OffensEval - Offensive Language Target Identification
Multi-step offensive language annotation combining offensiveness detection, target type classification, and offensive span identification, based on the SemEval 2020 OffensEval shared task (Zampieri et al., SemEval 2020).
Configuration Fileconfig.yaml
# OffensEval - Offensive Language Target Identification
# Based on Zampieri et al., SemEval 2020
# Paper: https://aclanthology.org/2020.semeval-1.188/
# Dataset: https://sites.google.com/site/offaborita/olid
#
# This task implements a multi-step offensive language annotation pipeline.
# Annotators first determine whether a social media post is offensive,
# then classify the target type (individual, group, other, or untargeted),
# and finally highlight the specific offensive spans and target mentions.
#
# Annotation Guidelines:
# 1. Read the post carefully and determine if it contains offensive language
# 2. If offensive, classify whether the offense targets an individual, group, or other entity
# 3. Use span annotation to highlight offensive expressions and target mentions
# 4. A post can be offensive without targeting anyone specific (untargeted)
annotation_task_name: "OffensEval - Offensive Language Target Identification"
task_dir: "."
data_files:
- sample-data.json
item_properties:
id_key: "id"
text_key: "text"
output_annotation_dir: "annotation_output/"
output_annotation_format: "json"
port: 8000
server_name: localhost
annotation_schemes:
# Step 1: Is the post offensive?
- annotation_type: radio
name: offensiveness
description: "Is this post offensive?"
labels:
- "Offensive"
- "Not Offensive"
keyboard_shortcuts:
"Offensive": "1"
"Not Offensive": "2"
tooltips:
"Offensive": "The post contains any form of offensive language, insult, or profanity"
"Not Offensive": "The post does not contain offensive language"
# Step 2: What type of target?
- annotation_type: multiselect
name: target_type
description: "If offensive, what type of target is addressed? (select all that apply)"
labels:
- "Individual"
- "Group"
- "Other"
- "Untargeted"
tooltips:
"Individual": "The offense targets a specific named or unnamed individual"
"Group": "The offense targets a group of people based on identity, affiliation, or characteristics"
"Other": "The offense targets an organization, event, or abstract entity"
"Untargeted": "The post is offensive but does not target any specific entity"
# Step 3: Highlight offensive spans and target mentions
- annotation_type: span
name: offensive_spans
description: "Highlight the offensive expressions and target mentions in the text"
labels:
- "Offensive Span"
- "Target Mention"
tooltips:
"Offensive Span": "A word or phrase that constitutes the offensive expression"
"Target Mention": "A word or phrase that refers to the target of the offense"
annotation_instructions: |
You will be shown social media posts. Your task is to:
1. Determine whether the post contains offensive language.
2. If offensive, classify the target type (individual, group, other, or untargeted). You may select multiple types.
3. Highlight specific offensive expressions and target mentions using span annotation.
Offensive language includes insults, threats, profanity directed at someone, and derogatory language.
A post can be offensive without targeting anyone specifically (e.g., general profanity or vulgarity).
html_layout: |
<div style="padding: 15px; max-width: 800px; margin: auto;">
<div style="background: #fef2f2; border: 1px solid #fecaca; border-radius: 8px; padding: 16px; margin-bottom: 16px;">
<strong style="color: #991b1b;">Post:</strong>
<p style="font-size: 16px; line-height: 1.7; margin: 8px 0 0 0;">{{text}}</p>
</div>
</div>
allow_all_users: true
instances_per_annotator: 50
annotation_per_instance: 2
allow_skip: true
skip_reason_required: false
Sample Datasample-data.json
[
{
"id": "offenseval_001",
"text": "This policy is a complete disaster and the people who support it are delusional sheep who can't think for themselves."
},
{
"id": "offenseval_002",
"text": "Just watched the new documentary on climate change. Really eye-opening and well-produced."
}
]
// ... and 8 more itemsGet This Design
Clone or download from the repository
Quick start:
git clone https://github.com/davidjurgens/potato-showcase.git cd potato-showcase/text/computational-social-science/offenseval-target-id potato start config.yaml
Details
Annotation Types
Domain
Use Cases
Tags
Found an issue or want to improve this design?
Open an IssueRelated Designs
Food Hazard Detection
Food safety hazard detection task requiring annotators to identify hazards, products, and risk levels in food incident reports, and classify the type of contamination. Based on SemEval-2025 Task 9.
HateXplain - Explainable Hate Speech Detection
Multi-task hate speech annotation with classification (hate/offensive/normal), target community identification, and rationale span highlighting. Based on the HateXplain benchmark (Mathew et al., AAAI 2021) - the first dataset covering classification, target identification, and rationale extraction.
MediTOD Medical Dialogue Annotation
Medical history-taking dialogue annotation based on the MediTOD dataset. Annotators label dialogue acts, identify medical entities (symptoms, conditions, medications, tests), and assess doctor-patient communication quality across multi-turn clinical conversations.