Toxic Spans Detection
Character-level toxic span annotation based on SemEval-2021 Task 5 (Pavlopoulos et al., 2021). Instead of binary toxicity classification, annotators identify the specific words/phrases that make a comment toxic, enabling more nuanced content moderation.
text annotation
Configuration Fileconfig.yaml
# Toxic Spans Detection
# Based on SemEval-2021 Task 5 (Pavlopoulos et al., 2021)
# Paper: https://aclanthology.org/2021.semeval-1.6/
# Dataset: https://github.com/ipavlopoulos/toxic_spans
#
# Task: Identify the specific character sequences within comments that
# contribute to toxicity, rather than making binary judgments about
# entire comments.
#
# Guidelines:
# - Mark the exact words/phrases that make the text toxic
# - Focus on language that is abusive, offensive, or harmful
# - Be precise: highlight only the toxic portions, not surrounding context
# - Multiple spans can be marked in a single comment
# - Some comments may have no toxic spans (false positives in toxicity detection)
port: 8000
server_name: localhost
task_name: "Toxic Spans Detection"
data_files:
- sample-data.json
id_key: id
text_key: text
output_file: annotations.json
annotation_schemes:
# First: determine if the text contains toxicity
- annotation_type: radio
name: contains_toxicity
description: "Does this text contain any toxic content?"
labels:
- "Yes - contains toxic content"
- "No - not toxic"
keyboard_shortcuts:
"Yes - contains toxic content": "y"
"No - not toxic": "n"
tooltips:
"Yes - contains toxic content": "The text contains language that is abusive, offensive, or harmful"
"No - not toxic": "The text does not contain toxic language (may be critical but not abusive)"
# Then: highlight the specific toxic spans
- annotation_type: span
name: toxic_spans
description: "Highlight the specific words or phrases that make this text toxic"
labels:
- Toxic
label_colors:
Toxic: "#ef4444"
tooltips:
Toxic: "Words or phrases that are abusive, offensive, threatening, or otherwise harmful"
allow_overlapping: false
# Optional: categorize the type of toxicity
- annotation_type: multiselect
name: toxicity_type
description: "What type(s) of toxicity are present? (select all that apply)"
labels:
- Insult
- Profanity
- Threat
- Identity Attack
- Sexual Content
- Other
tooltips:
Insult: "Personal attacks or demeaning language"
Profanity: "Vulgar or obscene language"
Threat: "Expressions of intent to harm"
"Identity Attack": "Attacks based on identity (race, gender, religion, etc.)"
"Sexual Content": "Sexually explicit or inappropriate content"
Other: "Other forms of toxic content"
allow_all_users: true
instances_per_annotator: 100
annotation_per_instance: 3
allow_skip: true
skip_reason_required: false
Sample Datasample-data.json
[
{
"id": "toxic_001",
"text": "This article is well-researched and presents a balanced view of the issue."
},
{
"id": "toxic_002",
"text": "You're such an idiot if you believe this garbage. Completely braindead take."
}
]
// ... and 10 more itemsGet This Design
Clone or download from the repository
Quick start:
git clone https://github.com/davidjurgens/potato-showcase.git cd potato-showcase/toxic-spans potato start config.yaml
Details
Annotation Types
Domain
Use Cases
Tags
Found an issue or want to improve this design?
Open an IssueRelated Designs
HateXplain - Explainable Hate Speech Detection
Multi-task hate speech annotation with classification (hate/offensive/normal), target community identification, and rationale span highlighting. Based on the HateXplain benchmark (Mathew et al., AAAI 2021) - the first dataset covering classification, target identification, and rationale extraction.
Coreference Resolution (OntoNotes)
Link pronouns and noun phrases to the entities they refer to in text. Based on the OntoNotes coreference annotation guidelines and CoNLL shared tasks. Identify mention spans and cluster coreferent mentions together.
Dialogue Relation Extraction (DialogRE)
Extract relations between entities in dialogue. Based on Yu et al., ACL 2020. Identify 36 relation types between speakers and entities mentioned in conversations.