intermediatetext
Hate Speech Detection
Identify and categorize hate speech, offensive language, and toxic content in text.
📝
text annotation
Configuration Fileconfig.yaml
# Hate Speech Detection Configuration
# Identify and categorize toxic content
annotation_task_name: "Hate Speech Detection"
data_files:
- "data/posts.json"
item_properties:
id_key: "id"
text_display_key: "text"
user_config:
allow_all_users: true
annotation_schemes:
- annotation_type: "radio"
name: "primary_label"
description: "How would you classify this content?"
labels:
- name: "Hate Speech"
tooltip: "Attacks a group based on protected characteristics"
key_value: "h"
color: "#dc2626"
- name: "Offensive Language"
tooltip: "Vulgar or inappropriate but not targeting a group"
key_value: "o"
color: "#f97316"
- name: "Neither"
tooltip: "No hate speech or offensive language"
key_value: "n"
color: "#22c55e"
- annotation_type: "multiselect"
name: "target_groups"
description: "If hate speech, which groups are targeted?"
labels:
- name: "Race/Ethnicity"
- name: "Religion"
- name: "Gender"
- name: "Sexual Orientation"
- name: "Disability"
- name: "National Origin"
- name: "Age"
- name: "Other"
show_if:
field: "primary_label"
value: "Hate Speech"
- annotation_type: "multiselect"
name: "hate_type"
description: "What type of hate speech?"
labels:
- name: "Slurs/Epithets"
- name: "Dehumanization"
- name: "Stereotyping"
- name: "Threats/Incitement"
- name: "Exclusion/Marginalization"
show_if:
field: "primary_label"
value: "Hate Speech"
- annotation_type: "likert"
name: "severity"
description: "How severe is this content?"
size: 5
min_label: "Mild"
max_label: "Severe"
- annotation_type: "radio"
name: "action_recommendation"
description: "Recommended action"
labels:
- name: "No action"
- name: "Warning label"
- name: "Hide behind warning"
- name: "Remove content"
- name: "Remove + ban user"
output: "annotation_output/"
Sample Datasample-data.json
[
{
"id": "hs_001",
"text": "I can't believe how long the line was at the DMV today. Absolutely ridiculous!",
"source": "social_media",
"timestamp": "2024-01-15T10:30:00Z"
},
{
"id": "hs_002",
"text": "That movie was so stupid, I want my two hours back.",
"source": "social_media",
"timestamp": "2024-01-15T11:45:00Z"
}
]
// ... and 1 more itemsGet This Design
View on GitHub
Clone or download from the repository
Quick start:
git clone https://github.com/davidjurgens/potato-showcase.git cd potato-showcase/hate-speech-detection potato start config.yaml
Details
Annotation Types
radiomultiselect
Domain
nlpcontent-moderation
Use Cases
content-moderationtoxicity-detection
Tags
hate-speechtoxicitymoderationoffensiveclassification
Found an issue or want to improve this design?
Open an IssueRelated Designs
Intent Classification
Classify user utterances into intents for chatbot and virtual assistant training.
radiomultiselect
CheXpert Chest X-Ray Classification
Multi-label classification of chest radiographs for 14 observations (Irvin et al., AAAI 2019). Annotate chest X-rays with pathology labels including uncertainty handling for clinical findings.
radiomultiselect
Deceptive Review Detection
Distinguish between truthful and deceptive (fake) reviews. Based on Ott et al., ACL 2011. Identify fake reviews written to deceive vs genuine customer experiences.
radiomultiselect