Hate Speech Detection

Identify and categorize hate speech, offensive language, and toxic content in text.

Configuration Fileconfig.yaml

# Hate Speech Detection Configuration
# Identify and categorize toxic content

task_dir: "."
annotation_task_name: "Hate Speech Detection"

data_files:
  - "data/posts.json"

item_properties:
  id_key: "id"
  text_key: "text"
  text_display_key: "text"

user_config:
  allow_all_users: true

annotation_schemes:
  - annotation_type: "radio"
    name: "primary_label"
    description: "How would you classify this content?"
    labels:
      - name: "Hate Speech"
        tooltip: "Attacks a group based on protected characteristics"
        key_value: "h"
        color: "#dc2626"
      - name: "Offensive Language"
        tooltip: "Vulgar or inappropriate but not targeting a group"
        key_value: "o"
        color: "#f97316"
      - name: "Neither"
        tooltip: "No hate speech or offensive language"
        key_value: "n"
        color: "#22c55e"

  - annotation_type: "multiselect"
    name: "target_groups"
    description: "If hate speech, which groups are targeted?"
    labels:
      - name: "Race/Ethnicity"
      - name: "Religion"
      - name: "Gender"
      - name: "Sexual Orientation"
      - name: "Disability"
      - name: "National Origin"
      - name: "Age"
      - name: "Other"
    show_if:
      field: "primary_label"
      value: "Hate Speech"

  - annotation_type: "multiselect"
    name: "hate_type"
    description: "What type of hate speech?"
    labels:
      - name: "Slurs/Epithets"
      - name: "Dehumanization"
      - name: "Stereotyping"
      - name: "Threats/Incitement"
      - name: "Exclusion/Marginalization"
    show_if:
      field: "primary_label"
      value: "Hate Speech"

  - annotation_type: "likert"
    name: "severity"
    description: "How severe is this content?"
    size: 5
    min_label: "Mild"
    max_label: "Severe"

  - annotation_type: "radio"
    name: "action_recommendation"
    description: "Recommended action"
    labels:
      - name: "No action"
      - name: "Warning label"
      - name: "Hide behind warning"
      - name: "Remove content"
      - name: "Remove + ban user"

output: "annotation_output/"

output_annotation_dir: "annotation_output/"
output_annotation_format: "json"

Sample Datasample-data.json

[
  {
    "id": "hs_001",
    "text": "I can't believe how long the line was at the DMV today. Absolutely ridiculous!",
    "source": "social_media",
    "timestamp": "2024-01-15T10:30:00Z"
  },
  {
    "id": "hs_002",
    "text": "That movie was so stupid, I want my two hours back.",
    "source": "social_media",
    "timestamp": "2024-01-15T11:45:00Z"
  }
]

// ... and 1 more items

Get This Design

View on GitHub

Clone or download from the repository

Quick start:

git clone https://github.com/davidjurgens/potato-showcase.git
cd potato-showcase/templates/text/hate-speech-detection
potato start config.yaml

Details

Annotation Types

radiomultiselect

Domain

nlpcontent-moderation

Use Cases

content-moderationtoxicity-detection

Related Designs

Intent Classification

Classify user utterances into intents for chatbot and virtual assistant training.

radiomultiselect

ADMIRE - Multimodal Idiomaticity Recognition

Multimodal idiomaticity detection task requiring annotators to identify whether expressions are used idiomatically or literally, with supporting cue analysis. Based on SemEval-2025 Task 1 (ADMIRE).

radiomultiselect

DocBank Document Layout Detection

Document layout analysis benchmark (Li et al., COLING 2020). Detect and classify document elements including titles, abstracts, paragraphs, figures, tables, and captions.

multiselectradio

Hate Speech Detection

Configuration Fileconfig.yaml

Sample Datasample-data.json

Get This Design

Details

Annotation Types

Domain

Use Cases

Tags

Related Designs

Intent Classification

ADMIRE - Multimodal Idiomaticity Recognition

DocBank Document Layout Detection