Explainable Online Sexism Detection
Detection and fine-grained classification of online sexism with span-level evidence extraction. Categories include threats, derogation, animosity, and prejudiced discussion. Based on SemEval-2023 Task 10 (Kirk et al.).
Configuration Fileconfig.yaml
# Explainable Online Sexism Detection
# Based on Kirk et al., SemEval 2023
# Paper: https://aclanthology.org/2023.semeval-1.305/
# Dataset: https://github.com/rewire-online/edos
#
# This task asks annotators to classify online posts for sexist content
# and highlight the specific expressions that are sexist. The classification
# follows a fine-grained taxonomy covering different types of sexism.
#
# Classification Labels:
# - Not Sexist: The text does not contain sexist content
# - Threats: Direct or indirect threats of harm based on gender
# - Derogation: Derogatory or demeaning language targeting gender
# - Animosity: Expression of hostility or contempt toward a gender
# - Prejudiced Discussion: Stereotyping or generalizing about gender
annotation_task_name: "Explainable Online Sexism Detection"
task_dir: "."
data_files:
- sample-data.json
item_properties:
id_key: "id"
text_key: "text"
output_annotation_dir: "annotation_output/"
output_annotation_format: "json"
port: 8000
server_name: localhost
annotation_schemes:
- annotation_type: radio
name: sexism_category
description: "Classify the type of sexism present in this text"
labels:
- "Not Sexist"
- "Threats"
- "Derogation"
- "Animosity"
- "Prejudiced Discussion"
keyboard_shortcuts:
"Not Sexist": "1"
"Threats": "2"
"Derogation": "3"
"Animosity": "4"
"Prejudiced Discussion": "5"
tooltips:
"Not Sexist": "The text does not contain sexist content"
"Threats": "Direct or indirect threats of harm, violence, or intimidation based on gender"
"Derogation": "Derogatory, demeaning, or belittling language targeting a gender"
"Animosity": "Expression of hostility, contempt, or ill-will toward a gender group"
"Prejudiced Discussion": "Stereotyping, generalizing, or expressing prejudiced views about gender"
- annotation_type: span
name: sexist_expression
description: "Highlight the specific text that contains the sexist expression"
labels:
- "Sexist Expression"
annotation_instructions: |
You will see a social media post. Your task is to:
1. Determine whether the post contains sexist content.
2. If sexist, classify the type of sexism (threats, derogation, animosity, or prejudiced discussion).
3. Highlight the specific text spans that contain the sexist language.
Note: Focus on the content, not the speaker's identity. Consider context carefully.
html_layout: |
<div style="padding: 15px; max-width: 800px; margin: auto;">
<div style="background: #fef2f2; border: 1px solid #fecaca; border-radius: 8px; padding: 16px; margin-bottom: 16px;">
<strong style="color: #991b1b;">Post:</strong>
<p style="font-size: 16px; line-height: 1.7; margin: 8px 0 0 0;">{{text}}</p>
</div>
</div>
allow_all_users: true
instances_per_annotator: 50
annotation_per_instance: 3
allow_skip: true
skip_reason_required: false
Sample Datasample-data.json
[
{
"id": "sexism_001",
"text": "Just finished reading an amazing book by a female author. Really well-written thriller with great character development."
},
{
"id": "sexism_002",
"text": "Women shouldn't be in leadership positions because they're too emotional to make rational decisions under pressure."
}
]
// ... and 8 more itemsGet This Design
Clone or download from the repository
Quick start:
git clone https://github.com/davidjurgens/potato-showcase.git cd potato-showcase/semeval/2023/task10-explainable-sexism potato start config.yaml
Details
Annotation Types
Domain
Use Cases
Tags
Found an issue or want to improve this design?
Open an IssueRelated Designs
Aspect-Based Sentiment Analysis
Identification of aspect terms in review text with sentiment polarity classification for each aspect. Based on SemEval-2016 Task 5 (ABSA).
Causal Medical Claim Detection and PICO Extraction
Detection of causal claims in medical texts and extraction of PICO (Population, Intervention, Comparator, Outcome) elements. Based on SemEval-2023 Task 8 (Khetan et al.).
Character Identification on Multiparty Dialogues
Identification and linking of character mentions in TV show dialogue, combining span annotation with entity resolution for the main cast of Friends. Based on SemEval-2018 Task 4.