RuSentiment - Social Media Sentiment
5-class sentiment annotation for social media posts based on RuSentiment (Rogers et al., COLING 2018). Includes Positive, Negative, Neutral, Speech Act (greetings/thanks), and Skip categories. Achieved 0.654 Fleiss kappa with 250-350 posts/hour annotation speed.
Archivo de configuraciónconfig.yaml
# RuSentiment - Social Media Sentiment Classification
# Based on Rogers et al., COLING 2018
# Paper: https://aclanthology.org/C18-1064/
# Dataset: https://github.com/text-machine-lab/rusentiment
#
# 5-class sentiment scheme designed for social media:
# - Positive: explicit or implicit positive sentiment
# - Negative: explicit or implicit negative sentiment
# - Neutral: no sentiment expressed
# - Speech Act: formulaic posts (greetings, thanks, congratulations)
# - Skip: unclear, noisy, or user-generated content like poems
#
# Guidelines:
# - Mixed sentiment: annotate based on dominant sentiment
# - Hashtags and emojis are NOT automatic sentiment labels
# - Speech Acts may not reflect sender's actual sentiment
# - Annotation speed target: 250-350 posts per hour
annotation_task_name: "RuSentiment: Social Media Sentiment Classification"
task_dir: "."
data_files:
- sample-data.json
item_properties:
id_key: "id"
text_key: "text"
output_annotation_dir: "annotation_output/"
output_annotation_format: "json"
annotation_schemes:
- annotation_type: radio
name: sentiment
description: "Classify the sentiment of this social media post"
labels:
- Positive
- Negative
- Neutral
- Speech Act
- Skip
keyboard_shortcuts:
Positive: "1"
Negative: "2"
Neutral: "3"
"Speech Act": "4"
Skip: "5"
tooltips:
Positive: "Post expresses positive emotion or favorable attitude (explicit or implicit)"
Negative: "Post expresses negative emotion or unfavorable attitude (explicit or implicit)"
Neutral: "Post contains no sentiment markers; purely informational"
"Speech Act": "Formulaic posts: greetings, thank-yous, congratulations, wishes (may not reflect true sentiment)"
Skip: "Unclear posts, excessive noise, user-generated content like poems or lyrics"
# Optional: For mixed sentiment posts
- annotation_type: radio
name: mixed_sentiment
description: "Does this post contain mixed sentiment?"
labels:
- "No - single sentiment"
- "Yes - but positive dominant"
- "Yes - but negative dominant"
- "Yes - balanced/unclear"
keyboard_shortcuts:
"No - single sentiment": "n"
"Yes - but positive dominant": "p"
"Yes - but negative dominant": "g"
"Yes - balanced/unclear": "b"
tooltips:
"No - single sentiment": "The post expresses only one type of sentiment"
"Yes - but positive dominant": "Mixed, but overall more positive"
"Yes - but negative dominant": "Mixed, but overall more negative"
"Yes - balanced/unclear": "Cannot determine dominant sentiment"
allow_all_users: true
instances_per_annotator: 500
annotation_per_instance: 3
allow_skip: false
Datos de ejemplosample-data.json
[
{
"id": "rusent_001",
"text": "Just had the best coffee of my life! This cafe is amazing!"
},
{
"id": "rusent_002",
"text": "Happy birthday! Wishing you all the best on your special day!"
}
]
// ... and 13 more itemsObtener este diseño
Clone or download from the repository
Inicio rápido:
git clone https://github.com/davidjurgens/potato-showcase.git cd potato-showcase/text/emotion-sentiment/rusentiment potato start config.yaml
Detalles
Tipos de anotación
Dominio
Casos de uso
Etiquetas
¿Encontró un problema o desea mejorar este diseño?
Abrir un issueDiseños relacionados
AfriSenti - African Language Sentiment
Sentiment analysis for tweets in African languages, classifying text as positive, negative, or neutral. Covers 14 African languages including Amharic, Hausa, Igbo, Yoruba, and Swahili. Based on SemEval-2023 Task 12 (Muhammad et al.).
Detecting Stance in Tweets
Classification of stance expressed in tweets toward specific targets as favor, against, or neither. Based on SemEval-2016 Task 6 (Stance Detection).
Explainable Online Sexism Detection
Detection and fine-grained classification of online sexism with span-level evidence extraction. Categories include threats, derogation, animosity, and prejudiced discussion. Based on SemEval-2023 Task 10 (Kirk et al.).