Skip to content
Showcase/Sentiment Analysis for Code-Mixed Social Media Text
intermediatetext

Sentiment Analysis for Code-Mixed Social Media Text

Classify the sentiment of code-mixed social media posts where two languages are interleaved within a single utterance, based on SemEval-2020 Task 9 (Patwa et al.). Supports Hindi-English and Spanish-English language pairs.

Submit

File di configurazioneconfig.yaml

# Sentiment Analysis for Code-Mixed Social Media Text
# Based on Patwa et al., SemEval 2020
# Paper: https://aclanthology.org/2020.semeval-1.100/
# Dataset: https://ritual-uh.github.io/sentimix2020/
#
# Annotators classify the sentiment of social media posts written in
# code-mixed text (two languages interleaved). The task captures the
# challenge of understanding sentiment in multilingual, informal text.

annotation_task_name: "Sentiment Analysis for Code-Mixed Social Media Text"
task_dir: "."

data_files:
  - sample-data.json

item_properties:
  id_key: "id"
  text_key: "text"

output_annotation_dir: "annotation_output/"
output_annotation_format: "json"

port: 8000
server_name: localhost

annotation_schemes:
  - annotation_type: radio
    name: sentiment
    description: "What is the overall sentiment of this code-mixed text?"
    labels:
      - "Positive"
      - "Negative"
      - "Neutral"
      - "Mixed"
    keyboard_shortcuts:
      "Positive": "1"
      "Negative": "2"
      "Neutral": "3"
      "Mixed": "4"
    tooltips:
      "Positive": "The text expresses a positive emotion or opinion"
      "Negative": "The text expresses a negative emotion or opinion"
      "Neutral": "The text does not express a clear positive or negative sentiment"
      "Mixed": "The text expresses both positive and negative sentiments"

annotation_instructions: |
  You will see a social media post written in code-mixed text (switching between
  two languages). Your task is to:
  1. Read the text carefully, considering both languages used.
  2. Determine the overall sentiment expressed in the post.
  3. Select Positive, Negative, Neutral, or Mixed.

  Note: Even if you don't understand every word, try to infer sentiment from
  the words you do recognize, emoticons, and overall tone.

html_layout: |
  <div style="padding: 15px; max-width: 800px; margin: auto;">
    <div style="background: #f0f9ff; border: 1px solid #bae6fd; border-radius: 8px; padding: 16px; margin-bottom: 16px;">
      <strong style="color: #0369a1;">Code-Mixed Text:</strong>
      <p style="font-size: 16px; line-height: 1.7; margin: 8px 0 0 0;">{{text}}</p>
    </div>
    <div style="background: #f0fdf4; border: 1px solid #bbf7d0; border-radius: 8px; padding: 16px; margin-bottom: 16px;">
      <strong style="color: #166534;">Language Pair:</strong>
      <span style="font-size: 14px;">{{language_pair}}</span>
    </div>
  </div>

allow_all_users: true
instances_per_annotator: 50
annotation_per_instance: 2
allow_skip: true
skip_reason_required: false

Dati di esempiosample-data.json

[
  {
    "id": "cm_001",
    "text": "Yaar ye movie bahut amazing thi! Best film of the year, no doubt about it.",
    "language_pair": "Hindi-English"
  },
  {
    "id": "cm_002",
    "text": "Esta clase es so boring, I can't even stay awake. Necesito cafe urgente.",
    "language_pair": "Spanish-English"
  }
]

// ... and 8 more items

Ottieni questo design

View on GitHub

Clone or download from the repository

Avvio rapido:

git clone https://github.com/davidjurgens/potato-showcase.git
cd potato-showcase/semeval/2020/task09-code-mixed-sentiment
potato start config.yaml

Dettagli

Tipi di annotazione

radio

Dominio

NLPSemEval

Casi d'uso

Sentiment AnalysisCode-MixingMultilingual NLP

Tag

semevalsemeval-2020shared-taskcode-mixingsentimentmultilingualsocial-media

Hai trovato un problema o vuoi migliorare questo design?

Apri un problema