IndicNLP Suite - Multilingual Sentiment Analysis

Multilingual sentiment analysis for Indic languages based on the IndicNLP Suite (Kakwani et al., Findings of EMNLP 2020). Annotators classify the sentiment of product reviews in various Indic languages and provide optional notes.

Archivo de configuraciónconfig.yaml

# IndicNLP Suite - Multilingual Sentiment Analysis
# Based on Kakwani et al., Findings of EMNLP 2020
# Paper: https://aclanthology.org/2020.findings-emnlp.445/
# Dataset: https://indicnlp.ai4bharat.org/
#
# This task classifies the sentiment of product reviews written in various
# Indic languages. Annotators assign a sentiment label and may provide
# optional notes about the text.
#
# Sentiment Labels:
# - Positive: The review expresses satisfaction or recommendation
# - Negative: The review expresses dissatisfaction or criticism
# - Neutral: The review is balanced or purely descriptive
#
# Annotation Guidelines:
# 1. Read the product review in the given language
# 2. Note the language identifier for context
# 3. Assign a sentiment label (Positive, Negative, or Neutral)
# 4. Optionally add notes about any difficulties or observations

annotation_task_name: "IndicNLP Suite - Multilingual Sentiment Analysis"
task_dir: "."

data_files:
  - sample-data.json

item_properties:
  id_key: "id"
  text_key: "text"

output_annotation_dir: "annotation_output/"
output_annotation_format: "json"

port: 8000
server_name: localhost

annotation_schemes:
  - annotation_type: radio
    name: sentiment
    description: "What is the sentiment of this product review?"
    labels:
      - "Positive"
      - "Negative"
      - "Neutral"
    keyboard_shortcuts:
      "Positive": "1"
      "Negative": "2"
      "Neutral": "3"
    tooltips:
      "Positive": "The review expresses satisfaction, praise, or recommendation"
      "Negative": "The review expresses dissatisfaction, criticism, or complaint"
      "Neutral": "The review is balanced, factual, or does not express clear sentiment"

  - annotation_type: text
    name: notes
    description: "Optional notes about the review or annotation"

annotation_instructions: |
  You will be shown a product review written in an Indic language.
  The language of the review is indicated above the text.
  1. Read the product review carefully.
  2. Assign a sentiment label: Positive, Negative, or Neutral.
  3. Optionally, add any notes about the review or your annotation.

  Note: You should be proficient in the indicated language to annotate accurately.

html_layout: |
  <div style="padding: 15px; max-width: 800px; margin: auto;">
    <div style="background: #dbeafe; border-radius: 8px; padding: 8px 12px; margin-bottom: 12px; display: inline-block;">
      <strong style="color: #1e40af;">Language:</strong>
      <span style="color: #1e3a5f;">{{language}}</span>
    </div>
    <div style="background: #f0f9ff; border: 1px solid #bae6fd; border-radius: 8px; padding: 16px; margin-bottom: 16px;">
      <strong style="color: #0369a1;">Product Review:</strong>
      <p style="font-size: 16px; line-height: 1.8; margin: 8px 0 0 0;">{{text}}</p>
    </div>
  </div>

allow_all_users: true
instances_per_annotator: 50
annotation_per_instance: 2
allow_skip: true
skip_reason_required: false

Datos de ejemplosample-data.json

[
  {
    "id": "indicnlp_001",
    "text": "This phone has an excellent camera and the battery lasts all day. Very happy with my purchase and would recommend it to anyone looking for a budget smartphone.",
    "language": "English"
  },
  {
    "id": "indicnlp_002",
    "text": "Product quality is very poor. The stitching came apart within a week and the color faded after the first wash. Completely disappointed and want a refund.",
    "language": "English"
  }
]

// ... and 8 more items

Obtener este diseño

View on GitHub

Clone or download from the repository

Inicio rápido:

git clone https://github.com/davidjurgens/potato-showcase.git
cd potato-showcase/text/cross-lingual/indicnlp-multilingual-sa
potato start config.yaml

Detalles

Tipos de anotación

radiotext

Dominio

NLPCross-lingualSentiment

Casos de uso

Sentiment AnalysisMultilingual NLPIndic Languages

Etiquetas

indicnlpmultilingualsentimentindic-languagesemnlp2020

¿Encontró un problema o desea mejorar este diseño?

Abrir un issue

Diseños relacionados

Argument Reasoning in Civil Procedure

Legal argument reasoning task requiring annotators to answer multiple-choice questions about civil procedure by selecting the best answer and providing legal reasoning. Based on SemEval-2024 Task 5.

radiotext

BRAINTEASER - Commonsense-Defying QA

Lateral thinking and commonsense-defying question answering task requiring annotators to select answers to brain teasers that defy default commonsense assumptions and provide explanations. Based on SemEval-2024 Task 9 (BRAINTEASER).

radiotext

Check-COVID: Fact-Checking COVID-19 News Claims

Fact-checking COVID-19 news claims. Annotators verify claims against evidence, identify supporting/refuting spans, and provide verdicts with explanations. Based on the Check-COVID dataset targeting misinformation during the pandemic.

radiospan