Multilingual Word-in-Context

Cross-lingual word sense disambiguation task where annotators determine whether a target word is used in the same sense across two different contexts, based on SemEval-2021 Task 2 (Martelli et al.).

Configuration Fileconfig.yaml

yaml

# Multilingual Word-in-Context
# Based on Martelli et al., SemEval 2021
# Paper: https://aclanthology.org/2021.semeval-1.3/
# Dataset: https://github.com/SapienzaNLP/xl-wic
#
# Annotators decide whether a target word carries the same meaning
# in two different sentence contexts. The task supports multilingual
# and cross-lingual settings.

annotation_task_name: "Multilingual Word-in-Context"
task_dir: "."

data_files:
  - sample-data.json

item_properties:
  id_key: "id"
  text_key: "text"

output_annotation_dir: "annotation_output/"
output_annotation_format: "json"

port: 8000
server_name: localhost

annotation_schemes:
  - annotation_type: radio
    name: sense_match
    description: "Does the target word have the same sense in both contexts?"
    labels:
      - "Same Sense"
      - "Different Sense"
    keyboard_shortcuts:
      "Same Sense": "1"
      "Different Sense": "2"
    tooltips:
      "Same Sense": "The target word is used with the same meaning in both contexts"
      "Different Sense": "The target word is used with different meanings in the two contexts"

annotation_instructions: |
  You will see a target word used in two different sentence contexts. Your task
  is to determine whether the target word carries the same meaning (sense) in both
  sentences.

  1. Read both contexts carefully.
  2. Focus on the meaning of the target word in each sentence.
  3. Select "Same Sense" if the word has the same meaning in both contexts.
  4. Select "Different Sense" if the word has different meanings.

  Example:
  - "bank" in "I went to the bank to deposit money" vs "The river bank was muddy"
    -> Different Sense

html_layout: |
  <div style="padding: 15px; max-width: 800px; margin: auto;">
    <div style="background: #fefce8; border: 1px solid #fde68a; border-radius: 8px; padding: 16px; margin-bottom: 16px; text-align: center;">
      <strong style="color: #a16207;">Target Word:</strong>
      <span style="font-size: 20px; font-weight: bold; color: #b45309;">{{target_word}}</span>
      <span style="margin-left: 12px; color: #78716c;">({{language}})</span>
    </div>
    <div style="background: #f0f9ff; border: 1px solid #bae6fd; border-radius: 8px; padding: 16px; margin-bottom: 12px;">
      <strong style="color: #0369a1;">Context 1:</strong>
      <p style="font-size: 16px; line-height: 1.7; margin: 8px 0 0 0;">{{text}}</p>
    </div>
    <div style="background: #f0fdf4; border: 1px solid #bbf7d0; border-radius: 8px; padding: 16px; margin-bottom: 16px;">
      <strong style="color: #15803d;">Context 2:</strong>
      <p style="font-size: 16px; line-height: 1.7; margin: 8px 0 0 0;">{{context_2}}</p>
    </div>
  </div>

allow_all_users: true
instances_per_annotator: 50
annotation_per_instance: 2
allow_skip: true
skip_reason_required: false

Sample Datasample-data.json

json

[
  {
    "id": "wic_001",
    "text": "The bank approved my loan application after reviewing my credit history.",
    "context_2": "We sat on the bank of the river and watched the sunset.",
    "target_word": "bank",
    "language": "English"
  },
  {
    "id": "wic_002",
    "text": "She decided to run for president of the student council.",
    "context_2": "He went for a run in the park every morning before work.",
    "target_word": "run",
    "language": "English"
  }
]

// ... and 8 more items

Get This Design

View on GitHub

Clone or download from the repository

Quick start:

git clone https://github.com/davidjurgens/potato-showcase.git
cd potato-showcase/semeval/2021/task02-crosslingual-wic
potato start config.yaml

Details

Annotation Types

radio

Domain

NLPSemEval

Use Cases

Word Sense DisambiguationCross-lingual NLP

Related Designs

ADMIRE - Multimodal Idiomaticity Recognition

Multimodal idiomaticity detection task requiring annotators to identify whether expressions are used idiomatically or literally, with supporting cue analysis. Based on SemEval-2025 Task 1 (ADMIRE).

radiomultiselect

AfriSenti - African Language Sentiment

Sentiment analysis for tweets in African languages, classifying text as positive, negative, or neutral. Covers 14 African languages including Amharic, Hausa, Igbo, Yoruba, and Swahili. Based on SemEval-2023 Task 12 (Muhammad et al.).