Graded Word Similarity in Context

Rate the semantic similarity of two words in their respective contexts on a graded scale, based on SemEval-2020 Task 3 (Armendariz et al.). Annotators assess how similar word meanings are when each word appears in a specific sentence context.

Configuration Fileconfig.yaml

This Potato config reproduces the annotation task. Save it as config.yaml and run potato start config.yaml to try it.

yaml

# Graded Word Similarity in Context
# Based on Armendariz et al., SemEval 2020
# Paper: https://aclanthology.org/2020.semeval-1.3/
# Dataset: https://competitions.codalab.org/competitions/20905
#
# Annotators rate how similar the meanings of two target words are,
# given their respective sentence contexts. The task captures the
# graded, context-dependent nature of word similarity.

annotation_task_name: "Graded Word Similarity in Context"
task_dir: "."

data_files:
  - sample-data.json

item_properties:
  id_key: "id"
  text_key: "text"

output_annotation_dir: "annotation_output/"
output_annotation_format: "json"

port: 8000
server_name: localhost

annotation_schemes:
  - annotation_type: likert
    name: similarity_likert
    description: "Rate how similar the meanings of the two target words are in their respective contexts."
    min_label: "Completely Different"
    max_label: "Identical"
    size: 5

  - annotation_type: slider
    name: similarity_slider
    description: "Provide a fine-grained similarity rating on a continuous scale."
    min_value: 0
    max_value: 4
    starting_value: 2

annotation_instructions: |
  You will see two sentences, each containing a target word. Your task is to:
  1. Read both sentences carefully and pay attention to the highlighted target words.
  2. Consider how similar the meanings of the two words are IN THEIR GIVEN CONTEXTS.
  3. Rate the similarity on the Likert scale (Completely Different to Identical).
  4. Provide a fine-grained continuous rating using the slider (0 = completely different,
     4 = identical meaning).

  Important: Focus on the meaning of the words as used in these specific contexts,
  not their general dictionary definitions.

html_layout: |
  <div style="padding: 15px; max-width: 800px; margin: auto;">
    <div style="background: #f0f9ff; border: 1px solid #bae6fd; border-radius: 8px; padding: 16px; margin-bottom: 16px;">
      <strong style="color: #0369a1;">Context 1 (Target: <em>{{word_1}}</em>):</strong>
      <p style="font-size: 16px; line-height: 1.7; margin: 8px 0 0 0;">{{text}}</p>
    </div>
    <div style="background: #fefce8; border: 1px solid #fde68a; border-radius: 8px; padding: 16px; margin-bottom: 16px;">
      <strong style="color: #a16207;">Context 2 (Target: <em>{{word_2}}</em>):</strong>
      <p style="font-size: 16px; line-height: 1.7; margin: 8px 0 0 0;">{{context_2}}</p>
    </div>
  </div>

allow_all_users: true
instances_per_annotator: 50
annotation_per_instance: 2
allow_skip: true
skip_reason_required: false

Sample Datasample-data.json

json

[
  {
    "id": "gws_001",
    "text": "The bank of the river was covered in wildflowers during spring.",
    "context_2": "She deposited her savings at the bank downtown.",
    "word_1": "bank",
    "word_2": "bank"
  },
  {
    "id": "gws_002",
    "text": "The bright star was visible in the night sky even from the city center.",
    "context_2": "The movie star arrived at the premiere in a luxury limousine.",
    "word_1": "star",
    "word_2": "star"
  }
]

// ... and 8 more items

Get This Design

View on GitHub

Clone or download from the repository

Quick start:

git clone https://github.com/davidjurgens/potato-showcase.git
cd potato-showcase/semeval/2020/task03-graded-word-similarity
potato start config.yaml

Dataset & paper

Armendariz et al., SemEval 2020

Official dataset ↗Read the paper ↗

Citation (BibTeX)

bibtex

@inproceedings{armendariz-etal-2020-semeval,
    title = "{S}em{E}val-2020 {T}ask 3: {G}raded {W}ord {S}imilarity in {C}ontext",
    author = "Armendariz, Carlos S.  and Purver, Matthew  and Pollak, Senja  and Ljubesic, Nikola  and Ulcar, Matej  and Vulc, Ivan  and Pilehvar, Mohammad Taher",
    booktitle = "Proceedings of the Fourteenth Workshop on Semantic Evaluation",
    year = "2020",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.semeval-1.3"
}

Details

Annotation Types

likertslider

Domain

NLPSemEval

Use Cases

Word SimilarityLexical SemanticsContextualized Meaning

Related Designs

Lexical Complexity Prediction

Predict the complexity of words in context using both Likert scale and continuous slider ratings, based on SemEval-2021 Task 1 (Shardlow et al.). Annotators assess how difficult a target word is for a non-native English speaker to understand.

likertslider

Multilingual Semantic Word Similarity

Graded word similarity judgment across multiple languages, based on SemEval-2017 Task 2. Annotators rate how semantically similar two words are on a continuous scale, supporting cross-lingual evaluation of distributional semantic models.

likertslider

Semantic Textual Relatedness

Semantic textual relatedness task requiring annotators to rate the degree of semantic relatedness between sentence pairs using both a Likert scale and a continuous slider. Based on SemEval-2024 Task 1 (STR).

likertslider

Graded Word Similarity in Context

Configuration Fileconfig.yaml

Sample Datasample-data.json

Get This Design

Dataset & paper

Details

Annotation Types

Domain

Use Cases

Tags

Related Designs

Lexical Complexity Prediction

Multilingual Semantic Word Similarity

Semantic Textual Relatedness