Semantic Textual Relatedness

Semantic textual relatedness task requiring annotators to rate the degree of semantic relatedness between sentence pairs using both a Likert scale and a continuous slider. Based on SemEval-2024 Task 1 (STR).

Configuration Fileconfig.yaml

This Potato config reproduces the annotation task. Save it as config.yaml and run potato start config.yaml to try it.

yaml

# Semantic Textual Relatedness
# Based on Abdalla et al., SemEval 2024
# Paper: https://aclanthology.org/volumes/2024.semeval-1/
# Dataset: https://github.com/semantic-textual-relatedness/Semantic_Relatedness_SemEval2024
#
# This task asks annotators to rate how semantically related two sentences
# are, using both a discrete Likert scale and a continuous slider for
# fine-grained judgments.

annotation_task_name: "Semantic Textual Relatedness"
task_dir: "."

data_files:
  - sample-data.json

item_properties:
  id_key: "id"
  text_key: "text"

output_annotation_dir: "annotation_output/"
output_annotation_format: "json"

port: 8000
server_name: localhost

annotation_schemes:
  - annotation_type: likert
    name: relatedness_likert
    description: "Rate the semantic relatedness of the two sentences."
    min_label: "Completely Unrelated"
    max_label: "Maximally Related"
    size: 5

  - annotation_type: slider
    name: relatedness_slider
    description: "Provide a fine-grained relatedness score."
    min_value: 0
    max_value: 1
    starting_value: 0.5

annotation_instructions: |
  You will be shown two sentences. Your task is to:
  1. Read both sentences carefully.
  2. Rate how semantically related the two sentences are on a 5-point Likert scale.
  3. Use the slider to provide a fine-grained relatedness score between 0 and 1.
  Note: Relatedness is broader than similarity. Two sentences can be related without
  being similar (e.g., cause and effect, or part and whole).

html_layout: |
  <div style="padding: 15px; max-width: 800px; margin: auto;">
    <div style="background: #f0f9ff; border: 1px solid #bae6fd; border-radius: 8px; padding: 16px; margin-bottom: 16px;">
      <strong style="color: #0369a1;">Sentence 1:</strong>
      <p style="font-size: 16px; line-height: 1.7; margin: 8px 0 0 0;">{{text}}</p>
    </div>
    <div style="background: #fefce8; border: 1px solid #fde68a; border-radius: 8px; padding: 16px; margin-bottom: 16px;">
      <strong style="color: #a16207;">Sentence 2:</strong>
      <p style="font-size: 16px; line-height: 1.7; margin: 8px 0 0 0;">{{sentence_2}}</p>
    </div>
    <div style="background: #f0fdf4; border: 1px solid #bbf7d0; border-radius: 8px; padding: 12px;">
      <strong style="color: #166534;">Language:</strong> <span>{{language}}</span>
    </div>
  </div>

allow_all_users: true
instances_per_annotator: 50
annotation_per_instance: 2
allow_skip: true
skip_reason_required: false

Sample Datasample-data.json

json

[
  {
    "id": "str_001",
    "text": "The cat sat on the warm windowsill watching the birds outside.",
    "sentence_2": "A kitten perched on the window ledge, gazing at the pigeons in the garden.",
    "language": "English"
  },
  {
    "id": "str_002",
    "text": "Heavy rainfall caused widespread flooding in the coastal region.",
    "sentence_2": "Residents were evacuated from their homes due to rising water levels.",
    "language": "English"
  }
]

// ... and 8 more items

Get This Design

View on GitHub

Clone or download from the repository

Quick start:

git clone https://github.com/davidjurgens/potato-showcase.git
cd potato-showcase/semeval/2024/task01-semantic-relatedness
potato start config.yaml

Dataset & paper

Abdalla et al., SemEval 2024

Official dataset ↗Read the paper ↗

Citation (BibTeX)

bibtex

@inproceedings{abdalla-etal-2024-str,
    title = "Semantic Textual Relatedness",
    author = "Abdalla, Mohamed and others",
    booktitle = "Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)",
    year = "2024",
    publisher = "Association for Computational Linguistics"
}

Details

Annotation Types

likertslider

Domain

SemEvalNLPSemantic SimilarityMultilingual

Use Cases

Semantic RelatednessSentence SimilarityCross-Lingual NLP

Related Designs

Graded Word Similarity in Context

Rate the semantic similarity of two words in their respective contexts on a graded scale, based on SemEval-2020 Task 3 (Armendariz et al.). Annotators assess how similar word meanings are when each word appears in a specific sentence context.

likertslider

Lexical Complexity Prediction

Predict the complexity of words in context using both Likert scale and continuous slider ratings, based on SemEval-2021 Task 1 (Shardlow et al.). Annotators assess how difficult a target word is for a non-native English speaker to understand.

likertslider

Multilingual Semantic Word Similarity

Graded word similarity judgment across multiple languages, based on SemEval-2017 Task 2. Annotators rate how semantically similar two words are on a continuous scale, supporting cross-lingual evaluation of distributional semantic models.

likertslider

Semantic Textual Relatedness

Configuration Fileconfig.yaml

Sample Datasample-data.json

Get This Design

Dataset & paper

Details

Annotation Types

Domain

Use Cases

Tags

Related Designs

Graded Word Similarity in Context

Lexical Complexity Prediction

Multilingual Semantic Word Similarity