Interpretable Semantic Textual Similarity
Fine-grained semantic similarity assessment between sentence pairs with span alignment, combining chunk-level annotation with graded similarity scoring. Based on SemEval-2016 Task 2.
Fichier de configurationconfig.yaml
# Interpretable Semantic Textual Similarity
# Based on Agirre et al., SemEval 2016
# Paper: https://aclanthology.org/S16-1082/
# Dataset: http://ixa2.si.ehu.eus/stswiki/
#
# This task asks annotators to assess semantic similarity between
# sentence pairs at a fine-grained level. Annotators highlight aligned
# chunks and rate the overall similarity on a 6-point scale.
#
# Span Labels:
# - Aligned Chunk: A text chunk that corresponds to content in the other sentence
#
# Similarity Scale (Likert 1-6):
# 1 = Completely Different (no semantic overlap)
# 6 = Identical Meaning (same meaning, possibly different wording)
annotation_task_name: "Interpretable Semantic Textual Similarity"
task_dir: "."
data_files:
- sample-data.json
item_properties:
id_key: "id"
text_key: "text"
output_annotation_dir: "annotation_output/"
output_annotation_format: "json"
port: 8000
server_name: localhost
annotation_schemes:
- annotation_type: span
name: aligned_chunks
description: "Highlight chunks in Sentence 1 that align with content in Sentence 2."
labels:
- "Aligned Chunk"
- annotation_type: likert
name: similarity_score
description: "How semantically similar are the two sentences?"
min_label: "Completely Different"
max_label: "Identical Meaning"
size: 6
annotation_instructions: |
You will be shown two sentences. Your task is to:
1. Highlight chunks in Sentence 1 that correspond to content in Sentence 2.
2. Rate the overall semantic similarity between the two sentences on a 1-6 scale.
Consider both the meaning and the information conveyed by each sentence.
html_layout: |
<div style="padding: 15px; max-width: 800px; margin: auto;">
<div style="background: #f0f9ff; border: 1px solid #bae6fd; border-radius: 8px; padding: 16px; margin-bottom: 12px;">
<strong style="color: #0369a1;">Sentence 1:</strong>
<p style="font-size: 16px; line-height: 1.7; margin: 8px 0 0 0;">{{text}}</p>
</div>
<div style="background: #f0fdf4; border: 1px solid #bbf7d0; border-radius: 8px; padding: 16px; margin-bottom: 16px;">
<strong style="color: #15803d;">Sentence 2:</strong>
<p style="font-size: 16px; line-height: 1.7; margin: 8px 0 0 0;">{{sentence_2}}</p>
</div>
</div>
allow_all_users: true
instances_per_annotator: 50
annotation_per_instance: 2
allow_skip: true
skip_reason_required: false
Données d'exemplesample-data.json
[
{
"id": "ists_001",
"text": "A man is playing a guitar on stage.",
"sentence_2": "A musician performs with a guitar in front of an audience."
},
{
"id": "ists_002",
"text": "The cat sat on the mat near the fireplace.",
"sentence_2": "A dog was sleeping in the garden outside."
}
]
// ... and 8 more itemsObtenir ce design
Clone or download from the repository
Démarrage rapide :
git clone https://github.com/davidjurgens/potato-showcase.git cd potato-showcase/semeval/2016/task02-interpretable-sts potato start config.yaml
Détails
Types d'annotation
Domaine
Cas d'utilisation
Étiquettes
Vous avez trouvé un problème ou souhaitez améliorer ce design ?
Ouvrir un ticketDesigns associés
Semantic Textual Relatedness
Semantic textual relatedness task requiring annotators to rate the degree of semantic relatedness between sentence pairs using both a Likert scale and a continuous slider. Based on SemEval-2024 Task 1 (STR).
ESA: Error Span Annotation for Machine Translation
Error span annotation for machine translation output. Annotators identify error spans in translations, classify error types (accuracy, fluency, terminology, style), and rate severity.
LongEval: Faithfulness Evaluation for Long-form Summarization
Faithfulness evaluation of long-form summaries. Annotators identify atomic content units in summaries, check each against source documents for faithfulness, and rate overall summary quality.