beginnertext
Semantic Similarity
Rate the semantic similarity between pairs of sentences on a continuous scale.
📝
text annotation
Configuration Fileconfig.yaml
# Semantic Similarity Configuration
# Rate similarity between sentence pairs
annotation_task_name: "Semantic Similarity"
data_files:
- "data/sentence_pairs.json"
item_properties:
id_key: "id"
text_display_key: "display"
user_config:
allow_all_users: true
annotation_schemes:
- annotation_type: "slider"
name: "similarity_score"
description: "How similar are these two sentences in meaning?"
min: 0
max: 5
step: 0.5
min_label: "Completely different"
max_label: "Identical meaning"
show_value: true
- annotation_type: "radio"
name: "relationship_type"
description: "What is the relationship between the sentences?"
labels:
- name: "Paraphrase"
tooltip: "Same meaning, different words"
- name: "Entailment"
tooltip: "If A is true, B must be true"
- name: "Partial overlap"
tooltip: "Share some meaning but differ in details"
- name: "Contradiction"
tooltip: "Opposite or conflicting meanings"
- name: "Unrelated"
tooltip: "Different topics entirely"
- annotation_type: "multiselect"
name: "similarity_aspects"
description: "In what ways are they similar?"
labels:
- name: "Same topic"
- name: "Same entities mentioned"
- name: "Same event/action"
- name: "Same sentiment"
- name: "Similar structure"
- name: "Shared vocabulary"
- annotation_type: "likert"
name: "confidence"
description: "How confident are you in your rating?"
size: 5
min_label: "Not confident"
max_label: "Very confident"
output: "annotation_output/"
Sample Datasample-data.json
[
{
"id": "sim_001",
"sentence1": "The cat sat on the mat.",
"sentence2": "A feline was resting on the rug.",
"display": "**Sentence 1:** The cat sat on the mat.\n\n**Sentence 2:** A feline was resting on the rug."
},
{
"id": "sim_002",
"sentence1": "The stock market crashed yesterday.",
"sentence2": "Stock prices plummeted in yesterday's trading.",
"display": "**Sentence 1:** The stock market crashed yesterday.\n\n**Sentence 2:** Stock prices plummeted in yesterday's trading."
}
]
// ... and 3 more itemsGet This Design
View on GitHub
Clone or download from the repository
Quick start:
git clone https://github.com/davidjurgens/potato-showcase.git cd potato-showcase/semantic-similarity potato start config.yaml
Details
Annotation Types
sliderlikert
Domain
nlplinguistics
Use Cases
similarityparaphrase-detection
Tags
similaritysemanticparaphrasesentencesnlp
Found an issue or want to improve this design?
Open an IssueRelated Designs
Continuous Emotion Rating
Rate emotional dimensions (valence, arousal, dominance) continuously following MSP-IMPROV protocol.
sliderlikert
Acoustic Scene Classification
Classify audio recordings by acoustic environment following the TUT/DCASE dataset format.
radiolikert
Argument Quality Assessment
Multi-dimensional argument quality annotation based on the Wachsmuth et al. (2017) taxonomy. Rates arguments on three dimensions: Cogency (logical validity), Effectiveness (persuasive power), and Reasonableness (contribution to resolution). Used in Dagstuhl-ArgQuality and GAQCorpus datasets.
likertradio