Semantic Similarity

Rate the semantic similarity between pairs of sentences on a continuous scale.

Archivo de configuraciónconfig.yaml

# Semantic Similarity Configuration
# Rate similarity between sentence pairs

task_dir: "."
annotation_task_name: "Semantic Similarity"

data_files:
  - "data/sentence_pairs.json"

item_properties:
  id_key: "id"
  text_key: "display"
  text_display_key: "display"

user_config:
  allow_all_users: true

annotation_schemes:
  - annotation_type: "slider"
    name: "similarity_score"
    description: "How similar are these two sentences in meaning?"
    min_value: 0
    max_value: 5
    starting_value: 2.5
    step: 0.5
    min_label: "Completely different"
    max_label: "Identical meaning"
    show_value: true

  - annotation_type: "radio"
    name: "relationship_type"
    description: "What is the relationship between the sentences?"
    labels:
      - name: "Paraphrase"
        tooltip: "Same meaning, different words"
      - name: "Entailment"
        tooltip: "If A is true, B must be true"
      - name: "Partial overlap"
        tooltip: "Share some meaning but differ in details"
      - name: "Contradiction"
        tooltip: "Opposite or conflicting meanings"
      - name: "Unrelated"
        tooltip: "Different topics entirely"

  - annotation_type: "multiselect"
    name: "similarity_aspects"
    description: "In what ways are they similar?"
    labels:
      - name: "Same topic"
      - name: "Same entities mentioned"
      - name: "Same event/action"
      - name: "Same sentiment"
      - name: "Similar structure"
      - name: "Shared vocabulary"

  - annotation_type: "likert"
    name: "confidence"
    description: "How confident are you in your rating?"
    size: 5
    min_label: "Not confident"
    max_label: "Very confident"

output: "annotation_output/"

output_annotation_dir: "annotation_output/"
output_annotation_format: "json"

Datos de ejemplosample-data.json

[
  {
    "id": "sim_001",
    "sentence1": "The cat sat on the mat.",
    "sentence2": "A feline was resting on the rug.",
    "display": "**Sentence 1:** The cat sat on the mat.\n\n**Sentence 2:** A feline was resting on the rug."
  },
  {
    "id": "sim_002",
    "sentence1": "The stock market crashed yesterday.",
    "sentence2": "Stock prices plummeted in yesterday's trading.",
    "display": "**Sentence 1:** The stock market crashed yesterday.\n\n**Sentence 2:** Stock prices plummeted in yesterday's trading."
  }
]

// ... and 3 more items

Obtener este diseño

View on GitHub

Clone or download from the repository

Inicio rápido:

git clone https://github.com/davidjurgens/potato-showcase.git
cd potato-showcase/templates/text/semantic-similarity
potato start config.yaml

Detalles

Tipos de anotación

sliderlikert

Dominio

nlplinguistics

Casos de uso

similarityparaphrase-detection

Etiquetas

similaritysemanticparaphrasesentencesnlp

¿Encontró un problema o desea mejorar este diseño?

Abrir un issue

Diseños relacionados

Automated Essay Scoring

Holistic and analytic scoring of student essays using a deep-neural approach to automated essay scoring (Uto, arXiv 2022). Annotators provide overall quality ratings, holistic scores on a 1-6 scale, and detailed feedback comments for educational assessment.

likertslider

Continuous Emotion Rating

Rate emotional dimensions (valence, arousal, dominance) continuously following MSP-IMPROV protocol.

sliderlikert

Graded Word Similarity in Context

Rate the semantic similarity of two words in their respective contexts on a graded scale, based on SemEval-2020 Task 3 (Armendariz et al.). Annotators assess how similar word meanings are when each word appears in a specific sentence context.

likertslider