Machine Translation Evaluation

Evaluate machine translation quality with adequacy and fluency ratings.

Archivo de configuraciónconfig.yaml

annotation_task_name: "Machine Translation Evaluation"

task_description: "Evaluate the quality of the machine translation."
task_dir: "."
port: 8000

data_files:
  - "sample-data.json"

item_properties:
  id_key: id
  text_key: source
  context_key: translation

annotation_schemes:
  - annotation_type: likert
    name: adequacy
    description: "How much of the source meaning is preserved in the translation?"
    size: 5
    min_label: "None"
    max_label: "All"
    required: true

  - annotation_type: likert
    name: fluency
    description: "How fluent is the translation in the target language?"
    size: 5
    min_label: "Incomprehensible"
    max_label: "Flawless"
    required: true

  - annotation_type: multiselect
    name: errors
    description: "Select any errors present in the translation"
    labels:
      - "Mistranslation"
      - "Omission"
      - "Addition"
      - "Grammar error"
      - "Word order"
      - "Terminology"
    required: false

output_annotation_dir: "output/"
output_annotation_format: "json"

Datos de ejemplosample-data.json

[
  {
    "id": "1",
    "source": "El gato negro duerme en el sofá.",
    "source_lang": "Spanish",
    "target_lang": "English",
    "translation": "The black cat sleeps on the couch."
  },
  {
    "id": "2",
    "source": "Je voudrais réserver une table pour deux personnes.",
    "source_lang": "French",
    "target_lang": "English",
    "translation": "I would like to book a table for two people."
  }
]

Obtener este diseño

View on GitHub

Clone or download from the repository

Inicio rápido:

git clone https://github.com/davidjurgens/potato-showcase.git
cd potato-showcase/evaluation/machine-translation-eval
potato start config.yaml

Detalles

Tipos de anotación

likertmultiselect

Dominio

NLPTranslation

Casos de uso

Machine TranslationMT Evaluation

Etiquetas

translationmtevaluationmultilingual

¿Encontró un problema o desea mejorar este diseño?

Abrir un issue

Diseños relacionados

AnnoMI Counselling Dialogue Annotation

Annotation of motivational interviewing counselling dialogues based on the AnnoMI dataset. Annotators label therapist and client utterances for MI techniques (open questions, reflections, affirmations) and client change talk (sustain talk, change talk), with quality ratings for therapeutic interactions.

radiomultiselect

Clickbait Detection (Webis Clickbait Corpus)

Classify headlines and social media posts as clickbait or non-clickbait based on the Webis Clickbait Corpus. Identify manipulative content designed to attract clicks through sensationalism, curiosity gaps, or misleading framing.

likertmultiselect

Conversation Quality Attributes

Dialogue quality assessment based on controllable dialogue generation research (See et al., NAACL 2019). Annotators evaluate conversation turns for engagement quality, rate overall conversation quality, and identify specific dialogue attributes.

radiolikert