Machine Translation Evaluation
Evaluate machine translation quality with adequacy and fluency ratings.
Archivo de configuraciónconfig.yaml
annotation_task_name: "Machine Translation Evaluation"
task_description: "Evaluate the quality of the machine translation."
task_dir: "."
port: 8000
data_files:
- "sample-data.json"
item_properties:
id_key: id
text_key: source
context_key: translation
annotation_schemes:
- annotation_type: likert
name: adequacy
description: "How much of the source meaning is preserved in the translation?"
size: 5
min_label: "None"
max_label: "All"
required: true
- annotation_type: likert
name: fluency
description: "How fluent is the translation in the target language?"
size: 5
min_label: "Incomprehensible"
max_label: "Flawless"
required: true
- annotation_type: multiselect
name: errors
description: "Select any errors present in the translation"
labels:
- "Mistranslation"
- "Omission"
- "Addition"
- "Grammar error"
- "Word order"
- "Terminology"
required: false
output_annotation_dir: "output/"
output_annotation_format: "json"
Datos de ejemplosample-data.json
[
{
"id": "1",
"source": "El gato negro duerme en el sofá.",
"source_lang": "Spanish",
"target_lang": "English",
"translation": "The black cat sleeps on the couch."
},
{
"id": "2",
"source": "Je voudrais réserver une table pour deux personnes.",
"source_lang": "French",
"target_lang": "English",
"translation": "I would like to book a table for two people."
}
]Obtener este diseño
Clone or download from the repository
Inicio rápido:
git clone https://github.com/davidjurgens/potato-showcase.git cd potato-showcase/evaluation/machine-translation-eval potato start config.yaml
Detalles
Tipos de anotación
Dominio
Casos de uso
Etiquetas
¿Encontró un problema o desea mejorar este diseño?
Abrir un issueDiseños relacionados
AnnoMI Counselling Dialogue Annotation
Annotation of motivational interviewing counselling dialogues based on the AnnoMI dataset. Annotators label therapist and client utterances for MI techniques (open questions, reflections, affirmations) and client change talk (sustain talk, change talk), with quality ratings for therapeutic interactions.
Clickbait Detection (Webis Clickbait Corpus)
Classify headlines and social media posts as clickbait or non-clickbait based on the Webis Clickbait Corpus. Identify manipulative content designed to attract clicks through sensationalism, curiosity gaps, or misleading framing.
Conversation Quality Attributes
Dialogue quality assessment based on controllable dialogue generation research (See et al., NAACL 2019). Annotators evaluate conversation turns for engagement quality, rate overall conversation quality, and identify specific dialogue attributes.