Text Summarization Evaluation
Rate the quality of AI-generated summaries on fluency, coherence, and faithfulness.
Archivo de configuraciónconfig.yaml
annotation_task_name: "Text Summarization Evaluation"
task_description: "Rate the quality of the summary compared to the source document."
task_dir: "."
port: 8000
data_files:
- "sample-data.json"
item_properties:
id_key: id
text_key: source
context_key: summary
annotation_schemes:
- annotation_type: likert
name: fluency
description: "How fluent and grammatical is the summary?"
size: 5
min_label: "Not fluent"
max_label: "Very fluent"
required: true
- annotation_type: likert
name: coherence
description: "How well-organized and coherent is the summary?"
size: 5
min_label: "Incoherent"
max_label: "Very coherent"
required: true
- annotation_type: likert
name: faithfulness
description: "Does the summary accurately reflect the source without hallucinations?"
size: 5
min_label: "Unfaithful"
max_label: "Faithful"
required: true
- annotation_type: text
name: comments
description: "Optional comments on the summary quality"
required: false
output_annotation_dir: "output/"
output_annotation_format: "json"
Datos de ejemplosample-data.json
[
{
"id": "1",
"source": "The International Space Station (ISS) has been continuously occupied since November 2000. It serves as a microgravity and space environment research laboratory where crew members conduct experiments in biology, physics, astronomy, and other fields. The ISS is a joint project among five space agencies: NASA, Roscosmos, JAXA, ESA, and CSA.",
"summary": "The ISS has been occupied since 2000 and serves as a research lab for experiments. It's run by five space agencies including NASA."
},
{
"id": "2",
"source": "Machine learning models require large amounts of training data to achieve good performance. Data annotation is the process of labeling data to provide ground truth for model training. High-quality annotations are essential for building reliable AI systems.",
"summary": "ML models need lots of labeled training data. Good annotations are crucial for building reliable AI."
}
]Obtener este diseño
Clone or download from the repository
Inicio rápido:
git clone https://github.com/davidjurgens/potato-showcase.git cd potato-showcase/evaluation/text-summarization-eval potato start config.yaml
Detalles
Tipos de anotación
Dominio
Casos de uso
Etiquetas
¿Encontró un problema o desea mejorar este diseño?
Abrir un issueDiseños relacionados
Automated Essay Scoring
Holistic and analytic scoring of student essays using a deep-neural approach to automated essay scoring (Uto, arXiv 2022). Annotators provide overall quality ratings, holistic scores on a 1-6 scale, and detailed feedback comments for educational assessment.
Coreference Resolution (OntoNotes)
Link pronouns and noun phrases to the entities they refer to in text. Based on the OntoNotes coreference annotation guidelines and CoNLL shared tasks. Identify mention spans and cluster coreferent mentions together.
FinBERT - Financial Headline Sentiment Analysis
Classify sentiment of financial news headlines as positive, negative, or neutral, based on the FinBERT model (Araci, arXiv 2019). Annotators also rate market outlook on a bearish-to-bullish scale and provide reasoning for their sentiment judgment.