FActScore: Fine-grained Atomic Evaluation of Factual Precision
Atomic fact evaluation in LLM-generated text. Annotators decompose generated text into atomic facts and verify each fact as supported, not-supported, or irrelevant against a reference source. Based on the FActScore framework for evaluating factual precision in long-form text generation.
Configuration Fileconfig.yaml
# FActScore: Fine-grained Atomic Evaluation of Factual Precision
# Based on Min et al., EMNLP 2023
# Paper: https://aclanthology.org/2023.emnlp-main.741/
# Dataset: https://github.com/shmsw25/FActScore
#
# This task evaluates factual precision in LLM-generated text at the
# atomic fact level. Each item presents a generated paragraph alongside
# an extracted atomic fact and a reference source for verification.
#
# Factuality Verdicts:
# - Supported: The atomic fact is confirmed by the reference source
# - Not Supported: The atomic fact is contradicted by or absent from the reference
# - Irrelevant: The atomic fact cannot be verified (opinion, trivial, or out of scope)
#
# Annotation Guidelines:
# 1. Read the full generated paragraph for context
# 2. Focus on the specific atomic fact to be verified
# 3. Carefully check the reference source for matching information
# 4. A fact is Supported only if the reference explicitly confirms it
# 5. Minor discrepancies (e.g., wrong year, wrong location) mean Not Supported
# 6. If the fact is correct but the reference doesn't mention it, mark Not Supported
# 7. If a correction is needed, provide the corrected version
annotation_task_name: "FActScore: Atomic Factuality"
task_dir: "."
data_files:
- sample-data.json
item_properties:
id_key: "id"
text_key: "text"
output_annotation_dir: "annotation_output/"
output_annotation_format: "json"
annotation_schemes:
# Step 1: Factuality verdict
- annotation_type: radio
name: factuality
description: "Is this atomic fact supported by the reference source?"
labels:
- "Supported"
- "Not Supported"
- "Irrelevant"
keyboard_shortcuts:
"Supported": "s"
"Not Supported": "n"
"Irrelevant": "i"
tooltips:
"Supported": "The reference source explicitly confirms this atomic fact"
"Not Supported": "The reference source contradicts this fact or does not contain information to confirm it"
"Irrelevant": "The fact is an opinion, trivially true, or cannot be verified against the reference"
# Step 2: Correction (if Not Supported)
- annotation_type: text
name: correction
description: "If the fact is Not Supported, provide the correct information from the reference source (leave blank if Supported or Irrelevant)"
html_layout: |
<div style="margin-bottom: 10px; padding: 10px; background: #f5f3ff; border-left: 4px solid #8b5cf6; border-radius: 4px;">
<strong>Generated Paragraph:</strong><br>{{text}}
</div>
<div style="margin-bottom: 10px; padding: 10px; background: #fef3c7; border-left: 4px solid #f59e0b; border-radius: 4px;">
<strong>Atomic Fact to Verify:</strong> {{atomic_fact}}
</div>
<div style="margin-bottom: 10px; padding: 10px; background: #f0fdf4; border-left: 4px solid #22c55e; border-radius: 4px;">
<strong>Reference Source:</strong><br>{{reference_source}}
</div>
allow_all_users: true
instances_per_annotator: 100
annotation_per_instance: 3
allow_skip: true
skip_reason_required: false
Sample Datasample-data.json
[
{
"id": "factscore_001",
"text": "Albert Einstein was born in Ulm, Germany on March 14, 1879. He developed the theory of special relativity in 1905 while working as a patent clerk in Bern, Switzerland. He received the Nobel Prize in Physics in 1921 for his explanation of the photoelectric effect. He later became a professor at Princeton University in the United States.",
"atomic_fact": "Einstein was born in Ulm, Germany on March 14, 1879.",
"reference_source": "Albert Einstein (14 March 1879 - 18 April 1955) was a German-born theoretical physicist. He was born in Ulm, in the Kingdom of Wurttemberg in the German Empire. His family moved to Munich when he was an infant."
},
{
"id": "factscore_002",
"text": "Marie Curie was a Polish-born physicist who became the first woman to win a Nobel Prize. She discovered two elements, polonium and radium, during her research on radioactivity. She won Nobel Prizes in both Physics (1903) and Chemistry (1911), making her the first person to win Nobel Prizes in two different sciences.",
"atomic_fact": "Marie Curie was the first person to win Nobel Prizes in two different sciences.",
"reference_source": "Marie Curie (1867-1934) was a Polish-French physicist and chemist. She was awarded the Nobel Prize in Physics in 1903, shared with Pierre Curie and Henri Becquerel. In 1911, she received the Nobel Prize in Chemistry. She remains the only person to have won Nobel Prizes in two different sciences."
}
]
// ... and 8 more itemsGet This Design
Clone or download from the repository
Quick start:
git clone https://github.com/davidjurgens/potato-showcase.git cd potato-showcase/text/fact-verification/factscore-atomic-factuality potato start config.yaml
Details
Annotation Types
Domain
Use Cases
Tags
Found an issue or want to improve this design?
Open an IssueRelated Designs
GPQA - Graduate-Level Expert QA Evaluation
Expert-level question answering evaluation on graduate-level science questions from the GPQA benchmark (Rein et al., ICLR 2024). Questions span physics, chemistry, and biology, designed to be answerable only by domain experts.
Bias Benchmark for QA (BBQ)
Annotate question-answering examples designed to probe social biases. Based on BBQ (Parrish et al., Findings of ACL 2022). Annotators select the correct answer given a context, assess the direction of bias in the question, categorize the type of bias, and explain their reasoning.
BIG-Bench Task Evaluation
Evaluate language model responses on diverse reasoning tasks from the BIG-Bench benchmark. Annotators assess correctness, provide reasoning explanations, and rate confidence for model outputs across multiple task categories.