# Detecting Hallucinations with Span Annotation

Source: https://www.potatoannotator.com/docs/guides/detecting-hallucinations

**A hallucination is a confident statement a model makes that isn't supported by its input or by fact. The most useful way to capture one is to highlight the exact words and label what's wrong with them, a [span annotation](/docs/guides/span-annotation) task over model output.** Span-level labels are far more actionable than a single "this answer is wrong" flag.

See [hallucination (artificial intelligence)](https://en.wikipedia.org/wiki/Hallucination_(artificial_intelligence)) for background.

## Why mark spans, not whole answers

A whole-answer "unfaithful" label tells you *that* something is wrong; a span tells you *what* and *where*. Span data lets you measure error rates per type, find patterns, and build targeted training data. It mirrors [MQM](https://themqm.org/) (Multidimensional Quality Metrics), the standard error-span framework from machine-translation evaluation.

## Setting up error-span annotation

```yaml
annotation_schemes:
  - annotation_type: span
    name: errors
    description: "Highlight each problematic span and label the error type."
    labels: [unsupported_claim, factual_error, contradiction, fabricated_citation]
    label_colors:
      unsupported_claim: "#f59e0b"
      factual_error: "#ef4444"
      contradiction: "#8b5cf6"
      fabricated_citation: "#ec4899"
  - annotation_type: radio
    name: severity
    description: "How serious is the worst error?"
    labels: [Minor, Major, Critical]
```

Add a severity judgment so you can weight a trivial slip differently from a dangerous fabrication, the way MQM does.

## Defining the error types

- **Unsupported claim**: not backed by the source (the [RAG](/docs/guides/rag-evaluation) case).
- **Factual error**: contradicts established fact.
- **Contradiction**: conflicts with something earlier in the same output.
- **Fabricated citation**: a reference that doesn't exist or doesn't say what's claimed.

Keep the set small and give each a one-line definition with an example, per [Writing Annotation Guidelines](/docs/guides/writing-annotation-guidelines).

## Quality considerations

- Give annotators the source material; "unsupported" is undefinable without it.
- Boundary rules matter, does the span cover the whole sentence or just the false clause? Decide once.
- Faithfulness is subjective at the edges; collect overlap and track [agreement](/docs/guides/inter-annotator-agreement).

## Further reading

- [RAG Evaluation](/docs/guides/rag-evaluation)
- [Span Annotation](/docs/guides/span-annotation)
- [How to Evaluate AI Agents](/docs/guides/evaluating-ai-agents)
