Questa pagina non è ancora disponibile nella tua lingua. Viene mostrata la versione in inglese.

LLM and Vision Pre-Annotation

How to speed up annotation with LLM pre-labeling and human verification, in-context learning, option highlighting, and vision pre-annotation, using Potato's AI support.

Pre-annotation uses a model to propose labels that a human then verifies or corrects. Checking a good suggestion is far faster than labeling from scratch, so pre-annotation can cut annotation time substantially, as long as you keep a human in the loop. This is human-in-the-loop machine learning.

Potato has built-in AI support for OpenAI, Claude, Gemini, Ollama, and others.

How pre-annotation works

A model (an LLM, or a vision model for images) predicts a label for each item.
The prediction is shown to the annotator as a pre-filled suggestion or a highlighted option.
The annotator confirms or fixes it.
The verified label, not the raw model output, becomes your data.

Turning it on

yaml

ai_support:
  enabled: true
  endpoint_type: openai      # or anthropic, gemini, ollama, ...
  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}
    temperature: 0.3

Potato offers a few flavors:

In-context learning labeling: the model labels items from a few examples in the prompt; the human verifies.
Option highlighting: the model pre-selects the labels it thinks are most likely, so the annotator confirms rather than searches.
Visual AI support: vision models (GPT-4V, Claude, Gemini, or a detector like YOLO) propose image labels and boxes.

The risk: automation bias

The danger of pre-annotation is automation bias, annotators rubber-stamp the model's suggestions, importing its errors into your "gold" data. Guard against it:

Keep gold standards running so you can detect blind acceptance.
Don't pre-fill on the items you use to measure agreement; measure on un-suggested items.
Use lower-confidence suggestions as hints, not defaults, for hard cases.

Pre-annotation vs. active learning

Pre-annotation makes each label faster. Active learning makes each label more valuable by choosing which items to label next. They combine well.

LLM and Vision Pre-Annotation

How pre-annotation works

Turning it on

The risk: automation bias

Pre-annotation vs. active learning

Further reading