Skip to content
Questa pagina non è ancora disponibile nella tua lingua. Viene mostrata la versione in inglese.

LLM and Vision Pre-Annotation

How to speed up annotation with LLM pre-labeling and human verification, in-context learning, option highlighting, and vision pre-annotation, using Potato's AI support.

Pre-annotation uses a model to propose labels that a human then verifies or corrects. Checking a good suggestion is far faster than labeling from scratch, so pre-annotation can cut annotation time substantially, as long as you keep a human in the loop. This is human-in-the-loop machine learning.

Potato has built-in AI support for OpenAI, Claude, Gemini, Ollama, and others.

How pre-annotation works

  1. A model (an LLM, or a vision model for images) predicts a label for each item.
  2. The prediction is shown to the annotator as a pre-filled suggestion or a highlighted option.
  3. The annotator confirms or fixes it.
  4. The verified label, not the raw model output, becomes your data.

Turning it on

yaml
ai_support:
  enabled: true
  endpoint_type: openai      # or anthropic, gemini, ollama, ...
  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}
    temperature: 0.3

Potato offers a few flavors:

  • In-context learning labeling: the model labels items from a few examples in the prompt; the human verifies.
  • Option highlighting: the model pre-selects the labels it thinks are most likely, so the annotator confirms rather than searches.
  • Visual AI support: vision models (GPT-4V, Claude, Gemini, or a detector like YOLO) propose image labels and boxes.

The risk: automation bias

The danger of pre-annotation is automation bias, annotators rubber-stamp the model's suggestions, importing its errors into your "gold" data. Guard against it:

  • Keep gold standards running so you can detect blind acceptance.
  • Don't pre-fill on the items you use to measure agreement; measure on un-suggested items.
  • Use lower-confidence suggestions as hints, not defaults, for hard cases.

Pre-annotation vs. active learning

Pre-annotation makes each label faster. Active learning makes each label more valuable by choosing which items to label next. They combine well.

Further reading