LLM and Vision Pre-Annotation
How to speed up annotation with LLM pre-labeling and human verification, in-context learning, option highlighting, and vision pre-annotation, using Potato's AI support.
Pre-annotation uses a model to propose labels that a human then verifies or corrects. Checking a good suggestion is far faster than labeling from scratch, so pre-annotation can cut annotation time substantially, as long as you keep a human in the loop. This is human-in-the-loop machine learning.
Potato has built-in AI support for OpenAI, Claude, Gemini, Ollama, and others.
How pre-annotation works
- A model (an LLM, or a vision model for images) predicts a label for each item.
- The prediction is shown to the annotator as a pre-filled suggestion or a highlighted option.
- The annotator confirms or fixes it.
- The verified label, not the raw model output, becomes your data.
Turning it on
ai_support:
enabled: true
endpoint_type: openai # or anthropic, gemini, ollama, ...
ai_config:
model: gpt-4
api_key: ${OPENAI_API_KEY}
temperature: 0.3Potato offers a few flavors:
- In-context learning labeling: the model labels items from a few examples in the prompt; the human verifies.
- Option highlighting: the model pre-selects the labels it thinks are most likely, so the annotator confirms rather than searches.
- Visual AI support: vision models (GPT-4V, Claude, Gemini, or a detector like YOLO) propose image labels and boxes.
The risk: automation bias
The danger of pre-annotation is automation bias, annotators rubber-stamp the model's suggestions, importing its errors into your "gold" data. Guard against it:
- Keep gold standards running so you can detect blind acceptance.
- Don't pre-fill on the items you use to measure agreement; measure on un-suggested items.
- Use lower-confidence suggestions as hints, not defaults, for hard cases.
Pre-annotation vs. active learning
Pre-annotation makes each label faster. Active learning makes each label more valuable by choosing which items to label next. They combine well.
Further reading
- AI Support feature reference
- Active Learning for Annotation
- Solo Mode, a guided human-plus-LLM workflow