Question 1

What is data annotation?

Accepted Answer

Data annotation is the process of adding labels to raw data such as text, images, audio, video, or model outputs, so the data can be used to train or evaluate machine learning models. A label might be a category, a highlighted span, a rating, or a comparison. Potato lets you set up any of these task types with a short YAML configuration.

Question 2

What is inter-annotator agreement?

Accepted Answer

Inter-annotator agreement measures how often independent annotators give the same label to the same item. It is the standard evidence that a task is well defined and the labels are reliable. Common measures are Cohen's kappa, Fleiss' kappa, and Krippendorff's alpha, which correct for agreement that would happen by chance. Potato reports Krippendorff's alpha in its admin dashboard.

Question 3

What is the best free annotation tool?

Accepted Answer

It depends on your data and goals, so there is no single answer. For work that spans text, images, audio, and AI-agent evaluation, Potato is a strong free and open-source option with more than 50 task types and a zero-code YAML setup. Label Studio, Doccano, brat, and Argilla are other open-source choices with different strengths.

Question 4

How do I label data for machine learning?

Accepted Answer

Start by defining the task and the label set, then write clear guidelines and have several annotators label overlapping items. Measure agreement, resolve the disagreements, and export the result in a format your training pipeline can read. Potato covers this whole workflow and exports to JSON, CoNLL, Hugging Face, spaCy, and COCO/YOLO.

Question 5

How many annotators do I need per item?

Accepted Answer

Clear, objective tasks can often use one annotator, with a small overlapping sample for quality checks. Moderately subjective tasks usually use three annotators resolved by majority vote. Highly subjective tasks use five or more, and sometimes keep the full range of opinions rather than collapsing to one answer. The benefit drops off quickly past three.

Question 6

What is active learning in data annotation?

Accepted Answer

Active learning chooses which items to annotate next so a model reaches a target accuracy with fewer labels than random sampling would need. The model flags the items it finds most informative, often the ones it is least certain about, and a person labels those. Potato supports uncertainty, diversity, BADGE, and BALD strategies.

Question 7

What is the difference between classification and span annotation?

Accepted Answer

Classification assigns one or more labels to a whole item, such as marking a review positive or negative. Span annotation marks a region inside an item, such as highlighting a name in a sentence or an event on an audio waveform. Named entity recognition and error marking are span tasks. Potato supports both, and you can combine them on one screen.

Question 8

How do I evaluate LLM or AI agent outputs?

Accepted Answer

Have people judge the outputs: rate them on a scale, compare two side by side, score them against a rubric, or mark specific errors with spans. For agents that take multiple steps, you can also judge each step of the trajectory. Potato provides all of these and can read agent traces from formats such as OpenAI, Anthropic, and ReAct.

Data Annotation Concepts

Data Annotation Concepts

Still Have Questions?