Image Annotation
How to annotate images in Potato, classification, multi-label tagging, bounding boxes, polygons, and landmarks, and export to COCO/YOLO.
Image annotation ranges from a single label per image to precise regions drawn on the pixels, boxes, polygons, and points. The right level depends on what your model needs: a classifier needs labels, a detector needs bounding boxes, and a segmentation model needs polygons. For the feature reference see Image Annotation.
Whole-image classification and tagging
Use radio for one label, multiselect for several. The image classification showcase is a working example.
annotation_schemes:
- annotation_type: multiselect
name: contents
description: "Select everything visible in the image."
labels: [Person, Vehicle, Animal, Building, Vegetation]Regions: boxes, polygons, landmarks
For localization, annotators draw on the image:
- Bounding boxes for object detection.
- Polygons for image segmentation, where object shape matters.
- Landmarks / keypoints for poses and faces.
Potato's image annotation supports these region types with per-class colors, the same way span annotation works for text.
Boundary and labeling rules
- Tightness. Do boxes hug the object or include a margin? Be consistent.
- Occlusion and truncation. Decide how to box a partly hidden object.
- Small objects and crowds. Set a minimum size and a rule for dense scenes.
These rules drive your agreement far more than the drawing tool does.
Export for vision models
Potato exports image annotations to COCO and YOLO formats, which detection and segmentation training pipelines read directly. See Exporting Annotations for ML.