Skip to content

Image Annotation

Annotate images with classification, bounding boxes, and region labeling.

Image Annotation

Potato supports image annotation for classification, object detection, and region labeling tasks.

Enabling Image Display

yaml
image:
  enabled: true
  max_width: 800
  max_height: 600

Data Format

Reference images in your data:

json
{
  "id": "img_1",
  "image_path": "images/photo_001.jpg",
  "description": "Optional description"
}

Configure the image field:

yaml
data_files:
  - path: data/image_tasks.json
    image_field: image_path

Image Classification

Classify entire images:

yaml
annotation_schemes:
  - annotation_type: radio
    name: category
    description: "What is shown in this image?"
    labels:
      - Cat
      - Dog
      - Bird
      - Other
 
  - annotation_type: multiselect
    name: attributes
    description: "Select all that apply"
    labels:
      - Indoor
      - Outdoor
      - Multiple animals
      - Human present

Multi-Label Classification

yaml
annotation_schemes:
  - annotation_type: multiselect
    name: objects
    description: "What objects are visible?"
    labels:
      - Person
      - Car
      - Building
      - Tree
      - Animal
      - Furniture
      - Food
      - Electronic device

Image Quality Assessment

yaml
annotation_schemes:
  - annotation_type: likert
    name: quality
    description: "Overall image quality"
    size: 5
    min_label: "Very poor"
    max_label: "Excellent"
 
  - annotation_type: multiselect
    name: issues
    description: "Select any quality issues"
    labels:
      - Blurry
      - Overexposed
      - Underexposed
      - Noisy
      - Low resolution
      - Watermark visible

Bounding Box Annotation

Draw boxes around objects:

yaml
annotation_schemes:
  - annotation_type: bbox
    name: objects
    description: "Draw boxes around objects"
    labels:
      - Person
      - Car
      - Bicycle
      - Traffic sign
    label_colors:
      Person: "#3b82f6"
      Car: "#10b981"
      Bicycle: "#f59e0b"
      "Traffic sign": "#ef4444"

Bounding Box Output

json
{
  "id": "img_1",
  "objects": [
    {
      "label": "Person",
      "x": 100,
      "y": 50,
      "width": 80,
      "height": 200
    },
    {
      "label": "Car",
      "x": 300,
      "y": 150,
      "width": 150,
      "height": 100
    }
  ]
}

Pre-Loaded Bounding Boxes

Load existing annotations for review:

json
{
  "id": "img_1",
  "image_path": "images/photo_001.jpg",
  "predictions": [
    {"label": "Person", "x": 100, "y": 50, "width": 80, "height": 200, "confidence": 0.95}
  ]
}
yaml
annotation_schemes:
  - annotation_type: bbox
    name: objects
    load_predictions: true
    prediction_field: predictions

Region/Polygon Annotation

For non-rectangular regions:

yaml
annotation_schemes:
  - annotation_type: polygon
    name: regions
    description: "Outline regions of interest"
    labels:
      - Building
      - Road
      - Vegetation
      - Water

Image Comparison

Compare two images:

yaml
data_files:
  - path: data/image_pairs.json
    item_a_field: image_original
    item_b_field: image_edited
 
annotation_schemes:
  - annotation_type: pairwise
    name: preference
    description: "Which image looks better?"
    options:
      - label: "Original"
        value: "A"
      - label: "Edited"
        value: "B"
      - label: "Same"
        value: "tie"

Image Captioning

yaml
annotation_schemes:
  - annotation_type: text
    name: caption
    description: "Write a caption for this image"
    textarea: true
    placeholder: "Describe what you see..."
    min_length: 10
    max_length: 300

Caption Quality Review

yaml
data_files:
  - path: data/captions.json
    image_field: image_path
    text_field: generated_caption
 
annotation_schemes:
  - annotation_type: likert
    name: accuracy
    description: "How accurate is this caption?"
    size: 5
    min_label: "Very inaccurate"
    max_label: "Very accurate"
 
  - annotation_type: likert
    name: fluency
    description: "How natural is the language?"
    size: 5
    min_label: "Very awkward"
    max_label: "Very natural"
 
  - annotation_type: text
    name: improved_caption
    description: "Suggest a better caption (optional)"
    textarea: true

Display Options

Image Sizing

yaml
image:
  max_width: 800
  max_height: 600
  preserve_aspect_ratio: true

Zoom Controls

yaml
image:
  zoom_enabled: true
  initial_zoom: fit  # 'fit', 'actual', or percentage

Full-Screen Mode

yaml
image:
  fullscreen_enabled: true

Content Moderation

yaml
annotation_schemes:
  - annotation_type: radio
    name: safe_for_work
    description: "Is this image safe for work?"
    labels:
      - Safe
      - Questionable
      - Not Safe
 
  - annotation_type: multiselect
    name: violation_types
    description: "Select all violations (if any)"
    labels:
      - Violence
      - Adult content
      - Hate symbols
      - Graphic content
      - Spam/advertisement
    show_if:
      scheme: safe_for_work
      value: ["Questionable", "Not Safe"]

Supported Formats

Common image formats supported:

  • JPEG/JPG
  • PNG
  • GIF
  • WebP
  • BMP
yaml
image:
  allowed_formats: ["jpg", "jpeg", "png", "webp"]

Full Example: Object Detection Review

yaml
task_name: "Object Detection Verification"
 
image:
  enabled: true
  max_width: 1000
  zoom_enabled: true
 
data_files:
  - path: data/detections.json
    image_field: image_path
 
annotation_schemes:
  # Review pre-loaded predictions
  - annotation_type: bbox
    name: objects
    description: "Verify and correct object boxes"
    labels:
      - Person
      - Vehicle
      - Animal
      - Object
    load_predictions: true
    prediction_field: model_predictions
    label_colors:
      Person: "#3b82f6"
      Vehicle: "#10b981"
      Animal: "#f59e0b"
      Object: "#6b7280"
 
  # Overall assessment
  - annotation_type: radio
    name: prediction_quality
    description: "How accurate were the predictions?"
    labels:
      - All correct
      - Minor corrections needed
      - Major corrections needed
      - Mostly incorrect
 
  - annotation_type: number
    name: missed_objects
    description: "How many objects were missed?"
    min: 0
    max: 50
 
  - annotation_type: text
    name: notes
    description: "Any issues or comments?"
    textarea: true
    required: false

Performance Tips

  1. Optimize image size - Resize large images before annotation
  2. Use JPEG for photos - Smaller file sizes, faster loading
  3. Use PNG for graphics - Better quality for diagrams/screenshots
  4. Enable lazy loading - For large datasets
  5. Consider thumbnails - Show previews in list views
  6. Pre-process consistently - Normalize sizes and formats