Segment Anything (SA-1B) Interactive Segmentation

Interactive image segmentation annotation. Annotators draw segmentation masks (polygons) and bounding boxes around all distinct objects in images, following the Segment Anything Model (SAM) annotation protocol.

Configuration Fileconfig.yaml

yaml

# Segment Anything (SA-1B) Interactive Segmentation Configuration
# Based on Kirillov et al., ICCV 2023

annotation_task_name: "SA-1B Interactive Segmentation"
task_dir: "."

data_files:
  - "sample-data.json"

item_properties:
  id_key: "id"
  text_key: "image_url"
  context_key: "scene_description"

user_config:
  allow_all_users: true

annotation_schemes:
  - annotation_type: "text"
    name: "segmentation_masks"
    description: "Draw polygon segmentation masks around each distinct object (format: object_label,x1,y1,x2,y2,...,xn,yn per line)"

  - annotation_type: "text"
    name: "bounding_boxes"
    description: "Draw bounding boxes around each distinct object (format: object_label,x,y,width,height per line)"

  - annotation_type: "radio"
    name: "object_category"
    description: "Select the primary object category visible in the image"
    labels:
      - name: "person"
        tooltip: "Human figures, body parts"
      - name: "animal"
        tooltip: "Animals of any species"
      - name: "vehicle"
        tooltip: "Cars, trucks, bicycles, boats, etc."
      - name: "furniture"
        tooltip: "Chairs, tables, beds, etc."
      - name: "food"
        tooltip: "Food items, dishes, produce"
      - name: "electronics"
        tooltip: "Phones, laptops, screens, etc."
      - name: "nature"
        tooltip: "Plants, trees, rocks, water"
      - name: "building"
        tooltip: "Structures, architecture"
      - name: "other"
        tooltip: "Objects not in the above categories"

  - annotation_type: "radio"
    name: "scene_complexity"
    description: "Rate the segmentation complexity of this image"
    labels:
      - name: "low"
        tooltip: "Few objects, clear boundaries (1-5 objects)"
      - name: "medium"
        tooltip: "Moderate number of objects, some overlap (6-15 objects)"
      - name: "high"
        tooltip: "Many objects, significant overlap or clutter (16+ objects)"

  - annotation_type: "radio"
    name: "annotation_difficulty"
    description: "How difficult is it to draw precise segmentation masks?"
    labels:
      - name: "easy"
        tooltip: "Clear object boundaries, high contrast"
      - name: "moderate"
        tooltip: "Some ambiguous boundaries or partial occlusion"
      - name: "hard"
        tooltip: "Complex shapes, heavy occlusion, or low contrast"

interface_config:
  item_display_format: "<img src='{{text}}' style='max-width:100%; max-height:500px;'/><br/><small>{{scene_description}}</small>"

output_annotation_format: "json"
output_annotation_dir: "annotations"

Sample Datasample-data.json

json

[
  {
    "id": "sa1b_001",
    "image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/4/47/PNG_transparency_demonstration_1.png/800px-PNG_transparency_demonstration_1.png",
    "scene_description": "A still life arrangement of dice and game pieces on a wooden surface. Segment each individual object.",
    "expected_objects": [
      "dice",
      "game piece",
      "wooden surface"
    ]
  },
  {
    "id": "sa1b_002",
    "image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/3/3a/Cat03.jpg/1200px-Cat03.jpg",
    "scene_description": "A domestic cat sitting on a grassy lawn. Draw masks around the cat and background regions.",
    "expected_objects": [
      "cat",
      "grass",
      "ground"
    ]
  }
]

// ... and 8 more items

Get This Design

View on GitHub

Clone or download from the repository

Quick start:

git clone https://github.com/davidjurgens/potato-showcase.git
cd potato-showcase/image/sa1b-segment-anything
potato start config.yaml

Details

Annotation Types

radiotext

Domain

Computer VisionSegmentation

Use Cases

Instance SegmentationObject DetectionInteractive Annotation

Related Designs

CUB-200-2011 Fine-Grained Bird Classification

Fine-grained visual categorization of 200 bird species (Wah et al., 2011). Annotate bird images with species labels, part locations, and attribute annotations.

multiselectradio

EPIC-KITCHENS Egocentric Action Annotation

Annotate fine-grained actions in egocentric kitchen videos with verb-noun pairs. Identify cooking actions from a first-person perspective.