Video Annotation with Frame-by-Frame Controls

Video annotation enables training data creation for action recognition, object tracking, and temporal event detection. This tutorial covers frame navigation, temporal segmentation, and efficient video labeling workflows.

Video Annotation Capabilities

Potato supports video annotation for labeling and classification tasks:

Video classification: Label entire video clips
Multi-label tagging: Apply multiple tags to videos
Temporal annotation: Mark events in videos

Basic Video Setup

yaml

annotation_task_name: "Video Action Recognition"
 
data_files:
  - "data/videos.json"
 
annotation_schemes:
  - annotation_type: video_annotation
    name: action
    description: "What action is shown in this video?"
    labels:
      - Walking
      - Running
      - Jumping
      - Sitting
      - Standing
      - Other

yaml

annotation_schemes:
  - annotation_type: video_annotation
    name: frame_label
    description: "Annotate video frames"
    labels:
      - Action
      - No action

Temporal Segment Annotation

Mark events with start and end times:

yaml

annotation_task_name: "Video Event Detection"
 
data_files:
  - "data/videos.json"
 
annotation_schemes:
  - annotation_type: video_annotation
    name: events
    description: "Mark all events and their duration"
    labels:
      - name: conversation
        color: "#4ECDC4"
      - name: action_sequence
        color: "#FF6B6B"
      - name: transition
        color: "#45B7D1"
      - name: title_card
        color: "#FFEAA7"

Creating Segments

Navigate to the start of an event
Press [ or click "Mark Start"
Navigate to the end of the event
Press ] or click "Mark End"
Select the event label
Repeat for all events

Video Classification

Classify video content with labels:

yaml

annotation_schemes:
  - annotation_type: video_annotation
    name: key_moments
    description: "Mark important moments"
    labels:
      - name: action_start
        description: "When the action begins"
        color: "#22C55E"
      - name: action_peak
        description: "Most intense moment"
        color: "#EF4444"
      - name: action_end
        description: "When the action completes"
        color: "#3B82F6"

Per-Video Classification

Classify videos with standard annotation types:

yaml

annotation_schemes:
  - annotation_type: radio
    name: video_label
    description: "What is happening in this video?"
    labels:
      - Person visible
      - No person
      - Transition/blur

Video with Labels

Annotate videos with category labels:

yaml

annotation_schemes:
  - annotation_type: video_annotation
    name: video_labels
    description: "Label video content"
    labels:
      - name: person
        color: "#FF6B6B"
      - name: vehicle
        color: "#4ECDC4"
      - name: ball
        color: "#FFEAA7"

Complete Video Annotation Configuration

yaml

annotation_task_name: "Sports Video Analysis"
 
data_files:
  - "data/sports_clips.json"
 
output_annotation_dir: "annotations/"
output_annotation_format: "jsonl"
 
annotation_schemes:
  # Game events
  - annotation_type: video_annotation
    name: game_events
    description: "Mark game events"
    labels:
      - name: goal
        color: "#22C55E"
      - name: foul
        color: "#EF4444"
      - name: corner_kick
        color: "#3B82F6"
      - name: free_kick
        color: "#F59E0B"
      - name: penalty
        color: "#EC4899"
      - name: offside
        color: "#8B5CF6"
 
  # Clip-level annotation
  - annotation_type: multiselect
    name: clip_tags
    description: "Tags for this clip"
    labels:
      - Highlight worthy
      - Good camera angle
      - Multiple players
      - Close-up
      - Wide shot
      - Slow motion available
 
annotation_guidelines:
  title: "Sports Video Annotation Guide"
  content: |
    ## Event Marking
    - Mark events from when they START
    - Include the full play sequence
    - Goal: From shot to ball crossing line
 
    ## Navigation
    - Space: Play/Pause
    - Arrow keys: Frame navigation

Output Format

json

{
  "id": "clip_001",
  "video_path": "/videos/match_highlight.mp4",
  "annotations": {
    "game_events": ["goal", "corner_kick"],
    "clip_tags": ["Highlight worthy", "Good camera angle"]
  }
}

Tips for Video Annotation

First pass overview: Watch at normal speed first
Slow motion for precision: Use 0.25x for exact timestamps
Keyboard shortcuts: Much faster than mouse
Take breaks: Video annotation is visually demanding
Consistent criteria: Document edge cases clearly

Next Steps

Learn image comparison for video quality assessment
Set up crowdsourcing for large-scale video annotation

Full documentation at /docs/features/image-annotation (video section).

Video Annotation with Frame-by-Frame Controls

Video Annotation Capabilities

Basic Video Setup

Frame-by-Frame Navigation

Temporal Segment Annotation

Creating Segments

Video Classification

Per-Video Classification

Video with Labels

Complete Video Annotation Configuration

Output Format

Tips for Video Annotation

Next Steps