Video Annotation with Frame-by-Frame Controls
Set up video annotation in Potato with frame-by-frame navigation, timestamp markers, temporal event labeling, and per-segment classification using radio or span schemas.
Video annotation is how you build training data for things like action recognition, object tracking, and temporal event detection. This tutorial covers frame navigation, segmenting a clip in time, and keeping the labeling fast. For the full video scheme reference, see the source documentation.
What video annotation can do
Potato handles video for both labeling and classification tasks:
- Video classification: Label entire video clips
- Multi-label tagging: Apply multiple tags to videos
- Temporal annotation: Mark events in videos
Basic Video Setup
annotation_task_name: "Video Action Recognition"
data_files:
- "data/videos.json"
annotation_schemes:
- annotation_type: video_annotation
name: action
description: "What action is shown in this video?"
labels:
- Walking
- Running
- Jumping
- Sitting
- Standing
- OtherFrame-by-Frame Navigation
annotation_schemes:
- annotation_type: video_annotation
name: frame_label
description: "Annotate video frames"
labels:
- Action
- No actionTemporal segment annotation
Mark events with a start and end time:
annotation_task_name: "Video Event Detection"
data_files:
- "data/videos.json"
annotation_schemes:
- annotation_type: video_annotation
name: events
description: "Mark all events and their duration"
labels:
- name: conversation
color: "#4ECDC4"
- name: action_sequence
color: "#FF6B6B"
- name: transition
color: "#45B7D1"
- name: title_card
color: "#FFEAA7"Creating Segments
- Navigate to the start of an event
- Press
[or click "Mark Start" - Navigate to the end of the event
- Press
]or click "Mark End" - Select the event label
- Repeat for all events
Marking key moments
Tag specific moments within a clip:
annotation_schemes:
- annotation_type: video_annotation
name: key_moments
description: "Mark important moments"
labels:
- name: action_start
description: "When the action begins"
color: "#22C55E"
- name: action_peak
description: "Most intense moment"
color: "#EF4444"
- name: action_end
description: "When the action completes"
color: "#3B82F6"Per-video classification
Classify a whole video with the standard annotation types:
annotation_schemes:
- annotation_type: radio
name: video_label
description: "What is happening in this video?"
labels:
- Person visible
- No person
- Transition/blurVideo with category labels
Annotate videos with category labels:
annotation_schemes:
- annotation_type: video_annotation
name: video_labels
description: "Label video content"
labels:
- name: person
color: "#FF6B6B"
- name: vehicle
color: "#4ECDC4"
- name: ball
color: "#FFEAA7"A complete configuration
annotation_task_name: "Sports Video Analysis"
data_files:
- "data/sports_clips.json"
output_annotation_dir: "annotations/"
export_annotation_format: "jsonl"
annotation_schemes:
# Game events
- annotation_type: video_annotation
name: game_events
description: "Mark game events"
labels:
- name: goal
color: "#22C55E"
- name: foul
color: "#EF4444"
- name: corner_kick
color: "#3B82F6"
- name: free_kick
color: "#F59E0B"
- name: penalty
color: "#EC4899"
- name: offside
color: "#8B5CF6"
# Clip-level annotation
- annotation_type: multiselect
name: clip_tags
description: "Tags for this clip"
labels:
- Highlight worthy
- Good camera angle
- Multiple players
- Close-up
- Wide shot
- Slow motion available
annotation_guidelines:
title: "Sports Video Annotation Guide"
content: |
## Event Marking
- Mark events from when they START
- Include the full play sequence
- Goal: From shot to ball crossing line
## Navigation
- Space: Play/Pause
- Arrow keys: Frame navigationOutput Format
{
"id": "clip_001",
"video_path": "/videos/match_highlight.mp4",
"annotations": {
"game_events": ["goal", "corner_kick"],
"clip_tags": ["Highlight worthy", "Good camera angle"]
}
}Tips for video annotation
- First pass overview: Watch at normal speed first
- Slow motion for precision: Use 0.25x for exact timestamps
- Keyboard shortcuts: Much faster than mouse
- Take breaks: Video annotation is visually demanding
- Consistent criteria: Document edge cases clearly
Next Steps
- Learn image comparison for video quality assessment
- Set up crowdsourcing for large-scale video annotation
Full documentation at /docs/features/image-annotation (video section).