Video Annotation
Annotate temporal segments, classify frames, mark keyframes, and track objects in videos.
Video Annotation
Potato provides comprehensive video annotation capabilities including temporal segmentation, frame classification, keyframe marking, and object tracking.
Annotation Modes
| Mode | Description | Use Case |
|---|---|---|
segment | Mark temporal ranges | Scene detection, speaker turns |
frame | Classify individual frames | Frame-level labeling |
keyframe | Mark important moments | Highlight detection |
tracking | Track objects across frames | Object tracking |
combined | All modes in one interface | Complex annotation tasks |
Basic Configuration
annotation_schemes:
- name: "video_segments"
description: "Mark segments where the speaker changes"
annotation_type: "video_annotation"
labels:
- name: "Speaker A"
color: "#FF6B6B"
- name: "Speaker B"
color: "#4ECDC4"Configuration Options
| Field | Type | Default | Description |
|---|---|---|---|
name | string | Required | Unique identifier for the annotation |
description | string | Required | Instructions shown to annotators |
annotation_type | string | Required | Must be "video_annotation" |
labels | list | Required | Available annotation labels |
mode | string | "segment" | Annotation mode |
min_segments | integer | 0 | Minimum required annotations |
max_segments | integer | null | Maximum allowed annotations |
timeline_height | integer | 70 | Timeline height in pixels |
overview_height | integer | 40 | Overview bar height in pixels |
zoom_enabled | boolean | true | Enable timeline zoom |
playback_rate_control | boolean | true | Show playback speed selector |
frame_stepping | boolean | true | Enable frame-by-frame navigation |
show_timecode | boolean | true | Display frame number and time |
video_fps | integer | 30 | Video frame rate for calculations |
Label Configuration
Labels can be simple strings or detailed objects with colors and keyboard shortcuts:
labels:
- name: "intro"
color: "#4ECDC4"
key_value: "1"
- name: "content"
color: "#3B82F6"
key_value: "2"
- name: "outro"
color: "#8B5CF6"
key_value: "3"Examples by Mode
Segment Mode (Default)
Mark temporal ranges with start and end times:
annotation_schemes:
- name: "scene_detection"
description: "Mark each scene in the video"
annotation_type: "video_annotation"
mode: "segment"
labels:
- name: "indoor"
color: "#3B82F6"
key_value: "1"
- name: "outdoor"
color: "#22C55E"
key_value: "2"
- name: "transition"
color: "#F59E0B"
key_value: "3"
zoom_enabled: true
playback_rate_control: trueFrame Mode
Classify individual frames:
annotation_schemes:
- name: "frame_quality"
description: "Classify each frame's quality"
annotation_type: "video_annotation"
mode: "frame"
labels:
- name: "good"
color: "#22C55E"
- name: "blurry"
color: "#F59E0B"
- name: "occluded"
color: "#EF4444"
frame_stepping: true
show_timecode: trueKeyframe Mode
Mark important moments:
annotation_schemes:
- name: "highlights"
description: "Mark key moments in the video"
annotation_type: "video_annotation"
mode: "keyframe"
labels:
- name: "goal"
color: "#22C55E"
- name: "foul"
color: "#EF4444"
- name: "highlight"
color: "#F59E0B"
frame_stepping: trueTracking Mode
Track objects across frames with bounding boxes:
annotation_schemes:
- name: "object_tracking"
description: "Track the ball throughout the video"
annotation_type: "video_annotation"
mode: "tracking"
labels:
- name: "ball"
color: "#3B82F6"
- name: "player"
color: "#22C55E"
frame_stepping: true
video_fps: 30Combined Mode
Use multiple annotation types in one interface:
annotation_schemes:
- name: "comprehensive"
description: "Full video annotation"
annotation_type: "video_annotation"
mode: "combined"
labels:
- name: "action"
color: "#3B82F6"
- name: "dialogue"
color: "#22C55E"
- name: "transition"
color: "#F59E0B"
zoom_enabled: true
playback_rate_control: true
frame_stepping: true
show_timecode: trueVideo Display
For simple video playback without annotation (e.g., to accompany other annotation types), use the video type:
annotation_schemes:
- name: "video_player"
description: "Watch the video clip"
annotation_type: "video"
video_path: "{{video_url}}"
controls: true
autoplay: false
loop: false
muted: falseKeyboard Shortcuts
| Key | Action |
|---|---|
Space | Play/pause |
, / . | Previous/next frame |
[ | Mark segment start |
] | Mark segment end |
Enter | Create segment |
K | Mark keyframe |
C | Classify current frame |
Delete | Remove selected annotation |
1-9 | Select label |
+ / - | Zoom in/out |
Data Format
Input Data
Your data file should include video paths or URLs:
[
{
"id": "video_1",
"video_url": "https://example.com/videos/sample1.mp4",
"description": "Meeting recording - identify speakers"
},
{
"id": "video_2",
"video_url": "/data/videos/sample2.mp4",
"description": "Activity video - label actions"
}
]Output Format
Output varies by mode:
Segment Mode:
{
"id": "video_1",
"annotations": {
"scene_detection": [
{
"start": 0.0,
"end": 5.2,
"start_frame": 0,
"end_frame": 156,
"label": "indoor"
}
]
}
}Frame Mode:
{
"id": "video_1",
"annotations": {
"frame_quality": [
{
"frame": 120,
"time": 4.0,
"label": "good"
}
]
}
}Keyframe Mode:
{
"id": "video_1",
"annotations": {
"highlights": [
{
"frame": 450,
"time": 15.0,
"label": "goal",
"note": "First goal of the match"
}
]
}
}Tracking Mode:
{
"id": "video_1",
"annotations": {
"object_tracking": [
{
"id": "track_1",
"label": "ball",
"frames": [
{"frame": 0, "bbox": {"x": 100, "y": 200, "w": 30, "h": 30}},
{"frame": 1, "bbox": {"x": 105, "y": 198, "w": 30, "h": 30}}
]
}
]
}
}Supported Video Formats
- MP4 (recommended)
- WebM
- MOV
- OGG
Video Object Tracking with Keyframe Interpolation
New in v2.2.0
The tracking mode now supports keyframe interpolation, allowing annotators to mark object positions at key frames and have intermediate frames automatically interpolated. This significantly speeds up object tracking tasks.
annotation_schemes:
- name: "tracking"
description: "Track objects with keyframe interpolation"
annotation_type: "video_annotation"
mode: "tracking"
labels:
- name: "person"
color: "#3B82F6"
- name: "vehicle"
color: "#22C55E"
frame_stepping: true
video_fps: 30Workflow
- Navigate to a keyframe and draw a bounding box around the object
- Skip ahead several frames and reposition the bounding box
- The system automatically interpolates positions for intermediate frames
- Review and adjust interpolated positions as needed
The full set of valid annotation modes is: segment, frame, keyframe, tracking, and combined.
Technical Notes
- Uses Peaks.js for timeline visualization
- Standard HTML5 video elements (no additional server dependencies)
- Timeline supports drag-to-select for segment creation
Best Practices
- Use compressed videos - Large files slow down loading
- Set appropriate FPS - Match
video_fpsto your actual video frame rate - Enable frame stepping - Essential for precise frame-level annotation
- Use playback rate control - Slow motion helps with detailed work
- Provide clear segment definitions - Define what constitutes segment boundaries
- Use distinct colors - Make labels visually distinguishable
- Consider timeline height - Increase for complex segmentation tasks