# Video Annotation with Frame-by-Frame Controls

Source: https://www.potatoannotator.com/blog/video-frame-annotation

Video annotation is how you build training data for things like action recognition, object tracking, and temporal event detection. This tutorial covers frame navigation, segmenting a clip in time, and keeping the labeling fast. For the full video scheme reference, see the [source documentation](https://github.com/davidjurgens/potato/blob/master/docs/annotation-types/multimedia/video_annotation.md).

## What video annotation can do

Potato handles video for both labeling and classification tasks:

- **Video classification**: Label entire video clips
- **Multi-label tagging**: Apply multiple tags to videos
- **Temporal annotation**: Mark events in videos

## Basic Video Setup

```yaml
annotation_task_name: "Video Action Recognition"

data_files:
  - "data/videos.json"

annotation_schemes:
  - annotation_type: video_annotation
    name: action
    description: "What action is shown in this video?"
    labels:
      - Walking
      - Running
      - Jumping
      - Sitting
      - Standing
      - Other
```

## Frame-by-Frame Navigation

```yaml
annotation_schemes:
  - annotation_type: video_annotation
    name: frame_label
    description: "Annotate video frames"
    labels:
      - Action
      - No action
```

## Temporal segment annotation

Mark events with a start and end time:

```yaml
annotation_task_name: "Video Event Detection"

data_files:
  - "data/videos.json"

annotation_schemes:
  - annotation_type: video_annotation
    name: events
    description: "Mark all events and their duration"
    labels:
      - name: conversation
        color: "#4ECDC4"
      - name: action_sequence
        color: "#FF6B6B"
      - name: transition
        color: "#45B7D1"
      - name: title_card
        color: "#FFEAA7"
```

### Creating Segments

1. Navigate to the start of an event
2. Press `[` or click "Mark Start"
3. Navigate to the end of the event
4. Press `]` or click "Mark End"
5. Select the event label
6. Repeat for all events

## Marking key moments

Tag specific moments within a clip:

```yaml
annotation_schemes:
  - annotation_type: video_annotation
    name: key_moments
    description: "Mark important moments"
    labels:
      - name: action_start
        description: "When the action begins"
        color: "#22C55E"
      - name: action_peak
        description: "Most intense moment"
        color: "#EF4444"
      - name: action_end
        description: "When the action completes"
        color: "#3B82F6"
```

## Per-video classification

Classify a whole video with the standard annotation types:

```yaml
annotation_schemes:
  - annotation_type: radio
    name: video_label
    description: "What is happening in this video?"
    labels:
      - Person visible
      - No person
      - Transition/blur
```

## Video with category labels

Annotate videos with category labels:

```yaml
annotation_schemes:
  - annotation_type: video_annotation
    name: video_labels
    description: "Label video content"
    labels:
      - name: person
        color: "#FF6B6B"
      - name: vehicle
        color: "#4ECDC4"
      - name: ball
        color: "#FFEAA7"
```

## A complete configuration

```yaml
annotation_task_name: "Sports Video Analysis"

data_files:
  - "data/sports_clips.json"

output_annotation_dir: "annotations/"
export_annotation_format: "jsonl"

annotation_schemes:
  # Game events
  - annotation_type: video_annotation
    name: game_events
    description: "Mark game events"
    labels:
      - name: goal
        color: "#22C55E"
      - name: foul
        color: "#EF4444"
      - name: corner_kick
        color: "#3B82F6"
      - name: free_kick
        color: "#F59E0B"
      - name: penalty
        color: "#EC4899"
      - name: offside
        color: "#8B5CF6"

  # Clip-level annotation
  - annotation_type: multiselect
    name: clip_tags
    description: "Tags for this clip"
    labels:
      - Highlight worthy
      - Good camera angle
      - Multiple players
      - Close-up
      - Wide shot
      - Slow motion available

annotation_guidelines:
  title: "Sports Video Annotation Guide"
  content: |
    ## Event Marking
    - Mark events from when they START
    - Include the full play sequence
    - Goal: From shot to ball crossing line

    ## Navigation
    - Space: Play/Pause
    - Arrow keys: Frame navigation
```

## Output Format

```json
{
  "id": "clip_001",
  "video_path": "/videos/match_highlight.mp4",
  "annotations": {
    "game_events": ["goal", "corner_kick"],
    "clip_tags": ["Highlight worthy", "Good camera angle"]
  }
}
```

## Tips for video annotation

1. **First pass overview**: Watch at normal speed first
2. **Slow motion for precision**: Use 0.25x for exact timestamps
3. **Keyboard shortcuts**: Much faster than mouse
4. **Take breaks**: Video annotation is visually demanding
5. **Consistent criteria**: Document edge cases clearly

## Next Steps

- Learn [image comparison](/blog/image-comparison-preference) for video quality assessment
- Set up [crowdsourcing](/blog/mturk-deployment) for large-scale video annotation

---

*Full documentation at [/docs/features/image-annotation](/docs/features/image-annotation) (video section).*
