# Video Annotation

Source: https://www.potatoannotator.com/docs/guides/video-annotation

**Video annotation adds a time axis to image work. The same clip can be labeled as a whole, segmented into time intervals ("the goal happens from 0:12 to 0:15"), or annotated frame by frame.** Potato provides frame navigation and temporal controls so annotators can move through a clip precisely.

Video tasks are central to [activity recognition](https://en.wikipedia.org/wiki/Activity_recognition) and [object tracking](https://en.wikipedia.org/wiki/Video_tracking).

## Clip-level classification

The simplest task: one label for the whole clip.

```yaml
annotation_schemes:
  - annotation_type: radio
    name: action
    description: "What is the main action in this clip?"
    labels: [Walking, Running, Sitting, Jumping, Other]
```

## Temporal segments, when something happens

To mark intervals on the timeline, use a span over the video's time axis, just like [sound event detection](/docs/guides/audio-annotation) does for audio.

```yaml
annotation_schemes:
  - annotation_type: span
    name: events
    description: "Mark the start and end of each event and label it."
    labels: [Goal, Foul, Substitution, Replay]
```

## Per-frame annotation and tracking

For frame-level work, classifying individual frames or tracking an object across frames, annotators step through the video and annotate at each frame. Decide a sampling rate (every frame, every Nth frame, or keyframes only); labeling every frame is expensive, so most projects subsample.

## Keeping video annotation consistent

- **Boundary precision.** Agree how exact segment start/end must be; frame-level precision is costly.
- **Occlusion and exit.** Write rules for when a tracked object is hidden or leaves the frame.
- **Workload.** Video is the most time-consuming modality, pilot to estimate cost before scaling, and consider [LLM/vision pre-annotation](/docs/guides/llm-pre-annotation) to seed labels.

## Further reading

- [Audio Annotation](/docs/guides/audio-annotation), the same temporal-span ideas
- [Image Annotation](/docs/guides/image-annotation)
- [Span Annotation](/docs/guides/span-annotation)