# Multi-Object Tracking Annotation

Source: https://www.potatoannotator.com/blog/multi-object-tracking

Multi-object tracking (MOT) annotation produces training data for things like surveillance, self-driving cars, and sports analytics. This post walks through the core ideas behind MOT annotation and how far Potato's current video features will take you on a basic tracking workflow.

A quick caveat before you read on: Potato does not have dedicated MOT tooling yet. If you need automatic ID management and interpolation today, you will probably want a specialized tool. But for smaller jobs, the manual approach below works fine.

## What makes MOT annotation hard

- Keeping object IDs consistent from one frame to the next
- Dealing with objects that get occluded and then come back
- Following objects through crowded scenes
- Sorting out ID switches and merges

## What Potato's video annotation does today

Potato handles basic video annotation through the `video_annotation` type. The MOT-specific niceties (automatic ID management, interpolation, occlusion handling) are not in yet, but you can still set up a basic video labeling workflow.

### Basic video annotation setup

```yaml
annotation_task_name: "Video Object Labeling"

data_files:
  - data/videos.json

annotation_schemes:
  - annotation_type: video_annotation
    name: objects
    description: "Label objects in video frames"
    video_path: video
    labels:
      - name: person
      - name: vehicle
      - name: cyclist
```

### Sample data format

Your `data/videos.json` file holds entries with video paths:

```json
[
  {
    "id": "video_001",
    "video": "/path/to/video.mp4"
  },
  {
    "id": "video_002",
    "video": "/path/to/another_video.mp4"
  }
]
```

## Tracking by hand

Without dedicated MOT features, you can still track objects manually. It is more tedious, but it works.

### Building tracks one frame at a time

1. Go to the frame where an object first shows up
2. Label it in the video annotation interface
3. Give it a consistent identifier in the annotation, like "person_1"
4. Step through the following frames and keep labeling it with that same identifier

### Dealing with occlusions

When an object disappears behind something:
1. Note the last frame where you could see it
2. When it comes back, reuse the same identifier so the track stays continuous
3. Jot down the occlusion period in your annotation notes

## Features we are thinking about

These would make Potato much better at MOT, and they are on the list for future work:

- Automatic ID assignment that auto-increments IDs for new objects
- Track interpolation, linear or cubic, between keyframes
- Occlusion handling with visibility levels (visible, partial, heavy, not_visible)
- Trajectory visualization to show object paths across frames
- A track management panel for merging, splitting, and managing track IDs
- Per-frame attributes for properties that change frame to frame

If any of these matter to you, get in touch with the Potato team or contribute the feature yourself.

## Tips for manual MOT annotation

1. Work in short segments of 100 to 200 frames at a time.
2. Use a clear ID scheme like "person_001" or "vehicle_023" and stick to it.
3. Keep notes about occlusions and the track decisions you made.
4. Do a review pass: watch the segment forward, then backward, to catch errors.
5. Lean on external tools. Pre-processing with detection models saves a lot of clicking.

## Other ways to go about it

If you need full MOT capabilities now, here are a few routes:

1. Run a hybrid workflow: do the initial labeling in Potato, then hand off to a specialized MOT tool for track management.
2. Pre-annotate with object detectors to generate starting bounding boxes, then refine them in Potato.
3. Export your Potato annotations and run tracking algorithms on them after the fact.

## Where to go next

- Learn about [video frame annotation](/blog/video-frame-annotation)
- Explore [image annotation features](/docs/features/image-annotation)
- Read about [inter-annotator agreement](/blog/inter-annotator-agreement) for quality control

For the full picture on how video annotation works in Potato, see the [source documentation](https://github.com/davidjurgens/potato/blob/master/docs/annotation-types/multimedia/video_annotation.md).

---

*For current video annotation documentation, see [/docs/features/image-annotation](/docs/features/image-annotation).*