Multi-Object Tracking Annotation
An overview of multi-object tracking annotation concepts and how Potato's video annotation capabilities can support basic tracking workflows.
Multi-Object Tracking Annotation
Multi-Object Tracking (MOT) annotation creates training data for surveillance, autonomous driving, and sports analytics. This tutorial discusses MOT annotation concepts and how Potato's current video annotation features can support basic tracking workflows.
MOT Annotation Challenges
- Maintaining consistent object IDs across frames
- Handling occlusions and re-appearances
- Tracking through crowded scenes
- Managing ID switches and merges
Current Video Annotation Support
Potato currently supports basic video annotation through the video_annotation type. While full MOT-specific features like automatic ID management, interpolation, and occlusion handling are not yet implemented, you can set up basic video labeling workflows.
Basic Video Annotation Setup
annotation_task_name: "Video Object Labeling"
data_files:
- data/videos.json
annotation_schemes:
- annotation_type: video_annotation
name: objects
description: "Label objects in video frames"
video_path: video
labels:
- name: person
- name: vehicle
- name: cyclistSample Data Format
Your data/videos.json file should contain entries with video paths:
[
{
"id": "video_001",
"video": "/path/to/video.mp4"
},
{
"id": "video_002",
"video": "/path/to/another_video.mp4"
}
]Manual Tracking Workflow
Without dedicated MOT features, you can still perform tracking annotation manually:
Creating Tracks Manually
- Navigate to the frame where an object first appears
- Use the video annotation interface to label the object
- Include a consistent identifier in your annotation (e.g., "person_1")
- Move to subsequent frames and continue labeling with the same identifier
Handling Occlusions
When an object becomes occluded:
- Note the last frame where the object was visible
- When the object reappears, use the same identifier to maintain track continuity
- Document occlusion periods in your annotation notes
Proposed MOT Features
The following features would enhance Potato's MOT annotation capabilities and are being considered for future development:
- Automatic ID assignment: Auto-increment IDs for new objects
- Track interpolation: Linear or cubic interpolation between keyframes
- Occlusion handling: Visibility levels (visible, partial, heavy, not_visible)
- Trajectory visualization: Show object paths across frames
- Track management panel: Merge, split, and manage track IDs
- Per-frame attributes: Properties that change frame-to-frame
If you're interested in these features, please reach out to the Potato development team or contribute to the project.
Tips for Manual MOT Annotation
- Work in short segments: 100-200 frames at a time
- Consistent naming: Use a clear ID scheme (e.g., "person_001", "vehicle_023")
- Document your process: Keep notes about occlusions and track decisions
- Review passes: Watch forward then backward to catch errors
- Use external tools: Consider pre-processing with detection models
Alternative Approaches
For projects requiring full MOT annotation capabilities:
- Hybrid workflow: Use Potato for initial labeling and specialized MOT tools for track management
- Pre-annotation: Run object detectors to generate initial bounding boxes, then refine in Potato
- Post-processing: Export Potato annotations and apply tracking algorithms externally
Next Steps
- Learn about video frame annotation
- Explore image annotation features
- Read about inter-annotator agreement for quality control
For current video annotation documentation, see /docs/features/image-annotation.