TRECVID Shot Boundary Detection

Detect shot boundaries and classify transition types in broadcast video. Mark cuts, dissolves, fades, and other transitions between camera shots.

Configuration Fileconfig.yaml

yaml

# TRECVID Shot Boundary Detection Configuration
# Task: Detect and classify shot transitions in broadcast video

annotation_task_name: "Shot Boundary Detection"
task_dir: "."

data_files:
  - data.json
item_properties:
  id_key: "id"
  text_key: "video_url"

output_annotation_dir: "annotation_output/"
output_annotation_format: "json"

annotation_schemes:
  - name: "shot_boundaries"
    description: |
      Mark every shot boundary and classify the transition type.
      A shot is a continuous sequence from a single camera.
    annotation_type: "video_annotation"
    mode: "keyframe"
    labels:
      - name: "cut"
        color: "#EF4444"
        key_value: "c"
      - name: "dissolve"
        color: "#8B5CF6"
        key_value: "d"
      - name: "fade_in"
        color: "#22C55E"
        key_value: "i"
      - name: "fade_out"
        color: "#F97316"
        key_value: "o"
      - name: "wipe"
        color: "#3B82F6"
        key_value: "w"
      - name: "other_gradual"
        color: "#EC4899"
        key_value: "g"
    frame_stepping: true
    show_timecode: true
    video_fps: 30

allow_all_users: true
instances_per_annotator: 40
annotation_per_instance: 2

annotation_instructions: |
  ## Shot Boundary Detection Task

  Mark every transition between camera shots.

  ### Transition Types:
  - **Cut (c)**: Instantaneous change (most common)
  - **Dissolve (d)**: Two shots overlap/blend
  - **Fade In (i)**: From black to image
  - **Fade Out (o)**: From image to black
  - **Wipe (w)**: One shot pushes another off screen
  - **Other Gradual (g)**: Any other gradual transition

  ### Guidelines:
  - Mark at the FIRST frame of the new shot (for cuts)
  - For gradual transitions, mark the midpoint
  - Use frame stepping for accuracy

Sample Datasample-data.json

json

[
  {
    "id": "sbd_001",
    "video_url": "https://example.com/videos/news_broadcast.mp4",
    "source": "broadcast_news",
    "duration_seconds": 180
  },
  {
    "id": "sbd_002",
    "video_url": "https://example.com/videos/documentary_clip.mp4",
    "source": "documentary",
    "duration_seconds": 240
  }
]

Get This Design

View on GitHub

Clone or download from the repository

Quick start:

git clone https://github.com/davidjurgens/potato-showcase.git
cd potato-showcase/video/boundary-detection/shot-boundary-detection
potato start config.yaml

Details

Annotation Types

video_annotation

Domain

Computer VisionBroadcast Media

Use Cases

Shot DetectionVideo EditingContent Indexing

Related Designs

ActivityNet Captions Dense Annotation

Dense temporal annotation with natural language descriptions. Annotators segment videos into events and write descriptive captions for each temporal segment.

video_annotationtext

ActivityNet Temporal Localization

Temporal activity localization in untrimmed videos. Annotators identify activity instances by marking precise start and end timestamps across 200 activity classes.