Scene Boundary Detection
Identify scene boundaries in documentary and narrative videos. Annotators mark transitions between semantically coherent scenes based on visual, audio, and narrative cues.
Configuration Fileconfig.yaml
# Scene Boundary Detection Configuration
# Based on BBC Planet Earth Scene Dataset (Sidiropoulos et al., 2011)
# Task: Mark scene transitions in documentary videos
annotation_task_name: "Scene Boundary Detection"
task_dir: "."
data_files:
- data.json
item_properties:
id_key: "id"
text_key: "video_url"
output_annotation_dir: "annotation_output/"
output_annotation_format: "json"
annotation_schemes:
- name: "scene_boundaries"
description: |
Mark the START of each new scene. A scene is a semantically coherent
segment unified by location, time, characters, or narrative topic.
annotation_type: "video_annotation"
mode: "keyframe"
labels:
- name: "scene_start"
color: "#EF4444"
key_value: "s"
- name: "gradual_transition"
color: "#F97316"
key_value: "g"
- name: "cut_transition"
color: "#3B82F6"
key_value: "c"
frame_stepping: true
show_timecode: true
playback_rate_control: true
- name: "scene_type"
description: "What type of scene follows this boundary?"
annotation_type: radio
labels:
- "Establishing shot"
- "Action/Event"
- "Dialogue/Interview"
- "Transition/Montage"
- "Credits/Title"
allow_all_users: true
instances_per_annotator: 30
annotation_per_instance: 2
annotation_instructions: |
## Scene Boundary Detection Task
Mark where each new SCENE begins in the video.
### What defines a scene boundary?
- Change in location or setting
- Significant time jump
- Change in main subject/topic
- Narrative shift
### Transition Types:
- **Cut**: Instantaneous change between shots
- **Gradual**: Fade, dissolve, or wipe transition
### Tips:
- A scene is NOT the same as a shot (scenes contain multiple shots)
- Mark the FIRST frame of the new scene
- Use frame stepping for precision
Sample Datasample-data.json
[
{
"id": "scene_001",
"video_url": "https://example.com/videos/nature_documentary_ep1.mp4",
"title": "Planet Earth - Mountains Episode",
"duration_seconds": 600
},
{
"id": "scene_002",
"video_url": "https://example.com/videos/nature_documentary_ep2.mp4",
"title": "Planet Earth - Ocean Deep",
"duration_seconds": 540
}
]Get This Design
Clone or download from the repository
Quick start:
git clone https://github.com/davidjurgens/potato-showcase.git cd potato-showcase/video/boundary-detection/scene-boundary-detection potato start config.yaml
Details
Annotation Types
Domain
Use Cases
Tags
Found an issue or want to improve this design?
Open an IssueRelated Designs
DiDeMo Moment Retrieval
Localizing natural language descriptions to specific video moments. Given a text query, annotators identify the corresponding temporal segment in the video.
VSTAR Video-grounded Dialogue
Video-grounded dialogue annotation. Annotators watch videos and answer questions requiring situated understanding, write dialogue turns grounded in specific video moments, and mark relevant temporal segments.
YouTube Highlights Detection
Detect highlight-worthy moments in domain-specific videos. Annotators identify the most engaging segments for automatic highlight generation.