intermediatevideo
LSMDC Keyframe Selection
Select representative keyframes from movie clips for video description tasks. Annotators identify frames that best summarize the visual content of each shot.
Archivo de configuraciónconfig.yaml
# LSMDC Keyframe Selection Configuration
# Based on Rohrbach et al., IJCV 2017
# Task: Select representative keyframes from movie clips
annotation_task_name: "LSMDC Keyframe Selection"
task_dir: "."
data_files:
- data.json
item_properties:
id_key: "id"
text_key: "video_url"
output_annotation_dir: "annotation_output/"
output_annotation_format: "json"
annotation_schemes:
- name: "keyframes"
description: |
Select the most REPRESENTATIVE frame(s) from this clip.
A good keyframe captures the main action or content of the shot.
annotation_type: "video_annotation"
mode: "keyframe"
labels:
- name: "best_keyframe"
color: "#22C55E"
key_value: "k"
- name: "alternative_keyframe"
color: "#3B82F6"
key_value: "a"
frame_stepping: true
show_timecode: true
playback_rate_control: true
video_fps: 24
- name: "keyframe_quality"
description: "How representative is the best keyframe of the clip?"
annotation_type: radio
labels:
- "Excellent - captures everything important"
- "Good - captures main content"
- "Fair - captures some content"
- "Poor - no single frame works well"
- name: "clip_content"
description: "What does this clip primarily show?"
annotation_type: radio
labels:
- "Person/Character focus"
- "Action/Movement"
- "Dialogue scene"
- "Establishing shot/Environment"
- "Object focus"
- "Multiple subjects"
allow_all_users: true
instances_per_annotator: 80
annotation_per_instance: 2
annotation_instructions: |
## Keyframe Selection Task
Select the best frame(s) to represent each movie clip.
### What makes a good keyframe?
- Shows the main subject clearly
- Captures the key action or moment
- Is visually clear (not blurry)
- Could stand alone as a summary of the clip
### Guidelines:
- Select ONE best keyframe per clip
- Optionally mark alternative keyframes
- Avoid: blurry frames, transitions, extreme close-ups
- Prefer: clear faces, complete actions, informative composition
### Tips:
- Use frame stepping to find the exact best frame
- For dialogue, choose a frame with visible faces
- For action, choose the peak of the action
- For establishing shots, choose the most informative view
Datos de ejemplosample-data.json
[
{
"id": "lsmdc_001",
"video_url": "https://example.com/videos/movie_clip_001.mp4",
"movie": "Sample Movie",
"clip_duration": 5
},
{
"id": "lsmdc_002",
"video_url": "https://example.com/videos/movie_clip_002.mp4",
"movie": "Sample Movie",
"clip_duration": 8
}
]Obtener este diseño
View on GitHub
Clone or download from the repository
Inicio rápido:
git clone https://github.com/davidjurgens/potato-showcase.git cd potato-showcase/video/summarization/lsmdc-keyframe-selection potato start config.yaml
Detalles
Tipos de anotación
radiovideo_annotation
Dominio
Computer VisionFilm Studies
Casos de uso
Keyframe SelectionVideo SummarizationMovie Description
Etiquetas
videokeyframemoviedescriptionlsmdcrepresentative
¿Encontró un problema o desea mejorar este diseño?
Abrir un issueDiseños relacionados
MovieScenes Scene Detection
Detect and annotate scene boundaries in movies. Identify where semantic scene changes occur based on location, time, or narrative shifts.
radiovideo_annotation
Charades-STA Temporal Grounding
Ground natural language descriptions to video segments. Given a sentence describing an action, identify the exact temporal boundaries where that action occurs.
radiovideo_annotation
DiDeMo Moment Retrieval
Localizing natural language descriptions to specific video moments. Given a text query, annotators identify the corresponding temporal segment in the video.
radiovideo_annotation