intermediatevideo
SumMe Video Summarization
Create video summaries by selecting key segments that best represent the content. Annotators identify important moments for automatic video summarization research.
ملف الإعدادconfig.yaml
# SumMe Video Summarization Configuration
# Based on Gygli et al., ECCV 2014
# Task: Select important segments for video summarization
annotation_task_name: "SumMe Video Summarization"
task_dir: "."
data_files:
- data.json
item_properties:
id_key: "id"
text_key: "video_url"
output_annotation_dir: "annotation_output/"
output_annotation_format: "json"
annotation_schemes:
- name: "summary_segments"
description: |
Mark segments that should be INCLUDED in a summary of this video.
Select the most important/interesting parts that capture the essence.
annotation_type: "video_annotation"
mode: "segment"
labels:
- name: "include_in_summary"
color: "#22C55E"
key_value: "s"
- name: "highly_important"
color: "#EF4444"
key_value: "h"
frame_stepping: true
show_timecode: true
playback_rate_control: true
video_fps: 30
- name: "importance_score"
description: "Overall, how interesting/important is this video's content?"
annotation_type: radio
labels:
- "5 - Very interesting/important"
- "4 - Interesting"
- "3 - Moderately interesting"
- "2 - Somewhat boring"
- "1 - Not interesting"
- name: "video_category"
description: "What category best describes this video?"
annotation_type: radio
labels:
- "Sports/Action"
- "Travel/Scenery"
- "Events/Celebrations"
- "Animals/Nature"
- "Tutorial/How-to"
- "Social/People"
- "Other"
- name: "summary_difficulty"
description: "How difficult was it to select summary segments?"
annotation_type: radio
labels:
- "Easy - clear highlights"
- "Moderate - some judgment needed"
- "Difficult - many equally important parts"
- "Very difficult - no clear structure"
allow_all_users: true
instances_per_annotator: 25
annotation_per_instance: 3
annotation_instructions: |
## Video Summarization Task
Select segments that should be included in a summary of each video.
### Goal:
If the video were shortened to ~15% of its length, what parts should remain?
### What makes a good summary segment?
- Captures key moments or highlights
- Shows the main subject/action clearly
- Is visually interesting or informative
- Would make sense to someone who hasn't seen the full video
### Guidelines:
- Mark all segments you think should be in the summary
- Use "highly_important" for the absolute best moments
- Try to select 10-20% of the video
- Avoid: repetitive content, blurry/unclear footage, transitions
### Tips:
- Watch the whole video first before marking
- Consider what would be lost if a segment were removed
- For action videos, focus on peak moments
- For scenic videos, focus on the best views
بيانات نموذجيةsample-data.json
[
{
"id": "summe_001",
"video_url": "https://example.com/videos/user_video_cooking.mp4",
"category": "cooking",
"duration": 180
},
{
"id": "summe_002",
"video_url": "https://example.com/videos/user_video_travel.mp4",
"category": "travel",
"duration": 240
}
]احصل على هذا التصميم
View on GitHub
Clone or download from the repository
بدء سريع:
git clone https://github.com/davidjurgens/potato-showcase.git cd potato-showcase/video/summarization/summe-summarization potato start config.yaml
التفاصيل
أنواع التوسيم
radiovideo_annotation
المجال
Computer VisionVideo Summarization
حالات الاستخدام
Video SummarizationHighlight DetectionImportance Scoring
الوسوم
videosummarizationhighlightsimportanceuser-generated
وجدت مشكلة أو تريد تحسين هذا التصميم؟
افتح مشكلةتصاميم ذات صلة
Charades-STA Temporal Grounding
Ground natural language descriptions to video segments. Given a sentence describing an action, identify the exact temporal boundaries where that action occurs.
radiovideo_annotation
DiDeMo Moment Retrieval
Localizing natural language descriptions to specific video moments. Given a text query, annotators identify the corresponding temporal segment in the video.
radiovideo_annotation
HowTo100M Instructional Video Annotation
Annotate instructional video clips with step descriptions and visual grounding. Link narrated instructions to visual actions for video-language understanding.
radiotext