TVSum Video Summarization
Frame-level importance scoring for video summarization. Annotators rate 2-second shots on a 1-5 importance scale to identify key moments worth including in a summary.
ملف الإعدادconfig.yaml
# TVSum Video Summarization Configuration
# Based on Song et al., CVPR 2015
# Task: Rate 2-second shots on importance scale for video summarization
annotation_task_name: "TVSum Video Summarization"
task_dir: "."
# Data configuration
data_files:
- data.json
item_properties:
id_key: "id"
text_key: "video_url"
# Output
output_annotation_dir: "annotation_output/"
output_annotation_format: "json"
# Annotation schemes
annotation_schemes:
# Video player for viewing the content
- name: "video_player"
description: "Watch the video and rate the importance of each segment"
annotation_type: "video"
video_path: "{{video_url}}"
controls: true
autoplay: false
# Current segment importance rating (1-5 scale as per TVSum methodology)
- name: "segment_importance"
description: |
Rate how important the current 2-second segment is for a video summary.
Consider: Would someone watching only a summary want to see this moment?
annotation_type: likert
size: 5
min_label: "Not Important"
max_label: "Very Important"
labels:
- "1 - Not important at all"
- "2 - Slightly important"
- "3 - Moderately important"
- "4 - Important"
- "5 - Very important (must include)"
# Optional: Mark specific highlight moments
- name: "highlight_moments"
description: "Mark any particularly memorable or highlight-worthy moments"
annotation_type: "video_annotation"
mode: "keyframe"
labels:
- name: "key_moment"
color: "#22C55E"
key_value: "k"
- name: "climax"
color: "#EF4444"
key_value: "c"
- name: "transition"
color: "#F59E0B"
key_value: "t"
frame_stepping: true
show_timecode: true
# User configuration
allow_all_users: true
# Task assignment
instances_per_annotator: 50
annotation_per_instance: 3
# Instructions
surveyflow:
on: true
order:
- prolific_id
prolific_id:
type: text
question: "Please enter your annotator ID:"
annotation_instructions: |
## Video Summarization Task
Your goal is to help create automatic video summaries by rating the importance
of video segments.
### Instructions:
1. Watch the video carefully
2. For each 2-second segment, rate its importance on a 1-5 scale:
- **1**: Not important - Can be skipped entirely
- **2**: Slightly important - Background/filler content
- **3**: Moderately important - Relevant but not essential
- **4**: Important - Should probably be in a summary
- **5**: Very important - Must be included in any summary
### What makes a segment important?
- Key events or actions
- Emotional high points
- Information that's essential to understanding the video
- Visually striking or memorable moments
### Tips:
- Compare segments relative to each other within the same video
- Think: "If I only had 15 seconds, would I include this?"
- Use keyboard shortcuts for faster annotation
بيانات نموذجيةsample-data.json
[
{
"id": "tvsum_001",
"video_url": "https://example.com/videos/changing_tire.mp4",
"title": "Changing a Tire Tutorial",
"category": "HowTo",
"duration_seconds": 180,
"description": "A step-by-step tutorial on how to change a flat tire on your car."
},
{
"id": "tvsum_002",
"video_url": "https://example.com/videos/dog_show.mp4",
"title": "Best Dog Show Moments",
"category": "Entertainment",
"duration_seconds": 240,
"description": "Highlights from a local dog show competition featuring various breeds."
}
]
// ... and 3 more itemsاحصل على هذا التصميم
Clone or download from the repository
بدء سريع:
git clone https://github.com/davidjurgens/potato-showcase.git cd potato-showcase/video/summarization/tvsum-summarization potato start config.yaml
التفاصيل
أنواع التوسيم
المجال
حالات الاستخدام
الوسوم
وجدت مشكلة أو تريد تحسين هذا التصميم؟
افتح مشكلةتصاميم ذات صلة
VBench Video Generation Quality Assessment
Quality assessment of AI-generated videos. Annotators rate generated videos on multiple dimensions (temporal consistency, motion smoothness, aesthetic quality) and compare pairs of generated videos.
Video-ChatGPT - Video QA Display and Evaluation
Video question answering evaluation based on the Video-ChatGPT benchmark (Maaz et al., ACL 2024). Annotators watch a video, review a model-generated response to a question, and evaluate correctness and quality.
ActivityNet Captions Dense Annotation
Dense temporal annotation with natural language descriptions. Annotators segment videos into events and write descriptive captions for each temporal segment.