YouTube Highlights Detection

Detect highlight-worthy moments in domain-specific videos. Annotators identify the most engaging segments for automatic highlight generation.

Configuration Fileconfig.yaml

yaml

# YouTube Highlights Detection Configuration
# Based on Sun et al., ECCV 2014
# Task: Identify highlight-worthy moments in domain-specific videos

annotation_task_name: "YouTube Highlights Detection"
task_dir: "."

data_files:
  - data.json
item_properties:
  id_key: "id"
  text_key: "video_url"

output_annotation_dir: "annotation_output/"
output_annotation_format: "json"

annotation_schemes:
  - name: "highlights"
    description: |
      Mark the HIGHLIGHT moments in this video.
      These are the parts viewers would most want to see.
    annotation_type: "video_annotation"
    mode: "segment"
    labels:
      - name: "highlight"
        color: "#F59E0B"
        key_value: "h"
      - name: "best_moment"
        color: "#EF4444"
        key_value: "b"
    frame_stepping: true
    show_timecode: true
    playback_rate_control: true
    video_fps: 30

  - name: "highlight_type"
    description: "What type of highlight is the BEST moment in this video?"
    annotation_type: radio
    labels:
      - "Skill/Trick - impressive technique"
      - "Fail/Funny - entertaining mistake"
      - "Scenic/Beautiful - visually stunning"
      - "Dramatic - emotional peak"
      - "Informative - key learning moment"
      - "Social - interesting interaction"

  - name: "domain_relevance"
    description: "How relevant is this video to its category?"
    annotation_type: radio
    labels:
      - "Highly relevant - typical example"
      - "Relevant - fits the category"
      - "Somewhat relevant - partial fit"
      - "Not very relevant - miscategorized"

allow_all_users: true
instances_per_annotator: 50
annotation_per_instance: 3

annotation_instructions: |
  ## YouTube Highlights Detection

  Identify the most highlight-worthy moments in each video.

  ### What is a highlight?
  - The moment you'd share with a friend
  - The part that makes the video worth watching
  - Peak action, emotion, or visual interest

  ### Domain-specific guidance:

  **Surfing**: Big waves, successful rides, wipeouts
  **Skating**: Tricks landing, impressive moves, falls
  **Gymnastics**: Difficult moves, perfect landings
  **Parkour**: Creative moves, big jumps, close calls
  **Dog**: Cute moments, tricks, funny behavior

  ### Guidelines:
  - Mark ALL highlight-worthy segments
  - Use "best_moment" for the single best highlight
  - Highlights are typically 2-10 seconds
  - Consider what would make a good thumbnail/preview

  ### Tips:
  - Put yourself in the viewer's shoes
  - Think about what would get the most engagement
  - Domain expertise helps identify skill-based highlights

Sample Datasample-data.json

json

[
  {
    "id": "yt_highlight_001",
    "video_url": "https://example.com/videos/surfing_clip.mp4",
    "domain": "surfing",
    "duration": 120
  },
  {
    "id": "yt_highlight_002",
    "video_url": "https://example.com/videos/skating_clip.mp4",
    "domain": "skating",
    "duration": 90
  }
]

Get This Design

View on GitHub

Clone or download from the repository

Quick start:

git clone https://github.com/davidjurgens/potato-showcase.git
cd potato-showcase/video/summarization/youtube-highlights
potato start config.yaml

Details

Annotation Types

radiovideo_annotation

Domain

Computer VisionVideo Understanding

Use Cases

Highlight DetectionVideo SummarizationContent Curation

Related Designs

DiDeMo Moment Retrieval

Localizing natural language descriptions to specific video moments. Given a text query, annotators identify the corresponding temporal segment in the video.

radiovideo_annotation

Scene Boundary Detection

Identify scene boundaries in documentary and narrative videos. Annotators mark transitions between semantically coherent scenes based on visual, audio, and narrative cues.

radiovideo_annotation

VSTAR Video-grounded Dialogue

Video-grounded dialogue annotation. Annotators watch videos and answer questions requiring situated understanding, write dialogue turns grounded in specific video moments, and mark relevant temporal segments.

video_annotationtext

YouTube Highlights Detection

Configuration Fileconfig.yaml

Sample Datasample-data.json

Get This Design

Details

Annotation Types

Domain

Use Cases

Tags

Related Designs

DiDeMo Moment Retrieval

Scene Boundary Detection

VSTAR Video-grounded Dialogue