Showcase/Sound Event Detection
advancedaudio

Sound Event Detection

Temporal sound event annotation with strong labels following DCASE Challenge protocols.

🎧

audio annotation

Configuration Fileconfig.yaml

task_name: "Sound Event Detection"

# Server configuration
server:
  port: 8000

# Audio settings
audio:
  enabled: true
  display: waveform
  waveform_color: "#F59E0B"
  progress_color: "#FBBF24"
  speed_control: true
  speed_options: [0.5, 0.75, 1.0]
  keyboard_controls:
    play_pause: "space"
    rewind_1s: ","
    forward_1s: "."

# Data configuration
data_files:
  - path: data/audio_recordings.json
    audio_field: audio_file

# Annotation schemes
annotation_schemes:
  # Event annotations (temporal spans)
  # Format: describe events with timestamps
  - annotation_type: text
    name: event_annotations
    description: "List all sound events with start/end times. Format: 'start-end: event_class'"
    textarea: true
    placeholder: "0.0-2.5: dog_bark\n3.1-4.0: car_horn\n2.0-5.5: speech\n..."

  # Events detected (for quick summary)
  - annotation_type: multiselect
    name: events_present
    description: "Which sound events are present in this clip? (Select all)"
    labels:
      - Speech
      - Dog bark
      - Cat meow
      - Car horn
      - Car passing
      - Siren
      - Alarm
      - Door/knock
      - Footsteps
      - Music
      - Bird sounds
      - Rain
      - Wind
      - Construction/drilling
      - Glass breaking
      - Gunshot
      - Scream
      - Baby cry
      - Applause
      - Laughter

  # Number of distinct events
  - annotation_type: radio
    name: event_count
    description: "How many distinct sound events did you annotate?"
    labels:
      - "0 (silence/background only)"
      - "1-2 events"
      - "3-5 events"
      - "6-10 events"
      - "More than 10 events"

  # Event overlap
  - annotation_type: radio
    name: event_overlap
    description: "Are there overlapping sound events?"
    labels:
      - No overlap (events are sequential)
      - Some overlap
      - Heavy overlap (many simultaneous sounds)

  # Annotation difficulty
  - annotation_type: likert
    name: difficulty
    description: "How difficult was it to determine event boundaries?"
    size: 5
    min_label: "Very easy (clear boundaries)"
    max_label: "Very difficult (ambiguous)"

  # Background noise level
  - annotation_type: likert
    name: noise_level
    description: "How much background noise is present?"
    size: 5
    min_label: "Silent/clean"
    max_label: "Very noisy"

  # Confidence
  - annotation_type: likert
    name: confidence
    description: "Confidence in your temporal boundaries"
    size: 5
    min_label: "Low"
    max_label: "High"

# User settings
allow_all_users: true
instances_per_annotator: 75

# Output
output:
  path: annotations/
  format: json

Get This Design

This design is available in our showcase. Copy the configuration below to get started.

Quick start:

# Create your project folder
mkdir sound-event-detection
cd sound-event-detection
# Copy config.yaml from above
potato start config.yaml

Details

Annotation Types

spanmultiselect

Domain

Audio

Use Cases

sound event detectiontemporal annotationacoustic monitoring

Tags

audioevent detectiontemporaldcasesegmentation