advancedaudio
Sound Event Detection
Temporal sound event annotation with strong labels following DCASE Challenge protocols.
Configuration Fileconfig.yaml
annotation_task_name: "Sound Event Detection"
port: 8000
# Data configuration
data_files:
- "data/audio_recordings.json"
item_properties:
id_key: "id"
text_key: "text"
# Annotation schemes
annotation_schemes:
# Event annotations (temporal spans)
# Format: describe events with timestamps
- annotation_type: text
name: event_annotations
description: "List all sound events with start/end times. Format: 'start-end: event_class'"
textarea: true
placeholder: "0.0-2.5: dog_bark\n3.1-4.0: car_horn\n2.0-5.5: speech\n..."
# Events detected (for quick summary)
- annotation_type: multiselect
name: events_present
description: "Which sound events are present in this clip? (Select all)"
labels:
- Speech
- Dog bark
- Cat meow
- Car horn
- Car passing
- Siren
- Alarm
- Door/knock
- Footsteps
- Music
- Bird sounds
- Rain
- Wind
- Construction/drilling
- Glass breaking
- Gunshot
- Scream
- Baby cry
- Applause
- Laughter
# Number of distinct events
- annotation_type: radio
name: event_count
description: "How many distinct sound events did you annotate?"
labels:
- "0 (silence/background only)"
- "1-2 events"
- "3-5 events"
- "6-10 events"
- "More than 10 events"
# Event overlap
- annotation_type: radio
name: event_overlap
description: "Are there overlapping sound events?"
labels:
- No overlap (events are sequential)
- Some overlap
- Heavy overlap (many simultaneous sounds)
# Annotation difficulty
- annotation_type: likert
name: difficulty
description: "How difficult was it to determine event boundaries?"
size: 5
min_label: "Very easy (clear boundaries)"
max_label: "Very difficult (ambiguous)"
# Background noise level
- annotation_type: likert
name: noise_level
description: "How much background noise is present?"
size: 5
min_label: "Silent/clean"
max_label: "Very noisy"
# Confidence
- annotation_type: likert
name: confidence
description: "Confidence in your temporal boundaries"
size: 5
min_label: "Low"
max_label: "High"
# User settings
allow_all_users: true
instances_per_annotator: 75
# Output
output_annotation_dir: "annotation_output/"
output_annotation_format: "json"
Get This Design
This design is available in our showcase. Copy the configuration below to get started.
Quick start:
# Create your project folder mkdir sound-event-detection cd sound-event-detection # Copy config.yaml from above potato start config.yaml
Details
Annotation Types
spanmultiselect
Domain
Audio
Use Cases
sound event detectiontemporal annotationacoustic monitoring
Tags
audioevent detectiontemporaldcasesegmentation
Related Designs
Audio Transcription Review
Review and correct automatic speech recognition transcriptions with waveform visualization.
likertmultiselect
AudioSet Event Classification
Multi-label audio event tagging following the AudioSet ontology for weak supervision.
multiselect
Detecting Persuasion Techniques in News
Identification of propaganda and persuasion techniques in news articles through both multi-label classification and span-level detection. Based on SemEval-2023 Task 3 (Piskorski et al.).
multiselectspan