Showcase/AudioSet Event Classification
intermediateaudio

AudioSet Event Classification

Multi-label audio event tagging following the AudioSet ontology for weak supervision.

🎧

audio annotation

Configuration Fileconfig.yaml

task_name: "AudioSet Event Classification"

# Server configuration
server:
  port: 8000

# Audio settings
audio:
  enabled: true
  display: waveform
  waveform_color: "#6E56CF"
  progress_color: "#A18FFF"
  speed_control: true

# Data configuration
data_files:
  - path: data/audio_clips.json
    audio_field: audio_file

# Annotation schemes
annotation_schemes:
  # Human sounds
  - annotation_type: multiselect
    name: human_sounds
    description: "Human sounds present (select all that apply)"
    labels:
      - Speech
      - Singing
      - Shout
      - Whisper
      - Laughter
      - Crying/sobbing
      - Cough
      - Sneeze
      - Breathing
      - Footsteps
      - Clapping
      - None

  # Animal sounds
  - annotation_type: multiselect
    name: animal_sounds
    description: "Animal sounds present (select all that apply)"
    labels:
      - Dog bark
      - Cat meow
      - Bird chirp/song
      - Rooster crow
      - Insect buzz
      - Horse neigh
      - Cow moo
      - None

  # Music and instruments
  - annotation_type: multiselect
    name: music_sounds
    description: "Music and instruments present (select all that apply)"
    labels:
      - Music
      - Guitar
      - Piano
      - Drums
      - Violin
      - Singing (musical)
      - Electronic music
      - None

  # Environmental sounds
  - annotation_type: multiselect
    name: environment_sounds
    description: "Environmental sounds present (select all that apply)"
    labels:
      - Wind
      - Rain
      - Thunder
      - Water (stream/river)
      - Fire crackling
      - Traffic noise
      - Siren
      - Bell
      - Door slam
      - None

  # Mechanical sounds
  - annotation_type: multiselect
    name: mechanical_sounds
    description: "Mechanical/vehicle sounds present (select all that apply)"
    labels:
      - Car engine
      - Motorcycle
      - Train
      - Aircraft
      - Power tools
      - Keyboard typing
      - Phone ringing
      - None

  # Confidence rating
  - annotation_type: likert
    name: confidence
    description: "How confident are you in your labels?"
    size: 5
    min_label: "Not confident"
    max_label: "Very confident"

  # Notes
  - annotation_type: text
    name: notes
    description: "Additional sounds or notes (optional)"
    textarea: false
    required: false

# User settings
allow_all_users: true
instances_per_annotator: 200

# Output
output:
  path: annotations/
  format: json

Get This Design

This design is available in our showcase. Copy the configuration below to get started.

Quick start:

# Create your project folder
mkdir audioset-event-classification
cd audioset-event-classification
# Copy config.yaml from above
potato start config.yaml

Details

Annotation Types

multiselect

Domain

AudioSpeech

Use Cases

audio classificationsound event detectionweak labeling

Tags

audioaudiosetmulti-labelevent classification