Tutorials5 min read
Music Genre Classification Annotation
Create a music annotation task in Potato with waveform visualization, 30-second preview playback, hierarchical genre label trees, and pairwise preference comparisons.
Potato Team
यह पृष्ठ अभी आपकी भाषा में उपलब्ध नहीं है। अंग्रेज़ी संस्करण दिखाया जा रहा है।
Genre labels are what recommendation systems, auto-generated playlists, and music discovery features run on. This post shows how to build genre-tagging interfaces in Potato, with audio previews and nested genre trees.
Basic genre classification
yaml
annotation_task_name: "Music Genre Classification"
data_files:
- data/tracks.json
item_properties:
audio_path: audio_preview
metadata_fields: [artist, title, album]
# Show track metadata
display:
show_metadata: true
metadata_template: "{{artist}} - {{title}}"
annotation_schemes:
- annotation_type: audio_annotation
audio_display: waveform
waveform_color: "#EC4899"
progress_color: "#F472B6"
speed_control: true
show_duration: true
name: primary_genre
description: "Select the primary genre"
labels:
- Rock
- Pop
- Hip-Hop
- Electronic
- Jazz
- Classical
- R&B
- Country
- Metal
- FolkNested genre taxonomy
yaml
annotation_schemes:
# Top-level genre
- annotation_type: radio
name: genre
description: "Primary genre"
labels:
- Rock
- Electronic
- Hip-Hop
- Pop
- Jazz
- Classical
# Subgenre (conditional)
- annotation_type: radio
name: subgenre
description: "Subgenre"
conditional:
depends_on: genre
options:
Rock:
- Classic Rock
- Alternative
- Indie Rock
- Hard Rock
- Punk
- Grunge
Electronic:
- House
- Techno
- Trance
- Drum & Bass
- Ambient
- Dubstep
"Hip-Hop":
- East Coast
- West Coast
- Trap
- Conscious
- Lo-fi
Pop:
- Synth Pop
- Dance Pop
- Indie Pop
- K-Pop
- Teen Pop
Jazz:
- Bebop
- Smooth Jazz
- Fusion
- Big Band
- Cool Jazz
Classical:
- Baroque
- Romantic
- Contemporary
- Opera
- ChamberMulti-label genre tagging
Plenty of tracks sit in more than one genre, so let annotators pick several:
yaml
annotation_schemes:
- annotation_type: multiselect
name: genres
description: "Select ALL applicable genres (up to 3)"
labels:
- Rock
- Pop
- Electronic
- Hip-Hop
- R&B
- Jazz
- Classical
- Country
- Folk
- Metal
- Punk
- Indie
- World
- Latin
min_selections: 1
max_selections: 3
- annotation_type: radio
name: primary_genre
description: "Which is the PRIMARY genre?"
dynamic_from: genres # Only show selected genres as optionsComplete music annotation config
yaml
annotation_task_name: "Music Tagging and Classification"
data_files:
- data/music_library.json
item_properties:
audio_path: preview_url
metadata_fields:
- artist
- title
- album
- year
- duration
display:
show_metadata: true
metadata_layout: card
fields:
- label: "Artist"
field: artist
- label: "Track"
field: title
- label: "Album"
field: album
- label: "Year"
field: year
annotation_schemes:
# Audio playback configuration
- annotation_type: audio_annotation
audio_display: waveform
waveform_color: "#8B5CF6"
progress_color: "#A78BFA"
cursor_color: "#F59E0B"
height: 100
show_duration: true
show_current_time: true
speed_control: true
speed_options: [0.75, 1.0, 1.25, 1.5]
volume_control: true
default_volume: 0.7
autoplay: false
loop: true
# Genre classification
- annotation_type: multiselect
name: genres
description: "Select all applicable genres"
labels:
- Rock
- Pop
- Electronic/Dance
- Hip-Hop/Rap
- R&B/Soul
- Jazz
- Classical
- Country
- Folk/Acoustic
- Metal
- Punk
- Indie
- World/International
- Latin
- Blues
- Reggae
min_selections: 1
max_selections: 3
# Mood/Energy
- annotation_type: multiselect
name: mood
description: "Select the mood(s) of this track"
labels:
- Happy/Uplifting
- Sad/Melancholic
- Energetic
- Calm/Relaxing
- Aggressive
- Romantic
- Dark/Moody
- Nostalgic
- Motivational
min_selections: 1
# Energy level
- annotation_type: likert
name: energy
description: "Energy level"
size: 5
min_label: "Very low energy"
max_label: "Very high energy"
# Danceability
- annotation_type: likert
name: danceability
description: "How danceable is this track?"
size: 5
min_label: "Not danceable"
max_label: "Very danceable"
# Vocal content
- annotation_type: radio
name: vocals
description: "Vocal content"
labels:
- Instrumental only
- Mostly instrumental
- Balanced
- Mostly vocals
- Vocals dominant
# Era/Style
- annotation_type: radio
name: era
description: "Musical era/style"
labels:
- Pre-1970s (Classic)
- 1970s
- 1980s
- 1990s
- 2000s
- 2010s
- 2020s/Current
# Quality flags
- annotation_type: multiselect
name: flags
description: "Special flags (if applicable)"
labels:
- Explicit content
- Live recording
- Remix/Cover
- Holiday/Seasonal
- Soundtrack
- Spoken word sections
required: false
# Confidence
- annotation_type: likert
name: confidence
description: "Confidence in your genre classification"
size: 5
min_label: "Uncertain"
max_label: "Very confident"
annotation_guidelines:
title: "Music Tagging Guidelines"
content: |
## Genre Selection
- Choose 1-3 genres that best describe the track
- Select the most specific applicable genre
- When unsure, lean toward broader categories
## Mood Assessment
- Consider the overall emotional tone
- A track can have multiple moods
- Think about how the music makes you feel
## Energy & Danceability
- Energy: tempo, intensity, power
- Danceability: rhythm, beat clarity, groove
## Listening Tips
- Listen to at least 30 seconds
- Pay attention to drops, changes, and transitions
- Consider instrumentation and production style
Output format
json
{
"id": "track_001",
"audio_url": "/audio/preview_001.mp3",
"metadata": {
"artist": "The Beatles",
"title": "Here Comes the Sun",
"album": "Abbey Road",
"year": 1969
},
"annotations": {
"genres": ["Rock", "Pop"],
"mood": ["Happy/Uplifting", "Nostalgic"],
"energy": 3,
"danceability": 2,
"vocals": "Balanced",
"era": "Pre-1970s (Classic)",
"flags": [],
"confidence": 5
}
}Tips for music annotation
- Good headphones matter. You will miss genre cues on laptop speakers.
- Keep the volume steady so you are not constantly adjusting and tiring yourself out.
- Listen to enough of each preview to catch the changes, not just the intro.
- Brush up on what each genre actually means before you start.
- Take breaks. Tired ears make worse calls.
Where to go next
- Add audio event detection for picking out instruments
- Set up crowdsourcing to get a wider range of ears on the data
- Learn about agreement metrics for music
For the full set of audio annotation options (waveform display, playback controls, region selection), see the source documentation.
Full audio documentation at /docs/features/audio-annotation.