# Music Genre Classification Annotation

Source: https://www.potatoannotator.com/blog/music-genre-classification

Genre labels are what recommendation systems, auto-generated playlists, and music discovery features run on. This post shows how to build genre-tagging interfaces in Potato, with audio previews and nested genre trees.

## Basic genre classification

```yaml
annotation_task_name: "Music Genre Classification"

data_files:
  - data/tracks.json

item_properties:
  audio_path: audio_preview
  metadata_fields: [artist, title, album]

# Show track metadata
display:
  show_metadata: true
  metadata_template: "{{artist}} - {{title}}"

annotation_schemes:
  - annotation_type: audio_annotation
    audio_display: waveform
    waveform_color: "#EC4899"
    progress_color: "#F472B6"
    speed_control: true
    show_duration: true
    name: primary_genre
    description: "Select the primary genre"
    labels:
      - Rock
      - Pop
      - Hip-Hop
      - Electronic
      - Jazz
      - Classical
      - R&B
      - Country
      - Metal
      - Folk
```

## Nested genre taxonomy

```yaml
annotation_schemes:
  # Top-level genre
  - annotation_type: radio
    name: genre
    description: "Primary genre"
    labels:
      - Rock
      - Electronic
      - Hip-Hop
      - Pop
      - Jazz
      - Classical

  # Subgenre (conditional)
  - annotation_type: radio
    name: subgenre
    description: "Subgenre"
    conditional:
      depends_on: genre
      options:
        Rock:
          - Classic Rock
          - Alternative
          - Indie Rock
          - Hard Rock
          - Punk
          - Grunge
        Electronic:
          - House
          - Techno
          - Trance
          - Drum & Bass
          - Ambient
          - Dubstep
        "Hip-Hop":
          - East Coast
          - West Coast
          - Trap
          - Conscious
          - Lo-fi
        Pop:
          - Synth Pop
          - Dance Pop
          - Indie Pop
          - K-Pop
          - Teen Pop
        Jazz:
          - Bebop
          - Smooth Jazz
          - Fusion
          - Big Band
          - Cool Jazz
        Classical:
          - Baroque
          - Romantic
          - Contemporary
          - Opera
          - Chamber
```

## Multi-label genre tagging

Plenty of tracks sit in more than one genre, so let annotators pick several:

```yaml
annotation_schemes:
  - annotation_type: multiselect
    name: genres
    description: "Select ALL applicable genres (up to 3)"
    labels:
      - Rock
      - Pop
      - Electronic
      - Hip-Hop
      - R&B
      - Jazz
      - Classical
      - Country
      - Folk
      - Metal
      - Punk
      - Indie
      - World
      - Latin
    min_selections: 1
    max_selections: 3

  - annotation_type: radio
    name: primary_genre
    description: "Which is the PRIMARY genre?"
    dynamic_from: genres  # Only show selected genres as options
```

## Complete music annotation config

```yaml
annotation_task_name: "Music Tagging and Classification"

data_files:
  - data/music_library.json

item_properties:
  audio_path: preview_url
  metadata_fields:
    - artist
    - title
    - album
    - year
    - duration

display:
  show_metadata: true
  metadata_layout: card
  fields:
    - label: "Artist"
      field: artist
    - label: "Track"
      field: title
    - label: "Album"
      field: album
    - label: "Year"
      field: year

annotation_schemes:
  # Audio playback configuration
  - annotation_type: audio_annotation
    audio_display: waveform
    waveform_color: "#8B5CF6"
    progress_color: "#A78BFA"
    cursor_color: "#F59E0B"
    height: 100
    show_duration: true
    show_current_time: true
    speed_control: true
    speed_options: [0.75, 1.0, 1.25, 1.5]
    volume_control: true
    default_volume: 0.7
    autoplay: false
    loop: true

  # Genre classification
  - annotation_type: multiselect
    name: genres
    description: "Select all applicable genres"
    labels:
      - Rock
      - Pop
      - Electronic/Dance
      - Hip-Hop/Rap
      - R&B/Soul
      - Jazz
      - Classical
      - Country
      - Folk/Acoustic
      - Metal
      - Punk
      - Indie
      - World/International
      - Latin
      - Blues
      - Reggae
    min_selections: 1
    max_selections: 3

  # Mood/Energy
  - annotation_type: multiselect
    name: mood
    description: "Select the mood(s) of this track"
    labels:
      - Happy/Uplifting
      - Sad/Melancholic
      - Energetic
      - Calm/Relaxing
      - Aggressive
      - Romantic
      - Dark/Moody
      - Nostalgic
      - Motivational
    min_selections: 1

  # Energy level
  - annotation_type: likert
    name: energy
    description: "Energy level"
    size: 5
    min_label: "Very low energy"
    max_label: "Very high energy"

  # Danceability
  - annotation_type: likert
    name: danceability
    description: "How danceable is this track?"
    size: 5
    min_label: "Not danceable"
    max_label: "Very danceable"

  # Vocal content
  - annotation_type: radio
    name: vocals
    description: "Vocal content"
    labels:
      - Instrumental only
      - Mostly instrumental
      - Balanced
      - Mostly vocals
      - Vocals dominant

  # Era/Style
  - annotation_type: radio
    name: era
    description: "Musical era/style"
    labels:
      - Pre-1970s (Classic)
      - 1970s
      - 1980s
      - 1990s
      - 2000s
      - 2010s
      - 2020s/Current

  # Quality flags
  - annotation_type: multiselect
    name: flags
    description: "Special flags (if applicable)"
    labels:
      - Explicit content
      - Live recording
      - Remix/Cover
      - Holiday/Seasonal
      - Soundtrack
      - Spoken word sections
    required: false

  # Confidence
  - annotation_type: likert
    name: confidence
    description: "Confidence in your genre classification"
    size: 5
    min_label: "Uncertain"
    max_label: "Very confident"

annotation_guidelines:
  title: "Music Tagging Guidelines"
  content: |
    ## Genre Selection
    - Choose 1-3 genres that best describe the track
    - Select the most specific applicable genre
    - When unsure, lean toward broader categories

    ## Mood Assessment
    - Consider the overall emotional tone
    - A track can have multiple moods
    - Think about how the music makes you feel

    ## Energy & Danceability
    - Energy: tempo, intensity, power
    - Danceability: rhythm, beat clarity, groove

    ## Listening Tips
    - Listen to at least 30 seconds
    - Pay attention to drops, changes, and transitions
    - Consider instrumentation and production style

```

## Output format

```json
{
  "id": "track_001",
  "audio_url": "/audio/preview_001.mp3",
  "metadata": {
    "artist": "The Beatles",
    "title": "Here Comes the Sun",
    "album": "Abbey Road",
    "year": 1969
  },
  "annotations": {
    "genres": ["Rock", "Pop"],
    "mood": ["Happy/Uplifting", "Nostalgic"],
    "energy": 3,
    "danceability": 2,
    "vocals": "Balanced",
    "era": "Pre-1970s (Classic)",
    "flags": [],
    "confidence": 5
  }
}
```

## Tips for music annotation

1. Good headphones matter. You will miss genre cues on laptop speakers.
2. Keep the volume steady so you are not constantly adjusting and tiring yourself out.
3. Listen to enough of each preview to catch the changes, not just the intro.
4. Brush up on what each genre actually means before you start.
5. Take breaks. Tired ears make worse calls.

## Where to go next

- Add [audio event detection](/blog/audio-event-detection) for picking out instruments
- Set up [crowdsourcing](/blog/prolific-integration) to get a wider range of ears on the data
- Learn about [agreement metrics](/blog/inter-annotator-agreement) for music

For the full set of audio annotation options (waveform display, playback controls, region selection), see the [source documentation](https://github.com/davidjurgens/potato/blob/master/docs/annotation-types/multimedia/audio_annotation.md).

---

*Full audio documentation at [/docs/features/audio-annotation](/docs/features/audio-annotation).*
