Skip to content
Tutorials5 min read

음악 장르 분류 어노테이션

파형 시각화, 30초 미리듣기 재생, 계층형 장르 레이블 트리, 쌍대 선호 비교를 갖춘 음악 어노테이션 작업을 Potato에서 만드는 방법을 설명합니다.

Potato Team

장르 레이블은 추천 시스템, 자동 생성 플레이리스트, 음악 발견 기능이 돌아가는 기반입니다. 이 글에서는 오디오 미리듣기와 중첩 장르 트리를 활용해 Potato에서 장르 태깅 인터페이스를 구축하는 방법을 보여 드립니다.

기본 장르 분류

yaml
annotation_task_name: "Music Genre Classification"
 
data_files:
  - data/tracks.json
 
item_properties:
  audio_path: audio_preview
  metadata_fields: [artist, title, album]
 
# Show track metadata
display:
  show_metadata: true
  metadata_template: "{{artist}} - {{title}}"
 
annotation_schemes:
  - annotation_type: audio_annotation
    audio_display: waveform
    waveform_color: "#EC4899"
    progress_color: "#F472B6"
    speed_control: true
    show_duration: true
    name: primary_genre
    description: "Select the primary genre"
    labels:
      - Rock
      - Pop
      - Hip-Hop
      - Electronic
      - Jazz
      - Classical
      - R&B
      - Country
      - Metal
      - Folk

중첩 장르 분류 체계

yaml
annotation_schemes:
  # Top-level genre
  - annotation_type: radio
    name: genre
    description: "Primary genre"
    labels:
      - Rock
      - Electronic
      - Hip-Hop
      - Pop
      - Jazz
      - Classical
 
  # Subgenre (conditional)
  - annotation_type: radio
    name: subgenre
    description: "Subgenre"
    conditional:
      depends_on: genre
      options:
        Rock:
          - Classic Rock
          - Alternative
          - Indie Rock
          - Hard Rock
          - Punk
          - Grunge
        Electronic:
          - House
          - Techno
          - Trance
          - Drum & Bass
          - Ambient
          - Dubstep
        "Hip-Hop":
          - East Coast
          - West Coast
          - Trap
          - Conscious
          - Lo-fi
        Pop:
          - Synth Pop
          - Dance Pop
          - Indie Pop
          - K-Pop
          - Teen Pop
        Jazz:
          - Bebop
          - Smooth Jazz
          - Fusion
          - Big Band
          - Cool Jazz
        Classical:
          - Baroque
          - Romantic
          - Contemporary
          - Opera
          - Chamber

다중 레이블 장르 태깅

많은 트랙이 둘 이상의 장르에 걸쳐 있으므로, 어노테이터가 여러 개를 고를 수 있게 하십시오:

yaml
annotation_schemes:
  - annotation_type: multiselect
    name: genres
    description: "Select ALL applicable genres (up to 3)"
    labels:
      - Rock
      - Pop
      - Electronic
      - Hip-Hop
      - R&B
      - Jazz
      - Classical
      - Country
      - Folk
      - Metal
      - Punk
      - Indie
      - World
      - Latin
    min_selections: 1
    max_selections: 3
 
  - annotation_type: radio
    name: primary_genre
    description: "Which is the PRIMARY genre?"
    dynamic_from: genres  # Only show selected genres as options

완전한 음악 어노테이션 설정

yaml
annotation_task_name: "Music Tagging and Classification"
 
data_files:
  - data/music_library.json
 
item_properties:
  audio_path: preview_url
  metadata_fields:
    - artist
    - title
    - album
    - year
    - duration
 
display:
  show_metadata: true
  metadata_layout: card
  fields:
    - label: "Artist"
      field: artist
    - label: "Track"
      field: title
    - label: "Album"
      field: album
    - label: "Year"
      field: year
 
annotation_schemes:
  # Audio playback configuration
  - annotation_type: audio_annotation
    audio_display: waveform
    waveform_color: "#8B5CF6"
    progress_color: "#A78BFA"
    cursor_color: "#F59E0B"
    height: 100
    show_duration: true
    show_current_time: true
    speed_control: true
    speed_options: [0.75, 1.0, 1.25, 1.5]
    volume_control: true
    default_volume: 0.7
    autoplay: false
    loop: true
 
  # Genre classification
  - annotation_type: multiselect
    name: genres
    description: "Select all applicable genres"
    labels:
      - Rock
      - Pop
      - Electronic/Dance
      - Hip-Hop/Rap
      - R&B/Soul
      - Jazz
      - Classical
      - Country
      - Folk/Acoustic
      - Metal
      - Punk
      - Indie
      - World/International
      - Latin
      - Blues
      - Reggae
    min_selections: 1
    max_selections: 3
 
  # Mood/Energy
  - annotation_type: multiselect
    name: mood
    description: "Select the mood(s) of this track"
    labels:
      - Happy/Uplifting
      - Sad/Melancholic
      - Energetic
      - Calm/Relaxing
      - Aggressive
      - Romantic
      - Dark/Moody
      - Nostalgic
      - Motivational
    min_selections: 1
 
  # Energy level
  - annotation_type: likert
    name: energy
    description: "Energy level"
    size: 5
    min_label: "Very low energy"
    max_label: "Very high energy"
 
  # Danceability
  - annotation_type: likert
    name: danceability
    description: "How danceable is this track?"
    size: 5
    min_label: "Not danceable"
    max_label: "Very danceable"
 
  # Vocal content
  - annotation_type: radio
    name: vocals
    description: "Vocal content"
    labels:
      - Instrumental only
      - Mostly instrumental
      - Balanced
      - Mostly vocals
      - Vocals dominant
 
  # Era/Style
  - annotation_type: radio
    name: era
    description: "Musical era/style"
    labels:
      - Pre-1970s (Classic)
      - 1970s
      - 1980s
      - 1990s
      - 2000s
      - 2010s
      - 2020s/Current
 
  # Quality flags
  - annotation_type: multiselect
    name: flags
    description: "Special flags (if applicable)"
    labels:
      - Explicit content
      - Live recording
      - Remix/Cover
      - Holiday/Seasonal
      - Soundtrack
      - Spoken word sections
    required: false
 
  # Confidence
  - annotation_type: likert
    name: confidence
    description: "Confidence in your genre classification"
    size: 5
    min_label: "Uncertain"
    max_label: "Very confident"
 
annotation_guidelines:
  title: "Music Tagging Guidelines"
  content: |
    ## Genre Selection
    - Choose 1-3 genres that best describe the track
    - Select the most specific applicable genre
    - When unsure, lean toward broader categories
 
    ## Mood Assessment
    - Consider the overall emotional tone
    - A track can have multiple moods
    - Think about how the music makes you feel
 
    ## Energy & Danceability
    - Energy: tempo, intensity, power
    - Danceability: rhythm, beat clarity, groove
 
    ## Listening Tips
    - Listen to at least 30 seconds
    - Pay attention to drops, changes, and transitions
    - Consider instrumentation and production style
 

출력 형식

json
{
  "id": "track_001",
  "audio_url": "/audio/preview_001.mp3",
  "metadata": {
    "artist": "The Beatles",
    "title": "Here Comes the Sun",
    "album": "Abbey Road",
    "year": 1969
  },
  "annotations": {
    "genres": ["Rock", "Pop"],
    "mood": ["Happy/Uplifting", "Nostalgic"],
    "energy": 3,
    "danceability": 2,
    "vocals": "Balanced",
    "era": "Pre-1970s (Classic)",
    "flags": [],
    "confidence": 5
  }
}

음악 어노테이션 팁

  1. 좋은 헤드폰이 중요합니다. 노트북 스피커로는 장르 단서를 놓치게 됩니다.
  2. 볼륨을 일정하게 유지해 계속 조정하면서 피로해지지 않도록 하십시오.
  3. 도입부만 듣지 말고, 변화를 포착할 만큼 각 미리듣기를 충분히 들으십시오.
  4. 시작하기 전에 각 장르가 실제로 무엇을 의미하는지 다시 익혀 두십시오.
  5. 휴식을 취하십시오. 지친 귀는 더 나쁜 판단을 내립니다.

다음으로 갈 곳

오디오 어노테이션 옵션 전체(파형 표시, 재생 컨트롤, 구간 선택)는 원본 문서를 참고하십시오.


전체 오디오 문서는 /docs/features/audio-annotation에 있습니다.