Tutorials5 min read
음악 장르 분류 어노테이션
파형 시각화, 30초 미리듣기 재생, 계층형 장르 레이블 트리, 쌍대 선호 비교를 갖춘 음악 어노테이션 작업을 Potato에서 만드는 방법을 설명합니다.
Potato Team
장르 레이블은 추천 시스템, 자동 생성 플레이리스트, 음악 발견 기능이 돌아가는 기반입니다. 이 글에서는 오디오 미리듣기와 중첩 장르 트리를 활용해 Potato에서 장르 태깅 인터페이스를 구축하는 방법을 보여 드립니다.
기본 장르 분류
yaml
annotation_task_name: "Music Genre Classification"
data_files:
- data/tracks.json
item_properties:
audio_path: audio_preview
metadata_fields: [artist, title, album]
# Show track metadata
display:
show_metadata: true
metadata_template: "{{artist}} - {{title}}"
annotation_schemes:
- annotation_type: audio_annotation
audio_display: waveform
waveform_color: "#EC4899"
progress_color: "#F472B6"
speed_control: true
show_duration: true
name: primary_genre
description: "Select the primary genre"
labels:
- Rock
- Pop
- Hip-Hop
- Electronic
- Jazz
- Classical
- R&B
- Country
- Metal
- Folk중첩 장르 분류 체계
yaml
annotation_schemes:
# Top-level genre
- annotation_type: radio
name: genre
description: "Primary genre"
labels:
- Rock
- Electronic
- Hip-Hop
- Pop
- Jazz
- Classical
# Subgenre (conditional)
- annotation_type: radio
name: subgenre
description: "Subgenre"
conditional:
depends_on: genre
options:
Rock:
- Classic Rock
- Alternative
- Indie Rock
- Hard Rock
- Punk
- Grunge
Electronic:
- House
- Techno
- Trance
- Drum & Bass
- Ambient
- Dubstep
"Hip-Hop":
- East Coast
- West Coast
- Trap
- Conscious
- Lo-fi
Pop:
- Synth Pop
- Dance Pop
- Indie Pop
- K-Pop
- Teen Pop
Jazz:
- Bebop
- Smooth Jazz
- Fusion
- Big Band
- Cool Jazz
Classical:
- Baroque
- Romantic
- Contemporary
- Opera
- Chamber다중 레이블 장르 태깅
많은 트랙이 둘 이상의 장르에 걸쳐 있으므로, 어노테이터가 여러 개를 고를 수 있게 하십시오:
yaml
annotation_schemes:
- annotation_type: multiselect
name: genres
description: "Select ALL applicable genres (up to 3)"
labels:
- Rock
- Pop
- Electronic
- Hip-Hop
- R&B
- Jazz
- Classical
- Country
- Folk
- Metal
- Punk
- Indie
- World
- Latin
min_selections: 1
max_selections: 3
- annotation_type: radio
name: primary_genre
description: "Which is the PRIMARY genre?"
dynamic_from: genres # Only show selected genres as options완전한 음악 어노테이션 설정
yaml
annotation_task_name: "Music Tagging and Classification"
data_files:
- data/music_library.json
item_properties:
audio_path: preview_url
metadata_fields:
- artist
- title
- album
- year
- duration
display:
show_metadata: true
metadata_layout: card
fields:
- label: "Artist"
field: artist
- label: "Track"
field: title
- label: "Album"
field: album
- label: "Year"
field: year
annotation_schemes:
# Audio playback configuration
- annotation_type: audio_annotation
audio_display: waveform
waveform_color: "#8B5CF6"
progress_color: "#A78BFA"
cursor_color: "#F59E0B"
height: 100
show_duration: true
show_current_time: true
speed_control: true
speed_options: [0.75, 1.0, 1.25, 1.5]
volume_control: true
default_volume: 0.7
autoplay: false
loop: true
# Genre classification
- annotation_type: multiselect
name: genres
description: "Select all applicable genres"
labels:
- Rock
- Pop
- Electronic/Dance
- Hip-Hop/Rap
- R&B/Soul
- Jazz
- Classical
- Country
- Folk/Acoustic
- Metal
- Punk
- Indie
- World/International
- Latin
- Blues
- Reggae
min_selections: 1
max_selections: 3
# Mood/Energy
- annotation_type: multiselect
name: mood
description: "Select the mood(s) of this track"
labels:
- Happy/Uplifting
- Sad/Melancholic
- Energetic
- Calm/Relaxing
- Aggressive
- Romantic
- Dark/Moody
- Nostalgic
- Motivational
min_selections: 1
# Energy level
- annotation_type: likert
name: energy
description: "Energy level"
size: 5
min_label: "Very low energy"
max_label: "Very high energy"
# Danceability
- annotation_type: likert
name: danceability
description: "How danceable is this track?"
size: 5
min_label: "Not danceable"
max_label: "Very danceable"
# Vocal content
- annotation_type: radio
name: vocals
description: "Vocal content"
labels:
- Instrumental only
- Mostly instrumental
- Balanced
- Mostly vocals
- Vocals dominant
# Era/Style
- annotation_type: radio
name: era
description: "Musical era/style"
labels:
- Pre-1970s (Classic)
- 1970s
- 1980s
- 1990s
- 2000s
- 2010s
- 2020s/Current
# Quality flags
- annotation_type: multiselect
name: flags
description: "Special flags (if applicable)"
labels:
- Explicit content
- Live recording
- Remix/Cover
- Holiday/Seasonal
- Soundtrack
- Spoken word sections
required: false
# Confidence
- annotation_type: likert
name: confidence
description: "Confidence in your genre classification"
size: 5
min_label: "Uncertain"
max_label: "Very confident"
annotation_guidelines:
title: "Music Tagging Guidelines"
content: |
## Genre Selection
- Choose 1-3 genres that best describe the track
- Select the most specific applicable genre
- When unsure, lean toward broader categories
## Mood Assessment
- Consider the overall emotional tone
- A track can have multiple moods
- Think about how the music makes you feel
## Energy & Danceability
- Energy: tempo, intensity, power
- Danceability: rhythm, beat clarity, groove
## Listening Tips
- Listen to at least 30 seconds
- Pay attention to drops, changes, and transitions
- Consider instrumentation and production style
출력 형식
json
{
"id": "track_001",
"audio_url": "/audio/preview_001.mp3",
"metadata": {
"artist": "The Beatles",
"title": "Here Comes the Sun",
"album": "Abbey Road",
"year": 1969
},
"annotations": {
"genres": ["Rock", "Pop"],
"mood": ["Happy/Uplifting", "Nostalgic"],
"energy": 3,
"danceability": 2,
"vocals": "Balanced",
"era": "Pre-1970s (Classic)",
"flags": [],
"confidence": 5
}
}음악 어노테이션 팁
- 좋은 헤드폰이 중요합니다. 노트북 스피커로는 장르 단서를 놓치게 됩니다.
- 볼륨을 일정하게 유지해 계속 조정하면서 피로해지지 않도록 하십시오.
- 도입부만 듣지 말고, 변화를 포착할 만큼 각 미리듣기를 충분히 들으십시오.
- 시작하기 전에 각 장르가 실제로 무엇을 의미하는지 다시 익혀 두십시오.
- 휴식을 취하십시오. 지친 귀는 더 나쁜 판단을 내립니다.
다음으로 갈 곳
- 악기를 식별하려면 오디오 이벤트 탐지를 추가하십시오
- 데이터에 더 다양한 귀를 모으려면 크라우드소싱을 설정하십시오
- 음악에 대한 일치도 지표를 알아보십시오
오디오 어노테이션 옵션 전체(파형 표시, 재생 컨트롤, 구간 선택)는 원본 문서를 참고하십시오.
전체 오디오 문서는 /docs/features/audio-annotation에 있습니다.