intermediateaudio
Speech Emotion Recognition
Classify emotional states from speech audio including happiness, sadness, anger, fear, and more.
Archivo de configuraciónconfig.yaml
# Speech Emotion Recognition Configuration
# Classify emotional states from speech audio
task_dir: "."
annotation_task_name: "Speech Emotion Recognition"
data_files:
- "data/speech_clips.json"
item_properties:
id_key: "id"
text_key: "transcript"
audio_key: "audio_url"
text_display_key: "transcript"
user_config:
allow_all_users: true
annotation_schemes:
- annotation_type: "radio"
name: "primary_emotion"
description: "What is the primary emotion expressed?"
labels:
- name: "Neutral"
tooltip: "No strong emotion detected"
key_value: "n"
- name: "Happy"
tooltip: "Joy, excitement, contentment"
key_value: "h"
- name: "Sad"
tooltip: "Sorrow, disappointment, grief"
key_value: "s"
- name: "Angry"
tooltip: "Frustration, irritation, rage"
key_value: "a"
- name: "Fearful"
tooltip: "Anxiety, worry, terror"
key_value: "f"
- name: "Disgusted"
tooltip: "Revulsion, distaste"
key_value: "d"
- name: "Surprised"
tooltip: "Shock, amazement"
key_value: "u"
- annotation_type: "likert"
name: "emotion_intensity"
description: "How intense is the emotion?"
size: 5
min_label: "Very mild"
max_label: "Very intense"
- annotation_type: "likert"
name: "confidence"
description: "How confident are you in your judgment?"
size: 5
min_label: "Not confident"
max_label: "Very confident"
- annotation_type: "multiselect"
name: "secondary_emotions"
description: "Any secondary emotions present?"
labels:
- name: "Contempt"
- name: "Boredom"
- name: "Interest"
- name: "Confusion"
- name: "Embarrassment"
- name: "Pride"
- annotation_type: "radio"
name: "valence"
description: "Overall emotional valence"
labels:
- name: "Positive"
- name: "Neutral"
- name: "Negative"
audio_display:
show_waveform: true
playback_controls: true
allow_speed_control: true
output: "annotation_output/"
output_annotation_dir: "annotation_output/"
output_annotation_format: "json"
Datos de ejemplosample-data.json
[
{
"id": "emo_001",
"audio_url": "https://example.com/audio/emotion_sample1.wav",
"transcript": "I can't believe this is happening! This is the best day ever!",
"duration": 3.2
},
{
"id": "emo_002",
"audio_url": "https://example.com/audio/emotion_sample2.wav",
"transcript": "I'm so disappointed. I really thought things would be different.",
"duration": 4.1
}
]
// ... and 1 more itemsObtener este diseño
View on GitHub
Clone or download from the repository
Inicio rápido:
git clone https://github.com/davidjurgens/potato-showcase.git cd potato-showcase/audio/emotion-recognition potato start config.yaml
Detalles
Tipos de anotación
radiolikert
Dominio
speechaffective-computing
Casos de uso
emotion-detectionsentiment
Etiquetas
audioemotionspeechaffectivesentiment
¿Encontró un problema o desea mejorar este diseño?
Abrir un issueDiseños relacionados
Acoustic Scene Classification
Classify audio recordings by acoustic environment following the TUT/DCASE dataset format.
radiolikert
Audio Transcription Review
Review and correct automatic speech recognition transcriptions with waveform visualization.
likertmultiselect
Audio-Visual Sentiment Analysis
Rate sentiment in speech segments following CMU-MOSI and CMU-MOSEI multimodal annotation protocols.
likertradio