intermediateaudio
Audio-Visual Sentiment Analysis
Rate sentiment in speech segments following CMU-MOSI and CMU-MOSEI multimodal annotation protocols.
🎧
audio annotation
Configuration Fileconfig.yaml
task_name: "Audio-Visual Sentiment Analysis"
# Server configuration
server:
port: 8000
# Audio settings
audio:
enabled: true
display: waveform
waveform_color: "#6366F1"
progress_color: "#818CF8"
speed_control: true
# Data configuration
data_files:
- path: data/speech_segments.json
audio_field: audio_file
text_field: transcript
# Annotation schemes
annotation_schemes:
# Sentiment polarity (7-point scale, CMU-MOSI standard)
- annotation_type: likert
name: sentiment
description: "Rate the overall sentiment expressed"
size: 7
labels:
- "-3: Strongly negative"
- "-2: Negative"
- "-1: Weakly negative"
- "0: Neutral"
- "+1: Weakly positive"
- "+2: Positive"
- "+3: Strongly positive"
# Sentiment intensity
- annotation_type: likert
name: intensity
description: "How intensely is the sentiment expressed?"
size: 5
min_label: "Very subtle"
max_label: "Very intense"
# Primary sentiment source
- annotation_type: radio
name: sentiment_source
description: "What conveys the sentiment most strongly?"
labels:
- Tone of voice
- Word choice (from transcript)
- Both equally
- Sentiment unclear
# Subjectivity
- annotation_type: likert
name: subjectivity
description: "How subjective/opinionated is this statement?"
size: 5
min_label: "Purely factual"
max_label: "Highly opinionated"
# Emotion detected
- annotation_type: radio
name: emotion
description: "What emotion (if any) accompanies the sentiment?"
labels:
- Happiness/joy
- Anger/frustration
- Sadness/disappointment
- Fear/anxiety
- Surprise
- Disgust
- No clear emotion
- Mixed emotions
# Speaker certainty
- annotation_type: likert
name: speaker_certainty
description: "How certain does the speaker sound about their opinion?"
size: 5
min_label: "Very uncertain"
max_label: "Very certain"
# Sarcasm detection
- annotation_type: radio
name: sarcasm
description: "Does the speaker appear sarcastic?"
labels:
- No sarcasm detected
- Possibly sarcastic
- Clearly sarcastic
# Annotation confidence
- annotation_type: likert
name: confidence
description: "How confident are you in your sentiment rating?"
size: 5
min_label: "Not confident"
max_label: "Very confident"
# User settings
allow_all_users: true
instances_per_annotator: 150
# Output
output:
path: annotations/
format: json
Get This Design
This design is available in our showcase. Copy the configuration below to get started.
Quick start:
# Create your project folder mkdir audio-sentiment-analysis cd audio-sentiment-analysis # Copy config.yaml from above potato start config.yaml
Details
Annotation Types
likertradio
Domain
AudioSpeech
Use Cases
sentiment analysismultimodal analysisopinion mining
Tags
audiosentimentmultimodalopinioncmu-mosi
Related Designs
Speech Emotion Recognition
Classify emotional content in speech following IEMOCAP and CREMA-D annotation schemes.
radiolikert
Speech Intelligibility Rating
Rate speech intelligibility for pathological speech following TORGO database annotation protocols.
likertradio
Speech Quality MOS Rating
Rate speech quality using Mean Opinion Score following ITU-T P.800 and Blizzard Challenge protocols.
likertradio