Showcase/Audio-Visual Sentiment Analysis
intermediateaudio

Audio-Visual Sentiment Analysis

Rate sentiment in speech segments following CMU-MOSI and CMU-MOSEI multimodal annotation protocols.

🎧

audio annotation

Configuration Fileconfig.yaml

task_name: "Audio-Visual Sentiment Analysis"

# Server configuration
server:
  port: 8000

# Audio settings
audio:
  enabled: true
  display: waveform
  waveform_color: "#6366F1"
  progress_color: "#818CF8"
  speed_control: true

# Data configuration
data_files:
  - path: data/speech_segments.json
    audio_field: audio_file
    text_field: transcript

# Annotation schemes
annotation_schemes:
  # Sentiment polarity (7-point scale, CMU-MOSI standard)
  - annotation_type: likert
    name: sentiment
    description: "Rate the overall sentiment expressed"
    size: 7
    labels:
      - "-3: Strongly negative"
      - "-2: Negative"
      - "-1: Weakly negative"
      - "0: Neutral"
      - "+1: Weakly positive"
      - "+2: Positive"
      - "+3: Strongly positive"

  # Sentiment intensity
  - annotation_type: likert
    name: intensity
    description: "How intensely is the sentiment expressed?"
    size: 5
    min_label: "Very subtle"
    max_label: "Very intense"

  # Primary sentiment source
  - annotation_type: radio
    name: sentiment_source
    description: "What conveys the sentiment most strongly?"
    labels:
      - Tone of voice
      - Word choice (from transcript)
      - Both equally
      - Sentiment unclear

  # Subjectivity
  - annotation_type: likert
    name: subjectivity
    description: "How subjective/opinionated is this statement?"
    size: 5
    min_label: "Purely factual"
    max_label: "Highly opinionated"

  # Emotion detected
  - annotation_type: radio
    name: emotion
    description: "What emotion (if any) accompanies the sentiment?"
    labels:
      - Happiness/joy
      - Anger/frustration
      - Sadness/disappointment
      - Fear/anxiety
      - Surprise
      - Disgust
      - No clear emotion
      - Mixed emotions

  # Speaker certainty
  - annotation_type: likert
    name: speaker_certainty
    description: "How certain does the speaker sound about their opinion?"
    size: 5
    min_label: "Very uncertain"
    max_label: "Very certain"

  # Sarcasm detection
  - annotation_type: radio
    name: sarcasm
    description: "Does the speaker appear sarcastic?"
    labels:
      - No sarcasm detected
      - Possibly sarcastic
      - Clearly sarcastic

  # Annotation confidence
  - annotation_type: likert
    name: confidence
    description: "How confident are you in your sentiment rating?"
    size: 5
    min_label: "Not confident"
    max_label: "Very confident"

# User settings
allow_all_users: true
instances_per_annotator: 150

# Output
output:
  path: annotations/
  format: json

Get This Design

This design is available in our showcase. Copy the configuration below to get started.

Quick start:

# Create your project folder
mkdir audio-sentiment-analysis
cd audio-sentiment-analysis
# Copy config.yaml from above
potato start config.yaml

Details

Annotation Types

likertradio

Domain

AudioSpeech

Use Cases

sentiment analysismultimodal analysisopinion mining

Tags

audiosentimentmultimodalopinioncmu-mosi