Showcase/Speech Intelligibility Rating
intermediateaudio

Speech Intelligibility Rating

Rate speech intelligibility for pathological speech following TORGO database annotation protocols.

🎧

audio annotation

Configuration Fileconfig.yaml

task_name: "Speech Intelligibility Rating"

# Server configuration
server:
  port: 8000

# Audio settings
audio:
  enabled: true
  display: waveform
  waveform_color: "#14B8A6"
  progress_color: "#2DD4BF"
  speed_control: true
  speed_options: [0.5, 0.75, 1.0, 1.25]

# Data configuration
data_files:
  - path: data/speech_samples.json
    audio_field: audio_file
    text_field: target_text  # What was supposed to be said

# Annotation schemes
annotation_schemes:
  # Transcription (orthographic)
  - annotation_type: text
    name: transcription
    description: "Write exactly what you hear (orthographic transcription)"
    textarea: false
    placeholder: "Type what you hear..."

  # Overall intelligibility
  - annotation_type: likert
    name: intelligibility
    description: "Overall speech intelligibility"
    size: 5
    labels:
      - "1: Unintelligible (cannot understand)"
      - "2: Mostly unintelligible"
      - "3: Partially intelligible"
      - "4: Mostly intelligible"
      - "5: Fully intelligible (clear)"

  # Severity rating (clinical scale)
  - annotation_type: radio
    name: severity
    description: "Speech disorder severity (if applicable)"
    labels:
      - Normal (no apparent disorder)
      - Mild (noticeable but easily understood)
      - Moderate (requires effort to understand)
      - Severe (very difficult to understand)
      - Profound (essentially unintelligible)

  # Articulation issues
  - annotation_type: multiselect
    name: articulation_issues
    description: "What articulation issues are present? (Select all)"
    labels:
      - Slurred consonants
      - Vowel distortions
      - Sound substitutions
      - Sound omissions
      - Hypernasality
      - Breathy voice
      - Strained voice
      - Monopitch (flat intonation)
      - Slow rate
      - Fast/rushed rate
      - Irregular rhythm
      - None apparent

  # Speech rate
  - annotation_type: radio
    name: speech_rate
    description: "How would you characterize the speech rate?"
    labels:
      - Much too slow
      - Somewhat slow
      - Normal rate
      - Somewhat fast
      - Much too fast/rushed

  # Effort to understand
  - annotation_type: likert
    name: listener_effort
    description: "How much effort was required to understand?"
    size: 5
    min_label: "No effort"
    max_label: "Extreme effort"

  # Speaker consistency
  - annotation_type: radio
    name: consistency
    description: "Was intelligibility consistent throughout?"
    labels:
      - Yes, consistent throughout
      - Variable (some parts clearer than others)
      - Progressively worse
      - Progressively better

  # Audio quality impact
  - annotation_type: radio
    name: quality_impact
    description: "Did recording quality affect your rating?"
    labels:
      - No (good quality recording)
      - Minor impact
      - Significant impact
      - Cannot rate due to poor quality

  # Confidence
  - annotation_type: likert
    name: confidence
    description: "Confidence in your intelligibility rating"
    size: 5
    min_label: "Low"
    max_label: "High"

# User settings
allow_all_users: true
instances_per_annotator: 100

# Output
output:
  path: annotations/
  format: json

Get This Design

This design is available in our showcase. Copy the configuration below to get started.

Quick start:

# Create your project folder
mkdir speech-intelligibility-rating
cd speech-intelligibility-rating
# Copy config.yaml from above
potato start config.yaml

Details

Annotation Types

likertradiotext

Domain

AudioSpeechMedical

Use Cases

speech pathologyintelligibility ratingdysarthria assessment

Tags

audiospeech pathologyintelligibilitydysarthriaclinical