intermediateaudio
Speech Intelligibility Rating
Rate speech intelligibility for pathological speech following TORGO database annotation protocols.
🎧
audio annotation
Configuration Fileconfig.yaml
task_name: "Speech Intelligibility Rating"
# Server configuration
server:
port: 8000
# Audio settings
audio:
enabled: true
display: waveform
waveform_color: "#14B8A6"
progress_color: "#2DD4BF"
speed_control: true
speed_options: [0.5, 0.75, 1.0, 1.25]
# Data configuration
data_files:
- path: data/speech_samples.json
audio_field: audio_file
text_field: target_text # What was supposed to be said
# Annotation schemes
annotation_schemes:
# Transcription (orthographic)
- annotation_type: text
name: transcription
description: "Write exactly what you hear (orthographic transcription)"
textarea: false
placeholder: "Type what you hear..."
# Overall intelligibility
- annotation_type: likert
name: intelligibility
description: "Overall speech intelligibility"
size: 5
labels:
- "1: Unintelligible (cannot understand)"
- "2: Mostly unintelligible"
- "3: Partially intelligible"
- "4: Mostly intelligible"
- "5: Fully intelligible (clear)"
# Severity rating (clinical scale)
- annotation_type: radio
name: severity
description: "Speech disorder severity (if applicable)"
labels:
- Normal (no apparent disorder)
- Mild (noticeable but easily understood)
- Moderate (requires effort to understand)
- Severe (very difficult to understand)
- Profound (essentially unintelligible)
# Articulation issues
- annotation_type: multiselect
name: articulation_issues
description: "What articulation issues are present? (Select all)"
labels:
- Slurred consonants
- Vowel distortions
- Sound substitutions
- Sound omissions
- Hypernasality
- Breathy voice
- Strained voice
- Monopitch (flat intonation)
- Slow rate
- Fast/rushed rate
- Irregular rhythm
- None apparent
# Speech rate
- annotation_type: radio
name: speech_rate
description: "How would you characterize the speech rate?"
labels:
- Much too slow
- Somewhat slow
- Normal rate
- Somewhat fast
- Much too fast/rushed
# Effort to understand
- annotation_type: likert
name: listener_effort
description: "How much effort was required to understand?"
size: 5
min_label: "No effort"
max_label: "Extreme effort"
# Speaker consistency
- annotation_type: radio
name: consistency
description: "Was intelligibility consistent throughout?"
labels:
- Yes, consistent throughout
- Variable (some parts clearer than others)
- Progressively worse
- Progressively better
# Audio quality impact
- annotation_type: radio
name: quality_impact
description: "Did recording quality affect your rating?"
labels:
- No (good quality recording)
- Minor impact
- Significant impact
- Cannot rate due to poor quality
# Confidence
- annotation_type: likert
name: confidence
description: "Confidence in your intelligibility rating"
size: 5
min_label: "Low"
max_label: "High"
# User settings
allow_all_users: true
instances_per_annotator: 100
# Output
output:
path: annotations/
format: json
Get This Design
This design is available in our showcase. Copy the configuration below to get started.
Quick start:
# Create your project folder mkdir speech-intelligibility-rating cd speech-intelligibility-rating # Copy config.yaml from above potato start config.yaml
Details
Annotation Types
likertradiotext
Domain
AudioSpeechMedical
Use Cases
speech pathologyintelligibility ratingdysarthria assessment
Tags
audiospeech pathologyintelligibilitydysarthriaclinical
Related Designs
Audio-Visual Sentiment Analysis
Rate sentiment in speech segments following CMU-MOSI and CMU-MOSEI multimodal annotation protocols.
likertradio
Speech Emotion Recognition
Classify emotional content in speech following IEMOCAP and CREMA-D annotation schemes.
radiolikert
Speech Quality MOS Rating
Rate speech quality using Mean Opinion Score following ITU-T P.800 and Blizzard Challenge protocols.
likertradio