Tutorials6 min read
تصنيف المشاعر في الكلام
إنشاء مهمة تصنيف مشاعر صوتية مع عرض الموجة الصوتية وعناصر التحكم في سرعة التشغيل ومقاييس ليكرت.
Potato Team·
تصنيف المشاعر في الكلام
يدعم التعرف على المشاعر في الكلام (SER) المساعدين الافتراضيين وتطبيقات الصحة النفسية وتحليلات خدمة العملاء. يوضح هذا الدليل التعليمي كيفية بناء واجهات توسيم للمشاعر الفئوية والتقييمات البُعدية واكتشاف المشاعر المختلطة.
مناهج توسيم المشاعر
هناك عدة طرق لتوسيم مشاعر الكلام:
- فئوية: تسميات منفصلة (سعيد، حزين، غاضب)
- بُعدية: مقاييس مستمرة (التكافؤ، الاستثارة، السيطرة)
- مختلطة: مشاعر متعددة مع تقييمات الشدة
- على مستوى المقطع: مشاعر مختلفة عند طوابع زمنية مختلفة
تصنيف المشاعر الفئوي
الإعداد الأساسي
yaml
annotation_task_name: "Speech Emotion Recognition"
data_files:
- data/utterances.json
item_properties:
id_key: id
audio_key: audio_path
text_key: transcript # Optional transcript
audio:
enabled: true
display: waveform
waveform_color: "#8B5CF6"
progress_color: "#A78BFA"
speed_control: true
speed_options: [0.75, 1.0, 1.25]
annotation_schemes:
- annotation_type: radio
name: emotion
description: "What emotion is expressed in this speech?"
labels:
- name: Happy
description: "Joy, excitement, amusement"
keyboard_shortcut: "h"
- name: Sad
description: "Sorrow, disappointment, grief"
keyboard_shortcut: "s"
- name: Angry
description: "Frustration, irritation, rage"
keyboard_shortcut: "a"
- name: Fearful
description: "Anxiety, worry, terror"
keyboard_shortcut: "f"
- name: Surprised
description: "Astonishment, shock"
keyboard_shortcut: "u"
- name: Disgusted
description: "Revulsion, distaste"
keyboard_shortcut: "d"
- name: Neutral
description: "No clear emotion"
keyboard_shortcut: "n"
required: trueإضافة الشدة
yaml
annotation_schemes:
- annotation_type: radio
name: emotion
labels: [Happy, Sad, Angry, Fearful, Surprised, Disgusted, Neutral]
required: true
- annotation_type: likert
name: intensity
description: "How intense is this emotion?"
size: 5
min_label: "Very weak"
max_label: "Very strong"
conditional:
depends_on: emotion
hide_when: ["Neutral"]توسيم المشاعر البُعدي
نموذج VAD (التكافؤ-الاستثارة-السيطرة):
yaml
annotation_task_name: "Dimensional Emotion Rating"
annotation_schemes:
# Valence: negative to positive
- annotation_type: likert
name: valence
description: "Valence: How positive or negative?"
size: 7
min_label: "Very negative"
max_label: "Very positive"
# Arousal: calm to excited
- annotation_type: likert
name: arousal
description: "Arousal: How calm or excited?"
size: 7
min_label: "Very calm"
max_label: "Very excited"
# Dominance: submissive to dominant
- annotation_type: likert
name: dominance
description: "Dominance: How submissive or dominant?"
size: 7
min_label: "Very submissive"
max_label: "Very dominant"المقاييس المرئية (SAM)
أسلوب مقياس التقييم الذاتي:
yaml
annotation_schemes:
- annotation_type: image_scale
name: valence
description: "Select the figure that matches the emotional valence"
images:
- path: /images/sam_valence_1.png
value: 1
- path: /images/sam_valence_2.png
value: 2
# ... etc
size: 9اكتشاف المشاعر المختلطة
للكلام الذي يحتوي على مشاعر متعددة:
yaml
annotation_schemes:
- annotation_type: multiselect
name: emotions_present
description: "Select ALL emotions you detect (can be multiple)"
labels:
- Happy
- Sad
- Angry
- Fearful
- Surprised
- Disgusted
- Contempt
min_selections: 1
- annotation_type: radio
name: primary_emotion
description: "Which emotion is MOST prominent?"
labels:
- Happy
- Sad
- Angry
- Fearful
- Surprised
- Disgusted
- Contempt
- Mixed (no dominant)توسيم المشاعر الشامل
yaml
annotation_task_name: "Comprehensive Speech Emotion Annotation"
data_files:
- data/speech_samples.json
item_properties:
id_key: id
audio_key: audio_url
text_key: transcript
audio:
enabled: true
display: waveform
waveform_color: "#EC4899"
progress_color: "#F472B6"
height: 120
speed_control: true
speed_options: [0.5, 0.75, 1.0, 1.25]
show_duration: true
autoplay: false
# Show transcript if available
display:
show_text: true
text_field: transcript
text_label: "Transcript (for reference)"
annotation_schemes:
# Primary categorical emotion
- annotation_type: radio
name: primary_emotion
description: "Primary emotion expressed"
labels:
- name: Happiness
color: "#FCD34D"
keyboard_shortcut: "1"
- name: Sadness
color: "#60A5FA"
keyboard_shortcut: "2"
- name: Anger
color: "#F87171"
keyboard_shortcut: "3"
- name: Fear
color: "#A78BFA"
keyboard_shortcut: "4"
- name: Surprise
color: "#34D399"
keyboard_shortcut: "5"
- name: Disgust
color: "#FB923C"
keyboard_shortcut: "6"
- name: Neutral
color: "#9CA3AF"
keyboard_shortcut: "7"
required: true
# Emotional intensity
- annotation_type: likert
name: intensity
description: "Emotional intensity"
size: 5
min_label: "Very mild"
max_label: "Very intense"
required: true
# Dimensional ratings
- annotation_type: likert
name: valence
description: "Valence (negative to positive)"
size: 7
min_label: "Negative"
max_label: "Positive"
- annotation_type: likert
name: arousal
description: "Arousal (calm to excited)"
size: 7
min_label: "Calm"
max_label: "Excited"
# Voice quality
- annotation_type: multiselect
name: voice_qualities
description: "Voice characteristics (select all that apply)"
labels:
- Trembling voice
- Raised pitch
- Lowered pitch
- Loud/shouting
- Soft/whisper
- Fast speech rate
- Slow speech rate
- Breathy
- Tense/strained
- Crying
- Laughing
# Genuineness
- annotation_type: radio
name: authenticity
description: "Does the emotion seem genuine?"
labels:
- Clearly genuine
- Likely genuine
- Uncertain
- Likely acted/fake
- Clearly acted/fake
# Confidence
- annotation_type: likert
name: confidence
description: "How confident are you in your annotation?"
size: 5
min_label: "Guessing"
max_label: "Certain"
annotation_guidelines:
title: "Emotion Annotation Guidelines"
content: |
## Listening Instructions
1. Listen to the entire clip before annotating
2. You may replay as many times as needed
3. Focus on the VOICE, not just the words
## Emotion Categories
- **Happiness**: Joy, amusement, contentment
- **Sadness**: Sorrow, disappointment, melancholy
- **Anger**: Frustration, irritation, rage
- **Fear**: Anxiety, nervousness, terror
- **Surprise**: Astonishment, startle
- **Disgust**: Revulsion, contempt
- **Neutral**: Calm, matter-of-fact
## Tips
- Consider tone, pitch, speaking rate
- The transcript may not match the emotion
- When unsure between two emotions, choose the stronger one
- Use the intensity scale for unclear cases
output_annotation_dir: annotations/
output_annotation_format: jsonlصيغة الإخراج
json
{
"id": "utt_001",
"audio_url": "/audio/sample_001.wav",
"transcript": "I can't believe this happened!",
"annotations": {
"primary_emotion": "Surprise",
"intensity": 4,
"valence": 2,
"arousal": 6,
"voice_qualities": ["Raised pitch", "Fast speech rate"],
"authenticity": "Clearly genuine",
"confidence": 4
},
"annotator": "rater_01",
"timestamp": "2024-12-05T10:30:00Z"
}المشاعر على مستوى المقطع
للتسجيلات الصوتية الطويلة ذات المشاعر المتغيرة:
yaml
annotation_schemes:
- annotation_type: audio_segments
name: emotion_segments
description: "Mark time segments with different emotions"
labels:
- name: Happy
color: "#FCD34D"
- name: Sad
color: "#60A5FA"
- name: Angry
color: "#F87171"
- name: Neutral
color: "#9CA3AF"
segment_attributes:
- name: intensity
type: likert
size: 5ضبط الجودة
yaml
quality_control:
attention_checks:
enabled: true
gold_items:
- audio: "/audio/gold/clearly_happy.wav"
expected:
primary_emotion: "Happiness"
intensity: [4, 5] # Accept 4 or 5
- audio: "/audio/gold/clearly_angry.wav"
expected:
primary_emotion: "Anger"نصائح لتوسيم المشاعر
- الاستماع الكامل: استمع دائماً للمقطع كاملاً
- التركيز على الصوت: المعلومات العاطفية في كيفية قول الأشياء
- الوعي الثقافي: تختلف معايير التعبير بين الثقافات
- إدارة الإرهاق: خذ فترات راحة - توسيم المشاعر مُرهق
- المعايرة: المناقشات المنتظمة مع الفريق تحسن الاتساق
الخطوات التالية
- أضف تحديد المتحدثين لتتبع مشاعر المتحدثين المتعددين
- أعدّ التعهيد الجماعي للجمع على نطاق واسع
- احسب اتفاق المُوسِّمين لمهام المشاعر
الوثائق في /docs/features/audio-annotation.