CHILDES Child Language Multi-Tier Annotation

Multi-tier ELAN-style annotation of child-adult interaction videos for language acquisition research. Annotators segment utterance boundaries on the timeline, provide morphological and syntactic annotations, and classify communicative context and error types. Based on the CHILDES/TalkBank project.

Configuration Fileconfig.yaml

This Potato config reproduces the annotation task. Save it as config.yaml and run potato start config.yaml to try it.

yaml

# CHILDES Child Language Multi-Tier Annotation Configuration
# Based on MacWhinney, Journal of Child Language 2000
# Paper: https://doi.org/10.1017/S0305000900003581
# Task: ELAN-style multi-tier annotation of child-adult interaction for language acquisition

annotation_task_name: "CHILDES Child Language Multi-Tier Annotation"
task_dir: "."

# Data configuration
data_files:
  - sample-data.json
item_properties:
  id_key: "id"
  text_key: "video_url"

# Output
output_annotation_dir: "annotation_output/"
output_annotation_format: "json"

# Annotation schemes - ELAN-style parallel tiers aligned to the video timeline
annotation_schemes:
  # Tier 1: Utterance boundary segmentation
  - name: "utterance_tier"
    description: |
      Segment the video timeline into individual utterances. Mark who is
      speaking (child or adult) and identify overlapping speech, non-verbal
      vocalizations, and unintelligible segments.
    annotation_type: "video_annotation"
    mode: "segment"
    labels:
      - name: "child-utterance"
        color: "#3B82F6"
        tooltip: "Clear, interpretable utterance produced by the child"
      - name: "adult-utterance"
        color: "#10B981"
        tooltip: "Utterance produced by the adult caregiver or interlocutor"
      - name: "overlap"
        color: "#F59E0B"
        tooltip: "Child and adult speaking simultaneously"
      - name: "vocalization"
        color: "#A855F7"
        tooltip: "Non-linguistic vocalization (babbling, cooing, crying, laughing)"
      - name: "unintelligible"
        color: "#EF4444"
        tooltip: "Speech that cannot be reliably transcribed"
    show_timecode: true
    video_fps: 25

  # Tier 2: Morphological annotation (free text)
  - name: "morphology"
    description: |
      Provide a morphological annotation of the child's utterance using CHAT
      conventions. Break words into morphemes and mark inflectional morphology
      (e.g., want-3SG, go-PAST, dog-PL).
    annotation_type: text
    textarea: true

  # Tier 3: Syntactic annotation (free text)
  - name: "syntax"
    description: |
      Provide a syntactic annotation of the child's utterance. Note phrase
      structure, word order patterns, and any syntactic constructions
      (e.g., SVO, Wh-question, negation, relative clause).
    annotation_type: text
    textarea: true

  # Tier 4: Communicative context classification
  - name: "communicative_context"
    description: "Classify the communicative context or function of the child's utterance."
    annotation_type: radio
    labels:
      - "spontaneous"
      - "imitation"
      - "routine"
      - "response"
      - "question"
      - "self-talk"
      - "directed-speech"
    keyboard_shortcuts:
      spontaneous: "1"
      imitation: "2"
      routine: "3"
      response: "4"
      question: "5"
      self-talk: "6"
      directed-speech: "7"

  # Tier 5: Error type classification
  - name: "error_type"
    description: "Classify the type of linguistic error in the child's utterance, if any."
    annotation_type: radio
    labels:
      - "none"
      - "phonological"
      - "morphological"
      - "syntactic"
      - "lexical"
      - "pragmatic"
    keyboard_shortcuts:
      none: "q"
      phonological: "w"
      morphological: "e"
      syntactic: "r"
      lexical: "t"
      pragmatic: "y"

# HTML layout
html_layout: |
  <div style="max-width: 900px; margin: 0 auto;">
    <h3 style="margin-bottom: 8px;">CHILDES: Multi-Tier Child Language Annotation</h3>
    <p style="color: #666; font-size: 14px; margin-bottom: 16px;">
      Annotate child-adult interaction videos across multiple tiers for utterance
      boundaries, morphology, syntax, communicative context, and error analysis.
    </p>
    <div style="text-align: center; margin-bottom: 20px;">
      <video controls width="720" style="max-width: 100%; border-radius: 8px; border: 1px solid #ddd;">
        <source src="{{video_url}}" type="video/mp4">
        Your browser does not support video playback.
      </video>
    </div>
    <div style="background: #f8f9fa; padding: 12px; border-radius: 6px; margin-bottom: 16px; font-size: 13px;">
      <strong>Multi-Tier Instructions:</strong> Annotate the child-adult interaction across
      five parallel tiers: utterance segmentation, morphological coding, syntactic structure,
      communicative context, and error classification. Focus primarily on the child's productions.
    </div>
  </div>

# User configuration
allow_all_users: true

# Task assignment
instances_per_annotator: 30
annotation_per_instance: 2

# Instructions
annotation_instructions: |
  ## CHILDES Child Language Multi-Tier Annotation

  This task uses ELAN-style multi-tier annotation for child-adult interaction
  videos following CHILDES/TalkBank conventions.

  ### Tier 1: Utterance Boundary Segmentation
  - Segment the timeline into individual utterances:
    - **Child utterance**: Interpretable speech produced by the child
    - **Adult utterance**: Speech from the caregiver or adult interlocutor
    - **Overlap**: Simultaneous speech from both parties
    - **Vocalization**: Non-linguistic sounds (babbling, cooing, crying, laughing)
    - **Unintelligible**: Speech that cannot be reliably understood
  - Use intonation contours and pauses to determine utterance boundaries
  - An utterance is defined as a single communicative unit with one intonation contour

  ### Tier 2: Morphological Annotation
  - Code the child's utterance morphologically using CHAT-style notation:
    - Separate morphemes with hyphens: "want-3SG", "dog-PL", "go-PAST"
    - Mark overregularizations: "go-ed" for "went" (overregularized past)
    - Use standard glosses: PL (plural), PAST (past tense), PROG (progressive),
      3SG (third person singular), POSS (possessive), NEG (negation)
  - Leave blank for adult utterances or unintelligible segments

  ### Tier 3: Syntactic Annotation
  - Note the syntactic structure of the child's utterance:
    - Word order pattern (SVO, SV, VO, single word, etc.)
    - Sentence type (declarative, interrogative, imperative, exclamatory)
    - Notable constructions (negation, questions, relative clauses, coordination)
    - Missing obligatory elements (e.g., "want cookie" = missing determiner)

  ### Tier 4: Communicative Context
  - Classify the communicative function:
    - **Spontaneous**: Self-initiated utterance, not prompted
    - **Imitation**: Direct repetition of an adult model
    - **Routine**: Part of a practiced routine (counting, song, greeting)
    - **Response**: Answer to an adult question or prompt
    - **Question**: Child asking a question
    - **Self-talk**: Speech directed to self or toys, not to the adult
    - **Directed speech**: Speech clearly addressed to a specific person

  ### Tier 5: Error Classification
  - Identify the primary error type, if any:
    - **None**: Target-like production
    - **Phonological**: Sound substitution, deletion, or addition
    - **Morphological**: Missing or incorrect inflection (e.g., "goed" for "went")
    - **Syntactic**: Word order error, missing function words
    - **Lexical**: Wrong word choice or neologism
    - **Pragmatic**: Contextually inappropriate utterance

  ### Developmental Notes
  - Child age is provided in the metadata; keep developmental expectations in mind
  - What counts as an "error" depends on the child's age and stage
  - At early stages (12-24 months), single words and babbling are expected
  - By 36+ months, expect more complex multi-word utterances

Sample Datasample-data.json

json

[
  {
    "id": "childes_001",
    "video_url": "https://example.com/videos/childes/adam_freeplay_24m.mp4",
    "child_id": "adam",
    "child_age_months": 24,
    "language": "English",
    "recording_context": "free-play"
  },
  {
    "id": "childes_002",
    "video_url": "https://example.com/videos/childes/sarah_mealtime_30m.mp4",
    "child_id": "sarah",
    "child_age_months": 30,
    "language": "English",
    "recording_context": "mealtime"
  }
]

// ... and 8 more items

Get This Design

View on GitHub

Clone or download from the repository

Quick start:

git clone https://github.com/davidjurgens/potato-showcase.git
cd potato-showcase/video/childes-child-language
potato start config.yaml

Dataset & paper

MacWhinney, Journal of Child Language 2000

Official dataset ↗Read the paper ↗

Citation (BibTeX)

bibtex

@article{macwhinney2000childes,
  title={The CHILDES Project: Tools for Analyzing Talk},
  author={MacWhinney, Brian},
  journal={Journal of Child Language},
  volume={27},
  number={2},
  pages={375--382},
  year={2000},
  publisher={Cambridge University Press}
}

Details

Annotation Types

video_annotationtextradio

Domain

Child Language AcquisitionDevelopmental PsychologyLinguistics

Use Cases

Language Development TrackingError AnalysisChild-Adult Interaction

Related Designs

DGS Corpus Sign Language Multi-Tier Annotation

Multi-tier ELAN-style annotation of German Sign Language (DGS) corpus videos. Annotators segment sign types, mouth gestures, non-manual signals, classify discourse functions, and provide German translations across parallel tiers aligned to the video timeline.

video_annotationradio

SaGA Gesture-Speech Alignment Multi-Tier Annotation

Multi-tier ELAN-style annotation of co-speech gestures and their alignment with spoken language. Annotators segment gesture phases and types on parallel timeline tiers, classify handedness and spatial reference frames, and transcribe concurrent speech. Based on the SaGA corpus.

video_annotationradio

CMU-MOSEI: Multimodal Sentiment and Emotion Dataset

CMU-MOSEI is the largest multimodal dataset for sentiment and emotion analysis, with 23,453 annotated YouTube clips spanning text, audio, and video. This Potato config reproduces its multi-tier timeline annotation.

video_annotationradio

CHILDES Child Language Multi-Tier Annotation

Configuration Fileconfig.yaml

Sample Datasample-data.json

Get This Design

Dataset & paper

Details

Annotation Types

Domain

Use Cases

Tags

Related Designs

DGS Corpus Sign Language Multi-Tier Annotation

SaGA Gesture-Speech Alignment Multi-Tier Annotation

CMU-MOSEI: Multimodal Sentiment and Emotion Dataset