Dialogue Annotation
Annotate conversations and multi-item text with special display options.
Dialogue and List Annotation
Potato supports annotation of multi-item data where each instance contains a list of text elements. This is commonly used for:
- Dialogue annotation: Conversations with multiple turns
- Pairwise comparison: Comparing two or more text variants
- Multi-document tasks: Rating or labeling multiple related texts
Data Format
Input Data
Multi-item data is represented as a list of strings in the text field:
{"id": "conv_001", "text": ["Tom: Isn't this awesome?!", "Sam: Yes! I like you!", "Tom: Great!", "Sam: Awesome! Let's party!"]}
{"id": "conv_002", "text": ["Tom: I am so sorry for that", "Sam: No worries", "Tom: Thanks for your understanding!"]}Each string in the list represents one item (e.g., a dialogue turn, a document variant).
Configuration
Basic Setup
# Data configuration
data_files:
- data/dialogues.json
item_properties:
id_key: id
text_key: text
# Configure list display
list_as_text:
text_list_prefix_type: none # No prefix since speaker names are in text
alternating_shading: true # Shade every other turn for readability
# Annotation schemes
annotation_schemes:
- annotation_type: radio
name: sentiment
description: "What is the overall sentiment of this conversation?"
labels:
- positive
- neutral
- negativeDisplay Options
The list_as_text configuration controls how list items are displayed:
list_as_text:
text_list_prefix_type: alphabet # Prefix type for items
horizontal: false # Layout direction
alternating_shading: false # Shade alternate turnsPrefix Types
| Option | Example | Best For |
|---|---|---|
alphabet | A. B. C. | Pairwise comparisons, options |
number | 1. 2. 3. | Sequential turns, ordered lists |
bullet | . . . | Unordered items |
none | (no prefix) | Dialogue with speaker names in text |
Layout Options
| Option | Description |
|---|---|
horizontal: false | Vertical layout (default) - items stacked |
horizontal: true | Side-by-side layout - for pairwise comparison |
alternating_shading: true | Shades every other turn for dialogue |
Example Configurations
Dialogue Annotation
annotation_task_name: Dialogue Analysis
data_files:
- data/dialogues.json
item_properties:
id_key: id
text_key: text
list_as_text:
text_list_prefix_type: none
alternating_shading: true
annotation_schemes:
- annotation_type: span
name: certainty
description: Highlight phrases expressing certainty or uncertainty
labels:
- certain
- uncertain
sequential_key_binding: true
- annotation_type: radio
name: sentiment
description: What sentiment does the conversation hold?
labels:
- positive
- neutral
- negative
sequential_key_binding: truePairwise Text Comparison
annotation_task_name: Text Comparison
data_files:
- data/pairs.json
item_properties:
id_key: id
text_key: text
list_as_text:
text_list_prefix_type: alphabet
horizontal: true
annotation_schemes:
- annotation_type: radio
name: preference
description: Which text is better?
labels:
- A is better
- B is better
- EqualWorking Example
A complete working example is available at project-hub/dialogue_analysis/:
python potato/flask_server.py start project-hub/dialogue_analysis/configs/dialogue-analysis.yaml -p 8000Sample data format:
{"id":"1","text":["Tom: Isn't this awesome?!", "Sam: Yes! I like you!", "Tom: great!", "Sam: Awesome! Let's party!"]}
{"id":"2","text":["Tom: I am so sorry for that", "Sam: No worries", "Tom: thanks for your understanding!"]}Tips
-
Speaker Names: Include speaker names in the text (e.g., "Tom: Hello") when using
text_list_prefix_type: nonefor dialogue -
Span Annotation: When using span annotation with dialogue data, annotators can highlight text within any of the displayed turns
-
Prefix Choice:
- Use
nonefor dialogue where speaker names are embedded in text - Use
numberwhen sequence order matters - Use
alphabetfor pairwise/comparison tasks
- Use
-
Readability: Enable
alternating_shadingfor long dialogues to help annotators track which turn they're reading -
Comparison Tasks: Use
horizontal: truewithalphabetprefixes for side-by-side comparison
Further Reading
- Pairwise Comparison - Side-by-side comparison annotation
- Span Annotation - Highlighting text in dialogue turns
- Radio & Multiselect - Classification of conversations
For implementation details, see the source documentation.