Skip to content
Guides3 min read

Understanding Potato Data Formats

A deep dive into JSON and JSONL data formats, with examples for text, image, audio, and multimodal annotation.

Potato Team·

Understanding Potato Data Formats

Potato uses JSON and JSONL formats for input data and output annotations. This guide covers format specifications, examples, and best practices for all data types.

Input Data Formats

One JSON object per line:

json
{"id": "001", "text": "First document text here."}
{"id": "002", "text": "Second document text here."}
{"id": "003", "text": "Third document text here."}

Advantages:

  • Stream processing (memory efficient)
  • Easy to append
  • One corrupted line doesn't break file

JSON Array

Standard JSON array:

json
[
  {"id": "001", "text": "First document."},
  {"id": "002", "text": "Second document."},
  {"id": "003", "text": "Third document."}
]

Configuration:

yaml
data_files:
  - data/items.json

Text Annotation Data

Basic Text

json
{"id": "doc_001", "text": "The product quality exceeded my expectations."}

With Metadata

json
{
  "id": "review_001",
  "text": "Great product, fast shipping!",
  "metadata": {
    "source": "amazon",
    "date": "2024-01-15",
    "author": "user123",
    "rating": 5
  }
}

With Pre-annotations

json
{
  "id": "ner_001",
  "text": "Apple announced new products in Cupertino.",
  "pre_annotations": {
    "entities": [
      {"start": 0, "end": 5, "label": "ORG", "text": "Apple"},
      {"start": 31, "end": 40, "label": "LOC", "text": "Cupertino"}
    ]
  }
}

Configuration:

yaml
data_files:
  - data/texts.json
 
item_properties:
  id_key: id
  text_key: text

Image Annotation Data

Local Images

json
{
  "id": "img_001",
  "image_path": "/data/images/photo_001.jpg",
  "caption": "Street scene in Paris"
}

Remote Images

json
{
  "id": "img_002",
  "image_url": "https://example.com/images/photo.jpg"
}

With Bounding Boxes

json
{
  "id": "detection_001",
  "image_path": "/images/street.jpg",
  "pre_annotations": {
    "objects": [
      {"bbox": [100, 150, 200, 300], "label": "person"},
      {"bbox": [350, 200, 450, 280], "label": "car"}
    ]
  }
}

Configuration:

yaml
data_files:
  - data/images.json
 
item_properties:
  id_key: id
  image_key: image_path  # or image_url

Audio Annotation Data

Local Audio

json
{
  "id": "audio_001",
  "audio_path": "/data/audio/recording.wav",
  "duration": 45.5,
  "transcript": "Hello, how are you today?"
}

With Segments

json
{
  "id": "audio_002",
  "audio_path": "/audio/meeting.mp3",
  "segments": [
    {"start": 0.0, "end": 5.5, "speaker": "Speaker1"},
    {"start": 5.5, "end": 12.0, "speaker": "Speaker2"}
  ]
}

Configuration:

yaml
data_files:
  - data/audio.json
 
item_properties:
  audio_key: audio_path
  text_key: transcript

Multimodal Data

Text + Image

json
{
  "id": "mm_001",
  "text": "What is shown in this image?",
  "image_path": "/images/scene.jpg"
}

Text + Audio

json
{
  "id": "mm_002",
  "text": "Transcribe this audio:",
  "audio_path": "/audio/clip.wav",
  "reference_transcript": "Expected transcription here"
}

Output Annotation Format

Basic Output

json
{
  "id": "doc_001",
  "text": "Great product!",
  "annotations": {
    "sentiment": "Positive",
    "confidence": 5
  },
  "annotator": "user123",
  "timestamp": "2024-11-05T10:30:00Z"
}

Span Annotations

json
{
  "id": "ner_001",
  "text": "Apple CEO Tim Cook visited Paris.",
  "annotations": {
    "entities": [
      {"start": 0, "end": 5, "label": "ORG", "text": "Apple"},
      {"start": 10, "end": 18, "label": "PERSON", "text": "Tim Cook"},
      {"start": 27, "end": 32, "label": "LOC", "text": "Paris"}
    ]
  }
}

Multiple Annotators

json
{
  "id": "item_001",
  "text": "Sample text",
  "annotations": [
    {
      "annotator": "ann1",
      "labels": {"sentiment": "Positive"},
      "timestamp": "2024-11-05T10:00:00Z"
    },
    {
      "annotator": "ann2",
      "labels": {"sentiment": "Positive"},
      "timestamp": "2024-11-05T11:00:00Z"
    }
  ],
  "aggregated": {
    "sentiment": "Positive",
    "agreement": 1.0
  }
}

Configuration Reference

yaml
data_files:
  - data/items.json
 
item_properties:
  id_key: id
  text_key: text
  image_key: image_path
  audio_key: audio_path

Best Practices

  1. Always include IDs: Unique identifiers for tracking
  2. Use JSONL for large datasets: Better memory efficiency
  3. Validate before loading: Check JSON syntax
  4. Include metadata: Source, date, author help debugging
  5. Consistent field names: Easier processing downstream

Full data format documentation at /docs/core-concepts/data-formats.