Skip to content

Soft Label

Configure soft label annotation in Potato for probability distribution allocation across categories using sliders that must sum to a fixed total.

Soft Label

The soft label annotation schema allows annotators to assign probability distributions across multiple categories rather than making a single hard classification decision. Annotators use sliders to distribute a fixed total (e.g., 100 points) across labels, capturing the degree of uncertainty or overlap between categories.

Overview

Soft labeling is useful when items may partially belong to multiple categories. Instead of forcing annotators to choose one label, this schema lets them express relative confidence across all options. The sliders are linked so they always sum to the configured total, and an optional distribution chart provides visual feedback.

Quick Start

yaml
annotation_schemes:
  - annotation_type: soft_label
    name: sentiment_distribution
    description: Distribute 100 points across sentiment categories based on how much each applies.
    labels: ["Positive", "Neutral", "Negative"]
    total: 100

Configuration Options

FieldTypeDefaultDescription
annotation_typestringRequiredMust be "soft_label"
namestringRequiredUnique identifier for this schema
descriptionstringRequiredInstructions displayed to annotators
labelsarrayRequiredList of category labels (minimum 2)
totalinteger100The fixed sum that all sliders must add up to
min_per_labelinteger0Minimum value each label must receive
show_distribution_chartbooleantrueDisplay a pie or bar chart showing the current distribution
label_requirement.requiredbooleanfalseWhether the annotation must be completed before moving on

Examples

Sentiment Distribution

yaml
annotation_schemes:
  - annotation_type: soft_label
    name: sentiment_distribution
    description: How much does each sentiment apply to this text?
    labels: ["Positive", "Neutral", "Negative"]
    total: 100
    show_distribution_chart: true

Emotion Intensity

yaml
annotation_schemes:
  - annotation_type: soft_label
    name: emotion_mix
    description: Distribute points to reflect the mix of emotions in this utterance.
    labels: ["Joy", "Sadness", "Anger", "Fear", "Surprise", "Disgust"]
    total: 100
    min_per_label: 0
    show_distribution_chart: true

Topic Relevance

yaml
annotation_schemes:
  - annotation_type: soft_label
    name: topic_relevance
    description: How relevant is this document to each topic?
    labels: ["Politics", "Sports", "Technology", "Entertainment"]
    total: 100
    label_requirement:
      required: true

Forced Minimum Allocation

yaml
annotation_schemes:
  - annotation_type: soft_label
    name: genre_mix
    description: Allocate points across genres. Each genre must receive at least 5 points.
    labels: ["Rock", "Pop", "Jazz", "Classical", "Electronic"]
    total: 100
    min_per_label: 5

Output Format

json
{
  "sentiment_distribution": {
    "labels": {
      "Positive": 45,
      "Neutral": 30,
      "Negative": 25
    }
  }
}

Values always sum to the configured total.

Best Practices

  1. Use when categories overlap - soft labels are ideal when items genuinely belong to multiple categories to varying degrees
  2. Keep the label count manageable - more than 6-7 labels makes the slider interface unwieldy
  3. Set a meaningful total - 100 is intuitive as percentages, but smaller totals work for simpler tasks
  4. Use min_per_label sparingly - forcing minimum allocations can bias results when a label truly does not apply
  5. Enable the distribution chart - visual feedback helps annotators see their allocation at a glance

Further Reading

For implementation details, see the source documentation.