Integrating LLMs for Smart Annotation Hints

AI-assisted annotation can dramatically improve both speed and quality. This guide covers integrating OpenAI, Claude, Gemini, and local models to provide intelligent suggestions to your annotators.

What LLM Integration Enables

Pre-annotation suggestions: AI provides initial labels for review
Keyword highlighting: Automatically highlight relevant terms
Quality hints: Flag potential annotation errors
Explanation generation: Help annotators understand difficult cases

Basic OpenAI Integration

yaml

annotation_task_name: "AI-Assisted Sentiment Analysis"
 
# AI configuration
ai_support:
  enabled: true
  endpoint_type: openai
 
  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}
    temperature: 0.3
    max_tokens: 500
 
  features:
    hints:
      enabled: true
    keyword_highlighting:
      enabled: true
    label_suggestions:
      enabled: true
 
# ... rest of config
annotation_schemes:
  - annotation_type: radio
    name: sentiment
    labels: [Positive, Negative, Neutral]

Supported Providers

OpenAI

yaml

ai_support:
  enabled: true
  endpoint_type: openai
 
  ai_config:
    model: gpt-4  # or gpt-4o, gpt-3.5-turbo
    api_key: ${OPENAI_API_KEY}
    temperature: 0.3
    max_tokens: 500

Anthropic Claude

yaml

ai_support:
  enabled: true
  endpoint_type: anthropic
 
  ai_config:
    model: claude-3-sonnet-20240229
    api_key: ${ANTHROPIC_API_KEY}
    temperature: 0.3
    max_tokens: 500

Google Gemini

yaml

ai_support:
  enabled: true
  endpoint_type: google
 
  ai_config:
    model: gemini-1.5-pro
    api_key: ${GOOGLE_API_KEY}

Local Models (Ollama)

yaml

ai_support:
  enabled: true
  endpoint_type: ollama
 
  ai_config:
    model: llama2  # or mistral, mixtral, etc.
    base_url: http://localhost:11434

Feature: Label Suggestions

AI models can suggest labels for annotator consideration:

yaml

ai_support:
  enabled: true
  endpoint_type: openai
 
  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}
 
  features:
    label_suggestions:
      enabled: true
      show_confidence: true
 
annotation_schemes:
  - annotation_type: radio
    name: category
    labels: [News, Opinion, Satire, Other]

Feature: Keyword Highlighting

Automatically highlight important terms:

yaml

ai_support:
  enabled: true
  endpoint_type: openai
 
  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}
 
  features:
    keyword_highlighting:
      enabled: true

Feature: Intelligent Hints

Provide contextual guidance to annotators:

yaml

ai_support:
  enabled: true
  endpoint_type: openai
 
  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}
 
  features:
    hints:
      enabled: true

Hints appear as contextual guidance without revealing the answer, helping annotators think through difficult cases.

Complete AI-Assisted Configuration

yaml

annotation_task_name: "AI-Assisted NER Annotation"
 
# AI Configuration
ai_support:
  enabled: true
  endpoint_type: openai
 
  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}
    temperature: 0.2
    max_tokens: 500
 
  features:
    hints:
      enabled: true
    keyword_highlighting:
      enabled: true
    label_suggestions:
      enabled: true
      show_confidence: true
 
  cache_config:
    disk_cache:
      enabled: true
      path: "ai_cache/cache.json"
    prefetch:
      warm_up_page_count: 50
      on_next: 5
      on_prev: 2
 
data_files:
  - data/texts.json
 
item_properties:
  id_key: id
  text_key: content
 
annotation_schemes:
  - annotation_type: span
    name: entities
    description: "Label named entities (AI suggestions provided)"
    labels:
      - name: PERSON
        color: "#FF6B6B"
      - name: ORG
        color: "#4ECDC4"
      - name: LOC
        color: "#45B7D1"
      - name: DATE
        color: "#96CEB4"
 
output_annotation_dir: "annotation_output/"
output_annotation_format: "json"

Working with AI Suggestions

When AI support is enabled, annotators see suggestions alongside the annotation interface. They can accept, modify, or ignore the AI's recommendations. The final annotation always reflects the annotator's decision, ensuring human oversight.

AI responses are cached automatically when caching is enabled, so the same instance won't trigger multiple API calls.

Custom Prompts

Potato includes default prompts for each annotation type, stored in potato/ai/prompt/. You can customize these by editing the prompt files:

Annotation Type	Prompt File
Radio buttons	`radio_prompt.txt`
Likert scales	`likert_prompt.txt`
Checkboxes	`checkbox_prompt.txt`
Span annotation	`span_prompt.txt`
Text input	`text_prompt.txt`

Prompts support variable substitution with {text}, {labels}, and {description}.

Tips for AI-Assisted Annotation

Start conservative: Review all suggestions initially
Monitor acceptance rates: Low rates indicate prompt issues
Iterate on prompts: Refine based on common errors
Maintain human oversight: AI assists, humans decide
Track AI vs human labels: Measure AI accuracy over time

New in v2.2: Option Highlighting

Potato 2.2 adds a new AI feature called Option Highlighting that analyzes content to highlight the most likely correct options for discrete annotation tasks (radio, multiselect, likert). Top-k options are highlighted with a star indicator while less-likely options are dimmed, keeping all options fully clickable.

yaml

ai_support:
  option_highlighting:
    enabled: true
    top_k: 3
    dim_opacity: 0.4

Read the full Option Highlighting documentation →

Next Steps

Enable active learning to prioritize uncertain items
Set up quality control with AI metrics
Learn about local models for privacy
Explore option highlighting for guided annotation

Full AI documentation at /docs/features/ai-support.