Blog/Guides
Guides5 min read

Integrating LLMs for Smart Annotation Hints

Learn how to use OpenAI, Claude, or Gemini to provide intelligent hints and suggestions to your annotators.

By Potato Team·

Integrating LLMs for Smart Annotation Hints

AI-assisted annotation can dramatically improve both speed and quality. This guide covers integrating OpenAI, Claude, Gemini, and local models to provide intelligent suggestions to your annotators.

What LLM Integration Enables

  • Pre-annotation suggestions: AI provides initial labels for review
  • Keyword highlighting: Automatically highlight relevant terms
  • Quality hints: Flag potential annotation errors
  • Explanation generation: Help annotators understand difficult cases

Basic OpenAI Integration

annotation_task_name: "AI-Assisted Sentiment Analysis"
 
# AI configuration
ai_support:
  enabled: true
  endpoint_type: openai
  model: gpt-4
  api_key: "${OPENAI_API_KEY}"  # Read from environment variable
 
  features:
    suggestions: true
    keyword_highlighting: true
    quality_hints: false
 
# ... rest of config
annotation_schemes:
  - annotation_type: radio
    name: sentiment
    labels: [Positive, Negative, Neutral]
    ai_suggestions: true  # Enable for this scheme

Supported Providers

OpenAI

ai_support:
  endpoint_type: openai
  model: gpt-4  # or gpt-3.5-turbo, gpt-4-turbo
  api_key: "${OPENAI_API_KEY}"
  temperature: 0.3
  max_tokens: 100

Anthropic Claude

ai_support:
  endpoint_type: anthropic
  model: claude-3-opus  # or claude-3-sonnet, claude-3-haiku
  api_key: "${ANTHROPIC_API_KEY}"
  temperature: 0.3

Google Gemini

ai_support:
  endpoint_type: gemini
  model: gemini-pro
  api_key: "${GOOGLE_API_KEY}"

Local Models (Ollama)

ai_support:
  endpoint_type: ollama
  model: llama2  # or mistral, mixtral, etc.
  endpoint: http://localhost:11434

Custom API

ai_support:
  endpoint_type: custom
  endpoint: https://your-api.com/v1/chat
  api_key: "${CUSTOM_API_KEY}"
  request_format: openai  # or anthropic

Feature: AI Suggestions

Pre-fill annotations with AI predictions:

ai_support:
  features:
    suggestions:
      enabled: true
      show_confidence: true
      confidence_threshold: 0.7
      allow_override: true
      highlight_uncertain: true
 
annotation_schemes:
  - annotation_type: radio
    name: category
    labels: [News, Opinion, Satire, Other]
    ai_suggestions:
      enabled: true
      prompt: |
        Classify this text into one of these categories:
        - News: Factual reporting of events
        - Opinion: Personal viewpoint or editorial
        - Satire: Humorous or ironic commentary
        - Other: Doesn't fit above categories
 
        Text: {{text}}
 
        Respond with just the category name.

Feature: Keyword Highlighting

Automatically highlight important terms:

ai_support:
  features:
    keyword_highlighting:
      enabled: true
      style: background  # or underline, bold
      color: "#FEF3C7"
 
      prompt: |
        Extract key entities and important terms from this text.
        Return as JSON array of objects with "text" and "type" fields.
 
        Text: {{text}}
 
        Types: person, organization, location, date, key_term

Feature: Quality Hints

Warn annotators about potential issues:

ai_support:
  features:
    quality_hints:
      enabled: true
      show_as: tooltip  # or banner, sidebar
 
      checks:
        - name: sarcasm_detection
          prompt: "Is this text sarcastic? Answer yes/no with brief explanation."
          show_when: "yes"
          message: "This text may be sarcastic - consider the intended meaning"
 
        - name: ambiguity_check
          prompt: "Is the sentiment in this text ambiguous? Answer yes/no."
          show_when: "yes"
          message: "This text has ambiguous sentiment - read carefully"

Complete AI-Assisted Configuration

annotation_task_name: "AI-Assisted NER Annotation"
 
# AI Configuration
ai_support:
  enabled: true
  endpoint_type: openai
  model: gpt-4
  api_key: "${OPENAI_API_KEY}"
  temperature: 0.2
 
  # Features
  features:
    suggestions:
      enabled: true
      show_confidence: true
      auto_accept_threshold: 0.95
      review_threshold: 0.7
 
    keyword_highlighting:
      enabled: true
      style: background
 
    quality_hints:
      enabled: true
 
data_files:
  - data/texts.json
 
item_properties:
  id_key: id
  text_key: content
 
annotation_schemes:
  - annotation_type: span
    name: entities
    description: "Label named entities (AI suggestions provided)"
    labels:
      - name: PERSON
        color: "#FF6B6B"
      - name: ORG
        color: "#4ECDC4"
      - name: LOC
        color: "#45B7D1"
      - name: DATE
        color: "#96CEB4"
 
    ai_suggestions:
      enabled: true
      prompt: |
        Extract named entities from the following text.
        Return as JSON array with fields: text, start, end, label
 
        Labels:
        - PERSON: Names of people
        - ORG: Organizations, companies
        - LOC: Locations, places
        - DATE: Dates and times
 
        Text: {{text}}
 
# Display AI suggestions
display:
  ai_panel:
    show: true
    position: right
    title: "AI Suggestions"
    show_accept_all: true
    show_reject_all: true

Handling AI Responses

Accepting Suggestions

ai_suggestions:
  interaction:
    accept_key: "a"
    reject_key: "r"
    accept_all_key: "ctrl+a"
 
    # Auto-acceptance
    auto_accept:
      enabled: false  # Require human review
      threshold: 0.95

Logging AI Performance

ai_support:
  logging:
    enabled: true
    log_suggestions: true
    log_acceptances: true
    log_modifications: true
    output_file: ai_performance.jsonl

Custom Prompts

Create task-specific prompts:

ai_support:
  custom_prompts:
    entity_extraction: |
      You are an expert NER annotator. Extract entities from the text.
      Be precise with boundaries. Return JSON array.
      Text: {{text}}
 
    sentiment_hint: |
      Analyze this text for sentiment indicators.
      Note: Consider sarcasm, negation, and context.
      Text: {{text}}
 
annotation_schemes:
  - annotation_type: span
    name: entities
    ai_suggestions:
      prompt_key: entity_extraction

Output with AI Metadata

{
  "id": "doc_001",
  "text": "Apple Inc. CEO Tim Cook announced...",
  "annotations": {
    "entities": [
      {
        "text": "Apple Inc.",
        "start": 0,
        "end": 10,
        "label": "ORG",
        "ai_suggested": true,
        "ai_confidence": 0.95,
        "human_modified": false
      }
    ]
  },
  "ai_metadata": {
    "provider": "openai",
    "model": "gpt-4",
    "suggestions_accepted": 3,
    "suggestions_rejected": 1,
    "suggestions_modified": 0
  }
}

Tips for AI-Assisted Annotation

  1. Start conservative: Review all suggestions initially
  2. Monitor acceptance rates: Low rates indicate prompt issues
  3. Iterate on prompts: Refine based on common errors
  4. Maintain human oversight: AI assists, humans decide
  5. Track AI vs human labels: Measure AI accuracy over time

Next Steps


Full AI documentation at /docs/features/ai-support.