# AI Support

Source: https://www.potatoannotator.com/docs/features/ai-support

Potato 2.0 includes built-in support for Large Language Models (LLMs) to assist annotators with intelligent hints, keyword highlighting, and label suggestions.

## Supported Providers

Potato supports multiple LLM providers:

**Cloud Providers:**
- **OpenAI** (GPT-4, GPT-4 Turbo, GPT-3.5)
- **Anthropic** (Claude 3, Claude 3.5)
- **Google** (Gemini 1.5 Pro, Gemini 2.0 Flash)
- **Hugging Face**
- **OpenRouter**

**Local/Self-Hosted:**
- **Ollama** (run models locally)
- **vLLM** (high-performance self-hosted inference)

## Configuration

### Basic Setup

Add an `ai_support` section to your configuration file:

```yaml
ai_support:
  enabled: true
  endpoint_type: openai

  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}
    temperature: 0.3
    max_tokens: 500
```

### Provider-Specific Configuration

#### OpenAI

```yaml
ai_support:
  enabled: true
  endpoint_type: openai

  ai_config:
    model: gpt-4o
    api_key: ${OPENAI_API_KEY}
    temperature: 0.3
    max_tokens: 500
```

#### Anthropic Claude

```yaml
ai_support:
  enabled: true
  endpoint_type: anthropic

  ai_config:
    model: claude-3-sonnet-20240229
    api_key: ${ANTHROPIC_API_KEY}
    temperature: 0.3
    max_tokens: 500
```

#### Google Gemini

```yaml
ai_support:
  enabled: true
  endpoint_type: google

  ai_config:
    model: gemini-1.5-pro
    api_key: ${GOOGLE_API_KEY}
```

#### Local Ollama

```yaml
ai_support:
  enabled: true
  endpoint_type: ollama

  ai_config:
    model: llama2
    base_url: http://localhost:11434
```

#### vLLM (Self-Hosted)

```yaml
ai_support:
  enabled: true
  endpoint_type: vllm

  ai_config:
    model: meta-llama/Llama-2-7b-chat-hf
    base_url: http://localhost:8000/v1
```

### Visual AI Endpoints

*New in v2.1.0*

For image and video annotation tasks, Potato supports dedicated vision endpoints including YOLO, Ollama Vision, OpenAI Vision, and Anthropic Vision. These enable object detection, pre-annotation, and visual classification.

See [Visual AI Support](/docs/features/visual-ai-support) for full configuration details.

## AI Features

Potato's AI support provides five primary capabilities:

### 1. Intelligent Hints

Provide contextual guidance to annotators without revealing the answer:

```yaml
ai_support:
  enabled: true
  endpoint_type: openai

  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}

  # Hints appear as tooltips or sidebars
  features:
    hints:
      enabled: true
```

### 2. Keyword Highlighting

Automatically highlight relevant keywords in the text:

```yaml
ai_support:
  enabled: true
  endpoint_type: openai

  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}

  features:
    keyword_highlighting:
      enabled: true
      # Highlights are rendered as box overlays on the text
```

### 3. Label Suggestions

Suggest labels for annotator consideration (shown with confidence indicators):

```yaml
ai_support:
  enabled: true
  endpoint_type: openai

  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}

  features:
    label_suggestions:
      enabled: true
      show_confidence: true
```

### 4. Label Rationales

*New in v2.1.0*

Generate balanced explanations for why each label might apply to the text, helping annotators understand the reasoning behind different classifications:

```yaml
ai_support:
  enabled: true
  endpoint_type: openai

  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}

  features:
    rationales:
      enabled: true
```

Rationales appear as a tooltip listing each available label with an explanation of why it might apply. This is useful for training annotators or when annotation decisions are difficult.

### 5. Option Highlighting

*New in v2.2.0*

AI-assisted highlighting of the most likely correct options for discrete annotation tasks (radio, multiselect, likert, select). The system analyzes content and highlights top-k likely options while dimming less-likely ones, keeping all options fully clickable.

```yaml
ai_support:
  enabled: true
  endpoint_type: openai

  ai_config:
    model: gpt-4o-mini
    api_key: ${OPENAI_API_KEY}

  option_highlighting:
    enabled: true
    top_k: 3
    dim_opacity: 0.4
    auto_apply: true
```

See [Option Highlighting](/docs/features/option-highlighting) for full configuration details.

### Complementary: Diversity Ordering

*New in v2.2.0*

While not strictly an AI feature, [Diversity Ordering](/docs/features/diversity-ordering) uses sentence-transformer embeddings to cluster items and present them in a diverse order, reducing annotator fatigue and improving coverage. It integrates with AI support by automatically prefetching AI hints for reordered items.

## Caching and Performance

AI responses can be cached to improve performance and reduce API costs:

```yaml
ai_support:
  enabled: true
  endpoint_type: openai

  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}

  cache_config:
    disk_cache:
      enabled: true
      path: "ai_cache/cache.json"

    # Pre-generate hints on startup and prefetch upcoming
    prefetch:
      warm_up_page_count: 100
      on_next: 5
      on_prev: 2
```

### Caching Strategies

1. **Warmup**: Pre-generates AI hints for an initial batch of instances when the server starts (`warm_up_page_count`)
2. **Prefetch**: Generates hints for upcoming instances as annotators navigate forward (`on_next`) or backward (`on_prev`)
3. **Disk Persistence**: Caches are saved to disk and persist across server restarts

## Custom Prompts

Potato includes default prompts for each annotation type, stored in `potato/ai/prompt/`. You can customize these for your specific task:

| Annotation Type | Prompt File |
|-----------------|-------------|
| Radio buttons | `radio_prompt.txt` |
| Likert scales | `likert_prompt.txt` |
| Checkboxes | `checkbox_prompt.txt` |
| Span annotation | `span_prompt.txt` |
| Sliders | `slider_prompt.txt` |
| Dropdowns | `dropdown_prompt.txt` |
| Number input | `number_prompt.txt` |
| Text input | `text_prompt.txt` |

Prompts support variable substitution:
- `{text}` - The document text
- `{labels}` - Available labels for the scheme
- `{description}` - The scheme description

## Multi-Schema Support

For tasks with multiple annotation schemes, you can enable AI support selectively:

```yaml
ai_support:
  enabled: true
  endpoint_type: openai

  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}

  # Only enable for specific schemes
  special_include:
    - page: 1
      schema: sentiment
    - page: 1
      schema: topics
```

## Full Example

Complete configuration for AI-assisted sentiment analysis:

```yaml
annotation_task_name: "AI-Assisted Sentiment Analysis"
task_dir: "."
port: 8000

data_files:
  - "data/reviews.json"

item_properties:
  id_key: id
  text_key: text

annotation_schemes:
  - annotation_type: radio
    name: sentiment
    description: "What is the sentiment of this review?"
    labels:
      - Positive
      - Negative
      - Neutral

ai_support:
  enabled: true
  endpoint_type: openai

  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}
    temperature: 0.3
    max_tokens: 500

  features:
    hints:
      enabled: true
    keyword_highlighting:
      enabled: true
      # Highlights are rendered as box overlays on the text
    label_suggestions:
      enabled: true
      show_confidence: true

  cache_config:
    disk_cache:
      enabled: true
      path: "ai_cache/cache.json"
    prefetch:
      warm_up_page_count: 50
      on_next: 3
      on_prev: 2

output_annotation_dir: "output/"
export_annotation_format: "json"
user_config:
  allow_all_users: true
```

## Environment Variables

Store API keys securely using environment variables:

```bash
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GOOGLE_API_KEY="..."
```

Reference them in your config with `${VARIABLE_NAME}` syntax.

## Cost Considerations

- AI calls are made per-instance by default
- Enable caching to reduce repeated API calls
- Use warmup and prefetch to pre-generate hints
- Consider using smaller/cheaper models for simple tasks
- Local providers (Ollama, vLLM) have no API costs

## Best Practices

1. **Use AI as assistance, not replacement** - Let annotators make final decisions
2. **Enable caching for production** - Reduces latency and costs
3. **Test prompts thoroughly** - Custom prompts should be validated
4. **Monitor API costs** - Track usage especially with cloud providers
5. **Consider local providers** - Ollama or vLLM for high-volume annotation
6. **Protect API credentials** - Use environment variables, never commit keys

## Further Reading

- [Option Highlighting](/docs/features/option-highlighting) - AI-assisted option guidance
- [Diversity Ordering](/docs/features/diversity-ordering) - Embedding-based item diversification
- [Visual AI Support](/docs/features/visual-ai-support) - AI for image and video annotation
- [ICL Labeling](/docs/features/icl-labeling) - AI-assisted in-context learning
- [Active Learning](/docs/features/active-learning) - ML-based instance prioritization
- [Productivity Features](/docs/features/productivity) - Keyword highlights and suggestions

For implementation details and custom prompt templates, see the [source documentation](https://github.com/davidjurgens/potato/blob/main/docs/ai_support.md).