# Automatic Keyword Highlighting

Source: https://www.potatoannotator.com/blog/keyword-highlighting-setup

AI-powered keyword highlighting pulls the annotator's eye toward the terms, entities, or patterns that matter in a piece of text. This guide walks through Potato's built-in AI support and how to set it up so relevant keywords get highlighted on their own.

## Why use keyword highlighting?

It guides annotators to the part of the text that actually matters, which means they find the key information faster and are less likely to skim past an important term. Because the highlighting comes from an LLM, it can adapt to the context of each item instead of relying on a fixed word list.

For how Potato's option and keyword highlighting works under the hood, see the [source documentation](https://github.com/davidjurgens/potato/blob/master/docs/ai-intelligence/option_highlighting.md).

## Basic AI-powered highlighting

Potato leans on its AI support system to find and highlight important keywords. Here is a minimal config:

```yaml
annotation_task_name: "Keyword Highlighted Annotation"

data_files:
  - path: "data/reviews.json"
    format: json

item_properties:
  id_key: id
  text_key: text

annotation_schemes:
  - annotation_type: radio
    name: sentiment
    description: "What is the overall sentiment?"
    labels:
      - Positive
      - Negative
      - Neutral

ai_support:
  enabled: true
  endpoint_type: openai

  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}
    temperature: 0.3
    max_tokens: 500

  features:
    keyword_highlighting:
      enabled: true
      # Highlights are rendered as box overlays on the text
```

When AI keyword highlighting is enabled, relevant terms are automatically highlighted in the annotation text:

![AI-powered keyword highlighting in the annotation interface](/images/blog/keyword-highlights.png "Important keywords and entities are automatically highlighted to guide annotator attention")

## Using different AI providers

### OpenAI

```yaml
ai_support:
  enabled: true
  endpoint_type: openai

  ai_config:
    model: gpt-4o
    api_key: ${OPENAI_API_KEY}
    temperature: 0.3
    max_tokens: 500

  features:
    keyword_highlighting:
      enabled: true

```

### Anthropic Claude

```yaml
ai_support:
  enabled: true
  endpoint_type: anthropic

  ai_config:
    model: claude-3-sonnet-20240229
    api_key: ${ANTHROPIC_API_KEY}
    temperature: 0.3
    max_tokens: 500

  features:
    keyword_highlighting:
      enabled: true
      # Highlights are rendered as box overlays on the text
```

### Local Ollama (No API Costs)

```yaml
ai_support:
  enabled: true
  endpoint_type: ollama

  ai_config:
    model: llama2
    base_url: http://localhost:11434

  features:
    keyword_highlighting:
      enabled: true
      # Highlights are rendered as box overlays on the text
```

## Combining features

The AI features stack, and they tend to be more useful together than alone:

```yaml
ai_support:
  enabled: true
  endpoint_type: openai

  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}
    temperature: 0.3
    max_tokens: 500

  features:
    # Highlight important keywords
    keyword_highlighting:
      enabled: true
      # Highlights are rendered as box overlays on the text

    # Show contextual hints
    hints:
      enabled: true

    # Suggest labels for consideration
    label_suggestions:
      enabled: true
      show_confidence: true
```

## Complete configuration example

Here is a full config for entity-aware annotation with AI highlighting:

```yaml
annotation_task_name: "Entity-Aware Annotation"

data_files:
  - path: "data/documents.json"
    format: json

item_properties:
  id_key: id
  text_key: text

annotation_schemes:
  - annotation_type: span
    name: entities
    labels:
      - name: PERSON
        color: "#FECACA"
      - name: ORG
        color: "#BBF7D0"
      - name: LOCATION
        color: "#BFDBFE"

ai_support:
  enabled: true
  endpoint_type: openai

  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}
    temperature: 0.3
    max_tokens: 500

  features:
    keyword_highlighting:
      enabled: true
      # Highlights are rendered as box overlays on the text
    hints:
      enabled: true
    label_suggestions:
      enabled: true
      show_confidence: true

  cache_config:
    disk_cache:
      enabled: true
      path: "ai_cache/cache.json"
    prefetch:
      warm_up_page_count: 50
      on_next: 3
      on_prev: 2

output_annotation_dir: "output/"
export_annotation_format: json
allow_all_users: true
```

## Caching for performance

Turn on caching to cut down on API calls and speed up responses:

```yaml
ai_support:
  enabled: true
  endpoint_type: openai

  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}

  features:
    keyword_highlighting:
      enabled: true

  cache_config:
    disk_cache:
      enabled: true
      path: "ai_cache/cache.json"

    # Pre-generate highlights on startup and prefetch upcoming
    prefetch:
      warm_up_page_count: 100
      on_next: 5
      on_prev: 2
```

## Tips

Pick highlight colors that sit well next to your annotation scheme rather than fighting it. Keep caching on so you are not paying for the same content twice. If you are annotating at high volume, Ollama runs locally and skips the API bill entirely. And remember the features stack: keyword highlighting pairs naturally with hints and label suggestions.

---

*Full documentation at [/docs/features/ai-support](/docs/features/ai-support).*
