Blog/Guides
Guides3 min read

Automatic Keyword Highlighting

Configure AI-powered keyword highlighting to draw annotator attention to important terms and phrases.

By Potato Team·

Automatic Keyword Highlighting

AI-powered keyword highlighting draws annotator attention to important terms, entities, or patterns in text. This guide covers how to configure Potato's built-in AI support to automatically highlight relevant keywords.

Why Use Keyword Highlighting?

  • Focus attention: Guide annotators to relevant content
  • Improve speed: Faster identification of key information
  • Reduce errors: Less likely to miss important terms
  • Leverage AI: Let LLMs identify context-specific keywords

Basic AI-Powered Highlighting

Potato uses its AI support system to identify and highlight important keywords. Here's a basic configuration:

annotation_task_name: "Keyword Highlighted Annotation"
 
data_files:
  - path: "data/reviews.json"
    format: json
 
item_properties:
  id_key: id
  text_key: text
 
annotation_schemes:
  - annotation_type: radio
    name: sentiment
    description: "What is the overall sentiment?"
    labels:
      - Positive
      - Negative
      - Neutral
 
ai_support:
  enabled: true
  endpoint_type: openai
 
  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}
    temperature: 0.3
    max_tokens: 500
 
  features:
    keyword_highlighting:
      enabled: true
      highlight_color: "#FEF3C7"

Using Different AI Providers

OpenAI

ai_support:
  enabled: true
  endpoint_type: openai
 
  ai_config:
    model: gpt-4-turbo-preview
    api_key: ${OPENAI_API_KEY}
    temperature: 0.3
    max_tokens: 500
 
  features:
    keyword_highlighting:
      enabled: true
      highlight_color: "#FEF3C7"

Anthropic Claude

ai_support:
  enabled: true
  endpoint_type: anthropic
 
  ai_config:
    model: claude-3-sonnet-20240229
    api_key: ${ANTHROPIC_API_KEY}
    temperature: 0.3
    max_tokens: 500
 
  features:
    keyword_highlighting:
      enabled: true
      highlight_color: "#FEF3C7"

Local Ollama (No API Costs)

ai_support:
  enabled: true
  endpoint_type: ollama
 
  ai_config:
    model: llama2
    base_url: http://localhost:11434
 
  features:
    keyword_highlighting:
      enabled: true
      highlight_color: "#FEF3C7"

Combining Features

AI support offers multiple features that work well together:

ai_support:
  enabled: true
  endpoint_type: openai
 
  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}
    temperature: 0.3
    max_tokens: 500
 
  features:
    # Highlight important keywords
    keyword_highlighting:
      enabled: true
      highlight_color: "#fef08a"
 
    # Show contextual hints
    hints:
      enabled: true
 
    # Suggest labels for consideration
    label_suggestions:
      enabled: true
      show_confidence: true

Complete Configuration Example

Here's a complete configuration for entity-aware annotation with AI highlighting:

annotation_task_name: "Entity-Aware Annotation"
 
data_files:
  - path: "data/documents.json"
    format: json
 
item_properties:
  id_key: id
  text_key: text
 
annotation_schemes:
  - annotation_type: span
    name: entities
    labels:
      - name: PERSON
        color: "#FECACA"
      - name: ORG
        color: "#BBF7D0"
      - name: LOCATION
        color: "#BFDBFE"
 
ai_support:
  enabled: true
  endpoint_type: openai
 
  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}
    temperature: 0.3
    max_tokens: 500
 
  features:
    keyword_highlighting:
      enabled: true
      highlight_color: "#fef08a"
    hints:
      enabled: true
    label_suggestions:
      enabled: true
      show_confidence: true
 
  cache_config:
    enabled: true
    cache_dir: "ai_cache/"
    warmup:
      enabled: true
      num_instances: 50
    prefetch:
      enabled: true
      lookahead: 3
 
output_annotation_dir: "output/"
output_annotation_format: json
allow_all_users: true

Caching for Performance

Enable caching to reduce API calls and improve response time:

ai_support:
  enabled: true
  endpoint_type: openai
 
  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}
 
  features:
    keyword_highlighting:
      enabled: true
 
  cache_config:
    enabled: true
    cache_dir: "ai_cache/"
 
    # Pre-generate highlights on startup
    warmup:
      enabled: true
      num_instances: 100
 
    # Generate for upcoming instances
    prefetch:
      enabled: true
      lookahead: 5

Tips

  1. Match colors to your task: Use highlight colors that complement your annotation scheme
  2. Enable caching: Avoid repeated API calls for the same content
  3. Consider local models: Use Ollama for high-volume annotation without API costs
  4. Combine features: Keyword highlighting works well with hints and label suggestions

Full documentation at /docs/features/ai-support.