Skip to content
Guides3 min read

Automatic Keyword Highlighting

Configure AI-powered keyword highlighting to draw annotator attention to important terms and phrases.

Potato Team·
यह पृष्ठ अभी आपकी भाषा में उपलब्ध नहीं है। अंग्रेज़ी संस्करण दिखाया जा रहा है।

Automatic Keyword Highlighting

AI-powered keyword highlighting draws annotator attention to important terms, entities, or patterns in text. This guide covers how to configure Potato's built-in AI support to automatically highlight relevant keywords.

Why Use Keyword Highlighting?

  • Focus attention: Guide annotators to relevant content
  • Improve speed: Faster identification of key information
  • Reduce errors: Less likely to miss important terms
  • Leverage AI: Let LLMs identify context-specific keywords

Basic AI-Powered Highlighting

Potato uses its AI support system to identify and highlight important keywords. Here's a basic configuration:

yaml
annotation_task_name: "Keyword Highlighted Annotation"
 
data_files:
  - path: "data/reviews.json"
    format: json
 
item_properties:
  id_key: id
  text_key: text
 
annotation_schemes:
  - annotation_type: radio
    name: sentiment
    description: "What is the overall sentiment?"
    labels:
      - Positive
      - Negative
      - Neutral
 
ai_support:
  enabled: true
  endpoint_type: openai
 
  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}
    temperature: 0.3
    max_tokens: 500
 
  features:
    keyword_highlighting:
      enabled: true
      # Highlights are rendered as box overlays on the text

Using Different AI Providers

OpenAI

yaml
ai_support:
  enabled: true
  endpoint_type: openai
 
  ai_config:
    model: gpt-4o
    api_key: ${OPENAI_API_KEY}
    temperature: 0.3
    max_tokens: 500
 
  features:
    keyword_highlighting:
      enabled: true
 

Anthropic Claude

yaml
ai_support:
  enabled: true
  endpoint_type: anthropic
 
  ai_config:
    model: claude-3-sonnet-20240229
    api_key: ${ANTHROPIC_API_KEY}
    temperature: 0.3
    max_tokens: 500
 
  features:
    keyword_highlighting:
      enabled: true
      # Highlights are rendered as box overlays on the text

Local Ollama (No API Costs)

yaml
ai_support:
  enabled: true
  endpoint_type: ollama
 
  ai_config:
    model: llama2
    base_url: http://localhost:11434
 
  features:
    keyword_highlighting:
      enabled: true
      # Highlights are rendered as box overlays on the text

Combining Features

AI support offers multiple features that work well together:

yaml
ai_support:
  enabled: true
  endpoint_type: openai
 
  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}
    temperature: 0.3
    max_tokens: 500
 
  features:
    # Highlight important keywords
    keyword_highlighting:
      enabled: true
      # Highlights are rendered as box overlays on the text
 
    # Show contextual hints
    hints:
      enabled: true
 
    # Suggest labels for consideration
    label_suggestions:
      enabled: true
      show_confidence: true

Complete Configuration Example

Here's a complete configuration for entity-aware annotation with AI highlighting:

yaml
annotation_task_name: "Entity-Aware Annotation"
 
data_files:
  - path: "data/documents.json"
    format: json
 
item_properties:
  id_key: id
  text_key: text
 
annotation_schemes:
  - annotation_type: span
    name: entities
    labels:
      - name: PERSON
        color: "#FECACA"
      - name: ORG
        color: "#BBF7D0"
      - name: LOCATION
        color: "#BFDBFE"
 
ai_support:
  enabled: true
  endpoint_type: openai
 
  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}
    temperature: 0.3
    max_tokens: 500
 
  features:
    keyword_highlighting:
      enabled: true
      # Highlights are rendered as box overlays on the text
    hints:
      enabled: true
    label_suggestions:
      enabled: true
      show_confidence: true
 
  cache_config:
    disk_cache:
      enabled: true
      path: "ai_cache/cache.json"
    prefetch:
      warm_up_page_count: 50
      on_next: 3
      on_prev: 2
 
output_annotation_dir: "output/"
output_annotation_format: json
allow_all_users: true

Caching for Performance

Enable caching to reduce API calls and improve response time:

yaml
ai_support:
  enabled: true
  endpoint_type: openai
 
  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}
 
  features:
    keyword_highlighting:
      enabled: true
 
  cache_config:
    disk_cache:
      enabled: true
      path: "ai_cache/cache.json"
 
    # Pre-generate highlights on startup and prefetch upcoming
    prefetch:
      warm_up_page_count: 100
      on_next: 5
      on_prev: 2

Tips

  1. Match colors to your task: Use highlight colors that complement your annotation scheme
  2. Enable caching: Avoid repeated API calls for the same content
  3. Consider local models: Use Ollama for high-volume annotation without API costs
  4. Combine features: Keyword highlighting works well with hints and label suggestions

Full documentation at /docs/features/ai-support.