AI 支持

集成大语言模型实现智能标注辅助。

Potato 2.0 内置了对大语言模型（LLM）的支持，帮助标注者获得智能提示、关键词高亮和标签建议。

支持的提供商

Potato 支持多种 LLM 提供商：

云服务提供商：

OpenAI（GPT-4、GPT-4 Turbo、GPT-3.5）
Anthropic（Claude 3、Claude 3.5）
Google（Gemini 1.5 Pro、Gemini 2.0 Flash）
Hugging Face
OpenRouter

本地/自托管：

Ollama（本地运行模型）
vLLM（高性能自托管推理）

配置

基本设置

在配置文件中添加 ai_support 部分：

yaml

ai_support:
  enabled: true
  endpoint_type: openai
 
  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}
    temperature: 0.3
    max_tokens: 500

特定提供商配置

OpenAI

yaml

ai_support:
  enabled: true
  endpoint_type: openai
 
  ai_config:
    model: gpt-4o
    api_key: ${OPENAI_API_KEY}
    temperature: 0.3
    max_tokens: 500

Anthropic Claude

yaml

ai_support:
  enabled: true
  endpoint_type: anthropic
 
  ai_config:
    model: claude-3-sonnet-20240229
    api_key: ${ANTHROPIC_API_KEY}
    temperature: 0.3
    max_tokens: 500

Google Gemini

yaml

ai_support:
  enabled: true
  endpoint_type: google
 
  ai_config:
    model: gemini-1.5-pro
    api_key: ${GOOGLE_API_KEY}

本地 Ollama

yaml

ai_support:
  enabled: true
  endpoint_type: ollama
 
  ai_config:
    model: llama2
    base_url: http://localhost:11434

vLLM（自托管）

yaml

ai_support:
  enabled: true
  endpoint_type: vllm
 
  ai_config:
    model: meta-llama/Llama-2-7b-chat-hf
    base_url: http://localhost:8000/v1

视觉 AI 端点

v2.1.0 新增

对于图像和视频标注任务，Potato 支持专用的视觉端点，包括 YOLO、Ollama Vision、OpenAI Vision 和 Anthropic Vision。这些端点支持目标检测、预标注和视觉分类。

详见视觉 AI 支持的完整配置说明。

AI 功能

Potato 的 AI 支持提供五种主要功能：

1. 智能提示

为标注者提供上下文指导，但不揭示答案：

yaml

ai_support:
  enabled: true
  endpoint_type: openai
 
  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}
 
  # Hints appear as tooltips or sidebars
  features:
    hints:
      enabled: true

2. 关键词高亮

自动高亮文本中的相关关键词：

yaml

ai_support:
  enabled: true
  endpoint_type: openai
 
  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}
 
  features:
    keyword_highlighting:
      enabled: true
      # Highlights are rendered as box overlays on the text

3. 标签建议

为标注者提供标签建议（带置信度指标显示）：

yaml

ai_support:
  enabled: true
  endpoint_type: openai
 
  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}
 
  features:
    label_suggestions:
      enabled: true
      show_confidence: true

4. 标签理由

v2.1.0 新增

为每个标签可能适用于文本的原因生成平衡的解释，帮助标注者理解不同分类背后的推理：

yaml

ai_support:
  enabled: true
  endpoint_type: openai
 
  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}
 
  features:
    rationales:
      enabled: true

理由以工具提示形式出现，列出每个可用标签及其可能适用的解释。这对于培训标注者或标注决策困难时非常有用。

5. 选项高亮

v2.2.0 新增

AI 辅助高亮离散标注任务（radio、multiselect、likert、select）中最可能正确的选项。系统分析内容并高亮 top-k 个最可能的选项，同时淡化不太可能的选项，但所有选项仍然完全可点击。

yaml

ai_support:
  enabled: true
  endpoint_type: openai
 
  ai_config:
    model: gpt-4o-mini
    api_key: ${OPENAI_API_KEY}
 
  option_highlighting:
    enabled: true
    top_k: 3
    dim_opacity: 0.4
    auto_apply: true

详见选项高亮的完整配置说明。

互补功能：多样性排序

v2.2.0 新增

虽然严格来说不是 AI 功能，多样性排序使用 sentence-transformer 嵌入对项目进行聚类并以多样化顺序呈现，减少标注者疲劳并提高覆盖率。它通过自动为重排序的项目预取 AI 提示来与 AI 支持集成。

缓存和性能

AI 响应可以被缓存以提高性能并降低 API 成本：

yaml

ai_support:
  enabled: true
  endpoint_type: openai
 
  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}
 
  cache_config:
    disk_cache:
      enabled: true
      path: "ai_cache/cache.json"
 
    # Pre-generate hints on startup and prefetch upcoming
    prefetch:
      warm_up_page_count: 100
      on_next: 5
      on_prev: 2

缓存策略

预热：服务器启动时为初始批量实例预生成 AI 提示（warm_up_page_count）
预取：标注者向前（on_next）或向后（on_prev）导航时，为即将到来的实例生成提示
磁盘持久化：缓存保存到磁盘，服务器重启后仍然有效

自定义提示

Potato 为每种标注类型包含默认提示，存储在 potato/ai/prompt/ 中。你可以为特定任务自定义这些提示：

标注类型	提示文件
单选按钮	`radio_prompt.txt`
Likert 量表	`likert_prompt.txt`
复选框	`checkbox_prompt.txt`
片段标注	`span_prompt.txt`
滑块	`slider_prompt.txt`
下拉菜单	`dropdown_prompt.txt`
数字输入	`number_prompt.txt`
文本输入	`text_prompt.txt`

提示支持变量替换：

{text} - 文档文本
{labels} - 模式的可用标签
{description} - 模式描述

多模式支持

对于有多个标注模式的任务，你可以有选择地启用 AI 支持：

yaml

ai_support:
  enabled: true
  endpoint_type: openai
 
  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}
 
  # Only enable for specific schemes
  special_include:
    - page: 1
      schema: sentiment
    - page: 1
      schema: topics

完整示例

AI 辅助情感分析的完整配置：

yaml

annotation_task_name: "AI-Assisted Sentiment Analysis"
task_dir: "."
port: 8000
 
data_files:
  - "data/reviews.json"
 
item_properties:
  id_key: id
  text_key: text
 
annotation_schemes:
  - annotation_type: radio
    name: sentiment
    description: "What is the sentiment of this review?"
    labels:
      - Positive
      - Negative
      - Neutral
 
ai_support:
  enabled: true
  endpoint_type: openai
 
  ai_config:
    model: gpt-4
    api_key: ${OPENAI_API_KEY}
    temperature: 0.3
    max_tokens: 500
 
  features:
    hints:
      enabled: true
    keyword_highlighting:
      enabled: true
      # Highlights are rendered as box overlays on the text
    label_suggestions:
      enabled: true
      show_confidence: true
 
  cache_config:
    disk_cache:
      enabled: true
      path: "ai_cache/cache.json"
    prefetch:
      warm_up_page_count: 50
      on_next: 3
      on_prev: 2
 
output_annotation_dir: "output/"
output_annotation_format: "json"
allow_all_users: true

环境变量

使用环境变量安全存储 API 密钥：

bash

export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GOOGLE_API_KEY="..."

在配置中使用 ${VARIABLE_NAME} 语法引用它们。

成本考虑

默认情况下每个实例都会调用 AI
启用缓存以减少重复的 API 调用
使用预热和预取来预生成提示
对于简单任务考虑使用更小/更便宜的模型
本地提供商（Ollama、vLLM）无 API 成本

最佳实践

将 AI 作为辅助而非替代 - 让标注者做最终决定
生产环境启用缓存 - 减少延迟和成本
彻底测试提示 - 自定义提示应经过验证
监控 API 成本 - 跟踪使用情况，特别是云服务提供商
考虑本地提供商 - 大批量标注使用 Ollama 或 vLLM
保护 API 凭证 - 使用环境变量，永远不要提交密钥

AI 支持

支持的提供商

配置

基本设置

特定提供商配置

OpenAI

Anthropic Claude

Google Gemini

本地 Ollama

vLLM（自托管）

视觉 AI 端点

AI 功能

1. 智能提示

2. 关键词高亮

3. 标签建议

4. 标签理由

5. 选项高亮

互补功能：多样性排序

缓存和性能

缓存策略

自定义提示

多模式支持

完整示例

环境变量

成本考虑

最佳实践

延伸阅读