感情分析は基本的なNLPタスクであり、Potatoは高品質な感情ラベルの収集を容易にします。本チュートリアルでは、すべての機能を備えた本番対応の感情アノテーションインターフェースを構築します。

プロジェクト概要

以下の機能を持つソーシャルメディア投稿のアノテーションインターフェースを作成します：

3値感情分類（ポジティブ、ネガティブ、ニュートラル）
各アノテーションの信頼度評価
オプションのテキスト説明
効率化のためのキーボードショートカット
品質管理措置

完全な設定

完全なconfig.yamlは以下の通りです：

yaml

annotation_task_name: "Social Media Sentiment Analysis"
 
# Data configuration
data_files:
  - "data/tweets.json"
 
item_properties:
  id_key: id
  text_key: text
 
# Annotation interface
annotation_schemes:
  # Primary sentiment label
  - annotation_type: radio
    name: sentiment
    description: "What is the overall sentiment of this post?"
    labels:
      - name: Positive
        tooltip: "Expresses happiness, satisfaction, or approval"
        keyboard_shortcut: "1"
      - name: Negative
        tooltip: "Expresses sadness, frustration, or disapproval"
        keyboard_shortcut: "2"
      - name: Neutral
        tooltip: "Factual, objective, or lacks emotional content"
        keyboard_shortcut: "3"
    required: true
 
  # Confidence rating
  - annotation_type: likert
    name: confidence
    description: "How confident are you in your sentiment label?"
    size: 5
    min_label: "Not confident"
    max_label: "Very confident"
    required: true
 
  # Optional explanation
  - annotation_type: text
    name: explanation
    description: "Why did you choose this label? (Optional)"
    multiline: true
    required: false
    placeholder: "Explain your reasoning..."
 
# Guidelines
annotation_guidelines:
  title: "Sentiment Annotation Guidelines"
  content: |
    ## タスク内容
    各ソーシャルメディア投稿で表現されている感情を分類してください。
 
    ## ラベル
 
    **ポジティブ**: 著者がポジティブな感情や意見を表現している
    - 幸福、興奮、感謝
    - 賞賛、推薦、賛成
    - 例: "Love this!", "Best day ever!", "Highly recommend"
 
    **ネガティブ**: 著者がネガティブな感情や意見を表現している
    - 怒り、不満、悲しみ
    - 苦情、批判、反対
    - 例: "Terrible service", "So disappointed", "Worst experience"
 
    **ニュートラル**: 事実的または明確な感情がない
    - ニュース、お知らせ、質問
    - 混合またはバランスのとれた意見
    - 例: "The store opens at 9am", "Has anyone tried this?"
 
    ## コツ
    - トピックではなく著者の感情に注目する
    - 皮肉は意図された意味に基づいてラベル付けする
    - 迷った場合は信頼度を下げる
 
# User management
automatic_assignment:
  on: true
  sampling_strategy: random
  labels_per_instance: 1
  instance_per_annotator: 100

サンプルデータフォーマット

data/tweets.jsonを作成します：

json

{"id": "t001", "text": "Just got my new laptop and I'm absolutely loving it! Best purchase of the year! #happy"}
{"id": "t002", "text": "Waited 2 hours for customer service and they still couldn't help me. Never shopping here again."}
{"id": "t003", "text": "The new coffee shop on Main Street opens tomorrow at 7am."}
{"id": "t004", "text": "This movie was okay I guess. Some good parts, some boring parts."}
{"id": "t005", "text": "Can't believe how beautiful the sunset was tonight! Nature is amazing."}

タスクの実行

アノテーションサーバーを起動します：

bash

potato start config.yaml

http://localhost:8000にアクセスしてログインし、アノテーションを開始します。

インターフェースの理解

メインアノテーションエリア

インターフェースには以下が表示されます：

アノテーション対象のテキスト（ハイライトされたURL、メンション、ハッシュタグ付き）
ツールチップ付きの感情ラジオボタン
信頼度リッカートスケール
オプションの説明テキストボックス

キーボードワークフロー

最大効率のために：

テキストを読む
1、2、または3を押して感情を選択
信頼度レベルをクリック（またはマウスを使用）
Enterを押して送信

進捗追跡

インターフェースには以下が表示されます：

現在の進捗（例：「15 / 100」）
推定残り時間
セッション統計

出力フォーマット

アノテーションはannotations/username.jsonlに保存されます：

json

{
  "id": "t001",
  "text": "Just got my new laptop and I'm absolutely loving it!...",
  "annotations": {
    "sentiment": "Positive",
    "confidence": 5,
    "explanation": "Clear expression of happiness with the purchase"
  },
  "annotator": "john_doe",
  "timestamp": "2026-01-15T14:30:00Z"
}

品質管理の追加

アテンションチェック

アノテーターの注意を確認するためにゴールドスタンダードアイテムを追加します：

yaml

quality_control:
  attention_checks:
    enabled: true
    frequency: 10  # Every 10th item
    items:
      - text: "I am extremely happy and satisfied! This is the best!"
        expected:
          sentiment: "Positive"
      - text: "This is absolutely terrible and I hate it completely."
        expected:
          sentiment: "Negative"

アノテーター間一致度

研究プロジェクトでは、複数アノテーションを有効にします：

yaml

automatic_assignment:
  on: true
  sampling_strategy: random
  labels_per_instance: 3  # Each item annotated by 3 people
  instance_per_annotator: 50

結果の分析

アノテーションをエクスポートして分析します：

python

import json
from collections import Counter
 
# Load annotations
annotations = []
with open('annotations/annotator1.jsonl') as f:
    for line in f:
        annotations.append(json.loads(line))
 
# Sentiment distribution
sentiments = Counter(a['annotations']['sentiment'] for a in annotations)
print(f"Sentiment distribution: {dict(sentiments)}")
 
# Average confidence
confidences = [a['annotations']['confidence'] for a in annotations]
print(f"Average confidence: {sum(confidences)/len(confidences):.2f}")

次のステップ

大規模アノテーションのためのクラウドソーシングを設定する
ラベリングを高速化するためのAI提案を追加する
難しいケースを優先するためのアクティブラーニングを実装する

その他のアノテーションタイプはドキュメントをご覧ください。