対話アノテーション

特別な表示オプションによる会話とマルチアイテムテキストのアノテーション。

Potatoは、各インスタンスがテキスト要素のリストを含むマルチアイテムデータのアノテーションをサポートしています。一般的な用途：

対話アノテーション: 複数のターンを持つ会話
ペアワイズ比較: 2つ以上のテキストバリアントの比較
マルチドキュメントタスク: 複数の関連テキストの評価やラベリング

データ形式

入力データ

マルチアイテムデータはtextフィールドの文字列リストとして表現されます：

json

{"id": "conv_001", "text": ["Tom: Isn't this awesome?!", "Sam: Yes! I like you!", "Tom: Great!", "Sam: Awesome! Let's party!"]}
{"id": "conv_002", "text": ["Tom: I am so sorry for that", "Sam: No worries", "Tom: Thanks for your understanding!"]}

リスト内の各文字列は1つのアイテム（対話ターン、ドキュメントバリアントなど）を表します。

設定

基本セットアップ

yaml

# Data configuration
data_files:
  - data/dialogues.json
 
item_properties:
  id_key: id
  text_key: text
 
# Configure list display
list_as_text:
  text_list_prefix_type: none  # No prefix since speaker names are in text
  alternating_shading: true    # Shade every other turn for readability
 
# Annotation schemes
annotation_schemes:
  - annotation_type: radio
    name: sentiment
    description: "What is the overall sentiment of this conversation?"
    labels:
      - positive
      - neutral
      - negative

表示オプション

list_as_text設定はリストアイテムの表示方法を制御します：

yaml

list_as_text:
  text_list_prefix_type: alphabet  # Prefix type for items
  horizontal: false                # Layout direction
  alternating_shading: false       # Shade alternate turns

プレフィックスタイプ

オプション	例	最適な用途
`alphabet`	A. B. C.	ペアワイズ比較、オプション
`number`	1. 2. 3.	順次ターン、順序付きリスト
`bullet`	. . .	順序なしアイテム
`none`	（プレフィックスなし）	テキストに話者名が含まれる対話

レイアウトオプション

オプション	説明
`horizontal: false`	縦並びレイアウト（デフォルト）- アイテムを積み重ね
`horizontal: true`	横並びレイアウト - ペアワイズ比較用
`alternating_shading: true`	対話で交互にターンを色付け

設定例

対話アノテーション

yaml

annotation_task_name: Dialogue Analysis
 
data_files:
  - data/dialogues.json
 
item_properties:
  id_key: id
  text_key: text
 
list_as_text:
  text_list_prefix_type: none
  alternating_shading: true
 
annotation_schemes:
  - annotation_type: span
    name: certainty
    description: Highlight phrases expressing certainty or uncertainty
    labels:
      - certain
      - uncertain
    sequential_key_binding: true
 
  - annotation_type: radio
    name: sentiment
    description: What sentiment does the conversation hold?
    labels:
      - positive
      - neutral
      - negative
    sequential_key_binding: true

ペアワイズテキスト比較

yaml

annotation_task_name: Text Comparison
 
data_files:
  - data/pairs.json
 
item_properties:
  id_key: id
  text_key: text
 
list_as_text:
  text_list_prefix_type: alphabet
  horizontal: true
 
annotation_schemes:
  - annotation_type: radio
    name: preference
    description: Which text is better?
    labels:
      - A is better
      - B is better
      - Equal

動作する例

完全な動作例はproject-hub/dialogue_analysis/で利用できます：

bash

python potato/flask_server.py start project-hub/dialogue_analysis/configs/dialogue-analysis.yaml -p 8000

サンプルデータ形式：

json

{"id":"1","text":["Tom: Isn't this awesome?!", "Sam: Yes! I like you!", "Tom: great!", "Sam: Awesome! Let's party!"]}
{"id":"2","text":["Tom: I am so sorry for that", "Sam: No worries", "Tom: thanks for your understanding!"]}

ヒント

話者名: text_list_prefix_type: noneを使用する対話では、テキストに話者名を含めてください（例："Tom: Hello"）
スパンアノテーション: 対話データでスパンアノテーションを使用する場合、アノテーターは表示されたターン内のテキストをハイライトできます
プレフィックスの選択:
- テキストに話者名が埋め込まれた対話にはnoneを使用
- シーケンス順序が重要な場合はnumberを使用
- ペアワイズ/比較タスクにはalphabetを使用
可読性: 長い対話ではalternating_shadingを有効にして、アノテーターがどのターンを読んでいるか追跡しやすくする
比較タスク: 横並び比較にはhorizontal: trueとalphabetプレフィックスを使用

対話アノテーション

データ形式

入力データ

設定

基本セットアップ

表示オプション

プレフィックスタイプ

レイアウトオプション

設定例

対話アノテーション

ペアワイズテキスト比較

動作する例

ヒント

関連資料