MTurk統合

Amazon Mechanical Turkにアノテーションタスクをデプロイする方法。

このガイドでは、Amazon Mechanical Turk（MTurk）にPotatoアノテーションタスクをデプロイする手順を説明します。

概要

PotatoはExternal Question HITタイプを通じてMTurkと統合します：

MTurk上でPotatoサーバーを指すExternal Question HITを作成する
ワーカーがHITをクリックすると、Potatoサーバーにリダイレクトされる
PotatoがURLからワーカーIDやその他のパラメータを抽出する
ワーカーがアノテーションタスクを完了する
完了時にワーカーが「Submit HIT to MTurk」をクリックする

URLパラメータ

MTurkはExternal Question URLに4つのパラメータを渡します：

パラメータ	説明
`workerId`	ワーカーの一意のMTurk識別子
`assignmentId`	このワーカーとHITのペアの一意のID
`hitId`	HIT識別子
`turkSubmitTo`	完了フォームのPOST先URL

前提条件

サーバー要件

公開アクセス可能なサーバー：
- オープンポート（通常8080または443）
- HTTPS推奨（一部のブラウザでは必須）
- 安定したインターネット接続
PotatoがインストールされたPython環境

MTurk要件

MTurkリクエスターアカウント：requester.mturk.comでサインアップ
入金済みアカウント：本番用に資金を追加（サンドボックスは無料）

クイックスタート

ステップ1：Potato設定を作成

yaml

# mturk_task.yaml
annotation_task_name: "Sentiment Classification"
task_description: "Classify the sentiment of short text snippets."
 
# MTurk login configuration
login:
  type: url_direct
  url_argument: workerId
 
# Optional completion code
completion_code: "TASK_COMPLETE"
 
# Crowdsourcing settings
hide_navbar: true
jumping_to_id_disabled: true
assignment_strategy: random
max_annotations_per_user: 10
max_annotations_per_item: 3
 
# Data files
data_files:
  - data/items.json
 
# Annotation scheme
annotation_schemes:
  - annotation_type: radio
    name: sentiment
    description: "What is the sentiment of this text?"
    labels:
      - positive
      - neutral
      - negative

ステップ2：サーバーを起動

bash

# Start the server
potato start mturk_task.yaml -p 8080
 
# Or with HTTPS (recommended)
potato start mturk_task.yaml -p 443 --ssl-cert cert.pem --ssl-key key.pem

ステップ3：MTurk上でHITを作成

このXMLテンプレートを使用してExternal Question HITを作成します：

xml

<?xml version="1.0" encoding="UTF-8"?>
<ExternalQuestion xmlns="http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2006-07-14/ExternalQuestion.xsd">
  <ExternalURL>https://your-server.com:8080/?workerId=${workerId}&amp;assignmentId=${assignmentId}&amp;hitId=${hitId}&amp;turkSubmitTo=${turkSubmitTo}</ExternalURL>
  <FrameHeight>800</FrameHeight>
</ExternalQuestion>

重要：XMLでは&の代わりに&を使用してください。

設定リファレンス

必須設定

yaml

login:
  type: url_direct      # Required: enables URL-based authentication
  url_argument: workerId  # Required: MTurk uses 'workerId' parameter

推奨設定

yaml

hide_navbar: true           # Prevent workers from skipping
jumping_to_id_disabled: true
assignment_strategy: random
max_annotations_per_user: 10
max_annotations_per_item: 3
task_description: "Brief description for the preview page."
completion_code: "YOUR_CODE"

サンドボックスでのテスト

本番に移行する前に、必ずMTurkサンドボックスでテストしてください。

サンドボックスURL

サービス	URL
リクエスター	https://requestersandbox.mturk.com
ワーカー	https://workersandbox.mturk.com
APIエンドポイント	https://mturk-requester-sandbox.us-east-1.amazonaws.com

ローカルテスト

MTurk URLパラメータをローカルでテストする：

bash

# Test normal workflow
curl "http://localhost:8080/?workerId=TEST_WORKER&assignmentId=TEST_ASSIGNMENT&hitId=TEST_HIT"
 
# Test preview mode
curl "http://localhost:8080/?workerId=TEST_WORKER&assignmentId=ASSIGNMENT_ID_NOT_AVAILABLE&hitId=TEST_HIT"

MTurk API統合（オプション）

高度な機能のために、MTurk API統合を有効にする：

bash

pip install boto3

configs/mturk_config.yamlを作成する：

yaml

aws_access_key_id: "YOUR_ACCESS_KEY"
aws_secret_access_key: "YOUR_SECRET_KEY"
sandbox: true  # Set to false for production
hit_id: "YOUR_HIT_ID"

メイン設定で有効にする：

yaml

mturk:
  enabled: true
  config_file_path: configs/mturk_config.yaml

プログラムによるHIT作成

python

import boto3
 
mturk = boto3.client(
    'mturk',
    region_name='us-east-1',
    endpoint_url='https://mturk-requester-sandbox.us-east-1.amazonaws.com'
)
 
question_xml = '''<?xml version="1.0" encoding="UTF-8"?>
<ExternalQuestion xmlns="http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2006-07-14/ExternalQuestion.xsd">
  <ExternalURL>https://your-server.com:8080/?workerId=${workerId}&amp;assignmentId=${assignmentId}&amp;hitId=${hitId}&amp;turkSubmitTo=${turkSubmitTo}</ExternalURL>
  <FrameHeight>800</FrameHeight>
</ExternalQuestion>'''
 
response = mturk.create_hit(
    Title='Sentiment Classification Task',
    Description='Classify the sentiment of short text snippets.',
    Keywords='sentiment, classification, text',
    Reward='0.50',
    MaxAssignments=100,
    LifetimeInSeconds=86400,
    AssignmentDurationInSeconds=3600,
    AutoApprovalDelayInSeconds=604800,
    Question=question_xml
)
 
print(f"Created HIT: {response['HIT']['HITId']}")

ベストプラクティス

タスク設計

明確な指示：詳細な例を提供する
適切な時間：ワーカーを急がせない
公正な報酬：最低でも最低賃金相当（時給$12-15）
適切な長さ：HITあたり5-15分が理想

品質管理

資格テスト：事前にワーカーをスクリーニングする
注意力チェック：検証問題を含める
冗長性：アイテムあたり複数のワーカー（3人以上推奨）
サンプル確認：サブセットを手動でチェックする

技術面

エッジケースに対応：ワーカーがリロードや戻る可能性がある
進捗を保存：可能であれば自動保存する
適切なエラー処理：役立つエラーメッセージを表示する

トラブルシューティング

承認後もプレビューページが表示される

assignmentIdパラメータが正しく渡されているか確認する
プレビューページは自動更新される。ワーカーに待つよう依頼する

送信ボタンが動作しない

ブラウザコンソールでエラーを確認する
turkSubmitToパラメータが存在するか確認する
CORSまたは混合コンテンツの問題を確認する

ワーカーがログインできない

login.url_argumentがworkerIdに設定されているか確認する
login.typeがurl_directであることを確認する

MTurk統合

概要

URLパラメータ

前提条件

サーバー要件

MTurk要件

クイックスタート

ステップ1：Potato設定を作成

ステップ2：サーバーを起動

ステップ3：MTurk上でHITを作成

設定リファレンス

必須設定

推奨設定

サンドボックスでのテスト

サンドボックスURL

ローカルテスト

MTurk API統合（オプション）

プログラムによるHIT作成

ベストプラクティス

タスク設計

品質管理

技術面

トラブルシューティング

承認後もプレビューページが表示される

送信ボタンが動作しない

ワーカーがログインできない

関連情報