Deploying to Amazon Mechanical Turk
Step-by-step instructions for running Potato annotation tasks on MTurk with qualification tests and approval workflows.
Deploying to Amazon Mechanical Turk
Amazon Mechanical Turk (MTurk) provides access to a large, on-demand workforce for annotation tasks. Potato integrates with MTurk through the ExternalQuestion HIT type: your Potato server acts as the annotation interface, and MTurk handles worker recruitment, assignment tracking, and payment. This guide covers the complete setup process.
Prerequisites
- AWS account with MTurk enabled
- MTurk Requester account (production or sandbox) at requester.mturk.com
- Potato server accessible via a public URL (HTTPS recommended)
- Python environment with Potato installed
- Basic familiarity with MTurk concepts (HITs, Workers, Assignments)
How the Integration Works
Potato does not manage MTurk HITs directly. Instead, the integration follows this flow:
- You create an ExternalQuestion HIT on MTurk that points to your Potato server URL
- A worker accepts the HIT on MTurk and is redirected to your Potato server with query parameters (
workerId,assignmentId,hitId,turkSubmitTo) - Potato uses the
workerIdparameter to identify the worker (vialogin.type: url_direct) - The worker completes the annotation task on your Potato server
- Upon completion, Potato redirects the worker back to MTurk's submission endpoint
Configuration
The key to MTurk integration is setting the login type to url_direct with url_argument: workerId. This tells Potato to extract the worker's identity from the URL query parameter that MTurk passes automatically.
login:
type: url_direct
url_argument: workerIdThat is the only MTurk-specific configuration in Potato. Everything else -- HIT creation, qualifications, payment, approval -- is managed on the MTurk side.
Complete Configuration Example
annotation_task_name: "Sentiment Classification"
task_description: "Classify the sentiment of short text snippets."
# MTurk login: extract worker ID from URL parameter
login:
type: url_direct
url_argument: workerId
# UI settings recommended for crowdsourcing
hide_navbar: true
jumping_to_id_disabled: true
# Assignment settings
assignment_strategy: random
max_annotations_per_user: 20
max_annotations_per_item: 3
# Data
data_files:
- data/items.json
item_properties:
id_key: id
text_key: text
# Annotation scheme
annotation_schemes:
- annotation_type: radio
name: sentiment
description: "What is the sentiment of this text?"
labels:
- Positive
- Negative
- Neutral
# Output
output_annotation_dir: annotation_output
output_annotation_format: jsonSetting Up on MTurk
Step 1: Start Your Potato Server
Launch your Potato server on a publicly accessible machine:
potato start config.yaml -p 8080Make sure the server is reachable from the internet (e.g., https://your-server.com:8080/).
Step 2: Create the ExternalQuestion XML
MTurk uses an XML format called ExternalQuestion to embed external websites in a HIT. Create the following XML:
<?xml version="1.0" encoding="UTF-8"?>
<ExternalQuestion xmlns="http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2006-07-14/ExternalQuestion.xsd">
<ExternalURL>https://your-server.com:8080/?workerId=${workerId}&assignmentId=${assignmentId}&hitId=${hitId}&turkSubmitTo=${turkSubmitTo}</ExternalURL>
<FrameHeight>800</FrameHeight>
</ExternalQuestion>Important: Use & instead of & in the XML. MTurk will substitute the ${...} placeholders with actual values when a worker accepts the HIT.
Step 3: Create the HIT on MTurk
You can create HITs through the MTurk Requester Console or programmatically using the AWS SDK (boto3). HIT settings like title, description, reward, duration, and qualifications are all configured on the MTurk side, not in Potato.
Using boto3 (Python)
import boto3
# Use sandbox for testing
mturk = boto3.client(
'mturk',
region_name='us-east-1',
endpoint_url='https://mturk-requester-sandbox.us-east-1.amazonaws.com'
)
# For production, omit endpoint_url or use:
# endpoint_url='https://mturk-requester.us-east-1.amazonaws.com'
question_xml = '''<?xml version="1.0" encoding="UTF-8"?>
<ExternalQuestion xmlns="http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2006-07-14/ExternalQuestion.xsd">
<ExternalURL>https://your-server.com:8080/?workerId=${workerId}&assignmentId=${assignmentId}&hitId=${hitId}&turkSubmitTo=${turkSubmitTo}</ExternalURL>
<FrameHeight>800</FrameHeight>
</ExternalQuestion>'''
response = mturk.create_hit(
Title='Sentiment Classification Task',
Description='Read short texts and classify their sentiment as positive, negative, or neutral.',
Keywords='sentiment, classification, text, NLP',
Reward='0.50',
MaxAssignments=100,
LifetimeInSeconds=86400, # 1 day
AssignmentDurationInSeconds=3600, # 1 hour
AutoApprovalDelayInSeconds=604800, # 7 days
Question=question_xml,
QualificationRequirements=[
{
'QualificationTypeId': '000000000000000000L0', # Approval rate
'Comparator': 'GreaterThanOrEqualTo',
'IntegerValues': [97]
},
{
'QualificationTypeId': '00000000000000000040', # Number approved
'Comparator': 'GreaterThanOrEqualTo',
'IntegerValues': [500]
},
{
'QualificationTypeId': '00000000000000000071', # Locale
'Comparator': 'In',
'LocaleValues': [
{'Country': 'US'},
{'Country': 'GB'},
{'Country': 'CA'},
{'Country': 'AU'}
]
}
]
)
print(f"Created HIT: {response['HIT']['HITId']}")Step 4: Set Qualifications (on MTurk)
Worker qualifications are configured entirely on the MTurk side when creating your HIT. Common qualification filters include:
- Approval rate: Require a minimum HIT approval percentage (e.g., 97%+)
- HITs approved: Require a minimum number of previously approved HITs (e.g., 500+)
- Locale: Restrict to workers from specific countries
- Masters: Use MTurk's pre-vetted Masters workers (higher fees apply)
- Custom qualifications: Create your own qualification tests through the MTurk console
Completion Handling
When a worker finishes annotating all assigned items, Potato needs to redirect them back to MTurk so the assignment can be submitted. MTurk passes a turkSubmitTo URL parameter that tells Potato where to send the completion POST request.
The worker sees a "Submit HIT" button after completing the task. Clicking it submits the assignment back to MTurk for your review and approval.
Testing in the MTurk Sandbox
Always test your setup in the MTurk Sandbox before going to production.
| Service | URL |
|---|---|
| Requester Sandbox | https://requestersandbox.mturk.com |
| Worker Sandbox | https://workersandbox.mturk.com |
| API Endpoint (Sandbox) | https://mturk-requester-sandbox.us-east-1.amazonaws.com |
Local Testing
You can test the URL parameter flow locally without MTurk:
# Simulate a worker accessing your task
curl "http://localhost:8080/?workerId=TEST_WORKER&assignmentId=TEST_ASSIGN&hitId=TEST_HIT"
# Simulate the preview mode (before a worker accepts the HIT)
curl "http://localhost:8080/?workerId=TEST_WORKER&assignmentId=ASSIGNMENT_ID_NOT_AVAILABLE&hitId=TEST_HIT"When assignmentId is ASSIGNMENT_ID_NOT_AVAILABLE, the worker is previewing the HIT and has not yet accepted it.
Managing HITs and Approvals
HIT management -- monitoring progress, approving or rejecting assignments, issuing bonuses -- is done through MTurk's own tools:
- MTurk Requester Console: Web interface for managing HITs, reviewing assignments, and communicating with workers
- boto3 (AWS SDK for Python): Programmatic access for batch operations
# Example: List assignments for a HIT
assignments = mturk.list_assignments_for_hit(
HITId='YOUR_HIT_ID',
AssignmentStatuses=['Submitted']
)
# Approve an assignment
mturk.approve_assignment(AssignmentId='ASSIGNMENT_ID')
# Reject an assignment (use sparingly)
mturk.reject_assignment(
AssignmentId='ASSIGNMENT_ID',
RequesterFeedback='Did not complete all items.'
)Cost Calculation
MTurk charges fees on top of the reward you pay workers:
- Base fee: 20% of the reward amount
- Masters qualification: Additional 5% fee
- 10+ assignments per HIT: Additional 20% fee
Example: If you pay $0.50 per assignment with 100 assignments:
- Base cost: 100 x $0.50 x 1.20 = $60.00
- With Masters: 100 x $0.50 x 1.25 = $62.50
Best Practices
- Start with Sandbox: Always test the full workflow in the sandbox before spending money
- Fair pay: Calculate an hourly rate (reward / estimated time x 60) and aim for at least $12-15/hour
- Clear HIT descriptions: Well-written titles and descriptions attract better workers
- Quick approval: Workers appreciate fast payment -- approve promptly when quality is acceptable
- Handle rejections carefully: Rejections hurt workers' approval rates and affect your requester reputation
- Use HTTPS: Some browsers block mixed content; HTTPS ensures the iframe works reliably
- Set
hide_navbar: true: Prevents workers from navigating away from the task within Potato - Monitor your server: Ensure your Potato server stays up for the duration of the HIT
Comparison: MTurk vs Prolific
| Aspect | MTurk | Prolific |
|---|---|---|
| Worker pool | Large, diverse | Smaller, research-focused |
| Quality | Variable | Generally higher |
| Pricing | Lower base, + fees | Higher, transparent |
| Setup | More complex | Simpler |
| Best for | Large scale, budget | Research, quality |
| Potato config | url_argument: workerId | url_argument: PROLIFIC_PID |
Next Steps
- Compare with Prolific integration
- Set up quality control
- Calculate inter-annotator agreement
Full MTurk documentation at /docs/deployment/mturk-integration. For implementation details, see the source documentation.