Questa pagina non è ancora disponibile nella tua lingua. Viene mostrata la versione in inglese.

MTurk एकीकरण

Amazon Mechanical Turk पर एनोटेशन कार्य तैनात करें।

Amazon Mechanical Turk एकीकरण

यह गाइड Amazon Mechanical Turk (MTurk) पर Potato एनोटेशन कार्यों को तैनात करने के निर्देश प्रदान करती है।

अवलोकन

Potato, MTurk के साथ External Question HIT प्रकार के माध्यम से एकीकृत होता है:

आप MTurk पर एक External Question HIT बनाते हैं जो आपके Potato सर्वर की ओर इशारा करता है
वर्कर्स आपके HIT पर क्लिक करते हैं और आपके Potato सर्वर पर रीडायरेक्ट होते हैं
Potato URL से वर्कर ID और अन्य पैरामीटर निकालता है
वर्कर्स एनोटेशन कार्य पूरा करते हैं
पूरा होने पर, वर्कर्स "Submit HIT to MTurk" पर क्लिक करते हैं

URL पैरामीटर

MTurk आपके External Question URL पर चार पैरामीटर पास करता है:

पैरामीटर	विवरण
`workerId`	वर्कर का अद्वितीय MTurk पहचानकर्ता
`assignmentId`	इस वर्कर-HIT जोड़े के लिए अद्वितीय ID
`hitId`	HIT पहचानकर्ता
`turkSubmitTo`	URL जहां समापन फॉर्म POST करना चाहिए

पूर्व-आवश्यकताएं

सर्वर आवश्यकताएं

सार्वजनिक रूप से सुलभ सर्वर जिसमें:
- खुला पोर्ट (आमतौर पर 8080 या 443)
- HTTPS अनुशंसित (कुछ ब्राउज़रों के लिए आवश्यक)
- स्थिर इंटरनेट कनेक्शन
Potato इंस्टॉल के साथ Python पर्यावरण

MTurk आवश्यकताएं

MTurk Requester Account: requester.mturk.com पर साइन अप करें
वित्तपोषित खाता: प्रोडक्शन के लिए फंड जोड़ें (sandbox मुफ्त है)

त्वरित शुरुआत

चरण 1: अपना Potato कॉन्फ़िगरेशन बनाएं

yaml

# mturk_task.yaml
annotation_task_name: "Sentiment Classification"
task_description: "Classify the sentiment of short text snippets."
 
# MTurk login configuration
login:
  type: url_direct
  url_argument: workerId
 
# Optional completion code
completion_code: "TASK_COMPLETE"
 
# Crowdsourcing settings
hide_navbar: true
jumping_to_id_disabled: true
assignment_strategy: random
max_annotations_per_user: 10
max_annotations_per_item: 3
 
# Data files
data_files:
  - data/items.json
 
# Annotation scheme
annotation_schemes:
  - annotation_type: radio
    name: sentiment
    description: "What is the sentiment of this text?"
    labels:
      - positive
      - neutral
      - negative

चरण 2: अपना सर्वर शुरू करें

bash

# Start the server
potato start mturk_task.yaml -p 8080
 
# Or with HTTPS (recommended)
potato start mturk_task.yaml -p 443 --ssl-cert cert.pem --ssl-key key.pem

चरण 3: MTurk पर अपना HIT बनाएं

इस XML टेम्प्लेट का उपयोग करके एक External Question HIT बनाएं:

xml

<?xml version="1.0" encoding="UTF-8"?>
<ExternalQuestion xmlns="http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2006-07-14/ExternalQuestion.xsd">
  <ExternalURL>https://your-server.com:8080/?workerId=${workerId}&amp;assignmentId=${assignmentId}&amp;hitId=${hitId}&amp;turkSubmitTo=${turkSubmitTo}</ExternalURL>
  <FrameHeight>800</FrameHeight>
</ExternalQuestion>

महत्वपूर्ण: XML में & के बजाय & का उपयोग करें।

कॉन्फ़िगरेशन संदर्भ

आवश्यक सेटिंग्स

yaml

login:
  type: url_direct      # Required: enables URL-based authentication
  url_argument: workerId  # Required: MTurk uses 'workerId' parameter

अनुशंसित सेटिंग्स

yaml

hide_navbar: true           # Prevent workers from skipping
jumping_to_id_disabled: true
assignment_strategy: random
max_annotations_per_user: 10
max_annotations_per_item: 3
task_description: "Brief description for the preview page."
completion_code: "YOUR_CODE"

Sandbox में परीक्षण

प्रोडक्शन में जाने से पहले हमेशा MTurk Sandbox में परीक्षण करें।

Sandbox URL

सेवा	URL
Requester	https://requestersandbox.mturk.com
Worker	https://workersandbox.mturk.com
API Endpoint	https://mturk-requester-sandbox.us-east-1.amazonaws.com

स्थानीय परीक्षण

MTurk URL पैरामीटर का स्थानीय रूप से परीक्षण करें:

bash

# Test normal workflow
curl "http://localhost:8080/?workerId=TEST_WORKER&assignmentId=TEST_ASSIGNMENT&hitId=TEST_HIT"
 
# Test preview mode
curl "http://localhost:8080/?workerId=TEST_WORKER&assignmentId=ASSIGNMENT_ID_NOT_AVAILABLE&hitId=TEST_HIT"

MTurk API एकीकरण (वैकल्पिक)

उन्नत सुविधाओं के लिए, MTurk API एकीकरण सक्षम करें:

bash

pip install boto3

configs/mturk_config.yaml बनाएं:

yaml

aws_access_key_id: "YOUR_ACCESS_KEY"
aws_secret_access_key: "YOUR_SECRET_KEY"
sandbox: true  # Set to false for production
hit_id: "YOUR_HIT_ID"

अपने मुख्य कॉन्फ़िग में सक्षम करें:

yaml

mturk:
  enabled: true
  config_file_path: configs/mturk_config.yaml

प्रोग्रामेटिक रूप से HIT बनाएं

python

import boto3
 
mturk = boto3.client(
    'mturk',
    region_name='us-east-1',
    endpoint_url='https://mturk-requester-sandbox.us-east-1.amazonaws.com'
)
 
question_xml = '''<?xml version="1.0" encoding="UTF-8"?>
<ExternalQuestion xmlns="http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2006-07-14/ExternalQuestion.xsd">
  <ExternalURL>https://your-server.com:8080/?workerId=${workerId}&amp;assignmentId=${assignmentId}&amp;hitId=${hitId}&amp;turkSubmitTo=${turkSubmitTo}</ExternalURL>
  <FrameHeight>800</FrameHeight>
</ExternalQuestion>'''
 
response = mturk.create_hit(
    Title='Sentiment Classification Task',
    Description='Classify the sentiment of short text snippets.',
    Keywords='sentiment, classification, text',
    Reward='0.50',
    MaxAssignments=100,
    LifetimeInSeconds=86400,
    AssignmentDurationInSeconds=3600,
    AutoApprovalDelayInSeconds=604800,
    Question=question_xml
)
 
print(f"Created HIT: {response['HIT']['HITId']}")

सर्वोत्तम अभ्यास

कार्य डिजाइन

स्पष्ट निर्देश: विस्तृत उदाहरण प्रदान करें
उचित समय: वर्कर्स को जल्दी न करें
उचित भुगतान: कम से कम न्यूनतम वेतन के समकक्ष ($12-15/घंटा)
प्रबंधनीय लंबाई: प्रति HIT 5-15 मिनट आदर्श है

गुणवत्ता नियंत्रण

योग्यता परीक्षण: वर्कर्स को पहले से स्क्रीन करें
ध्यान जांच: सत्यापन प्रश्न शामिल करें
अतिरेक: प्रति आइटम एकाधिक वर्कर (3+ अनुशंसित)
नमूनों की समीक्षा करें: एक उपसमुच्चय की मैन्युअल रूप से जांच करें

तकनीकी

किनारे के मामलों को संभालें: वर्कर्स पुनः लोड कर सकते हैं या वापस जा सकते हैं
प्रगति सहेजें: यदि संभव हो तो ऑटोसेव करें
सुशोभित त्रुटियां: सहायक त्रुटि संदेश दिखाएं

समस्या निवारण

स्वीकार करने के बाद वर्कर्स को प्रीव्यू पृष्ठ दिखाई देता है

सत्यापित करें कि assignmentId पैरामीटर सही तरीके से पास हो रहा है
प्रीव्यू पृष्ठ ऑटो-रीफ्रेश होता है; वर्कर्स से प्रतीक्षा करने के लिए कहें

सबमिट बटन काम नहीं करता

ब्राउज़र कंसोल में त्रुटियों की जांच करें
सत्यापित करें कि turkSubmitTo पैरामीटर मौजूद है
CORS या mixed-content समस्याओं की जांच करें

वर्कर्स लॉग इन नहीं कर सकते

सत्यापित करें कि login.url_argument workerId पर सेट है
सुनिश्चित करें कि login.type url_direct है

आगे पढ़ें

Crowdsourcing Integration - सामान्य क्राउडसोर्सिंग सेटअप
Quality Control - ध्यान जांच और गोल्ड मानक
Task Assignment - असाइनमेंट रणनीतियां

कार्यान्वयन विवरण के लिए, स्रोत दस्तावेज़ीकरण देखें।