Skip to content
Docs/Tools & Utilities

User Simulator

Automated testing tool for simulating multiple annotators with configurable behaviors.

User Simulator

The User Simulator enables automated testing of Potato annotation tasks by simulating multiple users with configurable behaviors and competence levels.

Overview

The simulator is useful for:

  • Quality control testing: Test attention checks, gold standards, and blocking behavior
  • Dashboard testing: Generate realistic annotation data for admin dashboard
  • Scalability testing: Stress test the server with many concurrent users
  • AI assistance evaluation: Compare LLM accuracy against human-like behaviors
  • Active learning testing: Simulate iterative annotation workflows

Quick Start

bash
# Basic random simulation with 10 users
python -m potato.simulator --server http://localhost:8000 --users 10
 
# With configuration file
python -m potato.simulator --config simulator-config.yaml --server http://localhost:8000
 
# Fast scalability test (no waiting between annotations)
python -m potato.simulator --server http://localhost:8000 --users 50 --parallel 10 --fast-mode

Configuration

YAML Configuration File

Create a YAML file with simulator settings:

yaml
simulator:
  # User configuration
  users:
    count: 20
    competence_distribution:
      good: 0.5      # 50% will be "good" annotators (80-90% accuracy)
      average: 0.3   # 30% "average" (60-70% accuracy)
      poor: 0.2      # 20% "poor" (40-50% accuracy)
 
  # Annotation strategy
  strategy: random  # random, biased, llm, pattern
 
  # Timing configuration
  timing:
    annotation_time:
      min: 2.0
      max: 45.0
      mean: 12.0
      std: 6.0
      distribution: normal  # uniform, normal, exponential
 
  # Execution
  execution:
    parallel_users: 5
    delay_between_users: 0.5
    max_annotations_per_user: 50
 
server:
  url: http://localhost:8000

Competence Levels

LevelAccuracyDescription
perfect100%Always matches gold standard
good80-90%High-quality annotator
average60-70%Typical crowdworker
poor40-50%Low-quality annotator
random~1/NRandom selection from labels
adversarial0%Intentionally wrong (for testing QC)

Annotation Strategies

Random Strategy (default)

Selects labels uniformly at random:

yaml
strategy: random

Biased Strategy

Weighted selection based on label preferences:

yaml
strategy: biased
biased_config:
  label_weights:
    positive: 0.6
    negative: 0.3
    neutral: 0.1

LLM Strategy

Uses an LLM to generate annotations based on text content:

yaml
strategy: llm
llm_config:
  endpoint_type: openai
  model: gpt-4o-mini
  api_key: ${OPENAI_API_KEY}
  temperature: 0.1
  add_noise: true
  noise_rate: 0.05

For local LLMs with Ollama:

yaml
strategy: llm
llm_config:
  endpoint_type: ollama
  model: llama3.2
  base_url: http://localhost:11434

CLI Options

text
Usage: python -m potato.simulator [OPTIONS]

Required:
  --server, -s URL        Potato server URL

User Configuration:
  --users, -u NUM         Number of simulated users (default: 10)
  --competence DIST       Competence distribution

Strategy:
  --strategy TYPE         Strategy: random, biased, llm, pattern
  --llm-endpoint TYPE     LLM endpoint: openai, anthropic, ollama
  --llm-model NAME        LLM model name

Execution:
  --parallel, -p NUM      Max concurrent users (default: 5)
  --max-annotations, -m   Max annotations per user
  --fast-mode             Disable waiting between annotations

Output:
  --output-dir, -o DIR    Output directory (default: simulator_output)

Quality Control Testing

Test attention check detection:

yaml
simulator:
  users:
    count: 10
    competence_distribution:
      adversarial: 1.0  # All users will fail
  quality_control:
    attention_check_fail_rate: 0.5
    respond_fast_rate: 0.3

Output Files

After simulation, results are exported to the output directory:

  • summary_{timestamp}.json - Aggregate statistics
  • user_results_{timestamp}.json - Per-user detailed results
  • annotations_{timestamp}.csv - All annotations in flat format

Summary Example

json
{
  "user_count": 20,
  "total_annotations": 400,
  "total_time_seconds": 125.3,
  "attention_checks": {
    "passed": 18,
    "failed": 2,
    "pass_rate": 0.9
  }
}

Programmatic Usage

python
from potato.simulator import SimulatorManager, SimulatorConfig
 
# Create configuration
config = SimulatorConfig(
    user_count=10,
    strategy="random",
    competence_distribution={"good": 0.5, "average": 0.5}
)
 
# Create and run simulator
manager = SimulatorManager(config, "http://localhost:8000")
results = manager.run_parallel(max_annotations_per_user=20)
 
# Print summary and export
manager.print_summary()
manager.export_results()

Integration with Tests

The simulator can be used in pytest fixtures:

python
import pytest
from potato.simulator import SimulatorManager, SimulatorConfig
 
@pytest.fixture
def simulated_annotations(flask_test_server):
    config = SimulatorConfig(user_count=5, strategy="random")
    manager = SimulatorManager(config, flask_test_server.base_url)
    return manager.run_parallel(max_annotations_per_user=10)
 
def test_dashboard_shows_annotations(simulated_annotations, flask_test_server):
    response = requests.get(f"{flask_test_server.base_url}/admin/api/overview")
    assert response.json()["total_annotations"] > 0

Troubleshooting

Login failures

  • Ensure the server allows anonymous registration or has require_password: false
  • Check server logs for authentication errors

No instances available

  • Verify data files are loaded correctly
  • Check assignment strategy settings

LLM strategy not working

  • Verify API key is set
  • For Ollama, ensure the server is running
  • Check model name is correct

Further Reading

For implementation details, see the source documentation.