Frequently Asked Questions

Find answers to common questions about Potato. Can't find what you're looking for? Join our Discord or check the documentation.

Getting Started

Potato (Portable Text Annotation Tool) is a free, open-source annotation tool for creating high-quality datasets. It supports text, images, audio, and video annotation with a simple YAML-based configuration system.

Yes, Potato is free and open-source under the PolyForm Shield License 1.0.0. This license allows free use for research, education, and non-commercial purposes. The license includes a non-compete clause that restricts using Potato to build competing annotation platforms. See the LICENSE file in the GitHub repository for full details.

No coding is required. Potato uses YAML configuration files that are human-readable and easy to write. Our Playground provides a visual interface to build configurations without writing any code.

Install via pip: `pip install potato-annotation`. Then run `potato start my_project -c config.yaml` to launch your annotation server. See our Quick Start guide for detailed instructions.

Potato requires Python 3.7 or higher. We recommend using Python 3.10+ for the best experience.

Data & Privacy

Your data stays on your machine. Potato runs entirely locally and never sends your data to external servers. This makes it ideal for sensitive data like medical records or proprietary content.

Yes. Since Potato is self-hosted and runs locally, you maintain complete control over your data. No data ever leaves your infrastructure, making it suitable for HIPAA, GDPR, and other compliance requirements.

Potato supports various input formats including plain text, JSON, JSONL, CSV, TSV, images (PNG, JPG, GIF, WebP), audio (MP3, WAV, OGG), and video files. Output can be exported to JSON, JSONL, CSV, and specialized formats like CoNLL, spaCy, COCO, and HuggingFace datasets.

Annotation Features

Potato supports: Radio buttons (single choice), Checkboxes (multi-select), Likert scales, Text input, Span annotation (highlighting), Bounding boxes, Polygons, Pairwise comparison, Best-Worst Scaling, and more. See our Showcase for examples.

Yes. A single annotation task can include any combination of annotation schemes. For example, you might have annotators highlight entities (span annotation), classify sentiment (radio buttons), and provide comments (text input) all on the same item.

Yes. Potato supports image classification with radio/checkbox labels, bounding box annotation for object detection, and polygon annotation for segmentation tasks.

Yes. Potato can display audio waveforms and video players alongside annotation controls. This is useful for transcription review, speaker diarization, emotion detection, and similar tasks.

Add a span annotation scheme in your config with the text to highlight. Annotators can select text spans and assign labels. See our NER and span annotation examples in the Showcase.

Annotator Management

Potato supports multiple annotators out of the box. Each annotator logs in with a unique ID, and their annotations are tracked separately. You can configure overlap to have multiple annotators label the same items for quality control.

Yes. Potato integrates with Prolific and Amazon Mechanical Turk. Annotators are redirected from the platform, complete tasks in Potato, and are returned with completion codes.

Potato tracks which items have been annotated by multiple annotators. You can export annotations and calculate agreement metrics (Cohen's Kappa, Krippendorff's Alpha, etc.) using standard Python libraries.

Yes. You can add attention check items, configure required annotation overlap, and use the admin dashboard to monitor annotator progress and identify potential issues.

LLM Integration

Yes. Potato integrates with OpenAI, Anthropic Claude, Google Gemini, and local LLMs via Ollama. You can configure AI pre-annotation to speed up human annotation workflows.

Add an `llm` section to your config specifying the provider, model, and prompt template. Potato will call the LLM API for each item and pre-fill annotation fields that annotators can accept or correct.

Yes. Potato supports Ollama for running local LLMs. This keeps your data completely private while still benefiting from AI assistance.

Yes. Potato is well-suited for collecting human preference data for RLHF. Use pairwise comparison to have annotators choose between model outputs, or Likert scales to rate response quality.

Deployment

Yes. While Potato runs locally by default, you can deploy it on any server. Run behind nginx or Apache for HTTPS, or use Docker for containerized deployment.

Potato can be containerized using a custom Dockerfile. While we don't currently provide official Docker images, you can create your own using a standard Python base image. See our deployment documentation for examples.

Yes. When deployed on a server, multiple annotators can access the same Potato instance simultaneously. Each annotator's work is tracked separately.

Deploy Potato behind a reverse proxy like nginx or Caddy that handles SSL termination. See our deployment guide for configuration examples.

Troubleshooting

Common issues: 1) Check that your config.yaml is valid YAML syntax. 2) Ensure your data file exists and is properly formatted. 3) Check that the port (default 8000) isn't already in use. 4) Look at the terminal output for specific error messages.

Check that: 1) You clicked the Save/Submit button. 2) The output directory is writable. 3) All required fields are filled in. Check the browser console for JavaScript errors.

Delete the annotator's annotation file from the output directory. Their assignment will be regenerated on next login.

Join our Discord community for real-time help, check GitHub Issues for known problems, or browse our documentation. The community is friendly and responsive!

Still Have Questions?

Our community is here to help. Join Discord for real-time support or browse the documentation for detailed guides.

Join Discord Browse Documentation