Multilingual and Low-Resource Annotation
Annotating in languages beyond English: the diversity gap, participatory methods with native speakers, and how to localize the Potato interface with right-to-left support, fonts, and translated labels.
Annotating in a language other than English is two problems at once. The scientific one: most NLP resources cover a handful of languages, categories don't transfer cleanly across cultures, and you need actual speakers, not just bilinguals, to produce good labels. The practical one: getting the tool to display the language correctly, including right-to-left scripts and non-Latin fonts. Potato handles the second through config; the first is on you. This guide covers both.
The diversity gap is real and large
There are roughly 7,000 languages in the world and NLP meaningfully serves a few dozen. Joshi et al. (2020) quantified this: a small number of languages have abundant labeled data while the vast majority, spoken by billions of people, have almost none, and the gap is self-reinforcing because resources flow to languages that already have them. Annotation is usually the bottleneck. If you want a model to work in a low-resource language, someone has to label data in it, and that labeling is where the quality is won or lost.
Annotate with the language community, not just in the language
The instinct is to hire cheap bilingual crowdworkers and translate an English guideline. Both parts of that are risky. Bird (2020) argues against extractive approaches that treat a language community as a data source, and the Masakhane project's participatory model (Nekoto et al., 2020) shows the alternative working at scale for African languages: native speakers help design the task, author the guidelines, and own the labels, rather than validating decisions made elsewhere. Two practical consequences:
- Recruit fluent speakers, ideally of the right variety. A language is not monolithic, dialect, region, and register matter, and a speaker of one variety may mislabel another. Bilingual is not the same as native.
- Don't assume categories transfer. Sentiment, offensiveness, politeness, and even named-entity types are culturally specific. A guideline that works for English politeness can quietly break in a language where politeness is grammatically encoded. Have speakers adapt the scheme, not just translate the words. This is a scheme-design decision, not a translation task.
Localizing the Potato interface
The practical half. Potato localizes the annotator-facing interface through configuration, no code changes, though it's worth knowing the edges up front.
Interface text. The ui_language block is a string table for the interface chrome. Set the document language and translate the buttons and headings the annotator sees:
ui_language:
html_lang: ar
html_dir: rtl # right-to-left for Arabic, Hebrew, etc.
submit_button: "إرسال"
instructions_heading: "التعليمات"Setting html_dir: rtl flips the whole document for right-to-left scripts using the browser's native bidirectional handling. One honest limitation: ui_language covers the core annotation and login screens, but several admin-side pages (the dashboard, adjudication, training, and logout screens) are still English-only, so plan for that if your annotators will see them.
Fonts for non-Latin scripts. Browsers don't always ship a good default font for CJK, Arabic, or Indic scripts. Load one through a project stylesheet with base_css:
base_css: "css/noto_font.css"Non-English data. Potato reads data files as UTF-8 by default and renders arbitrary Unicode in the text field, so non-English content displays as-is. If a file uses a different encoding, override it per file (encoding:) or globally (data_directory_encoding:).
Translated labels. This is the one subtle trap. By default Potato "humanizes" label names, reformatting and title-casing them, which can mangle non-Latin scripts. Keep machine-readable English names for your data, and show the annotator a translated label with displayed_label, turning humanizing off:
annotation_schemes:
- name: sentiment
annotation_type: radio
description: "ما هو شعور هذا النص؟"
humanize_labels: false
labels:
- name: positive
displayed_label: "إيجابي"
- name: negative
displayed_label: "سلبي"
- name: neutral
displayed_label: "محايد"The stored annotation is still positive, so your data stays clean while the annotator works entirely in Arabic.
Instructions and consent. Author your surveyflow consent and instruction pages directly in the target language, they are HTML files you write, with nothing forcing English. For running the same task across several languages, Potato ships a setup_multilingual_config.py helper that generates per-language configs from a template.
Further reading
- Choosing an Annotation Scheme, because categories rarely transfer across languages unchanged.
- Writing Annotation Guidelines, which native speakers should help author, not just translate.
- Running a Study on Prolific and MTurk, for recruiting speakers of the right language and variety.
- Layout Customization, for fonts and custom templates.