We sell translation infrastructure. Our own marketing site ships in five locales. When we audited the JSON we actually serve, the first hit was not a tone issue. It was a single wrong character inside a German compound that looked almost fine at a glance: a Cyrillic letter where Latin belongs. The word read like a glossary until you looked at the code point. If our own translations are broken, why should anyone trust the product?


Five failures that should never have shipped

Cyrillic inside German UI copy. A glossary line once contained a mixed script: the suffix looked like Latin “ar” but one letter was not. Native readers notice immediately. Spell check does not.

**The string:** …"Übersetzungsglossар"… (Cyrillic "р" in the compound)
**What it should be:** "Übersetzungsglossar"
**Why it matters:** It reads as corrupted text. It signals automated output nobody proofread in the target language.

Register inversion across German and French. The site is a developer tool. The voice should match how engineers actually write in product UI. We had shipped formal address everywhere in de and fr. That is not a nit. It means the default translation path picked the wrong register for the entire surface area, not a handful of edge strings.

**The string:** "… wie Sie …" / "vous pouvez …"
**What it should be:** informal "du" / "tu" voice for product and FAQ
**Why it matters:** Every screen sounded like enterprise procurement, not like the product we sell to vibe coders.

Invented currency. English said dollars. German showed euros with a monthly label. We do not bill a different price per country. The JSON implied we did.

**The string:** "20 €/Monat" (and similar) next to "$20/mo" elsewhere
**What it should be:** one USD amount, formatted per locale via code (e.g. Intl), not hand-invented symbols in JSON
**Why it matters:** Pricing trust. Mixed symbols read as either a bug or a hidden price change.

False friends for “coding.” In three languages, “coding environment” was translated with the word for encoding. The FAQ told people they could stay in their “encoding environment.” That is wrong in any reading of the product.

**The string:** Spanish "codificación", Italian "codifica", German "Codierung" in the wrong sense
**What it should be:** development environment wording (e.g. entorno de desarrollo / ambiente di sviluppo / Entwicklungsumgebung)
**Why it matters:** It is a semantic bug. It reads like we do not understand our own domain.

“Ship” read as postal shipping. German and Italian strings used verbs people associate with parcels, not with releasing software. Same English source, wrong lexical field. That came from uncorrected machine suggestions, not from a deliberate style choice.

**The string:** parcel-shipping verbs applied to "ship your app"
**What it should be:** release / deploy / deliver software wording
**Why it matters:** Developer trust. One wrong verb tells the reader the copy was never read by a developer in that language.

Why auto-generated translations drift

This is not a rant against translators. Much of the site never had a human pass per string. General-purpose models default to formal register because that is what bulk bilingual text looks like on the open web. They also mix near-homographs across scripts when the objective is “plausible characters,” not “correct script.” Without a glossary, word choice drifts between adjacent keys. Without a currency rule, the model invents plausible local symbols. Without domain checks, encoding and coding collide. That is the failure mode we exist to compress. Our own site still shipped it until we audited like a customer would.


What we built after the cleanup

One-time copy edits are not a system. The durable part is a small policy file the whole team (and any future auto-translation batch) can treat as law, plus a verification script we can run before merge.

Policy. i18n/LOCALIZATION_POLICY.md records register per locale, USD-only pricing in JSON, loanword decisions (for example keeping vibe coders in English), false-friend rules for coding vs encoding, and typography for inline literals. It is short on purpose. If a rule is not written down, a model will not infer it consistently.

## Currency
- USD only in all locales. Use {monthlyPrice} and {overageRate} placeholders…
## False friends
- Coding environment is not encoding: Spanish desarrollo / Italian sviluppo / German Entwicklung…

Verification. scripts/verify-translations.sh greps for Cyrillic in Latin locale files, euro symbols, the old false-friend substrings, and parity on pricing placeholders against messages/en.json. Optional checks use ripgrep when installed. The script is a floor, not a ceiling. It exists so “we fixed it once” does not decay quietly on the next bulk edit.

Code, not more JSON guesses. We centralized surfaced USD amounts in lib/pricing.ts and wired formatUsdPerMonth and formatUsdOveragePer1k into the homepage and FAQ so locale files stop inventing numerals. The bug was not only the wrong strings. The bug was not having the policy and the formatter in place before scale.


What you should steal for your own AI-assisted i18n

If you generate translations with a model, register is the highest-variance knob. Lock it per locale before you add more languages. Lock currency in code, not in prose. Lock brand tokens and developer jargon. Run automated checks that fail the batch instead of “we will eyeball it later.” Later never happens at the same quality bar as the first pass.

For the architecture story underneath this audit, read How to Localize an AI-Generated App and the tooling comparison in globalize.now vs Lokalise vs Crowdin. If you are wiring i18n on your own repo and want the same guardrails, npx globalize-skills is the entry point we document for skills that install next to your code. The honest close: we ate our own dog food, found ugly bugs, wrote the policy, and added a gate. That is the bar we want every multilingual launch to meet.