Skip to content

Commit d301312

Browse files
Copilotbrunoborges
andcommitted
docs: rewrite Plan B as unified Internationalization Specification
Co-authored-by: brunoborges <129743+brunoborges@users.noreply.github.com>
1 parent ce99798 commit d301312

File tree

2 files changed

+131
-168
lines changed

2 files changed

+131
-168
lines changed

specs/i18n/plan-b-externalized-strings-full-translations.md renamed to specs/i18n/i18n-spec.md

Lines changed: 131 additions & 103 deletions
Original file line numberDiff line numberDiff line change
@@ -1,36 +1,37 @@
1-
# Plan B — Externalized UI Strings + Full Translation Files
1+
# Internationalization Specification
22

33
## Overview
44

5-
Separate concerns into two distinct layers:
5+
This document specifies how java.evolved supports multiple languages.
6+
Internationalization is implemented via two distinct layers:
67

78
1. **UI strings layer** — every piece of hard-coded copy in the templates
8-
(labels, button text, nav, footer, etc.) is moved into a per-locale
9-
`strings/{locale}.json` file and injected at build time.
9+
(labels, button text, nav, footer, etc.) is extracted into a per-locale
10+
`translations/strings/{locale}.json` file and injected at build time.
1011

11-
2. **Content translation layer** — translated pattern JSON files are
12-
complete, stand-alone replacements (not overlays) stored next to the
13-
English originals. The generator falls back to the English file for any
14-
pattern that has not yet been translated.
12+
2. **Content translation layer** — translated pattern JSON files are complete,
13+
stand-alone replacements stored under `translations/content/{locale}/`.
14+
The generator falls back to the English file for any pattern that has not yet
15+
been translated.
1516

16-
This approach treats English as just another locale and unifies the build
17-
pipeline so all locales are first-class citizens.
17+
English is a first-class locale. All locales — including English — go through
18+
the same build pipeline.
1819

1920
---
2021

2122
## Directory Layout
2223

2324
```
24-
content/ # Unchanged English content
25+
content/ # English content (source of truth)
2526
language/
2627
type-inference-with-var.json
2728
collections/
2829
...
2930
30-
translations/ # New top-level directory
31+
translations/ # All i18n artefacts
3132
strings/
3233
en.json # English UI strings (extracted from templates)
33-
pt-BR.json
34+
pt-BR.json # Partial — missing keys fall back to en.json
3435
ja.json
3536
content/
3637
pt-BR/
@@ -42,17 +43,17 @@ translations/ # New top-level directory
4243
language/
4344
...
4445
45-
templates/ # Templates gain {{…}} tokens for every
46-
slug-template.html # hard-coded UI string (no literal English left)
46+
templates/ # Templates use {{…}} tokens for every UI string
47+
slug-template.html
4748
index.html
4849
...
4950
5051
html-generators/
5152
locales.properties # Ordered list of supported locales + display names
52-
generate.java / generate.py # Rewritten to iterate all locales
53+
generate.java / generate.py # Extended to iterate all locales
5354
5455
site/ # Generated output
55-
index.html # English home (locale = en, path = /)
56+
index.html # English home (path = /)
5657
language/
5758
type-inference-with-var.html
5859
data/
@@ -69,9 +70,26 @@ site/ # Generated output
6970

7071
---
7172

72-
## `strings/{locale}.json` Schema
73+
## `locales.properties` — Supported Locales Registry
74+
75+
```properties
76+
# html-generators/locales.properties
77+
# format: locale=Display name (first entry is the default/primary locale)
78+
en=English
79+
pt-BR=Português (Brasil)
80+
ja=日本語
81+
```
82+
83+
The generator reads this file to know which locales to build and what label
84+
to show in the language selector.
85+
86+
---
87+
88+
## `translations/strings/{locale}.json` Schema
7389

74-
Every user-visible string in the templates is assigned a dot-separated key:
90+
Every user-visible string in the templates is assigned a dot-separated key.
91+
The English file is the complete reference; locale files are partial and only
92+
need to include keys that differ from English.
7593

7694
```json
7795
// translations/strings/en.json
@@ -130,8 +148,7 @@ Every user-visible string in the templates is assigned a dot-separated key:
130148
```
131149

132150
```json
133-
// translations/strings/pt-BR.json
134-
// Strings files support partial translation: missing keys fall back to en.json at build time.
151+
// translations/strings/pt-BR.json (partial — only translated keys required)
135152
{
136153
"site": {
137154
"tagline": "O Java evoluiu. Seu código também pode.",
@@ -151,22 +168,15 @@ Every user-visible string in the templates is assigned a dot-separated key:
151168
}
152169
```
153170

154-
Missing keys in a locale's strings file fall back to the `en.json` value —
155-
strings files are partial by design and only require the keys that need
156-
translation.
157-
158-
> **Note**: This partial-fallback behaviour applies to `strings/{locale}.json`
159-
> **only**. Translated *content* files (see next section) are always complete
160-
> replacements, not overlays.
171+
Missing keys fall back to `en.json` at build time.
161172

162173
---
163174

164-
## Full-Replacement Content Files
175+
## Content Translation Files
165176

166-
Unlike Plan A's overlay approach, translated content files are **complete**
167-
copies of the pattern JSON with every field in the target language. This
168-
avoids partial-merge edge cases and lets translators work with a self-contained
169-
file.
177+
Translated content files are **complete** copies of the English pattern JSON
178+
with translatable fields rendered in the target language. This avoids
179+
partial-merge edge cases and makes each file self-contained.
170180

171181
```json
172182
// translations/content/pt-BR/language/type-inference-with-var.json
@@ -181,8 +191,8 @@ file.
181191
"modernLabel": "Java 10+",
182192
"oldApproach": "Tipos explícitos",
183193
"modernApproach": "Palavra-chave var",
184-
"oldCode": "String nome = \"Alice\";\nString saudacao = \"Olá, \" + nome + \"!\";",
185-
"modernCode": "var nome = \"Alice\";\nvar saudacao = \"Olá, %s!\".formatted(nome);",
194+
"oldCode": "...",
195+
"modernCode": "...",
186196
"summary": "Use var para deixar o compilador inferir o tipo local.",
187197
"explanation": "...",
188198
"whyModernWins": [
@@ -201,50 +211,33 @@ file.
201211
}
202212
```
203213

204-
`oldCode` and `modernCode` are always copied from the English file at build
205-
time; translators must not provide them (or they are silently ignored). This
206-
keeps code snippets language-neutral.
214+
`oldCode` and `modernCode` are **always overwritten** with the English values at
215+
build time, regardless of what appears in the translation file. Translators may
216+
leave those fields empty or copy the English values verbatim — neither causes
217+
any harm.
207218

208219
---
209220

210-
## `locales.properties`Supported Locales Registry
221+
## GeneratorResolution Order
211222

212-
```properties
213-
# html-generators/locales.properties
214-
# format: locale=Display name (first entry is the default/primary locale)
215-
en=English
216-
pt-BR=Português (Brasil)
217-
ja=日本語
218-
```
219-
220-
The generator reads this file to know which locales to build and what label
221-
to show in the language selector.
222-
223-
---
224-
225-
## Generator Changes
226-
227-
### Resolution Order
228-
229-
For each pattern the generator:
223+
For each pattern and locale the generator:
230224

231225
1. Loads the English baseline from `content/<cat>/<slug>.json`.
232-
2. Checks if `translations/content/<locale>/<cat>/<slug>.json` exists.
233-
- If **yes**: use the translated file but override `oldCode`/`modernCode`
226+
2. Checks whether `translations/content/<locale>/<cat>/<slug>.json` exists.
227+
- **Yes**use the translated file, then overwrite `oldCode`/`modernCode`
234228
with the English values.
235-
- If **no**: use the English file and optionally mark the page with a
236-
banner ("This pattern has not yet been translated").
237-
3. Loads `translations/strings/<locale>.json`, deep-merged over
238-
`translations/strings/en.json`.
239-
4. Renders the template, substituting both content tokens (`{{title}}`, …)
240-
and UI-string tokens (`{{nav.allPatterns}}`, …).
241-
5. Writes the output to `site/<locale>/<cat>/<slug>.html`
242-
(or `site/<cat>/<slug>.html` for `en`).
229+
- **No** → use the English file and inject an "untranslated" banner
230+
(see next section).
231+
3. Loads `translations/strings/<locale>.json` deep-merged over `en.json`.
232+
4. Renders the template, substituting content tokens (`{{title}}`, …) and
233+
UI-string tokens (`{{nav.allPatterns}}`, …).
234+
5. Writes output to `site/<locale>/<cat>/<slug>.html`
235+
(or `site/<cat>/<slug>.html` for English).
243236

244-
### Untranslated Pattern Banner (optional)
237+
### Untranslated Pattern Banner
245238

246239
When falling back to English content for a non-English locale, the generator
247-
can inject a `<div class="untranslated-notice">` banner:
240+
injects:
248241

249242
```html
250243
<div class="untranslated-notice" lang="en">
@@ -259,8 +252,8 @@ The banner is suppressed when the locale is `en` or a translation file exists.
259252

260253
## Template Changes
261254

262-
Every hard-coded English string in the templates is replaced with a token.
263-
The token naming convention mirrors the key path in `strings/{locale}.json`:
255+
Every hard-coded English string in the templates is replaced with a token whose
256+
name mirrors the dot-separated key path in `strings/{locale}.json`:
264257

265258
| Before | After |
266259
|---|---|
@@ -272,7 +265,7 @@ The token naming convention mirrors the key path in `strings/{locale}.json`:
272265
| `Copied!` | `{{copy.copied}}` |
273266
| `Search patterns…` | `{{search.placeholder}}` |
274267

275-
The `<html>` tag becomes `<html lang="{{locale}}">`.
268+
The `<html>` opening tag becomes `<html lang="{{locale}}">`.
276269

277270
---
278271

@@ -281,8 +274,8 @@ The `<html>` tag becomes `<html lang="{{locale}}">`.
281274
`hreflang` alternate links are generated for every supported locale:
282275

283276
```html
284-
<link rel="alternate" hreflang="en" href="https://javaevolved.github.io/language/type-inference-with-var.html">
285-
<link rel="alternate" hreflang="pt-BR" href="https://javaevolved.github.io/pt-BR/language/type-inference-with-var.html">
277+
<link rel="alternate" hreflang="en" href="https://javaevolved.github.io/language/type-inference-with-var.html">
278+
<link rel="alternate" hreflang="pt-BR" href="https://javaevolved.github.io/pt-BR/language/type-inference-with-var.html">
286279
<link rel="alternate" hreflang="x-default" href="https://javaevolved.github.io/language/type-inference-with-var.html">
287280
```
288281

@@ -295,18 +288,19 @@ The list of locales is rendered at build time from `locales.properties`:
295288

296289
```html
297290
<select id="localePicker" aria-label="Select language">
298-
<option value="en" >English</option>
291+
<option value="en">English</option>
299292
<option value="pt-BR" selected>Português (Brasil)</option>
300293
</select>
301294
```
302295

303-
`app.js` handles path-rewriting when the user picks a different locale.
296+
`app.js` rewrites the current URL path to the equivalent page in the selected
297+
locale when the user changes the selection.
304298

305299
---
306300

307301
## `app.js` Changes
308302

309-
The search index path and the locale picker rewrite must both be locale-aware:
303+
The search index path and locale picker must both be locale-aware:
310304

311305
```js
312306
// Detect current locale from path prefix
@@ -320,8 +314,8 @@ const indexPath = locale === 'en'
320314
: `/${locale}/data/snippets.json`;
321315
```
322316

323-
Localised strings needed by JavaScript (search placeholder, "no results"
324-
message, "Copied!") are embedded as a `<script>` block by the generator:
317+
Localised strings consumed by JavaScript are embedded as a `<script>` block by
318+
the generator so `app.js` doesn't need to fetch them separately:
325319

326320
```html
327321
<script>
@@ -333,51 +327,85 @@ message, "Copied!") are embedded as a `<script>` block by the generator:
333327
</script>
334328
```
335329

336-
`app.js` reads from `window.i18n` instead of hard-coded strings.
330+
`app.js` reads from `window.i18n` instead of hard-coded literals.
337331

338332
---
339333

340334
## GitHub Actions Changes
341335

342-
The deploy workflow builds the English site first, then iterates every entry
343-
in `locales.properties` (skipping `en`) to build locale subtrees:
336+
The deploy workflow iterates all entries in `locales.properties`:
337+
338+
```yaml
339+
- name: Build site
340+
run: python3 html-generators/generate.py --all-locales
341+
```
342+
343+
Or explicitly, to support incremental locale addition:
344344
345345
```yaml
346346
- name: Build site
347347
run: |
348-
python3 html-generators/generate.py # English
348+
python3 html-generators/generate.py
349349
python3 html-generators/generate.py --locale pt-BR
350350
python3 html-generators/generate.py --locale ja
351351
```
352352
353-
Or, with a build-all mode added to the generator:
353+
---
354354
355-
```yaml
356-
- name: Build site
357-
run: python3 html-generators/generate.py --all-locales
355+
## AI-Driven Translation Workflow
356+
357+
When a new slug is added, AI generates translations automatically:
358+
359+
```
360+
New English slug → AI prompt → Translated JSON file → Schema validation → Commit
358361
```
359362

363+
### Why this architecture suits AI translation
364+
365+
- The AI receives the full English JSON and returns a complete translated JSON —
366+
no special field-filtering rules in the prompt.
367+
- `oldCode`/`modernCode` are overwritten by the build tooling, so AI can copy
368+
them verbatim without risk of hallucinated code shipping to users.
369+
- The translated file passes the same JSON schema validation as English files —
370+
no separate validation logic needed.
371+
- If the AI file does not exist yet, the fallback is an explicit "untranslated"
372+
banner rather than a silent gap.
373+
374+
### Automation steps
375+
376+
1. **Trigger** — GitHub Actions detects a new or modified
377+
`content/<cat>/<slug>.json` (push event or workflow dispatch).
378+
2. **Translate** — For each supported locale, call the translation model with:
379+
```
380+
Translate the following Java pattern JSON from English to {locale}.
381+
- Keep unchanged: slug, id, category, difficulty, jdkVersion, oldLabel,
382+
modernLabel, oldCode, modernCode, docs, related, prev, next, support.state
383+
- Translate: title, summary, explanation, oldApproach, modernApproach,
384+
whyModernWins[*].title, whyModernWins[*].desc, support.description
385+
- Return valid JSON only.
386+
```
387+
3. **Validate** — Run JSON schema validation (same rules as English content).
388+
4. **Commit** — Write the output to
389+
`translations/content/{locale}/<cat>/<slug>.json` and commit.
390+
5. **Deploy** — The generator picks it up on next build; the "untranslated"
391+
banner disappears automatically.
392+
393+
### Keeping translations in sync
394+
395+
When an English file is **modified**, the same automation regenerates the
396+
translated file or opens a PR flagging the diff for human review. A CI check
397+
can compare `id`, `slug`, and `jdkVersion` between the English and translated
398+
files to detect stale translations.
399+
360400
---
361401

362402
## Migration Path
363403

364404
| Phase | Work |
365405
|---|---|
366-
| 1 | Create `translations/strings/en.json` by extracting every hard-coded string from templates; replace literals with `{{…}}` tokens; verify English output is byte-identical |
367-
| 2 | Add `locales.properties`; extend generator to load strings, support `--locale`, fall back gracefully |
368-
| 3 | Add language selector to nav + `app.js` locale detection and path rewrite |
406+
| 1 | Extract every hard-coded string from templates into `translations/strings/en.json`; replace literals with `{{…}}` tokens; verify English output is unchanged |
407+
| 2 | Add `locales.properties`; extend generator to load strings, support `--locale`, and fall back gracefully |
408+
| 3 | Add language selector to nav; implement `app.js` locale detection and path rewrite |
369409
| 4 | Translate `strings/pt-BR.json` and 2–3 content files as a proof-of-concept; verify fallback banner |
370410
| 5 | Update GitHub Actions; add `hreflang` alternate links |
371-
| 6 | Open translation contribution guide; document `translations/` schema |
372-
373-
---
374-
375-
## Trade-offs
376-
377-
| Pros | Cons |
378-
|---|---|
379-
| English is a first-class locale — no special-cased code paths | Larger initial refactor (all template strings must be extracted) |
380-
| Complete translation files are easy for translators to understand | Full content files are larger and harder to keep in sync with English originals |
381-
| Untranslated-page fallback is explicit and user-visible | `translations/` directory adds a new top-level location to learn |
382-
| UI strings fall back to English automatically — no silent gaps | `app.js` must be updated to consume `window.i18n` instead of literals |
383-
| Scales cleanly to many locales | Build time grows linearly with locale count × pattern count |
411+
| 6 | Wire up AI translation automation; add `translations/` schema documentation |

0 commit comments

Comments
 (0)