fix(tokenizer): fall back to direct fast-tokenizer load when model config build fails by hallerite · Pull Request #72 · PrimeIntellect-ai/renderers

hallerite · 2026-05-27T22:00:48Z

Problem

Loading a tokenizer for poolside/Laguna-XS.2 (and any model with nested per-layer rope_parameters) crashes renderers:

KeyError: "Missing required keys in `rope_parameters` for 'rope_type'='default': {'rope_theta'}"

The cause is entirely a modeling-layer concern leaking into tokenizer loading:

AutoTokenizer.from_pretrained always constructs the model config first (to resolve the tokenizer class) — even for a plain PreTrainedTokenizerFast.
Building Laguna's config runs HF's RoPE validator. Laguna's rope_parameters are nested (full_attention / sliding_attention) with no top-level rope_theta. vLLM's patch_rope_parameters injects rope_theta via standardize_rope_params() before validating, so vLLM loads fine — but plain transformers validates during __init__ without that step and raises.
AutoTokenizer only catches ValueError/OSError, so the KeyError escapes and kills the load.

renderers needs the tokenizer, not the model — it should never have been dragged through RoPE validation.

Fix

When AutoTokenizer.from_pretrained fails while building the model config, fall back to loading the repo's self-contained tokenizer.json directly via PreTrainedTokenizerFast, which never touches the model config. The fallback:

is modeling-agnostic — no Laguna/RoPE-specific knowledge, just "if the model config blew up but the tokenizer is self-describing, load it directly";
runs under the fastokens patch, so Laguna keeps the Rust fast-path speedup (verified: backend is the fastokens shim, encode output byte-identical to vanilla);
excludes custom auto_map tokenizers (e.g. Kimi-K2), which must keep going through AutoTokenizer + trust_remote_code;
re-raises the original error if there's no usable fast tokenizer, so genuine failures still surface.

Laguna now loads via the fastokens fast path and routes to LagunaXS2Renderer, with a single clear INFO line and no misleading "fastokens could not load" warning.

Verification

poolside/Laguna-XS.2: load_tokenizer → fastokens-backed TokenizersBackend, create_renderer → LagunaXS2Renderer, encode matches vanilla.
Qwen/Qwen3-0.6B and the existing tests/test_load_tokenizer_fastokens.py suite (9 tests): no regression.

🤖 Generated with Claude Code

Note

Medium Risk
Changes central tokenizer loading for every model; behavior is gated on auto_map and re-raises when no fast tokenizer exists, but broad except on the fallback path could mask unrelated load failures in edge cases.

Overview
Tokenizer loading no longer dies when Hugging Face builds the model config during AutoTokenizer.from_pretrained (e.g. RoPE validation on poolside/Laguna-XS.2). A shared _load_tokenizer_via_auto wrapper tries AutoTokenizer first; on failure it loads PreTrainedTokenizerFast from tokenizer.json via _load_fast_tokenizer_directly, skipping config construction when the repo has no custom auto_map tokenizer.

_patched_load and all load_tokenizer paths (vanilla, fastokens, fastokens fallback) now go through that wrapper so the fallback applies under the fastokens patch too. Custom remote tokenizers still require AutoTokenizer; if direct load isn’t safe, the original exception is re-raised.

^{Reviewed by Cursor Bugbot for commit 30655d6. Bugbot is set up for automated code reviews on this repo. Configure here.}

Note

Fix tokenizer loading to fall back to direct `tokenizer.json` load when `AutoTokenizer` fails

Adds _load_fast_tokenizer_directly in renderers/base.py to load a PreTrainedTokenizerFast straight from tokenizer.json, bypassing model config construction.
Adds _load_tokenizer_via_auto as a wrapper around AutoTokenizer.from_pretrained that catches failures and retries via the new direct loader, re-raising the original exception only if the fallback also fails.
Updates load_tokenizer and _patched_load to use _load_tokenizer_via_auto so the fallback applies in all tokenizer load paths.
The direct fallback is skipped if the tokenizer config declares an auto_map entry, since those require the full model config.

^{Macroscope summarized 30655d6.}

…nfig build fails `AutoTokenizer.from_pretrained` eagerly constructs the *model* config to resolve the tokenizer class — even for a plain `PreTrainedTokenizerFast`. That construction runs HF's RoPE validator, which rejects configs carrying nested `rope_parameters` (e.g. poolside/Laguna-XS.2: `full_attention` / `sliding_attention` blocks with no top-level `rope_theta`) when the config is built outside vLLM's `patch_rope_parameters`. The resulting `KeyError` escapes (AutoTokenizer only catches `ValueError`/`OSError`) and kills the tokenizer load — a modeling-only concern breaking something the tokenizer never needed. renderers needs the tokenizer, not the model. When `AutoTokenizer` fails while building the config, fall back to loading the repo's self-contained `tokenizer.json` directly via `PreTrainedTokenizerFast`, which never touches the model config. The fallback runs under the fastokens patch, so models like Laguna keep the Rust fast-path speedup. Custom `auto_map` tokenizers and repos without a fast tokenizer are left to surface the original error. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 30655d6. Configure here.}

cursor · 2026-05-27T22:12:15Z

+    surface its original error instead.
+    """
+    from transformers import PreTrainedTokenizerFast
+    from transformers.models.auto.tokenization_auto import get_tokenizer_config


Private import outside try/except masks original errors

Low Severity

The imports in _load_fast_tokenizer_directly — especially the private from transformers.models.auto.tokenization_auto import get_tokenizer_config — sit outside the try/except block at line 1132. Since this function is called from within _load_tokenizer_via_auto's except handler, an ImportError from the private API path would propagate upward and effectively replace the original meaningful exception (e.g. the KeyError from RoPE validation). Moving both imports inside the existing try block would let any import failure return None gracefully, preserving the original error for re-raise.

Additional Locations (1)

renderers/base.py#L1131-L1139

^{Reviewed by Cursor Bugbot for commit 30655d6. Configure here.}

macroscopeapp · 2026-05-27T22:12:59Z

Approvability

Verdict: Needs human review

This PR introduces a new fallback code path for tokenizer loading (~60 lines of new logic) that changes runtime behavior when AutoTokenizer fails. The use of a private transformers API and the non-trivial error-handling changes warrant human verification of the approach.

^{You can customize Macroscope's approvability policy. Learn more.}

Bumps the deps/renderers submodule 2ec28a8 (v0.1.8.dev28) -> 89ab3f0 (v0.1.8.dev35), pulling in PrimeIntellect-ai/renderers#72: when AutoTokenizer.from_pretrained fails while building the model config (e.g. HF RoPE validation rejecting nested rope_parameters for poolside/Laguna-XS.2), fall back to loading the repo's self-contained tokenizer.json directly. Fixes the tokenizer load crash for the Laguna model series; loads under the fastokens fast path. Re-locks uv.lock: renderers now floors openai-harmony at >=0.0.4 (renderers#69). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…na) (#2657) Bumps the deps/renderers submodule 2ec28a8 (v0.1.8.dev28) -> 89ab3f0 (v0.1.8.dev35), pulling in PrimeIntellect-ai/renderers#72: when AutoTokenizer.from_pretrained fails while building the model config (e.g. HF RoPE validation rejecting nested rope_parameters for poolside/Laguna-XS.2), fall back to loading the repo's self-contained tokenizer.json directly. Fixes the tokenizer load crash for the Laguna model series; loads under the fastokens fast path. Re-locks uv.lock: renderers now floors openai-harmony at >=0.0.4 (renderers#69). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

hallerite marked this pull request as ready for review May 27, 2026 22:01

hallerite force-pushed the fix/tokenizer-config-rope-fallback branch from aec2346 to 30655d6 Compare May 27, 2026 22:02

rasdani self-requested a review May 27, 2026 22:06

rasdani approved these changes May 27, 2026

View reviewed changes

cursor Bot reviewed May 27, 2026

View reviewed changes

hallerite merged commit 89ab3f0 into main May 27, 2026
11 checks passed

hallerite deleted the fix/tokenizer-config-rope-fallback branch May 27, 2026 22:19

hallerite mentioned this pull request May 27, 2026

chore(deps): bump renderers (tokenizer config-build fallback for Laguna) PrimeIntellect-ai/prime-rl#2657

Merged

This was referenced May 28, 2026

fix(tokenizer): apply config-build fallback to offset tokenizer too #75

Merged

chore(deps): bump renderers (offset-tokenizer config-build fallback for Laguna) PrimeIntellect-ai/prime-rl#2663

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(tokenizer): fall back to direct fast-tokenizer load when model config build fails#72

fix(tokenizer): fall back to direct fast-tokenizer load when model config build fails#72
hallerite merged 1 commit into
mainfrom
fix/tokenizer-config-rope-fallback

hallerite commented May 27, 2026 •

edited by macroscopeapp Bot

Loading

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot May 27, 2026

Uh oh!

macroscopeapp Bot commented May 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hallerite commented May 27, 2026 • edited by macroscopeapp Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Verification

Fix tokenizer loading to fall back to direct tokenizer.json load when AutoTokenizer fails

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 27, 2026

Choose a reason for hiding this comment

Private import outside try/except masks original errors

Uh oh!

macroscopeapp Bot commented May 27, 2026

Approvability

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hallerite commented May 27, 2026 •

edited by macroscopeapp Bot

Loading

Fix tokenizer loading to fall back to direct `tokenizer.json` load when `AutoTokenizer` fails