Skip to content

Rust PyO3 renderer port and native performance paths#73

Draft
ThomAub wants to merge 35 commits into
PrimeIntellect-ai:mainfrom
ThomAub:rust-pyo3-port
Draft

Rust PyO3 renderer port and native performance paths#73
ThomAub wants to merge 35 commits into
PrimeIntellect-ai:mainfrom
ThomAub:rust-pyo3-port

Conversation

@ThomAub
Copy link
Copy Markdown

@ThomAub ThomAub commented May 28, 2026

Summary

Adds the Rust/PyO3 native renderer port with native parity paths.

  • NumPy zero copy token APIs
  • Renderer sessions and prepared tools

Current benchmark perf

Latest local qwen3 benchmark on this branch:

  • native list geomean: 6.25x over Python
  • native NumPy geomean: 6.81x over Python
  • all-family smoke benchmark completed 327 measured rows

Command used:

uv run python benchmarks/native_vs_python_qwen3.py --families qwen3 --min-time 0.03 --repeats 3 --memory-loops 50

Note

Add Rust/PyO3 native renderer implementations for multiple model families

  • Introduces a Rust workspace (crates/) with three members: renderers-core (pure Rust rendering/parsing logic), renderers-py (PyO3 Python bindings compiled as renderers_native), and renderers-cli (a CLI for diffing/benchmarking).
  • Adds native renderer implementations for Qwen3, Qwen3.5, Qwen3.6, DeepSeekV3, GLM-4.5/5/5.1, Kimi K2/K2.5, MiniMax M2, Nemotron3, GPT-OSS, and a Jinja-based default renderer, all sharing a common Renderer trait with render, parse_response, stop_token_ids, and bridge_to_next_turn.
  • Python renderer classes (Qwen3Renderer, KimiK25Renderer, etc.) now override __new__ to return native Rust instances when RENDERERS_NATIVE env var selects the family and the compiled extension is importable; otherwise they fall back to pure-Python construction.
  • Adds _native_router.py for lazy extension loading and tokenizer path resolution, and _native_vision.py for native Qwen-VL image preprocessing with optional NumPy output.
  • Adds parity, routing, vision, and NumPy test suites under tests/ and a Rust CI workflow covering fmt, clippy, cargo test, and Miri.
  • Risk: __new__ overrides mean instantiating any patched renderer class may return an opaque Rust object; code that introspects instance type or calls Python-only attributes on the returned object will break silently when the native path is active.

Macroscope summarized fa47618.

@ThomAub ThomAub mentioned this pull request May 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant