Rust PyO3 renderer port and native performance paths by ThomAub · Pull Request #2 · ThomAub/renderers

ThomAub · 2026-05-21T08:01:13Z

Summary

Add the Rust/PyO3 renderer port with native parity coverage across the supported families.
Add native runtime performance paths: prepared tools, sessions, NumPy/packed outputs, fast role/content inputs, tool text caching, and TokenPlanBuf batch dynamic encoding.
Update SGLang and vLLM examples to use native prepared tools and sessions when available while keeping token-ID engine contracts unchanged.

Verification

cargo fmt --all -- --check
cargo clippy --workspace --all-targets --locked
cargo test --workspace
uv run ruff check examples/sglang/multiturn_generate_sglang.py examples/sglang/online_multiturn_sglang.py examples/vllm/multiturn_generate_vllm.py benchmarks/native_vs_python_qwen3.py tests/test_native_numpy.py
uv run ruff format --check examples/sglang/multiturn_generate_sglang.py examples/sglang/online_multiturn_sglang.py examples/vllm/multiturn_generate_vllm.py benchmarks/native_vs_python_qwen3.py tests/test_native_numpy.py
uv run maturin develop --manifest-path crates/renderers-py/Cargo.toml --release
uv run pytest -m parity tests/test_native_parity.py -q -rs
env RENDERERS_NATIVE=all uv run pytest tests/test_render_ids.py tests/test_bridge.py tests/test_roundtrip.py tests/test_message_indices.py tests/test_native_router.py tests/test_native_vision.py tests/test_native_numpy.py -q -rs

Benchmark Artifacts

/private/tmp/renderers-native-tokenplan-family-smoke-002.json
/private/tmp/renderers-native-tokenplan-family-smoke-002.md

ThomAub · 2026-05-21T08:04:26Z

Completion audit passed for the native runtime performance pass.

Evidence:

TokenPlanBuf and encode_batch_no_special are in emit.rs / tokenizer.rs.
Long no-tool render_ids uses TokenPlanBuf in DeepSeek V3, Qwen35/Qwen36, MiniMax M2, and GLM.
vLLM and SGLang examples now use prepare_tools(...) plus new_session(...) when native APIs exist.
render_fast_ids(...) is exposed in PyO3 and documented for role/content serving loops.
Verification passed: clippy, cargo tests, full native parity, and native-forced suite.
Matched benchmark artifact: /private/tmp/renderers-native-tokenplan-family-smoke-002.json.

Measured long-history direct render_ids native list improvements versus the prior smoke artifact:

family	prior native list	current native list	prior speedup vs Python	current speedup vs Python
DeepSeek V3	907.148 us	123.459 us	1.57x	4.96x
Qwen35	823.732 us	232.878 us	2.71x	9.20x
Qwen36	824.496 us	252.372 us	2.81x	9.35x
MiniMax M2	818.452 us	218.835 us	2.24x	8.10x
GLM5	678.772 us	200.545 us	2.68x	8.88x
GLM5.1	665.796 us	222.906 us	3.02x	7.94x
GLM4.5	682.401 us	202.465 us	2.80x	9.43x

Focused family geomeans from the matched benchmark run:

family	list geomean	NumPy geomean
DeepSeek V3	3.28x	3.51x
Qwen35	3.85x	4.19x
Qwen36	3.82x	4.16x
MiniMax M2	6.16x	6.75x
GLM5	2.86x	3.11x
GLM5.1	2.85x	3.11x
GLM4.5	3.33x	3.65x

ThomAub · 2026-05-28T12:09:20Z

Keep in mind PrimeIntellect-ai#70

ThomAub · 2026-05-28T12:35:00Z

Bench:

render_batch_ids short_batch: native list 95.246us -> 87.370us, native np 92.104us -> 84.731us.
render_batch_ids short_batch_prepared_tools: native np 137.099us -> 130.805us.
session_render_ids long_history_gen_prompt: native list 228.157us -> 212.276us.

ThomAub force-pushed the rust-pyo3-port branch 2 times, most recently from f048690 to 23d82c3 Compare May 21, 2026 09:32

ThomAub added 15 commits May 28, 2026 14:07

Add Rust native renderer foundation

e8d8abe

Add Qwen3 native parity path

62625b6

Add Qwen3.5 native parity path

57fc44a

Add DeepSeek V3 native parity path

930775f

Add Qwen3.6 native parity path

bd95817

Add Nemotron 3 native parity path

98c9ffa

Add GLM native parity paths

63713c3

Add Kimi K2 native parity path

6804f55

Add MiniMax M2 native parity path

3067ed3

Add Kimi K2.5 native parity path

6d303bb

Add GPT-OSS native parity path

ee68dd4

Add DefaultRenderer native parity path

31939db

Add Qwen-VL native multimodal parity

32df079

Add native Qwen-VL image processing parity

1b5c7cb

Harden native family parity

9899d2c

ThomAub added 11 commits May 28, 2026 14:09

Fix native review regressions

966dad4

Fix native token parity

69ba647

Tighten Rust native lint coverage

6527b65

Harden native parity tests

54270ff

Enable Kimi native tokenizer parity

0f5dd50

Add native runtime benchmark

df2a5f2

Expand native runtime benchmark

9042447

Trim native token bridge overhead

61925c3

Add native NumPy token fast paths

05e170e

Fix native workspace manifest

5152854

Restore native multimodal type surface

8a53b30

ThomAub added 7 commits May 28, 2026 14:09

Fix DeepSeek native tool parity

5608d57

Extend native runtime benchmark families

cf4196e

Improve native runtime performance paths

db2905a

Align native Python API surface

25eb78a

Fix Ruff import skip references

6c945be

Apply Ruff formatting

bad105c

Fix native renderer config constructors

d7e8a58

ThomAub force-pushed the rust-pyo3-port branch from f4e39b6 to d7e8a58 Compare May 28, 2026 12:17

Optimize native binding batch conversions

7be0ed7

Fix renderer compatibility and clippy

fa47618

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rust PyO3 renderer port and native performance paths#2

Rust PyO3 renderer port and native performance paths#2
ThomAub wants to merge 35 commits into
mainfrom
rust-pyo3-port

ThomAub commented May 21, 2026

Uh oh!

ThomAub commented May 21, 2026

Uh oh!

ThomAub commented May 28, 2026

Uh oh!

ThomAub commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ThomAub commented May 21, 2026

Summary

Verification

Benchmark Artifacts

Uh oh!

ThomAub commented May 21, 2026

Uh oh!

ThomAub commented May 28, 2026

Uh oh!

ThomAub commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant