Compose split routed experts from vLLM responses by S1ro1 · Pull Request #1349 · PrimeIntellect-ai/verifiers

S1ro1 · 2026-05-11T21:32:45Z

Summary

add a split routed-experts composer for prompt_routed_experts plus completion routed_experts
decode the new base64 routed-experts payload emitted by patched vLLM when vllm_xargs.routed_experts_encoding = "base64"
update chat, completions, and renderer clients to consume vLLM's new split routed-experts response shape
remove the old base85 routed_experts decode path
preserve prompt+completion routing when truncating response tokens

Validation

uvx ruff@0.15.12 format --isolated --check .
uvx ruff@0.15.12 check --isolated .
uv run --no-sync ty check verifiers/clients/openai_chat_completions_client.py verifiers/clients/openai_completions_client.py verifiers/clients/renderer_client.py verifiers/clients/routed_experts.py verifiers/utils/response_utils.py
uv run ruff check verifiers/clients/routed_experts.py
uv run python -m py_compile verifiers/clients/routed_experts.py

Note

Medium Risk
Updates token parsing across multiple clients to new vLLM prompt_routed_experts/routed_experts shapes, which can affect downstream telemetry/analysis if alignment or decoding assumptions are wrong.

Overview
Adds compose_split_routed_experts() to decode and merge vLLM’s split routing outputs (prompt_routed_experts + per-choice routed_experts) into a single ResponseTokens.routed_experts array aligned to prompt+completion tokens (including padding the final completion token).

Updates the chat, completions, and renderer clients to consume the new split fields and removes the prior inline base85/np.frombuffer decode path. Fixes truncation handling so routed_experts is truncated consistently with the combined prompt+completion token window.

^{Reviewed by Cursor Bugbot for commit b76a6bb. Bugbot is set up for automated code reviews on this repo. Configure here.}

cursor

Cursor Bugbot has reviewed your changes and found 3 potential issues.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 8dbd674. Configure here.}

cursor · 2026-05-11T22:50:00Z

            completion_logprobs = tokens.completion_logprobs[: max_seq_len - prompt_len]
            if routed_experts is not None:
-                routed_experts = routed_experts[: max_seq_len - prompt_len]
+                routed_experts = routed_experts[:max_seq_len]


Overlong prompt truncation discards prompt routing data

High Severity

The semantics of routed_experts changed from completion-only to prompt+completion combined, but the overlong-prompt truncation path at line 50 still sets routed_experts = []. This discards all prompt routing data when the prompt exceeds max_seq_len. The author correctly updated line 57 from routed_experts[: max_seq_len - prompt_len] to routed_experts[:max_seq_len] for the normal truncation case, but missed the analogous fix here — it needs to be routed_experts[:max_seq_len] instead of [].

Additional Locations (1)

verifiers/utils/response_utils.py#L56-L57

^{Reviewed by Cursor Bugbot for commit 8dbd674. Configure here.}

cursor · 2026-05-11T22:50:00Z

+        return None
+
+    prompt = _decode_routed_experts(prompt_routed_experts)
+    assert prompt.shape[0] == prompt_len


Compose function crashes when only completion routing present

Medium Severity

compose_split_routed_experts unconditionally calls _decode_routed_experts(prompt_routed_experts) after the early-return guard. If prompt_routed_experts is None while completion_routed_experts is not None, the early return is skipped (since only both-None triggers it), and _decode_routed_experts(None) crashes with TypeError when calling len(None). The function handles prompt-present/completion-absent but not the reverse.

^{Reviewed by Cursor Bugbot for commit 8dbd674. Configure here.}

cursor · 2026-05-11T22:50:00Z

+                choice_any, "routed_experts"
+            ):
+                prompt_routed_experts = response_any.prompt_routed_experts
+                completion_routed_experts = choice_any.routed_experts


or guard accesses both fields when one missing

Medium Severity

All three callers use or to check whether either routing field exists, then unconditionally access both. If only prompt_routed_experts exists on the response but not routed_experts on the choice (or vice versa), the missing attribute access raises AttributeError for OpenAI clients or KeyError for the renderer dict client. Using and or individual getattr/.get() fallbacks would be safer.

Additional Locations (2)

verifiers/clients/openai_completions_client.py#L177-L182

verifiers/clients/renderer_client.py#L574-L580

^{Reviewed by Cursor Bugbot for commit 8dbd674. Configure here.}

S1ro1 mentioned this pull request May 11, 2026

Support routed experts replay for vLLM P/D PrimeIntellect-ai/prime-rl#2474

Open

S1ro1 force-pushed the feat/split-routed-experts branch from 303d88b to 277ab6e Compare May 11, 2026 22:22

feat: compose split routed experts

8dbd674

S1ro1 force-pushed the feat/split-routed-experts branch from 277ab6e to 8dbd674 Compare May 11, 2026 22:28

S1ro1 marked this pull request as ready for review May 11, 2026 22:42

cursor Bot reviewed May 11, 2026

View reviewed changes

feat: decode base64 routed experts

b76a6bb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compose split routed experts from vLLM responses#1349

Compose split routed experts from vLLM responses#1349
S1ro1 wants to merge 2 commits into
mainfrom
feat/split-routed-experts

S1ro1 commented May 11, 2026 •

edited

Loading

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot May 11, 2026

Uh oh!

cursor Bot May 11, 2026

Uh oh!

cursor Bot May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

S1ro1 commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 11, 2026

Choose a reason for hiding this comment

Overlong prompt truncation discards prompt routing data

Uh oh!

cursor Bot May 11, 2026

Choose a reason for hiding this comment

Compose function crashes when only completion routing present

Uh oh!

cursor Bot May 11, 2026

Choose a reason for hiding this comment

or guard accesses both fields when one missing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

S1ro1 commented May 11, 2026 •

edited

Loading

`or` guard accesses both fields when one missing