Feat/tito dynamo transport by biswapanda · Pull Request #1289 · PrimeIntellect-ai/verifiers

biswapanda · 2026-05-05T17:28:57Z

Description

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Test improvement

Testing

All existing tests pass when running uv run pytest locally.
New tests have been added to cover the changes

Checklist

My code follows the style guidelines of this project as outlined in AGENTS.md
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
Any dependent changes have been merged and published

Additional Notes

Note

Medium Risk
Adds an alternate request/response wire format and local tokenization path that changes how prompts/completions are sent and parsed when renderer_transport is enabled, which could affect inference correctness and tool-call handling. Default behavior remains unchanged but misconfiguration or backend mismatches could cause failures.

Overview
Adds a new configurable RendererTransport (prime_vllm_generate vs dynamo_chat_nvext) and threads it through ClientConfig, RendererClient, and OpenAIChatCompletionsTokenClient so deployments can target Dynamo’s /chat/completions nvext.token_data token-in/token-out interface.

Updates renderers.client.completions_request to build/POST the Dynamo nvext request shape, parse completion token ids/logprobs from Dynamo-style responses, and normalize finish_reason to tool_calls when tool calls are detected client-side.

Extends TITO (OpenAIChatCompletionsTokenClient) to support the Dynamo transport by posting stitched prompt_ids to /chat/completions, performing bridge tokenization locally via a cached renderer/tokenizer (no /tokenize round-trip), and making prefix matching more robust by dropping None-valued keys during message normalization. Adds targeted tests covering the new transport and local tokenization behavior.

^{Reviewed by Cursor Bugbot for commit e8dc4d2. Bugbot is set up for automated code reviews on this repo. Configure here.}

…okenClient The verifiers TITO client previously only spoke vLLM's TITO surface (/v1/chat/completions/tokens for the final POST, /tokenize for bridge tokenization). Dynamo bis/dynamo-rl serves neither route, so multi-turn TITO against Dynamo silently degraded to MITO every turn-2+ via the existing fallback path. This commit teaches the TITO client to read ClientConfig.renderer_transport (same field RendererClient consults) and route accordingly: - prime_vllm_generate (default): unchanged - posts to /v1/chat/completions/tokens and uses /tokenize over HTTP. - dynamo_chat_nvext: bridge tokenize runs locally via the renderers package (zero RTTs); final POST goes to /v1/chat/completions with placeholder messages + nvext.token_data carrying the stitched prompt_ids + explicit stop_token_ids from renderer.get_stop_token_ids(). Wire shape matches what RendererClient already produces for the same transport, so a Dynamo deployment validated against renderer mode automatically accepts TITO traffic too. Adds two unit tests that assert the dynamo-transport wire shape end-to- end via a recording client + stub renderer (no real tokenizer download).

cursor · 2026-05-05T17:39:47Z

+            value = sampling_args.get(key, extra_body.get(key))
+            if value is not None:
+                body[key] = value
+        path = "/chat/completions"


Dynamo transport silently drops cache_salt in renderer path

High Severity

The dynamo_chat_nvext transport path rebuilds the request body, silently dropping cache_salt (and other sampling arguments like prompt_logprobs) from sampling_args or extra_body. This prevents proper prefix-cache KV invalidation after policy updates, causing the engine to reuse stale KV and leading to training regression.

^{Reviewed by Cursor Bugbot for commit c1ecc7a. Configure here.}

cursor · 2026-05-05T17:39:47Z

    renderer: str = "auto"
    renderer_model_name: str | None = None
    renderer_pool_size: int | None = None
+    renderer_transport: RendererTransport = "prime_vllm_generate"


New renderer_transport config field missing from docs

Low Severity

The PR adds renderer_transport to ClientConfig, which is a user-facing configuration class documented in docs/reference.md. The ClientConfig section in that file does not reflect the new field. Per project rules, any PR that modifies core user-facing functionality described in docs/ must update the relevant documentation.

^{Triggered by project rule: BugBot Instructions}

^{Reviewed by Cursor Bugbot for commit c1ecc7a. Configure here.}

cursor · 2026-05-05T17:39:47Z

+            for k, v in extra_body.items()
+            if k not in promotable and v is not None and k not in body
+        }
+        body.update(passthrough)


priority in extra_body leaks through passthrough

Low Severity

When priority is supplied via extra_body, it gets placed in nvext["agent_hints"]["priority"] (line 248-250) and leaks through the passthrough dict as a top-level body["priority"] (line 286-291), because "priority" is not in promotable and not a top-level key in body. The renderer client's completions_request doesn't have passthrough logic, so priority only ends up in nvext there — making the two transports inconsistent in wire shape.

Additional Locations (1)

verifiers/clients/openai_chat_completions_token_client.py#L247-L250

^{Reviewed by Cursor Bugbot for commit c1ecc7a. Configure here.}

OpenAIChatCompletionsTokenClient.get_prompt_ids' prefix-match between the prompt_messages caller-input and the trajectory step messages was asymmetric: - prompt_messages went straight through normalize_for_comparison (which picks up vf.AssistantMessage.model_dump's exhaustive view, including thinking_blocks=None and other defaulted fields). - step_messages went through to_native_prompt FIRST, which produces the slimmer OpenAI-format dict that omits thinking_blocks entirely. The two normalized forms then never compared equal whenever the caller handed the client Pydantic vf.Message types -- the form MultiTurnEnv produces after maybe_normalize_messages -- so the prefix match always returned None and TITO silently fell back to MITO every turn-2+. Probe-3 and the upstream test suite both used raw dict input, so the asymmetry only showed up under real orchestrator rollouts. Fix: drop None-valued keys in normalize_for_comparison. Both sides land on the same shape regardless of whether they came in as Pydantic or as plain OpenAI dicts. Validated end-to-end against bis-dev/5/always-continue-tito (multi-turn TITO + Dynamo bis/dynamo-rl smoke): 348 /v1/chat/completions, 21 SIDECAR-SKIP-TOKENIZE markers, 0 fall- back warnings. Same SIDECAR token-prefix appears across turns, confirming the engine reuses prior-turn ids verbatim. The existing 6 unit tests (4 vanilla TITO + 2 dynamo_chat_nvext) all still pass; their dict-shaped input always normalized to the same shape on both sides, so the symmetric drop-None doesn't change them.

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 4 total unresolved issues (including 3 from previous reviews).

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit e8dc4d2. Configure here.}

cursor · 2026-05-05T18:54:27Z

+            float(item.get("logprob") or 0.0)
+            for item in (choice.get("logprobs") or {}).get("content") or []
+            if isinstance(item, dict)
+        ]


Dynamo logprobs filter breaks token-logprob length alignment

Medium Severity

The dynamo_chat_nvext logprobs parsing uses if isinstance(item, dict) as a filter in the list comprehension, which silently drops non-dict entries (e.g., None). This can make completion_logprobs shorter than completion_ids, breaking the expected 1:1 alignment. The prime_vllm_generate path correctly preserves alignment by substituting 0.0 for None entries instead of filtering them out. Misaligned logprobs can corrupt training signal computation downstream.

^{Reviewed by Cursor Bugbot for commit e8dc4d2. Configure here.}

AmeenP and others added 3 commits May 5, 2026 09:33

Add Dynamo renderer transport

6406829

Pass renderer stop tokens to Dynamo

84a8e56

cursor Bot reviewed May 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/tito dynamo transport#1289

Feat/tito dynamo transport#1289
biswapanda wants to merge 4 commits intoPrimeIntellect-ai:mainfrom
biswapanda:feat/tito-dynamo-transport

biswapanda commented May 5, 2026 •

edited by cursor Bot

Loading

Uh oh!

cursor Bot May 5, 2026

Uh oh!

cursor Bot May 5, 2026

Uh oh!

cursor Bot May 5, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

biswapanda commented May 5, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Testing

Checklist

Additional Notes

Uh oh!

cursor Bot May 5, 2026

Choose a reason for hiding this comment

Dynamo transport silently drops cache_salt in renderer path

Uh oh!

cursor Bot May 5, 2026

Choose a reason for hiding this comment

New renderer_transport config field missing from docs

Uh oh!

cursor Bot May 5, 2026

Choose a reason for hiding this comment

priority in extra_body leaks through passthrough

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 5, 2026

Choose a reason for hiding this comment

Dynamo logprobs filter breaks token-logprob length alignment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

biswapanda commented May 5, 2026 •

edited by cursor Bot

Loading

Dynamo transport silently drops `cache_salt` in renderer path

New `renderer_transport` config field missing from docs

`priority` in `extra_body` leaks through passthrough