Forward-merge release/1.7 into develop by rapids-bot[bot] · Pull Request #1983 · NVIDIA/NeMo-Agent-Toolkit

rapids-bot · 2026-05-20T23:13:21Z

Forward-merge triggered by push to release/1.7 that creates a PR to keep develop up-to-date. If this PR is unable to be immediately merged due to conflicts, it will remain open for the team to manually merge. See forward-merger docs for more info.

…ll (#1980) Restores the wire format that `/generate/full` emits for the workflow's output, which silently regressed between 1.6 and 1.7. **The contract** (documented by the eval client `nat.plugins.eval.runtime.remote_workflow.py:79` and pinned by `test_remote_evaluate.py`'s server fixture): ``` data: {"value": "<final answer>"} ``` **What broke.** PR #1851 (`feat: token streaming support for ReAct Agent`) added a `_stream_fn` to `react_agent` that yields `ChatResponseChunk` (OpenAI shape). Once a stream_fn is registered, `generate_streaming_response_full` takes the streaming branch and wraps each chunk in `ResponsePayloadOutput`, whose `get_stream_data()` dumps the chunk's full OpenAI envelope. There is no top-level `value` field, so the eval client's `chunk_data.get("value")` returns `None` and every eval scores 0. The producer (`react_agent`) and consumer (eval client) ship in the same NAT release and disagree on the wire shape. `tool_calling_agent` is exposed to the same regression for the same reason — both yield `ChatResponseChunk` from their `_stream_fn`. The fix is in the shared `ResponsePayloadOutput.get_stream_data()`, so both code paths get covered. **The fix.** `ResponsePayloadOutput.get_stream_data()` now normalizes any payload — string, primitive, `ChatResponseChunk`, `ChatResponse`, other `BaseModel` — into the canonical `data: {"value": "<str>"}\n\n` envelope. Scoped to `/generate/full`; `/v1/chat/completions` is unaffected because that path yields `ChatResponseChunk` directly through `ResponseBaseModelOutput.get_stream_data()`, never wrapped in `ResponsePayloadOutput`. WebSocket consumers do their own payload coercion in `MessageValidator.convert_data_to_message_content()` and don't call `get_stream_data()` either. **Tests.** Parametrized unit tests pin the wire format per payload type. A new integration test in `test_remote_evaluate.py` round-trips real `ResponsePayloadOutput` lines through the real `EvaluationRemoteWorkflowHandler`, so a future change that desynchronizes producer and consumer fails CI rather than silently scoring zero on every eval. ## How to verify The bug surface is two pure functions on NAT data models — the producer (`ResponsePayloadOutput.get_stream_data`) and the consumer (`chunk_data.get("value")` in the eval client). You can reproduce both the bug and the fix with no FastAPI server, no LLM, and no external services: ```python import json from nat.data_models.api_server import ResponsePayloadOutput, ChatResponseChunk # What react_agent's _stream_fn yields after PR #1851: chunk = ChatResponseChunk.create_streaming_chunk("21") # What /generate/full puts on the wire: sse_line = ResponsePayloadOutput(payload=chunk).get_stream_data() # What nat.plugins.eval.runtime.remote_workflow extracts: data = json.loads(sse_line[len("data: "):-2]) print("eval client extracts:", repr(data.get("value"))) ``` Expected output: | | Output | |---|---| | On `release/1.7` (without this PR) | `eval client extracts: None` | | With this PR applied | `eval client extracts: '21'` | The full `nvidia_nat_core` and `nvidia_nat_eval` test suites pass on this branch with no regressions, including the new parametrized unit tests and the producer/consumer integration test added here. ## By Submitting this PR I confirm: - I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/NeMo-Agent-Toolkit/blob/develop/docs/source/resources/contributing/index.md). - We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license. - Any contribution which contains commits that are not Signed-Off will not be accepted. - When the PR is ready for review, new or existing tests cover these changes. - When the PR is ready for review, the documentation is up to date with these changes. ## Summary by CodeRabbit * **New Features** * Standardized the /generate/full SSE output to always emit responses as a consistent JSON "value" envelope for all payload types. * **Bug Fixes** * Remote evaluation now correctly accumulates streamed token/value segments into the final output instead of only capturing a single chunk. * **Tests** * Added unit and integration tests verifying the SSE envelope format and correct reconstruction of streamed responses. [![Review Change Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/NVIDIA/NeMo-Agent-Toolkit/pull/1980?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack) Authors: - Matthew Grossman (https://github.com/matthewgrossman) Approvers: - Will Killian (https://github.com/willkill07) URL: #1980

rapids-bot · 2026-05-20T23:13:24Z

SUCCESS - forward-merge complete.

rapids-bot Bot requested a review from a team as a code owner May 20, 2026 23:13

GPUtester merged commit 561276b into develop May 20, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Forward-merge release/1.7 into develop#1983

Forward-merge release/1.7 into develop#1983
GPUtester merged 1 commit into
developfrom
release/1.7

rapids-bot Bot commented May 20, 2026

Uh oh!

Uh oh!

rapids-bot Bot commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rapids-bot Bot commented May 20, 2026

Uh oh!

Uh oh!

rapids-bot Bot commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants