Forward-merge release/1.7 into develop#1983
Merged
Merged
Conversation
…ll (#1980) Restores the wire format that `/generate/full` emits for the workflow's output, which silently regressed between 1.6 and 1.7. **The contract** (documented by the eval client `nat.plugins.eval.runtime.remote_workflow.py:79` and pinned by `test_remote_evaluate.py`'s server fixture): ``` data: {"value": "<final answer>"} ``` **What broke.** PR #1851 (`feat: token streaming support for ReAct Agent`) added a `_stream_fn` to `react_agent` that yields `ChatResponseChunk` (OpenAI shape). Once a stream_fn is registered, `generate_streaming_response_full` takes the streaming branch and wraps each chunk in `ResponsePayloadOutput`, whose `get_stream_data()` dumps the chunk's full OpenAI envelope. There is no top-level `value` field, so the eval client's `chunk_data.get("value")` returns `None` and every eval scores 0. The producer (`react_agent`) and consumer (eval client) ship in the same NAT release and disagree on the wire shape. `tool_calling_agent` is exposed to the same regression for the same reason — both yield `ChatResponseChunk` from their `_stream_fn`. The fix is in the shared `ResponsePayloadOutput.get_stream_data()`, so both code paths get covered. **The fix.** `ResponsePayloadOutput.get_stream_data()` now normalizes any payload — string, primitive, `ChatResponseChunk`, `ChatResponse`, other `BaseModel` — into the canonical `data: {"value": "<str>"}\n\n` envelope. Scoped to `/generate/full`; `/v1/chat/completions` is unaffected because that path yields `ChatResponseChunk` directly through `ResponseBaseModelOutput.get_stream_data()`, never wrapped in `ResponsePayloadOutput`. WebSocket consumers do their own payload coercion in `MessageValidator.convert_data_to_message_content()` and don't call `get_stream_data()` either. **Tests.** Parametrized unit tests pin the wire format per payload type. A new integration test in `test_remote_evaluate.py` round-trips real `ResponsePayloadOutput` lines through the real `EvaluationRemoteWorkflowHandler`, so a future change that desynchronizes producer and consumer fails CI rather than silently scoring zero on every eval. ## How to verify The bug surface is two pure functions on NAT data models — the producer (`ResponsePayloadOutput.get_stream_data`) and the consumer (`chunk_data.get("value")` in the eval client). You can reproduce both the bug and the fix with no FastAPI server, no LLM, and no external services: ```python import json from nat.data_models.api_server import ResponsePayloadOutput, ChatResponseChunk # What react_agent's _stream_fn yields after PR #1851: chunk = ChatResponseChunk.create_streaming_chunk("21") # What /generate/full puts on the wire: sse_line = ResponsePayloadOutput(payload=chunk).get_stream_data() # What nat.plugins.eval.runtime.remote_workflow extracts: data = json.loads(sse_line[len("data: "):-2]) print("eval client extracts:", repr(data.get("value"))) ``` Expected output: | | Output | |---|---| | On `release/1.7` (without this PR) | `eval client extracts: None` | | With this PR applied | `eval client extracts: '21'` | The full `nvidia_nat_core` and `nvidia_nat_eval` test suites pass on this branch with no regressions, including the new parametrized unit tests and the producer/consumer integration test added here. ## By Submitting this PR I confirm: - I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/NeMo-Agent-Toolkit/blob/develop/docs/source/resources/contributing/index.md). - We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license. - Any contribution which contains commits that are not Signed-Off will not be accepted. - When the PR is ready for review, new or existing tests cover these changes. - When the PR is ready for review, the documentation is up to date with these changes. ## Summary by CodeRabbit * **New Features** * Standardized the /generate/full SSE output to always emit responses as a consistent JSON "value" envelope for all payload types. * **Bug Fixes** * Remote evaluation now correctly accumulates streamed token/value segments into the final output instead of only capturing a single chunk. * **Tests** * Added unit and integration tests verifying the SSE envelope format and correct reconstruction of streamed responses. [](https://app.coderabbit.ai/change-stack/NVIDIA/NeMo-Agent-Toolkit/pull/1980?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack) Authors: - Matthew Grossman (https://github.com/matthewgrossman) Approvers: - Will Killian (https://github.com/willkill07) URL: #1980
Author
|
SUCCESS - forward-merge complete. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Forward-merge triggered by push to release/1.7 that creates a PR to keep develop up-to-date. If this PR is unable to be immediately merged due to conflicts, it will remain open for the team to manually merge. See forward-merger docs for more info.