Skip to content

Conversation

@willkill07
Copy link
Member

@willkill07 willkill07 commented Jan 14, 2026

  • Introduced api_versioning.py for handling API versioning and deprecation.
  • Added VersioningConfig to fastapi_front_end_config.py for versioning settings.
  • Implemented HitlHttpHandler in hitl_http.py for Human-in-the-Loop interactions via HTTP.
  • Updated FastApiFrontEndPluginWorker to integrate versioning and HITL HTTP functionalities, including middleware for version headers and route registration for HITL endpoints.
  • Restructure FastAPI code to be more route centric
  • Add Distributed Workflow Observability

Breaking Changes

  • Removes /chat and /chat/stream outright due to it already being deprecated. Migration strategy: switch to /v1/chat/completions with "stream": (true|false) as part of request

Description

Closes

By Submitting this PR I confirm:

  • I am familiar with the Contributing Guidelines.
  • We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
    • Any contribution which contains commits that are not Signed-Off will not be accepted.
  • When the PR is ready for review, new or existing tests cover these changes.
  • When the PR is ready for review, the documentation is up to date with these changes.

Summary by CodeRabbit

  • New Features

    • v1 API: unified workflow endpoints and OpenAI v1-compatible chat; versioning header support and legacy-route aliases
    • HITL: HTTP polling + SSE interfaces for human review/response
    • Per-user monitoring, MCP tool discovery, static file store API, job-based evaluation, and websocket routing
  • Improvements

    • Observability: header propagation, trace embedding/merging, and enhanced streaming trace/usage emission
    • Robust streaming, batching, lazy tool init, and parent-child invocation metadata propagation
  • Documentation

    • Updated examples, migration guidance, and observability/HITL docs
  • Tests

    • New/updated tests covering versioning, streaming, HITL, and OpenAI v1 paths

✏️ Tip: You can customize this high-level summary in your review settings.

@willkill07 willkill07 self-assigned this Jan 14, 2026
@willkill07 willkill07 requested a review from a team as a code owner January 14, 2026 14:34
@willkill07 willkill07 added feature request New feature or request breaking Breaking change DO NOT MERGE PR should not be merged; see PR for details labels Jan 14, 2026
@coderabbitai
Copy link

coderabbitai bot commented Jan 14, 2026

Walkthrough

Adds API versioning and OpenAI v1-compatible endpoints, HTTP/SSE HITL interface, observability header propagation and trace-merge utilities, modular FastAPI route registration (workflow/auth/evaluate/monitor/static/websocket/mcp), streaming/trace-enhanced response helpers, and parent-child invocation metadata propagation.

Changes

Cohort / File(s) Summary
Front-end config & versioning
src/nat/front_ends/fastapi/fastapi_front_end_config.py, src/nat/front_ends/fastapi/api_versioning.py
New versioning, HITL, and observability config models; validation enforcing endpoint paths; VersioningOptions and header wrapper utilities; default endpoint paths moved to /v1/....
Route registration helpers
src/nat/front_ends/fastapi/routes/*
.../workflow.py, .../auth.py, .../hitl.py, .../evaluate.py, .../monitor.py, .../static_files.py, .../websocket.py, .../__init__.py
Add modular register_* helpers to wire auth, workflow (OpenAI v1 + legacy), HITL (polling + SSE), evaluate, monitor, static-file, and websocket routes; session resolution and route metadata included.
FastAPI worker & routing
src/nat/front_ends/fastapi/fastapi_front_end_plugin_worker.py
Refactor routing to use registration helpers; integrate versioning header application, HITL handler, route registration helpers, and expose EvaluationRun; new helper methods for versioned routing.
Streaming & response helpers
src/nat/front_ends/fastapi/response_helpers.py, src/nat/front_ends/fastapi/intermediate_steps_subscriber.py
Streaming generators extended to accept trace collectors, headers, include_trace/include_usage/send_done; pull_intermediate supports an on_step callback to capture/emmit intermediate steps.
Observability & trace merging
src/nat/observability/trace_merge.py, src/nat/runtime/session.py
New merge_trace_events() to convert remote trace payloads to IntermediateSteps; build/inject observability headers; parse incoming NAT trace headers into context state.
Runtime / invocation hierarchy
src/nat/runtime/runner.py, src/nat/builder/workflow.py, src/nat/runtime/session.py
Runner, Workflow, and Session/SessionManager run signatures accept parent_id and parent_name; root InvocationNode propagation updated to include parent metadata.
Remote workflow client
src/nat/eval/remote_workflow.py
Inject observability headers on outbound requests to /v1/workflow/full; merge remote _trace events into intermediate steps; improved streaming chunk handling and progress-bar safety.
MCP tool-list endpoint
packages/nvidia_nat_mcp/src/nat/plugins/mcp/server/routes.py
New register_mcp_routes() registering GET /mcp/client/tool/list with Pydantic models and aggregated per-group tool metadata.
Schema & message handling
src/nat/data_models/api_server.py, src/nat/front_ends/fastapi/message_handler.py
Rename WorkflowSchemaType enum members to WORKFLOW, WORKFLOW_STREAM, OPENAI_API_V1, OPENAI_API_V1_STREAM and update mappings/WS handling accordingly.
Observability processor
src/nat/observability/processor/batching_processor.py
Use per-batch start time for batching/flush logic; adjust flush scheduling and lifecycle handling.
Tests
tests/nat/front_ends/fastapi/*, tests/nat/server/*, tests/...
Add/update tests for version headers, config models, HITL, OpenAI v1 compatibility, streaming traces; many fixtures and tests updated to /v1/... paths.
Docs & examples
docs/..., examples/...
Update docs/examples and UI labels to /v1/...; add observability propagation docs and front-end controls; migrate many curl examples.
Miscellaneous
assorted files (autogen callback, weave exporter, test utils, examples, packages)
Lazy retriever initialization in examples, autogen step-manager retrieval helper, Weave exporter isinstance fix, wait_for_port test util, CLI help text updates, small enum/mapping and test flag renames (use_openai_v1_api).

Sequence Diagram(s)

sequenceDiagram
  participant Client as Client
  participant Frontend as FastAPI Front-End
  participant Router as Workflow Router
  participant HITL as HitlHttpHandler
  participant Runner as Runner/Workflow
  participant JobStore as JobStore/Remote

  Client->>Frontend: POST /v1/workflow (with headers)
  Frontend->>Frontend: apply_version_headers(), parse observability headers
  Frontend->>Router: dispatch to workflow handler
  alt HITL required
    Router->>HITL: enqueue InteractionPrompt (session_id)
    HITL->>Client: SSE / polling delivers prompt
    Client->>HITL: respond -> resolves future
    HITL-->>Router: human response returned
  end
  Router->>Runner: start workflow (parent_id/parent_name, injected headers)
  Runner->>JobStore: optional async submission / remote call
  JobStore-->>Runner: job result / stream chunks (may include _trace)
  Runner->>Router: return/stream responses (include_usage / include_trace)
  Router->>Frontend: emit chunks / final trace merged
  Frontend->>Client: SSE / chunked HTTP response
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.71% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and concisely summarizes the three main features introduced: FastAPI refactoring, HITL HTTP support, and distributed workflow observability, using imperative mood within the 72-character guideline.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@willkill07 willkill07 added the skip-ci Optionally Skip CI for this PR label Jan 14, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🤖 Fix all issues with AI agents
In `@src/nat/front_ends/fastapi/fastapi_front_end_plugin_worker.py`:
- Around line 306-310: The _ensure_usage function currently assigns a default
Usage when result.usage is not the default (inverted logic); change the
condition in _ensure_usage so it only sets result.usage = Usage(prompt_tokens=0,
completion_tokens=0, total_tokens=0) when result is a ChatResponse and
result.usage is missing or equal to the default empty Usage (e.g., check
result.usage is None or result.usage == Usage()); update the condition in the
_ensure_usage staticmethod accordingly to avoid overwriting valid usage data.

In `@src/nat/front_ends/fastapi/routes/auth.py`:
- Around line 63-75: The handler currently always returns
AUTH_REDIRECT_SUCCESS_HTML even when exceptions occur; change the control flow
so that when you call flow_state.future.set_exception(...) you also return an
error response (e.g., AUTH_REDIRECT_FAILURE_HTML or a generated error HTML) with
an appropriate non-200 status (e.g., 400 or 500) and Cache-Control header,
instead of falling through to the success response; if
AUTH_REDIRECT_FAILURE_HTML doesn't exist, add a simple failure HTML constant and
use it in the exception branches so success and failure responses are distinct
and reflect the real outcome.

In `@src/nat/front_ends/fastapi/routes/evaluate.py`:
- Around line 275-277: Replace the f-string in the logger.exception call inside
the exception handler so logging uses lazy formatting: change the call in the
except block that currently uses logger.exception(f"Error evaluating item with
{request.evaluator_name}") to use logger.exception("Error evaluating item with
%s", request.evaluator_name) in the Evaluate route (the except block that
returns EvaluateItemResponse on error) so the logger defers string
interpolation.

In `@src/nat/front_ends/fastapi/routes/hitl.py`:
- Line 1: Update the SPDX copyright header year range to match the project's
convention: replace the single year "2026" with the range "2025-2026" in the
file-level header (the SPDX/ copyright comment at the top of
src/nat/front_ends/fastapi/routes/hitl.py) so it matches the other files like
workflow.py.

In `@src/nat/front_ends/fastapi/routes/workflow.py`:
- Around line 94-95: Fix the typo in the route description string used in the
workflow endpoint: change "sectonds" to "seconds" in the description literal
(the multi-line string passed as description in the route definition in
workflow.py). Ensure the concatenated string still reads correctly: "Attempt to
perform the job synchronously up until `sync_timeout` seconds, if the job hasn't
been completed by then a job_id will be returned with a status code of 202."
- Around line 416-418: The route's description string is malformed with extra
leading whitespace in the second concatenated literal; update the description
argument (the description= parameter) to be a single properly spaced string (or
properly trimmed concatenated literals) so the "or set to 'none' to suppress all
intermediate steps." clause has no leading spaces and spacing/punctuation is
consistent; replace the current split literals with one cleaned string literal
for clarity.
- Line 1: Update the copyright header line at the top of the file (the SPDX
header) to use the year range "2025-2026" instead of "2026" so it matches the
project's copyright format (e.g., change the SPDX-FileCopyrightText line to
include "2025-2026").
- Around line 452-454: The route decorator's description parameter contains two
adjacent/malformed string literals (the "Stream raw intermediate steps..." and
the "Use filter_steps..." fragments) causing the same formatting issue as the
GET endpoint; fix by replacing these fragments with a single well-formed string
(either one-line or a triple-quoted multi-line string) for the description
argument in the streaming endpoint decorator (the description parameter in the
FastAPI route in src/nat/front_ends/fastapi/routes/workflow.py) so the text
reads: "Stream raw intermediate steps without any step adaptor translations. Use
filter_steps query parameter to filter steps by type (comma-separated list) or
set to 'none' to suppress all intermediate steps."

In `@src/nat/runtime/session.py`:
- Around line 674-694: The try-except blocks that parse "x-nat-trace-id" and
update _context_state.workflow_trace_id and that append to
_context_state._active_span_id_stack currently swallow all exceptions; change
them to catch Exception as e and emit a debug-level log (including the header
key and exception details) via the module/class logger before continuing so
invalid headers don't crash but diagnostic info is preserved; update both
locations (the block handling nat_trace_id and the block handling nat_span_id)
to log the caught exception and the offending header value.
- Around line 692-694: The code unconditionally sets workflow_run_id from
request.headers.get("x-nat-workflow-run-id") which can overwrite a value
previously set from the "workflow-run-id" header; update the block that handles
nat_workflow_run so it only calls
self._context_state.workflow_run_id.set(nat_workflow_run) if workflow_run_id is
not already present (e.g., check self._context_state.workflow_run_id.get() or
equivalent) mirroring the guard used earlier when setting from
"workflow-run-id".
🧹 Nitpick comments (35)
src/nat/front_ends/fastapi/api_versioning.py (3)

1-6: SPDX header format inconsistency.

The header format differs from other files in the codebase. Most files start directly with # SPDX-FileCopyrightText: on line 1, not with a bare # comment.

Suggested fix
-#
-# SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 # SPDX-License-Identifier: Apache-2.0
 #
-# Helpers for API versioning and deprecation handling in the FastAPI front end.
-#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.

34-37: Unused is_legacy parameter.

The is_legacy parameter is accepted but never used. If this is intentional scaffolding for future deprecation warnings, prefix it with _ to suppress the linter warning; otherwise, remove it.

Suggested fix (if keeping for future use)
-def apply_version_headers(response: Response, opts: VersioningOptions, *, is_legacy: bool = False) -> None:
+def apply_version_headers(response: Response, opts: VersioningOptions, *, _is_legacy: bool = False) -> None:

60-62: Consider a more precise version path check.

The current check startswith("/v") may match unintended paths like /validate or /videos. If version prefixes follow /v{digit}/, a regex or stricter check could reduce false positives.

Suggested refinement
+import re
+
+_VERSION_PREFIX_RE = re.compile(r"^/v\d+/")
+
 def is_versioned_request(request: Request) -> bool:
     """Quick check to detect /v{n}/ prefix on the path."""
-    return request.url.path.startswith("/v")
+    return bool(_VERSION_PREFIX_RE.match(request.url.path))
src/nat/observability/trace_merge.py (1)

56-57: Redundant exception object in logger.exception call.

logger.exception() automatically captures and logs the current exception context. Passing exc explicitly is redundant.

Suggested fix
         except (ValidationError, Exception) as exc:  # broad by design to avoid breaking trace merge
-            logger.exception("Failed to merge trace event: %s", exc)
+            logger.exception("Failed to merge trace event")
src/nat/front_ends/fastapi/fastapi_front_end_config.py (1)

196-200: Consider using Pydantic's deprecated field parameter.

The openai_api_path deprecation is noted in the description text. Pydantic's Field supports a deprecated parameter that can provide better tooling integration (e.g., IDE warnings, OpenAPI schema annotations).

Suggested enhancement
         openai_api_path: str | None = Field(
             default=None,
+            deprecated="Use openai_api_v1_path instead.",
             description=("(deprecated) Path for the default workflow using the OpenAI API Specification. "
                          "If None, no workflow endpoint with the OpenAI API Specification is created."),
         )
src/nat/runtime/session.py (2)

53-77: Accessing private _active_span_id_stack directly is fragile.

Line 60 accesses the internal _active_span_id_stack ContextVar directly rather than using the active_span_id_stack property. While this avoids auto-initialization side effects, it couples this utility to internal implementation details. Consider whether the property accessor would be more appropriate or document why bypassing it is intentional.

Additionally, the span_id fallback generation at line 61 truncates a 128-bit UUID to 16 hex characters, which matches the W3C trace context span-id format (64-bit). This is correct.


570-595: Docstring format should follow Google style consistently.

The docstring at lines 577-585 uses an Args: section which is correct for Google style, but it's missing the trailing period on the method description line. Per coding guidelines, the first line of docstrings must be a concise description ending with a period.

Proposed fix
         """
-        Start a workflow run
+        Start a workflow run.

         Args:
             message: Input message for the workflow
src/nat/eval/remote_workflow.py (1)

78-79: Redundant exception object in logger.exception call.

Per the static analysis hint (TRY401), when using logger.exception(), the exception is automatically included in the log output. Passing e explicitly is redundant.

Proposed fix
                         except json.JSONDecodeError as e:
-                            logger.exception("Failed to parse generate response chunk: %s", e)
+                            logger.exception("Failed to parse generate response chunk")
                             continue
src/nat/front_ends/fastapi/routes/__init__.py (1)

1-16: Copyright year inconsistency.

Line 1 uses Copyright (c) 2026 while other files in this PR use Copyright (c) 2025-2026. For consistency and to reflect the project's history, consider updating to the range format.

Proposed fix
-# SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
src/nat/front_ends/fastapi/intermediate_steps_subscriber.py (1)

26-26: Missing type hints on new parameter.

Per coding guidelines, all public APIs require Python 3.11+ type hints on parameters. The on_step parameter lacks a type annotation.

Proposed fix
-async def pull_intermediate(_q, adapter, on_step=None):
+async def pull_intermediate(_q, adapter, on_step: Callable[[ResponseIntermediateStep], None] | None = None):

You'll also need to add the import:

from collections.abc import Callable
tests/nat/front_ends/fastapi/test_openai_compatibility.py (1)

284-286: Redundant sse.json() calls.

sse.json() is called on line 281 for validation and again on line 285 for usage detection. Consider caching the parsed JSON to avoid redundant parsing.

Proposed fix
                 if sse.data != "[DONE]":
-                    chunk = ChatResponseChunk.model_validate(sse.json())
+                    chunk_json = sse.json()
+                    chunk = ChatResponseChunk.model_validate(chunk_json)
                     assert chunk.object == "chat.completion.chunk"
                     chunks_received += 1
                     # usage summary may appear as plain JSON dict; detect by keys
-                    if isinstance(sse.json(), dict) and "prompt_tokens" in sse.json():
+                    if isinstance(chunk_json, dict) and "prompt_tokens" in chunk_json:
                         usage_chunks += 1
tests/nat/front_ends/fastapi/test_api_versioning.py (1)

23-30: LGTM! Consider adding edge case tests.

The test correctly verifies that apply_version_headers sets the X-API-Version header when api_version_header=True. The assertions are clear and the test structure follows conventions.

Consider expanding test coverage to include:

  • api_version_header=False (header should not be set)
  • is_legacy=True parameter behavior (if it affects Deprecation header)
src/nat/front_ends/fastapi/routes/auth.py (2)

49-49: Return a proper Response object for invalid state.

Returning a plain string works due to FastAPI's auto-conversion, but explicitly returning an HTMLResponse or PlainTextResponse with an appropriate status code (e.g., 400) would be more consistent with the rest of the handler.

Suggested fix
-            return "Invalid state. Please restart the authentication process."
+            return HTMLResponse(
+                content="Invalid state. Please restart the authentication process.",
+                status_code=400,
+                headers={"Content-Type": "text/html; charset=utf-8", "Cache-Control": "no-cache"}
+            )

36-44: Add return type hint to inner function.

Per coding guidelines, all public APIs require type hints. While redirect_uri is an inner function, adding a return type hint improves clarity and IDE support.

Suggested fix
-    async def redirect_uri(request: Request):
+    async def redirect_uri(request: Request) -> HTMLResponse:
packages/nvidia_nat_mcp/src/nat/plugins/mcp/server/routes.py (5)

74-75: Remove unused noqa directive and redundant exception in log.

Static analysis indicates the # noqa: BLE001 is unused (the exception catch is valid here), and including the exception object in logger.exception() is redundant since it automatically includes the exception info.

Proposed fix
-                except Exception as exc:  # noqa: BLE001
-                    logger.exception("Failed to connect to MCP server %s: %s", getattr(client, "server_name", "?"), exc)
+                except Exception:
+                    logger.exception("Failed to connect to MCP server %s", getattr(client, "server_name", "?"))

88-89: Remove unused noqa directive and redundant exception in log.

Same issue as above - the noqa directive is unused and the exception object is redundant in logger.exception().

Proposed fix
-                except Exception as exc:  # noqa: BLE001
-                    logger.exception("Failed to get accessible functions for group %s: %s", group_name, exc)
+                except Exception:
+                    logger.exception("Failed to get accessible functions for group %s", group_name)

103-104: Remove unused noqa directive and redundant exception in log.

Same pattern - clean up the unused directive and redundant exception parameter.

Proposed fix
-                except Exception as exc:  # noqa: BLE001
-                    logger.exception("Failed to process overrides for group %s: %s", group_name, exc)
+                except Exception:
+                    logger.exception("Failed to process overrides for group %s", group_name)

148-150: Use logger.exception and remove unused noqa directive.

Per coding guidelines, when catching and logging exceptions without re-raising, use logger.exception() to capture the full stack trace. Additionally, the noqa directive is flagged as unused.

Proposed fix
-        except Exception as exc:  # noqa: BLE001
-            logger.error("Error in MCP client tool list endpoint: %s", exc)
+        except Exception as exc:
+            logger.exception("Error in MCP client tool list endpoint")
             raise HTTPException(status_code=500, detail=f"Failed to retrieve MCP client information: {exc}") from exc

40-47: Consider moving Pydantic models to module level.

Defining MCPToolInfo and MCPClientToolListResponse inside the register_mcp_routes function limits reusability and makes them harder to import for testing or documentation. Moving them to module level would improve discoverability.

src/nat/front_ends/fastapi/routes/monitor.py (1)

36-42: Consider enhancing the docstring to Google-style format.

Per coding guidelines, public functions should have Google-style docstrings with Args and Returns sections. The current one-liner is functional but could be expanded.

📝 Suggested docstring improvement
 async def register_monitor_routes(worker: FastApiFrontEndPluginWorker, app: FastAPI) -> None:
-    """Add the per-user monitoring endpoint to the FastAPI app."""
+    """Add the per-user monitoring endpoint to the FastAPI app.
+
+    Args:
+        worker: The FastAPI front-end plugin worker instance.
+        app: The FastAPI application to register routes on.
+    """
     # Check if monitoring is enabled in config
src/nat/front_ends/fastapi/routes/static_files.py (4)

41-48: Add a docstring for the public function.

Per coding guidelines, public functions should have Google-style docstrings.

📝 Suggested docstring
 async def register_static_file_routes(worker: FastApiFrontEndPluginWorker, app: FastAPI,
                                       builder: WorkflowBuilder) -> None:
+    """Register static file object store endpoints on the FastAPI app.
+
+    Args:
+        worker: The FastAPI front-end plugin worker instance.
+        app: The FastAPI application to register routes on.
+        builder: The workflow builder for obtaining the object store client.
+    """

     if not worker.front_end_config.object_store:

59-70: Add type hints for endpoint functions.

Per coding guidelines, all public APIs require type hints on parameters and return values.

💡 Proposed type hints
-    async def add_static_file(file_path: str, file: UploadFile):
+    async def add_static_file(file_path: str, file: UploadFile) -> dict[str, str]:

72-80: Add type hints for consistency.

Same recommendation as add_static_file - add return type hint -> dict[str, str].


82-97: Quote filename in Content-Disposition header for safety.

Filenames containing special characters (spaces, quotes) could break the header parsing. Consider quoting the filename.

🔧 Proposed fix
-        return StreamingResponse(reader(),
-                                 media_type=file_data.content_type,
-                                 headers={"Content-Disposition": f"attachment; filename={filename}"})
+        return StreamingResponse(reader(),
+                                 media_type=file_data.content_type,
+                                 headers={"Content-Disposition": f'attachment; filename="{filename}"'})
src/nat/front_ends/fastapi/routes/websocket.py (2)

38-46: Consider expanding docstring to Google-style format.

Per coding guidelines, public functions should have Args and Returns documentation.


53-76: Simplify redundant condition in cookie header logic.

On line 74, the condition not cookie_exists and not existing_session_cookie is redundant because existing_session_cookie can only be True when cookie_exists is True. If cookie_exists is False, then existing_session_cookie is always False.

♻️ Simplified condition
         # Add new cookie header only if no cookies exist and no session cookie found
-        if not cookie_exists and not existing_session_cookie:
+        if not cookie_exists:
             headers.append((b"cookie", cookie_header.encode()))
             logger.info("WebSocket: Added new session cookie header: %s", session_id[:10] + "...")
src/nat/front_ends/fastapi/routes/workflow.py (2)

299-302: Refine exception handling for OpenAI-compatible endpoint.

The return result on Line 298 should be in an else block per TRY300. Also, catching bare Exception is broad. Consider logging the exception using logger.exception() before returning the error response, per coding guidelines for catching without re-raising.

Proposed fix
                 if stream_requested:
                     # ... streaming logic ...
                     return StreamingResponse(...)

-                result = await generate_single_response(payload, session, result_type=ChatResponse)
-                add_context_headers_to_response(response)
-                worker._ensure_usage(result)
-                return result
-            except HTTPException:
-                raise
-            except Exception as exc:  # OpenAI-compatible error shape
-                return worker._openai_error_response(exc)
+                else:
+                    result = await generate_single_response(payload, session, result_type=ChatResponse)
+                    add_context_headers_to_response(response)
+                    worker._ensure_usage(result)
+                    return result
+            except HTTPException:
+                raise
+            except Exception as exc:  # OpenAI-compatible error shape
+                logger.exception("Unexpected error in OpenAI-compatible endpoint")
+                return worker._openai_error_response(exc)

312-315: Use logger.exception() when catching without re-raising.

Per coding guidelines, when catching and logging exceptions without re-raising, use logger.exception() to capture full stack trace information.

Proposed fix
             try:
                 job_output = json.loads(job_output)
             except json.JSONDecodeError:
-                logger.error("Failed to parse job output as JSON: %s", job_output)
+                logger.exception("Failed to parse job output as JSON: %s", job_output)
                 job_output = {"error": "Output parsing failed"}
src/nat/front_ends/fastapi/routes/hitl.py (2)

55-56: Use asyncio.get_running_loop() instead of deprecated get_event_loop().

asyncio.get_event_loop() is deprecated in Python 3.10+ and may not return the running loop in all contexts. Use asyncio.get_running_loop() since this code is always called from within an async context.

Proposed fix
-            fut: asyncio.Future[HumanResponse] = asyncio.get_event_loop().create_future()
+            fut: asyncio.Future[HumanResponse] = asyncio.get_running_loop().create_future()

156-159: Add exception chaining with from None or from exc.

Per B904, when raising a new exception in an except block, use raise ... from to properly chain exceptions or explicitly suppress chaining.

Proposed fix
                 try:
                     await self._queue.respond(interaction_id, payload)
                 except KeyError:
-                    raise HTTPException(status_code=404, detail=f"Interaction {interaction_id} not found")
+                    raise HTTPException(status_code=404, detail=f"Interaction {interaction_id} not found") from None
src/nat/front_ends/fastapi/response_helpers.py (4)

40-41: Consider removing or documenting unused parameters.

send_done and include_usage are declared but unused. The comment on Line 56-57 indicates these are reserved for future extension. Consider either:

  1. Removing them until needed (YAGNI)
  2. Adding a # noqa: ARG001 comment to suppress the warning with intent documentation

Also applies to: 56-57


64-64: Mutable default argument in function signature.

Creating StepAdaptor(StepAdaptorConfig()) as a default argument instantiates new objects at function definition time, not call time. While this may work correctly since StepAdaptor appears stateless, it's a risky pattern. Consider using None with in-function defaulting.

Proposed fix
 async def generate_streaming_response(
         payload: typing.Any,
         *,
         session: Session,
         streaming: bool,
-        step_adaptor: StepAdaptor = StepAdaptor(StepAdaptorConfig()),
+        step_adaptor: StepAdaptor | None = None,
         result_type: type | None = None,
         output_type: type | None = None,
         trace_collector: list[ResponseSerializable] | None = None,
         http_headers: dict[str, str] | None = None) -> AsyncGenerator[ResponseSerializable]:
+
+    if step_adaptor is None:
+        step_adaptor = StepAdaptor(StepAdaptorConfig())

53-54: Use TypeError instead of ValueError for type mismatches.

Per TRY004, when an item has an unexpected type, TypeError is the appropriate exception.

Proposed fix
         else:
-            raise ValueError("Unexpected item type in stream. Expected ChatResponseSerializable, got: " +
+            raise TypeError("Unexpected item type in stream. Expected ChatResponseSerializable, got: " +
                              str(type(item)))

68-68: Document or remove unused http_headers parameter.

The http_headers parameter is declared but never used. If this is for future use, add a comment; otherwise, remove it.

src/nat/front_ends/fastapi/fastapi_front_end_plugin_worker.py (1)

437-437: Sort __all__ for consistency.

Per RUF022, __all__ should be sorted.

Proposed fix
-__all__ = ["FastApiFrontEndPluginWorkerBase", "FastApiFrontEndPluginWorker", "RouteInfo", "EvaluationRun"]
+__all__ = ["EvaluationRun", "FastApiFrontEndPluginWorker", "FastApiFrontEndPluginWorkerBase", "RouteInfo"]
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0b721cc and 668a85a.

📒 Files selected for processing (25)
  • docs/source/reference/rest-api/api-server-endpoints.md
  • packages/nvidia_nat_mcp/src/nat/plugins/mcp/server/routes.py
  • src/nat/builder/workflow.py
  • src/nat/eval/remote_workflow.py
  • src/nat/front_ends/fastapi/api_versioning.py
  • src/nat/front_ends/fastapi/fastapi_front_end_config.py
  • src/nat/front_ends/fastapi/fastapi_front_end_plugin_worker.py
  • src/nat/front_ends/fastapi/intermediate_steps_subscriber.py
  • src/nat/front_ends/fastapi/response_helpers.py
  • src/nat/front_ends/fastapi/routes/__init__.py
  • src/nat/front_ends/fastapi/routes/auth.py
  • src/nat/front_ends/fastapi/routes/evaluate.py
  • src/nat/front_ends/fastapi/routes/hitl.py
  • src/nat/front_ends/fastapi/routes/monitor.py
  • src/nat/front_ends/fastapi/routes/static_files.py
  • src/nat/front_ends/fastapi/routes/websocket.py
  • src/nat/front_ends/fastapi/routes/workflow.py
  • src/nat/observability/trace_merge.py
  • src/nat/runtime/runner.py
  • src/nat/runtime/session.py
  • tests/nat/front_ends/fastapi/test_api_versioning.py
  • tests/nat/front_ends/fastapi/test_fastapi_front_end_config.py
  • tests/nat/front_ends/fastapi/test_mcp_client_endpoint.py
  • tests/nat/front_ends/fastapi/test_openai_compatibility.py
  • tests/nat/front_ends/fastapi/test_response_helpers.py
🧰 Additional context used
📓 Path-based instructions (12)
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/*.py: Follow PEP 20 and PEP 8 for Python style guidelines
Run yapf with PEP 8 base and 'column_limit = 120' for code formatting
Use 'ruff check --fix' for linting with configuration from 'pyproject.toml', fix warnings unless explicitly ignored
Use snake_case for functions and variables, PascalCase for classes, UPPER_CASE for constants
All public APIs require Python 3.11+ type hints on parameters and return values
Prefer 'collections.abc' / 'typing' abstractions (e.g., 'Sequence' over 'list') for type hints
Use 'typing.Annotated' for units or extra metadata when useful
Treat 'pyright' warnings (configured in 'pyproject.toml') as errors during development
Preserve stack traces and prevent duplicate logging when handling exceptions; use bare 'raise' statements when re-raising, and use 'logger.error()' for logging (not 'logger.exception()') to avoid duplicate stack trace output
When catching and logging exceptions without re-raising, always use 'logger.exception()' (equivalent to 'logger.error(exc_info=True)') to capture full stack trace information
Pydantic models using 'SecretStr', 'SerializableSecretStr', or 'OptionalSecretStr' should use 'default=None' for optional fields and 'default_factory=lambda: SerializableSecretStr("")' for non-optional fields to avoid initialization bugs
Provide Google-style docstrings for every public module, class, function and CLI command
The first line of docstrings must be a concise description ending with a period
Surround code entities in docstrings with backticks to avoid Vale false-positives
Validate and sanitise all user input, especially in web or CLI interfaces
Prefer 'httpx' with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use 'async'/'await' for I/O-bound work (HTTP, DB, file reads)
Cache expensive computations with 'functools.lru_cache' or an external cache when appropriate
Leverage NumPy vectorised operations whenever beneficial and feasible

Files:

  • src/nat/front_ends/fastapi/routes/__init__.py
  • packages/nvidia_nat_mcp/src/nat/plugins/mcp/server/routes.py
  • src/nat/builder/workflow.py
  • src/nat/front_ends/fastapi/routes/auth.py
  • src/nat/front_ends/fastapi/routes/workflow.py
  • src/nat/front_ends/fastapi/routes/static_files.py
  • tests/nat/front_ends/fastapi/test_openai_compatibility.py
  • src/nat/front_ends/fastapi/routes/monitor.py
  • tests/nat/front_ends/fastapi/test_response_helpers.py
  • src/nat/eval/remote_workflow.py
  • src/nat/observability/trace_merge.py
  • src/nat/front_ends/fastapi/response_helpers.py
  • src/nat/front_ends/fastapi/routes/hitl.py
  • tests/nat/front_ends/fastapi/test_mcp_client_endpoint.py
  • src/nat/front_ends/fastapi/routes/evaluate.py
  • src/nat/runtime/session.py
  • src/nat/front_ends/fastapi/routes/websocket.py
  • tests/nat/front_ends/fastapi/test_fastapi_front_end_config.py
  • src/nat/runtime/runner.py
  • src/nat/front_ends/fastapi/intermediate_steps_subscriber.py
  • src/nat/front_ends/fastapi/fastapi_front_end_plugin_worker.py
  • src/nat/front_ends/fastapi/fastapi_front_end_config.py
  • tests/nat/front_ends/fastapi/test_api_versioning.py
  • src/nat/front_ends/fastapi/api_versioning.py
**/*.{py,yaml,yml,json,toml}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Indent with 4 spaces (never tabs) and ensure every file ends with a single newline

Files:

  • src/nat/front_ends/fastapi/routes/__init__.py
  • packages/nvidia_nat_mcp/src/nat/plugins/mcp/server/routes.py
  • src/nat/builder/workflow.py
  • src/nat/front_ends/fastapi/routes/auth.py
  • src/nat/front_ends/fastapi/routes/workflow.py
  • src/nat/front_ends/fastapi/routes/static_files.py
  • tests/nat/front_ends/fastapi/test_openai_compatibility.py
  • src/nat/front_ends/fastapi/routes/monitor.py
  • tests/nat/front_ends/fastapi/test_response_helpers.py
  • src/nat/eval/remote_workflow.py
  • src/nat/observability/trace_merge.py
  • src/nat/front_ends/fastapi/response_helpers.py
  • src/nat/front_ends/fastapi/routes/hitl.py
  • tests/nat/front_ends/fastapi/test_mcp_client_endpoint.py
  • src/nat/front_ends/fastapi/routes/evaluate.py
  • src/nat/runtime/session.py
  • src/nat/front_ends/fastapi/routes/websocket.py
  • tests/nat/front_ends/fastapi/test_fastapi_front_end_config.py
  • src/nat/runtime/runner.py
  • src/nat/front_ends/fastapi/intermediate_steps_subscriber.py
  • src/nat/front_ends/fastapi/fastapi_front_end_plugin_worker.py
  • src/nat/front_ends/fastapi/fastapi_front_end_config.py
  • tests/nat/front_ends/fastapi/test_api_versioning.py
  • src/nat/front_ends/fastapi/api_versioning.py
**/*.{py,js,ts,tsx,jsx,sh,yaml,yml,json,toml,md,mdx,rst}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/*.{py,js,ts,tsx,jsx,sh,yaml,yml,json,toml,md,mdx,rst}: Every file must start with the standard SPDX Apache-2.0 header
Confirm that copyright years are up-to-date whenever a file is changed
All source files must include the SPDX Apache-2.0 header template

Files:

  • src/nat/front_ends/fastapi/routes/__init__.py
  • packages/nvidia_nat_mcp/src/nat/plugins/mcp/server/routes.py
  • src/nat/builder/workflow.py
  • src/nat/front_ends/fastapi/routes/auth.py
  • src/nat/front_ends/fastapi/routes/workflow.py
  • src/nat/front_ends/fastapi/routes/static_files.py
  • tests/nat/front_ends/fastapi/test_openai_compatibility.py
  • src/nat/front_ends/fastapi/routes/monitor.py
  • tests/nat/front_ends/fastapi/test_response_helpers.py
  • src/nat/eval/remote_workflow.py
  • docs/source/reference/rest-api/api-server-endpoints.md
  • src/nat/observability/trace_merge.py
  • src/nat/front_ends/fastapi/response_helpers.py
  • src/nat/front_ends/fastapi/routes/hitl.py
  • tests/nat/front_ends/fastapi/test_mcp_client_endpoint.py
  • src/nat/front_ends/fastapi/routes/evaluate.py
  • src/nat/runtime/session.py
  • src/nat/front_ends/fastapi/routes/websocket.py
  • tests/nat/front_ends/fastapi/test_fastapi_front_end_config.py
  • src/nat/runtime/runner.py
  • src/nat/front_ends/fastapi/intermediate_steps_subscriber.py
  • src/nat/front_ends/fastapi/fastapi_front_end_plugin_worker.py
  • src/nat/front_ends/fastapi/fastapi_front_end_config.py
  • tests/nat/front_ends/fastapi/test_api_versioning.py
  • src/nat/front_ends/fastapi/api_versioning.py
**/*.{py,md,mdx,rst}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Version numbers are derived automatically by 'setuptools-scm'; never hard-code them in code or docs

Files:

  • src/nat/front_ends/fastapi/routes/__init__.py
  • packages/nvidia_nat_mcp/src/nat/plugins/mcp/server/routes.py
  • src/nat/builder/workflow.py
  • src/nat/front_ends/fastapi/routes/auth.py
  • src/nat/front_ends/fastapi/routes/workflow.py
  • src/nat/front_ends/fastapi/routes/static_files.py
  • tests/nat/front_ends/fastapi/test_openai_compatibility.py
  • src/nat/front_ends/fastapi/routes/monitor.py
  • tests/nat/front_ends/fastapi/test_response_helpers.py
  • src/nat/eval/remote_workflow.py
  • docs/source/reference/rest-api/api-server-endpoints.md
  • src/nat/observability/trace_merge.py
  • src/nat/front_ends/fastapi/response_helpers.py
  • src/nat/front_ends/fastapi/routes/hitl.py
  • tests/nat/front_ends/fastapi/test_mcp_client_endpoint.py
  • src/nat/front_ends/fastapi/routes/evaluate.py
  • src/nat/runtime/session.py
  • src/nat/front_ends/fastapi/routes/websocket.py
  • tests/nat/front_ends/fastapi/test_fastapi_front_end_config.py
  • src/nat/runtime/runner.py
  • src/nat/front_ends/fastapi/intermediate_steps_subscriber.py
  • src/nat/front_ends/fastapi/fastapi_front_end_plugin_worker.py
  • src/nat/front_ends/fastapi/fastapi_front_end_config.py
  • tests/nat/front_ends/fastapi/test_api_versioning.py
  • src/nat/front_ends/fastapi/api_versioning.py
**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions

  • Ensure the code follows best practices and coding standards. - For Python code, follow
    PEP 20 and
    PEP 8 for style guidelines.
  • Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values (except for return values of None,
    in that situation no return type hint is needed).
    Example:
    def my_function(param1: int, param2: str) -> bool:
        pass
  • For Python exception handling, ensure proper stack trace preservation:
    • When re-raising exceptions: use bare raise statements to maintain the original stack trace,
      and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.
    • When catching and logging exceptions without re-raising: always use logger.exception()
      to capture the full stack trace information.

Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

  • Documentation in Markdown files should not contain usage of a possessive 's with inanimate objects
    (ex: "the system's performance" should be "the performance of the system").
  • Documentation in Markdown files should not use NAT as an acronym, always spell out NeMo Agent Toolkit.
    The exception to this rule is when referring to package names or code identifiers that contain "nat", th...

Files:

  • src/nat/front_ends/fastapi/routes/__init__.py
  • packages/nvidia_nat_mcp/src/nat/plugins/mcp/server/routes.py
  • src/nat/builder/workflow.py
  • src/nat/front_ends/fastapi/routes/auth.py
  • src/nat/front_ends/fastapi/routes/workflow.py
  • src/nat/front_ends/fastapi/routes/static_files.py
  • tests/nat/front_ends/fastapi/test_openai_compatibility.py
  • src/nat/front_ends/fastapi/routes/monitor.py
  • tests/nat/front_ends/fastapi/test_response_helpers.py
  • src/nat/eval/remote_workflow.py
  • docs/source/reference/rest-api/api-server-endpoints.md
  • src/nat/observability/trace_merge.py
  • src/nat/front_ends/fastapi/response_helpers.py
  • src/nat/front_ends/fastapi/routes/hitl.py
  • tests/nat/front_ends/fastapi/test_mcp_client_endpoint.py
  • src/nat/front_ends/fastapi/routes/evaluate.py
  • src/nat/runtime/session.py
  • src/nat/front_ends/fastapi/routes/websocket.py
  • tests/nat/front_ends/fastapi/test_fastapi_front_end_config.py
  • src/nat/runtime/runner.py
  • src/nat/front_ends/fastapi/intermediate_steps_subscriber.py
  • src/nat/front_ends/fastapi/fastapi_front_end_plugin_worker.py
  • src/nat/front_ends/fastapi/fastapi_front_end_config.py
  • tests/nat/front_ends/fastapi/test_api_versioning.py
  • src/nat/front_ends/fastapi/api_versioning.py
src/nat/**/*

⚙️ CodeRabbit configuration file

This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.

Files:

  • src/nat/front_ends/fastapi/routes/__init__.py
  • src/nat/builder/workflow.py
  • src/nat/front_ends/fastapi/routes/auth.py
  • src/nat/front_ends/fastapi/routes/workflow.py
  • src/nat/front_ends/fastapi/routes/static_files.py
  • src/nat/front_ends/fastapi/routes/monitor.py
  • src/nat/eval/remote_workflow.py
  • src/nat/observability/trace_merge.py
  • src/nat/front_ends/fastapi/response_helpers.py
  • src/nat/front_ends/fastapi/routes/hitl.py
  • src/nat/front_ends/fastapi/routes/evaluate.py
  • src/nat/runtime/session.py
  • src/nat/front_ends/fastapi/routes/websocket.py
  • src/nat/runtime/runner.py
  • src/nat/front_ends/fastapi/intermediate_steps_subscriber.py
  • src/nat/front_ends/fastapi/fastapi_front_end_plugin_worker.py
  • src/nat/front_ends/fastapi/fastapi_front_end_config.py
  • src/nat/front_ends/fastapi/api_versioning.py
packages/**/*

⚙️ CodeRabbit configuration file

packages/**/*: - This directory contains optional plugin packages for the toolkit, each should contain a pyproject.toml file. - The pyproject.toml file should declare a dependency on nvidia-nat or another package with a name starting
with nvidia-nat-. This dependency should be declared using ~=<version>, and the version should be a two
digit version (ex: ~=1.0).

  • Not all packages contain Python code, if they do they should also contain their own set of tests, in a
    tests/ directory at the same level as the pyproject.toml file.
  • When adding a new package, that new package name (as defined in the pyproject.toml file) should
    be added as a dependency to the nvidia-nat-all package in packages/nvidia_nat_all/pyproject.toml

Files:

  • packages/nvidia_nat_mcp/src/nat/plugins/mcp/server/routes.py
**/test_*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/test_*.py: Use pytest with pytest-asyncio for asynchronous code testing
Test functions should be named with 'test_' prefix using snake_case
Extract frequently repeated code in tests into pytest fixtures with the 'fixture_' prefix on the function name and a 'name' argument in the decorator
Mock external services with 'pytest_httpserver' or 'unittest.mock' instead of hitting live endpoints in tests
Mark slow tests with '@pytest.mark.slow' so they can be skipped in the default test suite
Mark integration tests requiring external services with '@pytest.mark.integration' so they can be skipped in the default test suite

Files:

  • tests/nat/front_ends/fastapi/test_openai_compatibility.py
  • tests/nat/front_ends/fastapi/test_response_helpers.py
  • tests/nat/front_ends/fastapi/test_mcp_client_endpoint.py
  • tests/nat/front_ends/fastapi/test_fastapi_front_end_config.py
  • tests/nat/front_ends/fastapi/test_api_versioning.py
tests/**/*.py

⚙️ CodeRabbit configuration file

tests/**/*.py: - Ensure that tests are comprehensive, cover edge cases, and validate the functionality of the code. - Test functions should be named using the test_ prefix, using snake_case. - Any frequently repeated code should be extracted into pytest fixtures. - Pytest fixtures should define the name argument when applying the pytest.fixture decorator. The fixture
function being decorated should be named using the fixture_ prefix, using snake_case. Example:
@pytest.fixture(name="my_fixture")
def fixture_my_fixture():
pass

  • Do NOT add @pytest.mark.asyncio to any test. Async tests are automatically detected and run by the
    async runner - the decorator is unnecessary clutter.

Files:

  • tests/nat/front_ends/fastapi/test_openai_compatibility.py
  • tests/nat/front_ends/fastapi/test_response_helpers.py
  • tests/nat/front_ends/fastapi/test_mcp_client_endpoint.py
  • tests/nat/front_ends/fastapi/test_fastapi_front_end_config.py
  • tests/nat/front_ends/fastapi/test_api_versioning.py
**/*.{md,mdx}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/*.{md,mdx}: Use 'NVIDIA NeMo Agent toolkit' for full name (first use), 'NeMo Agent toolkit' or 'the toolkit' for subsequent references, and 'Toolkit' (capital T) in titles/headings, 'toolkit' (lowercase t) in body text
Never use deprecated names: 'Agent Intelligence toolkit', 'aiqtoolkit', 'AgentIQ', 'AIQ', or 'aiq' in documentation; update any occurrences unless intentionally referring to deprecated versions or implementing compatibility layers

Files:

  • docs/source/reference/rest-api/api-server-endpoints.md
**/*.{md,mdx,rst}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

**/*.{md,mdx,rst}: Documentation must be clear, comprehensive, and free of TODOs, FIXMEs, placeholder text, offensive or outdated terms, and spelling mistakes
Do not use words listed in 'ci/vale/styles/config/vocabularies/nat/reject.txt' in documentation
Words listed in 'ci/vale/styles/config/vocabularies/nat/accept.txt' are acceptable even if they appear to be spelling mistakes

Files:

  • docs/source/reference/rest-api/api-server-endpoints.md
docs/source/**/*

⚙️ CodeRabbit configuration file

docs/source/**/*: This directory contains the source code for the documentation. All documentation should be written in Markdown format. Any image files should be placed in the docs/source/_static directory.

Documentation Categories

Ensure documentation is placed in the correct category:

  • get-started/: Introductory documentation for new users
    • get-started/tutorials/: Step-by-step learning guides
  • build-workflows/: Workflow creation, configuration, adding remote MCP tools or A2A agents - run-workflows/: Execution, observability, serving workflows via MCP and A2A protocols - improve-workflows/: Evaluation and optimization guides - components/: Specific component implementations (agents, tools, connectors) - extend/: Custom component development and testing (not core library contributions) - reference/: Python and REST API documentation only - resources/: Project information (licensing, FAQs)
    • resources/contributing/: Development environment and contribution guides

Placement rules:

  1. Component implementations always belong in components/, not build-workflows/ 2. API documentation belongs only in reference/ 3. Using remote MCP tools or A2A agents should be placed in build-workflows/ 4. Serving workflows via MCP/A2A should be placed in run-workflows/

Files:

  • docs/source/reference/rest-api/api-server-endpoints.md
🧠 Learnings (2)
📚 Learning: 2026-01-05T15:46:49.677Z
Learnt from: CR
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 0
File: .cursor/rules/general.mdc:0-0
Timestamp: 2026-01-05T15:46:49.677Z
Learning: Applies to **/*.py : Use 'async'/'await' for I/O-bound work (HTTP, DB, file reads)

Applied to files:

  • src/nat/front_ends/fastapi/fastapi_front_end_plugin_worker.py
📚 Learning: 2025-09-16T00:35:35.247Z
Learnt from: zhongxuanwang-nv
Repo: NVIDIA/NeMo-Agent-Toolkit PR: 794
File: src/nat/data_models/config.py:190-198
Timestamp: 2025-09-16T00:35:35.247Z
Learning: Pydantic Field supports the `deprecated` parameter which can be set to a string (deprecation message), boolean, or warnings.deprecated decorator instance. The AI suggestion claiming Field(deprecated=...) is invalid was incorrect.

Applied to files:

  • src/nat/front_ends/fastapi/fastapi_front_end_config.py
🧬 Code graph analysis (17)
packages/nvidia_nat_mcp/src/nat/plugins/mcp/server/routes.py (4)
src/nat/front_ends/fastapi/fastapi_front_end_plugin_worker.py (3)
  • FastApiFrontEndPluginWorker (213-433)
  • config (106-107)
  • _register_api_route (335-341)
packages/nvidia_nat_mcp/src/nat/plugins/mcp/client/client_base.py (1)
  • description (611-617)
packages/nvidia_nat_mcp/src/nat/plugins/mcp/client/client_config.py (1)
  • MCPClientConfig (126-137)
tests/nat/front_ends/fastapi/test_mcp_client_endpoint.py (2)
  • get_tools (41-44)
  • get_accessible_functions (60-61)
src/nat/builder/workflow.py (1)
src/nat/runtime/runner.py (1)
  • Runner (47-314)
src/nat/front_ends/fastapi/routes/auth.py (3)
src/nat/front_ends/fastapi/fastapi_front_end_plugin_worker.py (3)
  • config (106-107)
  • front_end_config (110-111)
  • _register_api_route (335-341)
src/nat/runtime/session.py (1)
  • config (272-273)
tests/nat/front_ends/auth_flow_handlers/mock_oauth2_server.py (1)
  • token_url (143-144)
src/nat/front_ends/fastapi/routes/static_files.py (2)
src/nat/object_store/models.py (1)
  • ObjectStoreItem (21-38)
src/nat/front_ends/fastapi/fastapi_front_end_plugin_worker.py (2)
  • front_end_config (110-111)
  • _register_api_route (335-341)
tests/nat/front_ends/fastapi/test_openai_compatibility.py (4)
src/nat/data_models/intermediate_step.py (1)
  • data (320-321)
src/nat/data_models/api_server.py (1)
  • ChatResponseChunk (391-479)
tests/nat/authentication/test_bearer_token_validator.py (1)
  • json (111-112)
external/dynamo/test_dynamo_integration.sh (1)
  • usage (21-38)
src/nat/front_ends/fastapi/routes/monitor.py (2)
src/nat/runtime/metrics.py (5)
  • PerUserMetricsCollector (75-180)
  • PerUserMonitorResponse (67-72)
  • PerUserResourceUsage (58-64)
  • collect_user_metrics (90-104)
  • collect_all_metrics (106-126)
src/nat/runtime/session.py (3)
  • user_id (154-155)
  • session_manager (162-163)
  • is_workflow_per_user (296-297)
tests/nat/front_ends/fastapi/test_response_helpers.py (2)
src/nat/data_models/api_server.py (1)
  • ResponseIntermediateStep (482-494)
src/nat/front_ends/fastapi/response_helpers.py (1)
  • generate_streaming_response_full_as_str (190-224)
src/nat/eval/remote_workflow.py (2)
src/nat/observability/trace_merge.py (1)
  • merge_trace_events (31-59)
src/nat/runtime/session.py (2)
  • session (475-567)
  • inject_observability_headers (80-86)
src/nat/observability/trace_merge.py (2)
src/nat/data_models/api_server.py (1)
  • ResponseIntermediateStep (482-494)
src/nat/data_models/intermediate_step.py (3)
  • IntermediateStep (262-337)
  • IntermediateStepPayload (149-259)
  • UUID (328-329)
src/nat/front_ends/fastapi/response_helpers.py (1)
src/nat/data_models/api_server.py (3)
  • ResponseSerializable (315-323)
  • ResponseIntermediateStep (482-494)
  • ResponsePayloadOutput (509-518)
tests/nat/front_ends/fastapi/test_mcp_client_endpoint.py (1)
packages/nvidia_nat_mcp/src/nat/plugins/mcp/server/routes.py (1)
  • register_mcp_routes (37-187)
src/nat/runtime/session.py (2)
src/nat/builder/context.py (5)
  • ContextState (67-120)
  • workflow_trace_id (222-226)
  • get (119-120)
  • get (339-349)
  • workflow_run_id (215-219)
src/nat/builder/workflow.py (1)
  • run (98-128)
src/nat/front_ends/fastapi/routes/websocket.py (5)
src/nat/front_ends/fastapi/auth_flow_handlers/websocket_flow_handler.py (1)
  • WebSocketAuthenticationFlowHandler (47-146)
src/nat/front_ends/fastapi/message_handler.py (1)
  • WebSocketMessageHandler (61-367)
src/nat/runtime/session.py (5)
  • session (475-567)
  • SessionManager (191-729)
  • session_manager (162-163)
  • run (166-188)
  • run (570-595)
src/nat/front_ends/fastapi/fastapi_front_end_config.py (2)
  • FastApiFrontEndConfig (179-324)
  • EndpointBase (184-207)
src/nat/front_ends/fastapi/fastapi_front_end_plugin_worker.py (6)
  • FastApiFrontEndPluginWorker (213-433)
  • get_step_adaptor (204-205)
  • get_step_adaptor (293-295)
  • _add_flow (427-429)
  • _remove_flow (431-433)
  • _register_websocket_route (343-348)
tests/nat/front_ends/fastapi/test_fastapi_front_end_config.py (1)
src/nat/front_ends/fastapi/fastapi_front_end_config.py (5)
  • HitlHttpConfig (160-168)
  • ObservabilityPropagationConfig (171-176)
  • VersioningConfig (151-157)
  • FastApiFrontEndConfig (179-324)
  • EndpointBase (184-207)
src/nat/front_ends/fastapi/fastapi_front_end_plugin_worker.py (2)
src/nat/front_ends/fastapi/api_versioning.py (3)
  • VersioningOptions (20-31)
  • apply_version_headers (34-37)
  • from_config (28-31)
src/nat/front_ends/fastapi/routes/hitl.py (4)
  • HitlHttpHandler (112-182)
  • resolve_session_id (175-182)
  • user_callback (119-131)
  • register_routes (133-172)
tests/nat/front_ends/fastapi/test_api_versioning.py (1)
src/nat/front_ends/fastapi/api_versioning.py (2)
  • VersioningOptions (20-31)
  • apply_version_headers (34-37)
src/nat/front_ends/fastapi/api_versioning.py (1)
src/nat/front_ends/fastapi/fastapi_front_end_config.py (1)
  • FastApiFrontEndConfig (179-324)
🪛 Ruff (0.14.11)
packages/nvidia_nat_mcp/src/nat/plugins/mcp/server/routes.py

66-66: Abstract raise to an inner function

(TRY301)


66-66: Avoid specifying long messages outside the exception class

(TRY003)


74-74: Unused noqa directive (unused: BLE001)

Remove unused noqa directive

(RUF100)


75-75: Redundant exception object included in logging.exception call

(TRY401)


88-88: Unused noqa directive (unused: BLE001)

Remove unused noqa directive

(RUF100)


89-89: Redundant exception object included in logging.exception call

(TRY401)


103-103: Unused noqa directive (unused: BLE001)

Remove unused noqa directive

(RUF100)


104-104: Redundant exception object included in logging.exception call

(TRY401)


148-148: Unused noqa directive (unused: BLE001)

Remove unused noqa directive

(RUF100)


149-149: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

src/nat/front_ends/fastapi/routes/auth.py

68-68: Do not catch blind exception: Exception

(BLE001)

src/nat/front_ends/fastapi/routes/workflow.py

298-298: Consider moving this statement to an else block

(TRY300)


301-301: Do not catch blind exception: Exception

(BLE001)


314-314: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


471-471: Avoid specifying long messages outside the exception class

(TRY003)


553-553: Avoid specifying long messages outside the exception class

(TRY003)

src/nat/eval/remote_workflow.py

79-79: Redundant exception object included in logging.exception call

(TRY401)

src/nat/observability/trace_merge.py

57-57: Redundant exception object included in logging.exception call

(TRY401)

src/nat/front_ends/fastapi/response_helpers.py

40-40: Unused function argument: send_done

(ARG001)


41-41: Unused function argument: include_usage

(ARG001)


53-54: Prefer TypeError exception for invalid type

(TRY004)


64-64: Do not perform function call StepAdaptor in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


64-64: Do not perform function call StepAdaptorConfig in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)


68-68: Unused function argument: http_headers

(ARG001)


197-197: Unused function argument: send_done

(ARG001)


198-198: Unused function argument: include_usage

(ARG001)


215-216: Prefer TypeError exception for invalid type

(TRY004)

src/nat/front_ends/fastapi/routes/hitl.py

64-64: Avoid specifying long messages outside the exception class

(TRY003)


126-126: Avoid specifying long messages outside the exception class

(TRY003)


159-159: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

src/nat/front_ends/fastapi/routes/evaluate.py

277-277: Use explicit conversion flag

Replace with conversion flag

(RUF010)

src/nat/runtime/session.py

588-589: Avoid specifying long messages outside the exception class

(TRY003)


679-680: try-except-pass detected, consider logging the exception

(S110)


679-679: Do not catch blind exception: Exception

(BLE001)


689-690: try-except-pass detected, consider logging the exception

(S110)


689-689: Do not catch blind exception: Exception

(BLE001)

src/nat/front_ends/fastapi/fastapi_front_end_plugin_worker.py

437-437: __all__ is not sorted

Apply an isort-style sorting to __all__

(RUF022)

src/nat/front_ends/fastapi/api_versioning.py

34-34: Unused function argument: is_legacy

(ARG001)

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

@willkill07 willkill07 marked this pull request as draft January 14, 2026 14:43
@willkill07 willkill07 changed the title feat: add API versioning; refactor FastAPI; add HITL HTTP support feat: FastAPI Refactor; HITL HTTP; Distributed Workflow Observability Jan 14, 2026
- Introduced `api_versioning.py` for handling API versioning and deprecation.
- Added `VersioningConfig` to `fastapi_front_end_config.py` for versioning settings.
- Implemented `HitlHttpHandler` in `hitl_http.py` for Human-in-the-Loop interactions via HTTP.
- Updated `FastApiFrontEndPluginWorker` to integrate versioning and HITL HTTP functionalities, including middleware for version headers and route registration for HITL endpoints.
- Restructure FastAPI code to be more route centric

This enhances the FastAPI front end with structured versioning and interactive capabilities.

Signed-off-by: Will Killian <wkillian@nvidia.com>
@willkill07 willkill07 force-pushed the wkk_fastapi-refactor branch from bfb7a78 to cd87e6b Compare January 16, 2026 21:07
Signed-off-by: Will Killian <wkillian@nvidia.com>
@willkill07 willkill07 removed skip-ci Optionally Skip CI for this PR DO NOT MERGE PR should not be merged; see PR for details labels Jan 21, 2026
@willkill07 willkill07 marked this pull request as ready for review January 21, 2026 15:30
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 20

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (11)
examples/custom_functions/plot_charts/README.md (2)

25-25: Use the full toolkit name on first use.

First mention must be “NVIDIA NeMo Agent toolkit” in Markdown docs.

As per coding guidelines.


37-37: Avoid possessive with inanimate objects.

Replace “Python's matplotlib library” with “the matplotlib library for Python”.

As per coding guidelines.

src/nat/cli/commands/finetune.py (1)

130-132: Use logger.exception() to preserve stack trace before raising ClickException.

Since you're raising a new ClickException rather than using a bare raise, the original stack trace is lost. Per coding guidelines, use logger.exception() to capture the full stack trace in logs for debugging purposes.

Proposed fix
     except Exception as e:
-        logger.error("Finetuning failed: %s", e)
+        logger.exception("Finetuning failed: %s", e)
         raise click.ClickException(str(e))
docs/source/components/auth/mcp-auth/index.md (1)

133-133: Replace "AIQ" prefix in example Jira ticket IDs.

The examples use "AIQ-" as the Jira ticket prefix (e.g., "AIQ-1935", "AIQ-2342", "AIQ-2507"). Since "AIQ" is a deprecated product name per the coding guidelines, using it in examples could cause confusion. Consider using a generic project key instead (e.g., "PROJ-1935", "ISSUE-2342", or "TICKET-2507").

Based on coding guidelines, which state: "Never use deprecated names: 'Agent Intelligence toolkit', 'aiqtoolkit', 'AgentIQ', 'AIQ', or 'aiq' in documentation."

📝 Suggested replacement examples

Line 133:

-nat run --config_file examples/MCP/simple_auth_mcp/configs/config-mcp-auth-jira.yml --input "What is Jira ticket AIQ-1935 about"
+nat run --config_file examples/MCP/simple_auth_mcp/configs/config-mcp-auth-jira.yml --input "What is Jira ticket PROJ-1935 about"

Line 179:

-What is ticket AIQ-1935 about
+What is ticket PROJ-1935 about

Line 272:

-  -d '{"messages": [{"role": "user", "content": "What is status of AIQ-2342?"}]}'
+  -d '{"messages": [{"role": "user", "content": "What is status of PROJ-2342?"}]}'

Line 280:

-  -d '{"messages": [{"role": "user", "content": "What is status of AIQ-2507?"}]}'
+  -d '{"messages": [{"role": "user", "content": "What is status of PROJ-2507?"}]}'

Also applies to: 179-179, 272-272, 280-280

examples/agents/mixture_of_agents/README.md (1)

1-21: Remove duplicate SPDX header and fix copyright years.

The file contains two SPDX Apache-2.0 headers when only one should be present. Additionally, the first header has an incorrect copyright year range (2025-2026), while the second has 2024-2026.

Per coding guidelines, each file must start with a single standard SPDX Apache-2.0 header, and copyright years should be up-to-date when files are changed.

🔧 Proposed fix

Remove the duplicate header and use the correct copyright year range:

-<!--
-SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
-SPDX-License-Identifier: Apache-2.0
-
-Licensed under the Apache License, Version 2.0 (the "License");
-you may not use this file except in compliance with the License.
-You may obtain a copy of the License at
-
-http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing, software
-distributed under the License is distributed on an "AS IS" BASIS,
-WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-See the License for the specific language governing permissions and
-limitations under the License.
--->
-
 <!--
   SPDX-FileCopyrightText: Copyright (c) 2024-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
   SPDX-License-Identifier: Apache-2.0
 -->
examples/agents/tool_calling/README.md (1)

18-21: Remove duplicate SPDX license header.

This file contains two SPDX license headers. Lines 1-16 already contain the complete Apache-2.0 header with copyright years 2025-2026. This second partial header (with different copyright years 2024-2026) is redundant and inconsistent.

Proposed fix

Remove lines 18-21:

-<!--
-  SPDX-FileCopyrightText: Copyright (c) 2024-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
-  SPDX-License-Identifier: Apache-2.0
--->
-
 # Tool Calling Agent
examples/front_ends/simple_calculator_custom_routes/README.md (1)

12-12: Minor license text inconsistency: "AS-IS" should be "AS IS".

The Apache 2.0 license standard text uses "AS IS" (two words without hyphen). This line has "AS-IS" which differs from the standard template and other files in the repository.

Proposed fix
-distributed under the License is distributed on an "AS-IS" BASIS,
+distributed under the License is distributed on an "AS IS" BASIS,
src/nat/observability/processor/batching_processor.py (1)

152-178: Time-based flush can be skipped after an early size flush.

If a size-based batch flush happens before the scheduled task wakes, the existing _flush_task can wake, see elapsed < interval, exit, and leave the new batch without any time-based flush unless another item arrives. Reschedule the flush when a new batch starts (including after overflow).

🐛 Suggested fix
-                if forced_batch:
-                    # Add current item to queue and return the forced batch
-                    self._batch_queue.append(item)
-                    self._batch_start_time = time.time()
-                    self._items_processed += 1
-                    if self._flush_task is None or self._flush_task.done():
-                        self._flush_task = asyncio.create_task(self._schedule_flush())
-                    return forced_batch
+                if forced_batch:
+                    # Add current item to queue and return the forced batch
+                    self._batch_queue.append(item)
+                    self._batch_start_time = time.time()
+                    self._items_processed += 1
+                    if self._flush_task and not self._flush_task.done():
+                        self._flush_task.cancel()
+                    self._flush_task = asyncio.create_task(self._schedule_flush())
+                    return forced_batch
@@
-            if queue_was_empty:
-                self._batch_start_time = time.time()
+            if queue_was_empty:
+                self._batch_start_time = time.time()
+                if self._flush_task and not self._flush_task.done():
+                    self._flush_task.cancel()
+                self._flush_task = asyncio.create_task(self._schedule_flush())

Also applies to: 188-210

src/nat/eval/remote_workflow.py (1)

69-78: Use key presence checks to avoid dropping falsy responses.

if (val := chunk_data.get("value")): skips valid falsy values (e.g., empty string). Use key presence to avoid losing the final response.

🐛 Suggested fix
-                            if (val := chunk_data.get("value")):
-                                final_response = val
+                            if "value" in chunk_data:
+                                final_response = chunk_data["value"]
@@
-                            if (trace := chunk_data.get("_trace")):
-                                intermediate_steps.extend(merge_trace_events(trace))
+                            if "_trace" in chunk_data:
+                                intermediate_steps.extend(merge_trace_events(chunk_data["_trace"]))
src/nat/data_models/api_server.py (1)

544-552: Typo in docstring: "Workkflow" should be "Workflow".

The docstring has a typo on line 546.

🔧 Proposed fix
 class WorkflowSchemaType(str, Enum):
     """
-    WorkflowSchemaType is an Enum that represents Workkflow response types.
+    WorkflowSchemaType is an Enum that represents Workflow response types.
     """
     WORKFLOW = "workflow"
     WORKFLOW_STREAM = "workflow_stream"
     OPENAI_API_V1 = "openai_api_v1"
     OPENAI_API_V1_STREAM = "openai_api_v1_stream"
src/nat/front_ends/fastapi/response_helpers.py (1)

33-38: Fix shared StepAdaptor default and remove unused parameters.

Instantiating StepAdaptor in the function signature creates a shared instance across all calls. Since StepAdaptor._history accumulates steps via append() and is queried to correlate events by UUID, reusing the same instance causes steps from different requests to contaminate each other. Additionally, send_done, include_usage (in generate_streaming_response_as_str), and http_headers (in generate_streaming_response) are unused and trigger ruff ARG001.

🐛 Proposed fix
async def generate_streaming_response_as_str(payload: typing.Any,
                                             *,
                                             session: Session,
                                             streaming: bool,
-                                            step_adaptor: StepAdaptor = StepAdaptor(StepAdaptorConfig()),
+                                            step_adaptor: StepAdaptor | None = None,
                                             result_type: type | None = None,
-                                            output_type: type | None = None,
-                                            send_done: bool = False,
-                                            include_usage: bool = False) -> AsyncGenerator[str]:
+                                            output_type: type | None = None) -> AsyncGenerator[str]:
+    if step_adaptor is None:
+        step_adaptor = StepAdaptor(StepAdaptorConfig())
async def generate_streaming_response(
        payload: typing.Any,
         *,
         session: Session,
         streaming: bool,
-        step_adaptor: StepAdaptor = StepAdaptor(StepAdaptorConfig()),
+        step_adaptor: StepAdaptor | None = None,
         result_type: type | None = None,
         output_type: type | None = None,
         trace_collector: list[ResponseSerializable] | None = None,
-        http_headers: dict[str, str] | None = None) -> AsyncGenerator[ResponseSerializable]:
+        ) -> AsyncGenerator[ResponseSerializable]:
+    if step_adaptor is None:
+        step_adaptor = StepAdaptor(StepAdaptorConfig())
🤖 Fix all issues with AI agents
In `@docs/source/reference/rest-api/api-server-endpoints.md`:
- Around line 20-25: Fix inconsistent list indentation in the documentation by
normalizing all bullet lists to the same top-level indentation (remove leading
spaces before the "-" for the "Workflow Interface" and "Chat Interface" items
and apply the same change to the other affected lists referenced in the
comment), ensuring each bullet starts at column 0 and nested lists (if any) use
a consistent additional indent level; update the other occurrences reported (the
lists around the "Workflow Interface" and "Chat Interface" bullets and the lists
at the other mentioned sections) to match this style so markdownlint MD005/MD007
pass.
- Around line 399-403: Update the SSE Event Example code fence to include a
language identifier (use "text") so the fenced block starts with ```text; locate
the "SSE Event Example" snippet that currently begins with a plain
triple-backtick and change it to ```text to satisfy markdownlint MD040 and
ensure the example is treated as plain text.
- Around line 31-37: The documentation body currently uses inconsistent product
capitalization; update all occurrences in this section (and nearby explanatory
lines) to use "NeMo Agent toolkit" (lowercase "t")—specifically change the
heading/body text that references the removed endpoints `/chat` and
`/chat/completions` and the note about `/v1/chat/completions` (and the config
name `openai_api_v1_path`) so the product name appears as "NeMo Agent toolkit".

In
`@examples/evaluation_and_profiling/simple_web_query_eval/src/nat_simple_web_query_eval/scripts/evaluate_single_item_simple.py`:
- Around line 39-41: Update the docstring that contains the sentence mentioning
/v1/workflow/full so the endpoint path is wrapped in backticks; locate the
module or function docstring in evaluate_single_item_simple.py that currently
lists "Sends a question to /v1/workflow/full endpoint (without intermediate
steps)" and change it to use backticks around `/v1/workflow/full` to follow
docstring formatting guidelines.

In `@packages/nvidia_nat_mcp/src/nat/plugins/mcp/server/routes.py`:
- Around line 85-88: The current computation of configured_short_names uses
name.split('.', 1)[1] which will raise IndexError if a key in
accessible_functions lacks a dot; update the logic around the
accessible_functions result (from group_instance.get_accessible_functions) to
safely derive short names by checking for a delimiter before indexing (e.g., use
str.partition or rsplit with maxsplit=1 and fall back to the whole name or skip
entries without a dot), so configured_short_names is built without risking
IndexError when processing keys from accessible_functions.

In `@src/nat/front_ends/fastapi/response_helpers.py`:
- Around line 40-41: The parameters send_done, include_usage and http_headers in
the async generator functions in response_helpers.py are never used and trigger
ruff ARG001; fix by explicitly marking them unused — either rename the
parameters to _send_done/_include_usage/_http_headers or add a single
unused-tuple assignment (e.g. _ = (send_done, include_usage, http_headers)) at
the top of the affected async generator functions so the linter is satisfied
while preserving the API.

In `@src/nat/front_ends/fastapi/routes/auth.py`:
- Around line 48-49: The handler currently returns a plain string when the OAuth
state is missing or not found (check the `state` variable and
`worker._outstanding_flows`), change this to return an HTMLResponse with an
appropriate status code (e.g., 400 Bad Request) and a short HTML body matching
the error message so the failure path mirrors the success path; ensure you
import and use `fastapi.responses.HTMLResponse` (or
`starlette.responses.HTMLResponse`) and replace the plain string return with
`HTMLResponse("Invalid state. Please restart the authentication process.",
status_code=400)`.

In `@src/nat/front_ends/fastapi/routes/evaluate.py`:
- Line 1: Update the SPDX copyright header at the top of the file to use the
year range "2025-2026" instead of a single year; locate the existing header line
that starts with "# SPDX-FileCopyrightText:" and replace the year portion so it
reads "2025-2026" while preserving the rest of the header text and formatting.

In `@src/nat/front_ends/fastapi/routes/hitl.py`:
- Around line 52-71: The enqueue/respond flow leaves stale entries in
_pending_by_session and _response_waiters when a HITL future times out; update
enqueue (where asyncio.Future is created) to attach a cleanup callback (or await
with timeout and catch asyncio.TimeoutError) that removes the prompt from
_pending_by_session[session_id] and pops the prompt.id from _response_waiters,
and ensure respond also performs the same safe-removal logic; reference the
enqueue, respond, _response_waiters, _pending_by_session, and
_notify_subscribers symbols and implement the cleanup in the same module so
timed-out futures are removed and cannot be re-surfaced in /pending.
- Around line 152-160: The except block in post_response that catches KeyError
from self._queue.respond should preserve the original exception context when
raising HTTPException; change the handler to capture the KeyError as a variable
(e.g., except KeyError as e) and re-raise HTTPException(status_code=404,
detail=...) using exception chaining (raise ... from e) so the original
traceback is retained for debugging.

In `@src/nat/front_ends/fastapi/routes/static_files.py`:
- Line 1: Update the copyright header line that currently reads "#
SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All
rights reserved." to use the year range "2025-2026" so it becomes "#
SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION &
AFFILIATES. All rights reserved."; this change should be made in
src/nat/front_ends/fastapi/routes/static_files.py by replacing the existing
header string.
- Around line 99-105: The delete_static_file function currently uses the
incoming file_path directly; call sanitize_path(file_path) first and use the
sanitized path for the object_store_client.delete_object call and any subsequent
logic to ensure consistent path normalization and security. Locate the
delete_static_file async function and replace usages of the raw file_path with
the sanitized variable (e.g., sanitized_path = sanitize_path(file_path)) before
calling object_store_client.delete_object and before raising HTTPException for
NoSuchKeyError. Ensure the rest of the function (returning
Response(status_code=204)) remains unchanged.
- Around line 83-97: Sanitize the incoming file_path in get_static_file the same
way as add_static_file and upsert_static_file by calling
sanitize_path(file_path) before using it with object_store_client.get_object to
prevent path traversal; also derive filename from the sanitized path and ensure
the Content-Disposition header safely quotes the filename (e.g., use a
quoted/escaped filename value) when returning the StreamingResponse so special
characters are handled correctly.

In `@src/nat/front_ends/fastapi/routes/websocket.py`:
- Line 1: Update the copyright header at the top of the module websocket.py to
use the year range "2025-2026" instead of "2026"; locate the SPDX/ copyright
comment block in src/nat/front_ends/fastapi/routes/websocket.py (the file-level
header) and change the year to the two-year range so it matches project
guidelines.

In `@src/nat/front_ends/fastapi/routes/workflow.py`:
- Around line 535-537: The legacy route registered at "/generate/full" has a
malformed multi-line description string with embedded newline/indentation;
update the description argument for that route to a single properly formatted
string (matching the other route) that reads: "Stream raw intermediate steps
without any step adaptor translations. Use filter_steps query parameter to
filter steps by type (comma-separated list) or set to 'none' to suppress all
intermediate steps." Locate the "/generate/full" route registration in
src/nat/front_ends/fastapi/routes/workflow.py and replace the broken description
value accordingly.
- Around line 310-314: The JSON parsing exception handler in the job output
processing currently calls logger.error and should capture the stack trace;
update the exception block that handles json.JSONDecodeError (around the
job_output parsing) to call logger.exception with a descriptive message (e.g.,
"Failed to parse job output as JSON") so the full traceback is logged, then
leave job_output fallback assignment (job_output = {"error": "Output parsing
failed"}) unchanged.

In `@src/nat/runtime/session.py`:
- Around line 674-694: The header values must be validated before storing to
avoid corrupting context: for x-nat-trace-id (handled around
self._context_state.workflow_trace_id.set), ensure the header is a valid hex
string that int(nat_trace_id, 16) succeeds and falls within the expected
bit/size range before calling set; for x-nat-span-id (manipulated via
self._context_state._active_span_id_stack.set) validate each incoming span id is
the expected format (e.g., only hex digits or digits, reasonable length) before
appending to the copied stack; for x-nat-workflow-run-id
(self._context_state.workflow_run_id.set) validate it matches the expected ID
pattern (e.g., UUID or allowed chars) before setting. If validation fails, skip
setting the context (and optionally log a debug/warn) instead of silently
catching broad exceptions.

In `@tests/nat/front_ends/fastapi/test_fastapi_front_end_plugin.py`:
- Around line 87-92: In test_generate_and_openai_stream, remove the dead/unused
assignment to values created inside the fn_use_openai_v1_api branch (the
ChatRequest(...).model_dump() assigned to values) because values is immediately
overwritten; instead only keep the intended test array assignment (values =
["a","b","c","d"]) and if needed store the ChatRequest result in a differently
named variable or move it into a separate assertion path tied to
fn_use_openai_v1_api; reference: test_generate_and_openai_stream,
fn_use_openai_v1_api, values, ChatRequest, Message.

In `@tests/nat/front_ends/fastapi/test_openai_compatibility.py`:
- Around line 170-172: The test function test_openai_v1_mode_endpoints currently
has a docstring claiming it covers both legacy and OpenAI v1 compatible modes
but only exercises the v1-compatible endpoint; update the docstring to
accurately state it verifies creation of OpenAI v1-compatible endpoints (e.g.,
"Test that endpoints are created correctly for the OpenAI v1-compatible mode")
and adjust any nearby comments or documentation in the same function to remove
references to legacy coverage so the test description matches actual behavior.

In `@tests/nat/front_ends/fastapi/test_response_helpers.py`:
- Line 1: Update the SPDX copyright header string from "Copyright (c) 2026,
NVIDIA CORPORATION & AFFILIATES. All rights reserved." to the range format
"Copyright (c) 2025-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved."
so it matches other files; edit the top-of-file SPDX line in
tests/nat/front_ends/fastapi/test_response_helpers.py (the string beginning with
"SPDX-FileCopyrightText") to use "2025-2026".
♻️ Duplicate comments (8)
src/nat/front_ends/fastapi/routes/auth.py (1)

63-75: Success HTML returned even when authentication fails.

The exception handlers (lines 63-69) set the exception on flow_state.future but then fall through to return AUTH_REDIRECT_SUCCESS_HTML. This misleads users into thinking authentication succeeded. Each exception branch should return an error HTML response with an appropriate status code.

src/nat/front_ends/fastapi/routes/evaluate.py (1)

275-277: Use lazy string formatting in logger call.

Per static analysis (RUF010) and best practices, use %s formatting for logger calls instead of f-strings to enable lazy evaluation.

🔧 Proposed fix
             except Exception as e:
-                logger.exception(f"Error evaluating item with {request.evaluator_name}")
+                logger.exception("Error evaluating item with %s", request.evaluator_name)
                 return EvaluateItemResponse(success=False, result=None, error=f"Evaluation failed: {str(e)}")
src/nat/front_ends/fastapi/routes/workflow.py (4)

1-1: Copyright header format inconsistency.

The copyright header uses "2026" only, but should use "2025-2026" for consistency with existing files.

🔧 Proposed fix
-# SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

93-94: Typo: "sectonds" should be "seconds".

🔧 Proposed fix
-                description="Attempt to perform the job synchronously up until `sync_timeout` sectonds, "
+                description="Attempt to perform the job synchronously up until `sync_timeout` seconds, "

415-417: Malformed description string with extra whitespace.

The description has inconsistent formatting from line continuation artifacts.

🔧 Proposed fix
                 description="Stream raw intermediate steps without any step adaptor translations.\n"
-                "Use filter_steps query parameter to filter steps by type (comma-separated list) or"
-                "                        set to 'none' to suppress all intermediate steps.",
+                "Use filter_steps query parameter to filter steps by type (comma-separated list) or "
+                "set to 'none' to suppress all intermediate steps.",

451-453: Same malformed description string issue.

🔧 Proposed fix
                 description="Stream raw intermediate steps without any step adaptor translations.\n"
-                "Use filter_steps query parameter to filter steps by type (comma-separated list) or "
-                "                        set to 'none' to suppress all intermediate steps.",
+                "Use filter_steps query parameter to filter steps by type (comma-separated list) or "
+                "set to 'none' to suppress all intermediate steps.",
src/nat/front_ends/fastapi/routes/hitl.py (1)

1-1: Update SPDX year range to 2025-2026 for consistency.

As per coding guidelines, copyright years should be up-to-date when files are changed.

✏️ Proposed fix
-# SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
src/nat/front_ends/fastapi/fastapi_front_end_plugin_worker.py (1)

306-310: Fix inverted _ensure_usage condition to avoid overwriting valid usage.

Current logic replaces non-empty usage. It should only default when usage is missing/empty.

🐛 Proposed fix
     def _ensure_usage(result: typing.Any) -> None:
         """Ensure ChatResponse includes usage per OpenAI spec."""
-        if isinstance(result, ChatResponse) and result.usage != Usage():
+        if isinstance(result, ChatResponse) and (result.usage is None or result.usage == Usage()):
             result.usage = Usage(prompt_tokens=0, completion_tokens=0, total_tokens=0)
🧹 Nitpick comments (14)
src/nat/cli/commands/optimize.py (1)

62-64: Nitpick: Docstring formatting.

The docstring has a leading space and doesn't end with a period. As per coding guidelines, the first line of docstrings must end with a period.

Suggested fix
 def optimizer_command(ctx, **kwargs) -> None:
-    """ Optimize workflow with the specified dataset"""
+    """Optimize workflow with the specified dataset."""
     pass
src/nat/cli/commands/finetune.py (2)

110-112: Address TODO comment for config override logic.

There's an unimplemented TODO for config override logic. This could lead to silent failures where users specify --override options that are acknowledged in logs but never applied.

Would you like me to generate the config override implementation or open an issue to track this task?


84-96: Consider adding | None to optional parameter type hints.

Several parameters with required=False or default=None in their Click options are typed without None:

  • dataset: Pathdataset: Path | None
  • endpoint: strendpoint: str | None
  • validation_dataset: Pathvalidation_dataset: Path | None
  • validation_config_file: Pathvalidation_config_file: Path | None

This would improve type safety and enable static analysis tools like pyright to catch potential None access issues.

Proposed fix
 def finetune_command(
     processors,  # pylint: disable=unused-argument
     *,
     config_file: Path,
-    dataset: Path,
+    dataset: Path | None,
     result_json_path: str,
-    endpoint: str,
+    endpoint: str | None,
     endpoint_timeout: int,
     override: tuple[tuple[str, str], ...],
-    validation_dataset: Path,
+    validation_dataset: Path | None,
     validation_interval: int,
-    validation_config_file: Path,
+    validation_config_file: Path | None,
 ):
docs/source/run-workflows/observe/observability-propagation.md (2)

35-47: Add language specifier to fenced code block.

The ASCII diagram code block lacks a language specifier. Use text or plaintext for ASCII art diagrams to satisfy markdown linting rules.

Proposed fix
-```
+```text
 ┌─────────────────────────────────────────────────────────────────┐
 │                      Parent Workflow                            │

192-194: Add language specifier to fenced code block.

The HTTP header example lacks a language specifier. Use http or text for HTTP header examples.

Proposed fix
-```
+```http
 Observability-Trace-Id: 0123456789abcdef0123456789abcdef
</details>

</blockquote></details>
<details>
<summary>src/nat/front_ends/fastapi/routes/auth.py (1)</summary><blockquote>

`68-69`: **Consider narrowing the exception catch or adding logging.**

The bare `Exception` catch is flagged by Ruff (BLE001). While it serves as a reasonable fallback, consider either narrowing to specific expected exceptions or adding `logger.exception()` to capture unexpected failure details for debugging.

</blockquote></details>
<details>
<summary>src/nat/front_ends/fastapi/api_versioning.py (2)</summary><blockquote>

`34-38`: **Unused `is_legacy` parameter.**

The `is_legacy` parameter is defined but not used, as flagged by Ruff (ARG001). This appears intended for future deprecation header support (e.g., adding a `Deprecation` header for legacy routes). Consider either implementing the planned functionality or prefixing with underscore to indicate intentional non-use.


<details>
<summary>Option 1: Implement deprecation header</summary>

```diff
 def apply_version_headers(response: Response, opts: VersioningOptions, *, is_legacy: bool = False) -> None:
     """Inject version header if enabled."""
     if opts.api_version_header:
         response.headers["X-API-Version"] = opts.version
+    if is_legacy:
+        response.headers["Deprecation"] = "true"
Option 2: Mark as intentionally unused
-def apply_version_headers(response: Response, opts: VersioningOptions, *, is_legacy: bool = False) -> None:
+def apply_version_headers(response: Response, opts: VersioningOptions, *, _is_legacy: bool = False) -> None:

40-57: Consider using functools.wraps to preserve handler metadata.

The wrapper function _wrapped doesn't preserve the original handler's __name__, __doc__, and other metadata. This can affect debugging, logging, and OpenAPI documentation generation.

Proposed fix
+import functools
+
 def wrap_with_headers(handler: Callable, opts: VersioningOptions, *, is_legacy: bool) -> Callable:
     """Wrap a route handler to add version headers."""

+    `@functools.wraps`(handler)
     async def _wrapped(*args, **kwargs):
         response: Response | None = kwargs.get("response")
src/nat/front_ends/fastapi/routes/evaluate.py (1)

71-77: TODO comment present - consider tracking for follow-up.

There's a TODO comment about finding another way to limit concurrent evaluations. This might warrant a follow-up issue to track this technical debt.

Would you like me to open a new issue to track the concurrent evaluation limiting concern?

src/nat/front_ends/fastapi/routes/workflow.py (1)

298-301: Consider narrowing the exception type or documenting the intentional broad catch.

The broad Exception catch is flagged by static analysis (BLE001). While this appears intentional for OpenAI-compatible error response handling, consider adding a comment explaining why this is necessary, or narrowing to more specific exception types if possible.

💡 Suggested documentation
                 except HTTPException:
                     raise
-                except Exception as exc:  # OpenAI-compatible error shape
+                except Exception as exc:  # Catch-all for OpenAI-compatible error shape
+                    # Intentionally broad to ensure all errors are converted to OpenAI error format
                     return worker._openai_error_response(exc)
src/nat/front_ends/fastapi/fastapi_front_end_config.py (1)

151-175: Add Google‑style docstrings for new public config models.

These public models should use Google‑style class docstrings (e.g., Attributes:) and wrap code entities in backticks to keep the doc tooling consistent. As per coding guidelines, please align these new docstrings.

✍️ Example pattern
 class VersioningConfig(BaseModel):
-    """Configuration for API versioning and legacy route handling."""
+    """Configuration for API versioning and legacy route handling.
+
+    Attributes:
+        version: Current API version for versioned routes.
+        enable_legacy_routes: Enable unversioned legacy routes when `True`.
+        api_version_header: Emit `X-API-Version` response header when `True`.
+    """
src/nat/runtime/session.py (2)

53-86: Use Google-style docstrings for the new observability helpers.

These are public helpers and should include Args/Returns sections. As per coding guidelines, please update to Google-style docstrings.

✏️ Proposed docstring update
 def build_observability_headers(context_state: ContextState) -> dict[str, str]:
-    """
-    Construct observability propagation headers (W3C + NAT) from the current context state.
-    """
+    """Construct observability propagation headers (W3C + NAT) from the current context state.
+
+    Args:
+        context_state: Context state used to build propagation headers.
+
+    Returns:
+        Mapping of header names to values.
+    """
@@
 def inject_observability_headers(headers: dict[str, str] | None = None) -> dict[str, str]:
-    """
-    Convenience helper to inject observability headers into an outbound request headers dict.
-    """
+    """Inject observability headers into an outbound request headers mapping.
+
+    Args:
+        headers: Existing headers to extend.
+
+    Returns:
+        New headers dictionary including observability propagation headers.
+    """

166-187: Add explicit type hints for message and return values in run methods.

These are public APIs and should have complete type hints. As per coding guidelines, please add parameter and return types.

✏️ Proposed type-hint additions
-    async def run(self,
-                  message,
+    async def run(self,
+                  message: typing.Any,
                   runtime_type: RuntimeTypeEnum = RuntimeTypeEnum.RUN_OR_SERVE,
                   parent_id: str | None = None,
-                  parent_name: str | None = None):
+                  parent_name: str | None = None) -> typing.AsyncIterator[typing.Any]:
@@
-    async def run(self,
-                  message,
+    async def run(self,
+                  message: typing.Any,
                   runtime_type: RuntimeTypeEnum = RuntimeTypeEnum.RUN_OR_SERVE,
                   parent_id: str | None = None,
-                  parent_name: str | None = None):
+                  parent_name: str | None = None) -> typing.AsyncIterator[typing.Any]:

Also applies to: 570-595

src/nat/front_ends/fastapi/routes/hitl.py (1)

42-183: Convert HITL public API docstrings to Google-style format.

This module introduces multiple public classes/functions; please add Args/Returns sections accordingly. As per coding guidelines, public APIs must use Google-style docstrings.

Comment on lines +39 to 41
- Sends a question to /v1/workflow/full endpoint (without intermediate steps)
- Captures the agent's response
- Evaluates using /evaluate/item endpoint
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Wrap endpoint paths in backticks in docstrings.

Please format /v1/workflow/full with backticks to match docstring guidance. As per coding guidelines, please wrap code entities in docstrings.

Proposed diff
-- Sends a question to /v1/workflow/full endpoint (without intermediate steps)
+- Sends a question to `/v1/workflow/full` endpoint (without intermediate steps)
🤖 Prompt for AI Agents
In
`@examples/evaluation_and_profiling/simple_web_query_eval/src/nat_simple_web_query_eval/scripts/evaluate_single_item_simple.py`
around lines 39 - 41, Update the docstring that contains the sentence mentioning
/v1/workflow/full so the endpoint path is wrapped in backticks; locate the
module or function docstring in evaluate_single_item_simple.py that currently
lists "Sends a question to /v1/workflow/full endpoint (without intermediate
steps)" and change it to use backticks around `/v1/workflow/full` to follow
docstring formatting guidelines.

Comment on lines +85 to +88
accessible_functions = await group_instance.get_accessible_functions(filter_fn=pass_through_filter)
configured_full_to_fn = accessible_functions
configured_short_names = {name.split('.', 1)[1] for name in accessible_functions.keys()}
except Exception as exc: # noqa: BLE001
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard against function names without a group prefix.

name.split(".", 1)[1] raises IndexError when a key lacks a dot, which would 500 this endpoint. A safe split avoids that runtime risk.

🐛 Suggested fix
-                    configured_short_names = {name.split('.', 1)[1] for name in accessible_functions.keys()}
+                    configured_short_names = {
+                        name.partition(".")[2] or name for name in accessible_functions.keys()
+                    }
🧰 Tools
🪛 Ruff (0.14.13)

88-88: Unused noqa directive (unused: BLE001)

Remove unused noqa directive

(RUF100)

🤖 Prompt for AI Agents
In `@packages/nvidia_nat_mcp/src/nat/plugins/mcp/server/routes.py` around lines 85
- 88, The current computation of configured_short_names uses name.split('.',
1)[1] which will raise IndexError if a key in accessible_functions lacks a dot;
update the logic around the accessible_functions result (from
group_instance.get_accessible_functions) to safely derive short names by
checking for a delimiter before indexing (e.g., use str.partition or rsplit with
maxsplit=1 and fall back to the whole name or skip entries without a dot), so
configured_short_names is built without risking IndexError when processing keys
from accessible_functions.

Comment on lines +40 to +41
send_done: bool = False,
include_usage: bool = False) -> AsyncGenerator[str]:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Unused send_done / include_usage / http_headers will trigger ruff ARG001.

These new arguments are never read; either implement their behavior or explicitly mark them unused to keep linting clean. As per coding guidelines, please address the unused-argument warnings.

✅ Minimal lint-safe placeholder
 async def generate_streaming_response_as_str(payload: typing.Any,
                                              *,
                                              session: Session,
                                              streaming: bool,
                                              step_adaptor: StepAdaptor | None = None,
                                              result_type: type | None = None,
                                              output_type: type | None = None,
                                              send_done: bool = False,
                                              include_usage: bool = False) -> AsyncGenerator[str]:
+    _ = send_done, include_usage
 
 async def generate_streaming_response(
         payload: typing.Any,
         *,
         session: Session,
         streaming: bool,
         step_adaptor: StepAdaptor | None = None,
         result_type: type | None = None,
         output_type: type | None = None,
         trace_collector: list[ResponseSerializable] | None = None,
         http_headers: dict[str, str] | None = None) -> AsyncGenerator[ResponseSerializable]:
+    _ = http_headers
 
 async def generate_streaming_response_full_as_str(payload: typing.Any,
                                                   *,
                                                   session: Session,
                                                   streaming: bool,
                                                   result_type: type | None = None,
                                                   output_type: type | None = None,
                                                   filter_steps: str | None = None,
                                                   send_done: bool = False,
                                                   include_usage: bool = False,
                                                   include_trace: bool = False) -> AsyncGenerator[str]:
+    _ = send_done, include_usage

Also applies to: 67-68, 197-198

🧰 Tools
🪛 Ruff (0.14.13)

40-40: Unused function argument: send_done

(ARG001)


41-41: Unused function argument: include_usage

(ARG001)

🤖 Prompt for AI Agents
In `@src/nat/front_ends/fastapi/response_helpers.py` around lines 40 - 41, The
parameters send_done, include_usage and http_headers in the async generator
functions in response_helpers.py are never used and trigger ruff ARG001; fix by
explicitly marking them unused — either rename the parameters to
_send_done/_include_usage/_http_headers or add a single unused-tuple assignment
(e.g. _ = (send_done, include_usage, http_headers)) at the top of the affected
async generator functions so the linter is satisfied while preserving the API.

Comment on lines +535 to +537
description="Stream raw intermediate steps without any step adaptor translations.\n"
"Use filter_steps query parameter to filter steps by type (comma-separated list) or "
" set to 'none' to suppress all intermediate steps.",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Same malformed description string issue in legacy route.

The legacy /generate/full route has the same whitespace formatting issue.

🔧 Proposed fix
             description="Stream raw intermediate steps without any step adaptor translations.\n"
-            "Use filter_steps query parameter to filter steps by type (comma-separated list) or "
-            "                        set to 'none' to suppress all intermediate steps.",
+            "Use filter_steps query parameter to filter steps by type (comma-separated list) or "
+            "set to 'none' to suppress all intermediate steps.",
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
description="Stream raw intermediate steps without any step adaptor translations.\n"
"Use filter_steps query parameter to filter steps by type (comma-separated list) or "
" set to 'none' to suppress all intermediate steps.",
description="Stream raw intermediate steps without any step adaptor translations.\n"
"Use filter_steps query parameter to filter steps by type (comma-separated list) or "
"set to 'none' to suppress all intermediate steps.",
🤖 Prompt for AI Agents
In `@src/nat/front_ends/fastapi/routes/workflow.py` around lines 535 - 537, The
legacy route registered at "/generate/full" has a malformed multi-line
description string with embedded newline/indentation; update the description
argument for that route to a single properly formatted string (matching the
other route) that reads: "Stream raw intermediate steps without any step adaptor
translations. Use filter_steps query parameter to filter steps by type
(comma-separated list) or set to 'none' to suppress all intermediate steps."
Locate the "/generate/full" route registration in
src/nat/front_ends/fastapi/routes/workflow.py and replace the broken description
value accordingly.

Comment on lines +674 to +694
# NAT observability headers (do not overwrite if already set)
nat_trace_id = request.headers.get("x-nat-trace-id")
if nat_trace_id and not self._context_state.workflow_trace_id.get():
try:
self._context_state.workflow_trace_id.set(int(nat_trace_id, 16))
except Exception:
pass

nat_span_id = request.headers.get("x-nat-span-id")
if nat_span_id:
try:
# Copy the current stack to avoid mutating a shared list from ContextVar
current = list(self._context_state._active_span_id_stack.get() or [])
current.append(nat_span_id)
self._context_state._active_span_id_stack.set(current)
except Exception:
pass

nat_workflow_run = request.headers.get("x-nat-workflow-run-id")
if nat_workflow_run:
self._context_state.workflow_run_id.set(nat_workflow_run)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Validate NAT trace/span header values before storing.

These headers are user input; invalid values can corrupt trace propagation. As per coding guidelines, please sanitize/validate before use.

🛡️ Proposed validation
         nat_trace_id = request.headers.get("x-nat-trace-id")
         if nat_trace_id and not self._context_state.workflow_trace_id.get():
             try:
-                self._context_state.workflow_trace_id.set(int(nat_trace_id, 16))
-            except Exception:
+                if len(nat_trace_id) == 32:
+                    self._context_state.workflow_trace_id.set(int(nat_trace_id, 16))
+            except ValueError:
                 pass
@@
         nat_span_id = request.headers.get("x-nat-span-id")
         if nat_span_id:
             try:
-                # Copy the current stack to avoid mutating a shared list from ContextVar
-                current = list(self._context_state._active_span_id_stack.get() or [])
-                current.append(nat_span_id)
-                self._context_state._active_span_id_stack.set(current)
-            except Exception:
+                if len(nat_span_id) == 16:
+                    int(nat_span_id, 16)
+                    # Copy the current stack to avoid mutating a shared list from ContextVar
+                    current = list(self._context_state._active_span_id_stack.get() or [])
+                    current.append(nat_span_id)
+                    self._context_state._active_span_id_stack.set(current)
+            except ValueError:
                 pass
🧰 Tools
🪛 Ruff (0.14.13)

679-680: try-except-pass detected, consider logging the exception

(S110)


679-679: Do not catch blind exception: Exception

(BLE001)


689-690: try-except-pass detected, consider logging the exception

(S110)


689-689: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
In `@src/nat/runtime/session.py` around lines 674 - 694, The header values must be
validated before storing to avoid corrupting context: for x-nat-trace-id
(handled around self._context_state.workflow_trace_id.set), ensure the header is
a valid hex string that int(nat_trace_id, 16) succeeds and falls within the
expected bit/size range before calling set; for x-nat-span-id (manipulated via
self._context_state._active_span_id_stack.set) validate each incoming span id is
the expected format (e.g., only hex digits or digits, reasonable length) before
appending to the copied stack; for x-nat-workflow-run-id
(self._context_state.workflow_run_id.set) validate it matches the expected ID
pattern (e.g., UUID or allowed chars) before setting. If validation fails, skip
setting the context (and optionally log a debug/warn) instead of silently
catching broad exceptions.

Comment on lines +87 to 92
@pytest.mark.parametrize("fn_use_openai_v1_api", [True, False])
async def test_generate_and_openai_stream(fn_use_openai_v1_api: bool):

if (fn_use_openai_api):
values = ChatRequest(messages=[Message(content="Hello", role="user")]).model_dump()
if (fn_use_openai_v1_api):
values = ChatRequest(messages=[Message(content="Hello", role="user")], stream=True).model_dump()
values = ["a", "b", "c", "d"]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove the dead assignment in the OpenAI v1 stream test.

values is overwritten unconditionally, so the first assignment is unused and confusing.

✏️ Proposed cleanup
-    if (fn_use_openai_v1_api):
-        values = ChatRequest(messages=[Message(content="Hello", role="user")], stream=True).model_dump()
     values = ["a", "b", "c", "d"]
🤖 Prompt for AI Agents
In `@tests/nat/front_ends/fastapi/test_fastapi_front_end_plugin.py` around lines
87 - 92, In test_generate_and_openai_stream, remove the dead/unused assignment
to values created inside the fn_use_openai_v1_api branch (the
ChatRequest(...).model_dump() assigned to values) because values is immediately
overwritten; instead only keep the intended test array assignment (values =
["a","b","c","d"]) and if needed store the ChatRequest result in a differently
named variable or move it into a separate assertion path tied to
fn_use_openai_v1_api; reference: test_generate_and_openai_stream,
fn_use_openai_v1_api, values, ChatRequest, Message.

Comment on lines +170 to 172
async def test_openai_v1_mode_endpoints():
"""Test that endpoints are created correctly for both legacy and OpenAI v1 compatible modes"""

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Update the docstring to reflect v1-only coverage.

The test now exercises only the v1-compatible endpoint; the legacy reference is misleading.

📝 Suggested tweak
-    """Test that endpoints are created correctly for both legacy and OpenAI v1 compatible modes"""
+    """Test that endpoints are created correctly for the OpenAI v1 compatible mode."""
🤖 Prompt for AI Agents
In `@tests/nat/front_ends/fastapi/test_openai_compatibility.py` around lines 170 -
172, The test function test_openai_v1_mode_endpoints currently has a docstring
claiming it covers both legacy and OpenAI v1 compatible modes but only exercises
the v1-compatible endpoint; update the docstring to accurately state it verifies
creation of OpenAI v1-compatible endpoints (e.g., "Test that endpoints are
created correctly for the OpenAI v1-compatible mode") and adjust any nearby
comments or documentation in the same function to remove references to legacy
coverage so the test description matches actual behavior.

@willkill07 willkill07 force-pushed the wkk_fastapi-refactor branch from 6688813 to 40462c3 Compare January 21, 2026 16:54
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@docs/source/run-workflows/observe/observability-propagation.md`:
- Line 20: Update the first occurrence of the product name in the sentence that
currently reads "NeMo Agent toolkit" to include the "NVIDIA" prefix so it
becomes "NVIDIA NeMo Agent toolkit"; locate the phrase in the paragraph that
begins "Observability propagation enables distributed tracing..." and replace
the first mention only (keep subsequent mentions unchanged).

In `@src/nat/front_ends/fastapi/api_versioning.py`:
- Around line 43-64: The is_legacy parameter is unused and triggers lint ARG001;
remove it from wrap_with_headers and stop passing it into apply_version_headers
(adjust the call in _wrapped to call apply_version_headers(response, opts)), and
also remove is_legacy from apply_version_headers signature if it is unused
elsewhere; update any callers that previously passed is_legacy to
wrap_with_headers/apply_version_headers so they no longer pass that argument.
- Around line 29-40: The VersioningOptions public API is missing type
annotations and Google-style docstrings; update VersioningOptions.__init__ to
annotate parameters as api_version_header: bool = True and version: str = "1"
and add a Google-style docstring for the class and __init__, and update the
classmethod from_config signature to accept cfg: FastApiFrontEndConfig and
return -> "VersioningOptions" (or VersioningOptions with a forward
reference/import), add a Google-style docstring for from_config, and ensure any
necessary imports or forward-references are added so the type hints resolve.
♻️ Duplicate comments (3)
docs/source/reference/rest-api/api-server-endpoints.md (3)

20-25: Fix list indentation to satisfy markdownlint (MD007).

Top‑level bullets are still indented, which trips markdownlint. Please align them to column 0 (same as previously reported).

Also applies to: 148-152, 208-212


33-37: Use “NeMo Agent toolkit” (lowercase “t”) in body text.

These sections still use “NeMo Agent Toolkit” in body text; please normalize to “NeMo Agent toolkit.” As per coding guidelines, ...

Also applies to: 250-257, 498-502, 555-557


399-403: Add a language identifier to the SSE example fence (MD040).

The SSE block still lacks a language tag; use text to satisfy markdownlint.

🧹 Nitpick comments (3)
docs/source/run-workflows/observe/observability-propagation.md (3)

35-47: Add language specifier to fenced code block.

The ASCII diagram should include a language specifier for proper rendering and to satisfy linting rules.

🎨 Proposed fix
-```
+```text
 ┌─────────────────────────────────────────────────────────────────┐
 │                      Parent Workflow                            │

Based on static analysis hints.


107-107: Consider using "the toolkit" for subsequent references.

Per the coding guidelines, subsequent references to the product can use "the toolkit" for improved readability.

📝 Proposed changes

Line 107:

-When using NeMo Agent Toolkit programmatically (not through HTTP), you can explicitly pass parent information to the workflow runner:
+When using the toolkit programmatically (not through HTTP), you can explicitly pass parent information to the workflow runner:

Line 126:

-- Integrating NeMo Agent Toolkit workflows into existing applications
+- Integrating toolkit workflows into existing applications

Based on coding guidelines.

Also applies to: 126-126


192-194: Add language specifier to fenced code block.

The HTTP header example should include a language specifier for proper syntax highlighting.

🎨 Proposed fix
-```
+```http
 Observability-Trace-Id: 0123456789abcdef0123456789abcdef
</details>

Based on static analysis hints.

</blockquote></details>

</blockquote></details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->


# Observability Propagation

Observability propagation enables distributed tracing across multiple NeMo Agent toolkit workflows. When a parent workflow calls a child workflow (either via HTTP or programmatically), trace context flows from parent to child. After execution, trace data from child workflows can be aggregated back into the parent's trace.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

First use should include "NVIDIA" prefix.

Per the coding guidelines, the first use of the product name should be "NVIDIA NeMo Agent toolkit".

📝 Proposed fix
-Observability propagation enables distributed tracing across multiple NeMo Agent toolkit workflows. When a parent workflow calls a child workflow (either via HTTP or programmatically), trace context flows from parent to child. After execution, trace data from child workflows can be aggregated back into the parent's trace.
+Observability propagation enables distributed tracing across multiple NVIDIA NeMo Agent toolkit workflows. When a parent workflow calls a child workflow (either via HTTP or programmatically), trace context flows from parent to child. After execution, trace data from child workflows can be aggregated back into the parent's trace.

Based on coding guidelines.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Observability propagation enables distributed tracing across multiple NeMo Agent toolkit workflows. When a parent workflow calls a child workflow (either via HTTP or programmatically), trace context flows from parent to child. After execution, trace data from child workflows can be aggregated back into the parent's trace.
Observability propagation enables distributed tracing across multiple NVIDIA NeMo Agent toolkit workflows. When a parent workflow calls a child workflow (either via HTTP or programmatically), trace context flows from parent to child. After execution, trace data from child workflows can be aggregated back into the parent's trace.
🤖 Prompt for AI Agents
In `@docs/source/run-workflows/observe/observability-propagation.md` at line 20,
Update the first occurrence of the product name in the sentence that currently
reads "NeMo Agent toolkit" to include the "NVIDIA" prefix so it becomes "NVIDIA
NeMo Agent toolkit"; locate the phrase in the paragraph that begins
"Observability propagation enables distributed tracing..." and replace the first
mention only (keep subsequent mentions unchanged).

Comment on lines +29 to +40
class VersioningOptions:
"""Options for version headers."""

def __init__(self, *, api_version_header: bool = True, version: str = "1"):
self.api_version_header = api_version_header
self.version = version

@classmethod
def from_config(cls, cfg: FastApiFrontEndConfig): # pragma: no cover - simple wiring helper
version_cfg = getattr(cfg, "versioning", None)
return cls(api_version_header=bool(getattr(version_cfg, "api_version_header", True)),
version=str(getattr(version_cfg, "version", 1)))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add missing type hints (and Google-style docstrings) for public API.

VersioningOptions.__init__ lacks parameter annotations and from_config lacks a return type, which violates the repo’s public API typing requirements. Please also ensure public module/class/function docstrings follow Google style. As per coding guidelines, ...

♻️ Proposed type-hint fix
 class VersioningOptions:
     """Options for version headers."""
 
-    def __init__(self, *, api_version_header: bool = True, version: str = "1"):
+    def __init__(self, *, api_version_header: bool = True, version: str = "1") -> None:
         self.api_version_header = api_version_header
         self.version = version
 
     `@classmethod`
-    def from_config(cls, cfg: FastApiFrontEndConfig):  # pragma: no cover - simple wiring helper
+    def from_config(cls, cfg: FastApiFrontEndConfig) -> "VersioningOptions":  # pragma: no cover - simple wiring helper
         version_cfg = getattr(cfg, "versioning", None)
         return cls(api_version_header=bool(getattr(version_cfg, "api_version_header", True)),
                    version=str(getattr(version_cfg, "version", 1)))
🤖 Prompt for AI Agents
In `@src/nat/front_ends/fastapi/api_versioning.py` around lines 29 - 40, The
VersioningOptions public API is missing type annotations and Google-style
docstrings; update VersioningOptions.__init__ to annotate parameters as
api_version_header: bool = True and version: str = "1" and add a Google-style
docstring for the class and __init__, and update the classmethod from_config
signature to accept cfg: FastApiFrontEndConfig and return -> "VersioningOptions"
(or VersioningOptions with a forward reference/import), add a Google-style
docstring for from_config, and ensure any necessary imports or
forward-references are added so the type hints resolve.

Comment on lines +43 to +64
def apply_version_headers(response: Response, opts: VersioningOptions, *, is_legacy: bool = False) -> None:
"""Inject version header if enabled."""
if opts.api_version_header:
response.headers["X-API-Version"] = opts.version


def wrap_with_headers(handler: Callable, opts: VersioningOptions, *, is_legacy: bool) -> Callable:
"""Wrap a route handler to add version headers."""

async def _wrapped(*args, **kwargs):
response: Response | None = kwargs.get("response")
# Some handlers accept Response as first positional arg; detect by type
if response is None:
for arg in args:
if isinstance(arg, Response):
response = arg
break

result = await handler(*args, **kwargs)
if response is not None:
apply_version_headers(response, opts, is_legacy=is_legacy)
return result
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix unused is_legacy parameter to satisfy Ruff (ARG001).

is_legacy is currently unused, which will fail linting. Either remove it (and update call sites) or use it to drive legacy/deprecation behavior. The minimal fix is to drop it here and adjust callers accordingly.

🧹 Minimal fix (remove unused parameter)
-def apply_version_headers(response: Response, opts: VersioningOptions, *, is_legacy: bool = False) -> None:
+def apply_version_headers(response: Response, opts: VersioningOptions) -> None:
     """Inject version header if enabled."""
     if opts.api_version_header:
         response.headers["X-API-Version"] = opts.version
 
 
-def wrap_with_headers(handler: Callable, opts: VersioningOptions, *, is_legacy: bool) -> Callable:
+def wrap_with_headers(handler: Callable, opts: VersioningOptions) -> Callable:
     """Wrap a route handler to add version headers."""
 
     async def _wrapped(*args, **kwargs):
         response: Response | None = kwargs.get("response")
@@
-        if response is not None:
-            apply_version_headers(response, opts, is_legacy=is_legacy)
+        if response is not None:
+            apply_version_headers(response, opts)
         return result
🧰 Tools
🪛 Ruff (0.14.13)

43-43: Unused function argument: is_legacy

(ARG001)

🤖 Prompt for AI Agents
In `@src/nat/front_ends/fastapi/api_versioning.py` around lines 43 - 64, The
is_legacy parameter is unused and triggers lint ARG001; remove it from
wrap_with_headers and stop passing it into apply_version_headers (adjust the
call in _wrapped to call apply_version_headers(response, opts)), and also remove
is_legacy from apply_version_headers signature if it is unused elsewhere; update
any callers that previously passed is_legacy to
wrap_with_headers/apply_version_headers so they no longer pass that argument.

@willkill07 willkill07 force-pushed the wkk_fastapi-refactor branch from 40462c3 to 5884408 Compare January 21, 2026 17:27
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In `@docs/source/reference/rest-api/api-server-endpoints.md`:
- Around line 20-25: Replace the first in-body occurrence of "NeMo Agent
toolkit" with the full product name "NVIDIA NeMo Agent toolkit" (e.g., in the
sentence that currently reads "There are currently four workflow transactions
that can be initiated using HTTP or WebSocket when the NeMo Agent toolkit server
is running: ...") so the initial mention uses the full name while subsequent
mentions may remain shortened.

In `@docs/source/run-workflows/observe/observability-propagation.md`:
- Around line 192-194: The fenced code example containing the header line
"Observability-Trace-Id: 0123456789abcdef0123456789abcdef" lacks a language
identifier; update the opening triple-backtick fence to include a language tag
(for example use "text") so the block reads ```text and ensures compliance with
MD040 and proper syntax highlighting for the Observability-Trace-Id example.
- Around line 35-47: The fenced diagram in observability-propagation.md is
missing a language identifier (triggering MD040); update the opening code fence
for the ASCII art diagram (the triple-backtick block containing the Parent
Workflow/Child Workflow diagram) to include a language identifier like "text"
(i.e., change the opening ``` to ```text) and leave the diagram contents
unchanged.

In `@src/nat/front_ends/fastapi/fastapi_front_end_plugin_worker.py`:
- Around line 306-310: The _ensure_usage function currently overwrites valid
usage because the condition is inverted; change the check in _ensure_usage so it
only assigns a default Usage() when result.usage is missing/empty (i.e.,
result.usage is None or result.usage == Usage()), not when it already contains
real data; update the isinstance(result, ChatResponse) check to combine with
(result.usage is None or result.usage == Usage()) before setting result.usage =
Usage(prompt_tokens=0, completion_tokens=0, total_tokens=0).
- Around line 335-348: The route registration logic in _register_api_route and
_register_websocket_route skips registering any route when the incoming path is
already versioned and enable_legacy_routes is False; update both functions
(referencing _register_api_route, _register_websocket_route and _versioned_path)
so the versioned path is always registered (call
app.add_api_route/app.add_websocket_route for versioned) and only register the
legacy/original path when front_end_config.versioning.enable_legacy_routes is
True (and avoid double-registering the same path). Ensure at least one app.add_*
call happens: always register the computed versioned path, and conditionally
register the original path when legacy routes are enabled.
♻️ Duplicate comments (6)
docs/source/run-workflows/observe/observability-propagation.md (1)

18-21: Use full product name on first mention and avoid inanimate possessive.
This first use should include “NVIDIA”, and “parent’s trace” should be rephrased. As per coding guidelines.

✏️ Proposed fix
-Observability propagation enables distributed tracing across multiple NeMo Agent toolkit workflows. When a parent workflow calls a child workflow (either via HTTP or programmatically), trace context flows from parent to child. After execution, trace data from child workflows can be aggregated back into the parent's trace.
+Observability propagation enables distributed tracing across multiple NVIDIA NeMo Agent toolkit workflows. When a parent workflow calls a child workflow (either via HTTP or programmatically), trace context flows from parent to child. After execution, trace data from child workflows can be aggregated back into the trace of the parent workflow.
src/nat/front_ends/fastapi/api_versioning.py (2)

29-41: Add missing type hints + Google-style docstrings for public APIs.
Public APIs here still lack required type hints and Google-style docstrings. As per coding guidelines.

♻️ Proposed fix
+"""API versioning utilities for the FastAPI front end."""
+
 class VersioningOptions:
-    """Options for version headers."""
+    """Options for version headers.
+
+    Args:
+        api_version_header: Whether to emit the `X-API-Version` header.
+        version: API version string to emit.
+    """
 
-    def __init__(self, *, api_version_header: bool = True, version: str = "1"):
+    def __init__(self, *, api_version_header: bool = True, version: str = "1") -> None:
+        """Initialize version header options.
+
+        Args:
+            api_version_header: Whether to emit the version header.
+            version: API version string.
+        """
         self.api_version_header = api_version_header
         self.version = version
 
     `@classmethod`
-    def from_config(cls, cfg: FastApiFrontEndConfig):  # pragma: no cover - simple wiring helper
+    def from_config(cls, cfg: FastApiFrontEndConfig) -> "VersioningOptions":  # pragma: no cover - simple wiring helper
+        """Build options from a FastAPI front-end config.
+
+        Args:
+            cfg: FastAPI front-end configuration.
+
+        Returns:
+            Versioning options instance.
+        """
         version_cfg = getattr(cfg, "versioning", None)
         return cls(api_version_header=bool(getattr(version_cfg, "api_version_header", True)),
                    version=str(getattr(version_cfg, "version", 1)))
 ...
-def is_versioned_request(request: Request) -> bool:
-    """Quick check to detect /v{n}/ prefix on the path."""
+def is_versioned_request(request: Request) -> bool:
+    """Check whether the request path starts with a version prefix."""
     return request.url.path.startswith("/v")

Also applies to: 69-71


43-64: Remove unused is_legacy parameter (Ruff ARG001).
This unused parameter will fail linting; either remove it or use it for behavior. As per coding guidelines.

🧹 Minimal fix (remove unused parameter)
-def apply_version_headers(response: Response, opts: VersioningOptions, *, is_legacy: bool = False) -> None:
+def apply_version_headers(response: Response, opts: VersioningOptions) -> None:
     """Inject version header if enabled."""
     if opts.api_version_header:
         response.headers["X-API-Version"] = opts.version
 
 
-def wrap_with_headers(handler: Callable, opts: VersioningOptions, *, is_legacy: bool) -> Callable:
+def wrap_with_headers(handler: Callable, opts: VersioningOptions) -> Callable:
     """Wrap a route handler to add version headers."""
@@
-        if response is not None:
-            apply_version_headers(response, opts, is_legacy=is_legacy)
+        if response is not None:
+            apply_version_headers(response, opts)
         return result

If you remove the parameter, please verify call sites:

#!/bin/bash
# Locate call sites that pass is_legacy.
rg -n "wrap_with_headers\(|apply_version_headers\(" src/nat/front_ends/fastapi
docs/source/reference/rest-api/api-server-endpoints.md (3)

31-36: Use “NeMo Agent toolkit” in body text (lowercase “t”).
As per coding guidelines.

✏️ Suggested tweak
-The endpoints `/chat` and `/chat/completions` have been **removed** in NeMo Agent Toolkit v1.5 and later.
+The endpoints `/chat` and `/chat/completions` have been **removed** in NeMo Agent toolkit v1.5 and later.

20-25: Normalize list indentation to satisfy markdownlint (MD005/MD007).
Apply consistent top-level indentation for these lists.

✏️ Example fix (apply similarly to other lists)
-  - **Workflow Interface:** Uses the transaction schema defined by your workflow. The interface documentation is accessible
+ - **Workflow Interface:** Uses the transaction schema defined by your workflow. The interface documentation is accessible
...
-  - **Chat Interface:** [OpenAI API Documentation] provides
+ - **Chat Interface:** [OpenAI API Documentation] provides

Also applies to: 149-152, 208-212


399-403: Add a language identifier to the SSE example fence (MD040).

✏️ Suggested tweak
-  ```
+  ```text
  data: {"id": "interaction-uuid-5678", "prompt": "Approve data export?", "options": ["approve", "reject"], "metadata": {}}
 
</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🧹 Nitpick comments (1)</summary><blockquote>

<details>
<summary>src/nat/front_ends/fastapi/fastapi_front_end_plugin_worker.py (1)</summary><blockquote>

`437-437`: **Consider sorting `__all__` for consistency.**

Ruff flagged that `__all__` is not sorted (RUF022). Alphabetical ordering improves readability and reduces merge conflicts.



<details>
<summary>Suggested fix</summary>

```diff
-__all__ = ["FastApiFrontEndPluginWorkerBase", "FastApiFrontEndPluginWorker", "RouteInfo", "EvaluationRun"]
+__all__ = ["EvaluationRun", "FastApiFrontEndPluginWorker", "FastApiFrontEndPluginWorkerBase", "RouteInfo"]

Comment on lines +20 to +25
There are currently four workflow transactions that can be initiated using HTTP or WebSocket when the NeMo Agent toolkit server is running: `workflow non-streaming`, `workflow async`, `workflow streaming`, and `workflow streaming full`. The following are types of interfaces you can use to interact with your running workflows.
- **Workflow Interface:** Uses the transaction schema defined by your workflow. The interface documentation is accessible
using Swagger while the server is running [`http://localhost:8000/docs`](http://localhost:8000/docs).
- **Chat Interface:** [OpenAI API Documentation](https://platform.openai.com/docs/guides/text?api-mode=chat) provides
details on chat formats compatible with the NeMo Agent toolkit server.
details on chat formats compatible with the NeMo Agent toolkit server. Use the OpenAI-compatible endpoint at
`/v1/chat/completions`, which supports streaming and non-streaming requests based on the `stream` parameter.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use the full product name on first mention.
The first mention in body text should be “NVIDIA NeMo Agent toolkit.” As per coding guidelines.

✏️ Suggested tweak
-There are currently four workflow transactions that can be initiated using HTTP or WebSocket when the NeMo Agent toolkit server is running: `workflow non-streaming`, `workflow async`, `workflow streaming`, and `workflow streaming full`. The following are types of interfaces you can use to interact with your running workflows.
+There are currently four workflow transactions that can be initiated using HTTP or WebSocket when the NVIDIA NeMo Agent toolkit server is running: `workflow non-streaming`, `workflow async`, `workflow streaming`, and `workflow streaming full`. The following are types of interfaces you can use to interact with your running workflows.
🧰 Tools
🪛 markdownlint-cli2 (0.18.1)

21-21: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)


23-23: Unordered list indentation
Expected: 0; Actual: 2

(MD007, ul-indent)

🤖 Prompt for AI Agents
In `@docs/source/reference/rest-api/api-server-endpoints.md` around lines 20 - 25,
Replace the first in-body occurrence of "NeMo Agent toolkit" with the full
product name "NVIDIA NeMo Agent toolkit" (e.g., in the sentence that currently
reads "There are currently four workflow transactions that can be initiated
using HTTP or WebSocket when the NeMo Agent toolkit server is running: ...") so
the initial mention uses the full name while subsequent mentions may remain
shortened.

Comment on lines +35 to +47
```
┌─────────────────────────────────────────────────────────────────┐
│ Parent Workflow │
│ ┌───────────────┐ ┌───────────────────┐ │
│ │ Start Request │──── HTTP Headers ───▶│ Child Workflow │ │
│ │ │ (traceparent, │ │ │
│ │ │ X-NAT-Trace-ID) │ │ │
│ │ │ │ │ │
│ │ │◀── Response with ────│ │ │
│ │ Merge Steps │ _trace payload │ │ │
│ └───────────────┘ └───────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add a language identifier to the diagram fence (MD040).

✏️ Suggested tweak
-```
+```text
┌─────────────────────────────────────────────────────────────────┐
│                      Parent Workflow                            │
│  ┌───────────────┐                      ┌───────────────────┐   │
│  │ Start Request │──── HTTP Headers ───▶│  Child Workflow   │   │
│  │               │    (traceparent,     │                   │   │
│  │               │    X-NAT-Trace-ID)   │                   │   │
│  │               │                      │                   │   │
│  │               │◀── Response with ────│                   │   │
│  │  Merge Steps  │    _trace payload    │                   │   │
│  └───────────────┘                      └───────────────────┘   │
└─────────────────────────────────────────────────────────────────┘
🧰 Tools
🪛 markdownlint-cli2 (0.18.1)

35-35: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
In `@docs/source/run-workflows/observe/observability-propagation.md` around lines
35 - 47, The fenced diagram in observability-propagation.md is missing a
language identifier (triggering MD040); update the opening code fence for the
ASCII art diagram (the triple-backtick block containing the Parent
Workflow/Child Workflow diagram) to include a language identifier like "text"
(i.e., change the opening ``` to ```text) and leave the diagram contents
unchanged.

Comment on lines +192 to +194
```
Observability-Trace-Id: 0123456789abcdef0123456789abcdef
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add a language identifier to the header example fence (MD040).

✏️ Suggested tweak
-```
+```text
Observability-Trace-Id: 0123456789abcdef0123456789abcdef
🧰 Tools
🪛 markdownlint-cli2 (0.18.1)

192-192: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
In `@docs/source/run-workflows/observe/observability-propagation.md` around lines
192 - 194, The fenced code example containing the header line
"Observability-Trace-Id: 0123456789abcdef0123456789abcdef" lacks a language
identifier; update the opening triple-backtick fence to include a language tag
(for example use "text") so the block reads ```text and ensures compliance with
MD040 and proper syntax highlighting for the Observability-Trace-Id example.

Comment on lines +306 to +310
@staticmethod
def _ensure_usage(result: typing.Any) -> None:
"""Ensure ChatResponse includes usage per OpenAI spec."""
if isinstance(result, ChatResponse) and result.usage != Usage():
result.usage = Usage(prompt_tokens=0, completion_tokens=0, total_tokens=0)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Logic bug: condition is inverted and will overwrite valid usage data.

The condition result.usage != Usage() evaluates to True when result.usage differs from the default empty Usage(). This causes the method to overwrite valid usage data (e.g., actual token counts) with zeros, which is the opposite of the intended behavior.

Proposed fix
 `@staticmethod`
 def _ensure_usage(result: typing.Any) -> None:
     """Ensure ChatResponse includes usage per OpenAI spec."""
-    if isinstance(result, ChatResponse) and result.usage != Usage():
+    if isinstance(result, ChatResponse) and result.usage is None:
         result.usage = Usage(prompt_tokens=0, completion_tokens=0, total_tokens=0)

If you also need to handle the case where result.usage is an empty Usage() instance that should be replaced:

if isinstance(result, ChatResponse) and (result.usage is None or result.usage == Usage()):
    result.usage = Usage(prompt_tokens=0, completion_tokens=0, total_tokens=0)
🤖 Prompt for AI Agents
In `@src/nat/front_ends/fastapi/fastapi_front_end_plugin_worker.py` around lines
306 - 310, The _ensure_usage function currently overwrites valid usage because
the condition is inverted; change the check in _ensure_usage so it only assigns
a default Usage() when result.usage is missing/empty (i.e., result.usage is None
or result.usage == Usage()), not when it already contains real data; update the
isinstance(result, ChatResponse) check to combine with (result.usage is None or
result.usage == Usage()) before setting result.usage = Usage(prompt_tokens=0,
completion_tokens=0, total_tokens=0).

Comment on lines +335 to +348
def _register_api_route(self, app: FastAPI, *, path: str, **kwargs) -> None:
"""Register route on legacy + versioned paths depending on config."""
if self.front_end_config.versioning.enable_legacy_routes:
app.add_api_route(path=path, **kwargs)
versioned = self._versioned_path(path)
if versioned != path:
app.add_api_route(path=versioned, **kwargs)

def _register_websocket_route(self, app: FastAPI, *, path: str, endpoint) -> None:
if self.front_end_config.versioning.enable_legacy_routes:
app.add_websocket_route(path, endpoint)
versioned = self._versioned_path(path)
if versioned != path:
app.add_websocket_route(versioned, endpoint)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for usages of _register_api_route and _register_websocket_route
echo "=== Usages of _register_api_route ==="
rg -n '_register_api_route' --type=py -B2 -A2

echo ""
echo "=== Usages of _register_websocket_route ==="
rg -n '_register_websocket_route' --type=py -B2 -A2

Repository: NVIDIA/NeMo-Agent-Toolkit

Length of output: 7829


🏁 Script executed:

#!/bin/bash
# Find the file to understand the context better
echo "=== Checking the full file structure and imports ==="
head -60 src/nat/front_ends/fastapi/fastapi_front_end_plugin_worker.py | cat -n

echo ""
echo "=== Check the register_workflow_route method ==="
rg -A 30 'def register_workflow_route' src/nat/front_ends/fastapi/fastapi_front_end_plugin_worker.py

Repository: NVIDIA/NeMo-Agent-Toolkit

Length of output: 3087


🏁 Script executed:

#!/bin/bash
# Check how paths are constructed/passed to _register_api_route
echo "=== Search for direct app.add_api_route calls (not through helper) ==="
rg -n 'app\.add_api_route|app\.add_websocket_route' --type=py -B2 -A2

echo ""
echo "=== Check config/versioning setup ==="
rg -n '_versioned_path|versioning.enable_legacy' --type=py -B1 -A1 | head -100

Repository: NVIDIA/NeMo-Agent-Toolkit

Length of output: 4740


🏁 Script executed:

#!/bin/bash
# Check the _versioned_path implementation to understand path handling
echo "=== Full _versioned_path method ==="
sed -n '327,334p' src/nat/front_ends/fastapi/fastapi_front_end_plugin_worker.py

echo ""
echo "=== Check config defaults for path values ==="
rg -n 'class.*Config|path.*=' src/nat/front_ends/fastapi/fastapi_front_end_config.py | head -50

echo ""
echo "=== Check if evaluate.path has default value ==="
rg -A2 'evaluate.*:' src/nat/front_ends/fastapi/fastapi_front_end_config.py | head -20

Repository: NVIDIA/NeMo-Agent-Toolkit

Length of output: 1770


🏁 Script executed:

#!/bin/bash
# Check workflow routes more carefully to see if they're already versioned
echo "=== Check workflow route defaults ==="
rg -n 'workflow.*path|/v1/|/v2/' src/nat/front_ends/fastapi/routes/workflow.py | head -30

echo ""
echo "=== Check the evaluate config defaults ==="
rg -B2 -A5 'class EvaluateRouteConfig' src/nat/front_ends/fastapi/fastapi_front_end_config.py

Repository: NVIDIA/NeMo-Agent-Toolkit

Length of output: 150


🏁 Script executed:

#!/bin/bash
# Verify the actual behavior with a concrete example
echo "=== Check the version number ==="
rg -A2 'class VersioningConfig' src/nat/front_ends/fastapi/fastapi_front_end_config.py

echo ""
echo "=== Double-check workflow endpoint defaults ==="
rg -B5 'workflow.*=' src/nat/front_ends/fastapi/fastapi_front_end_config.py | grep -A5 'path='

echo ""
echo "=== Verify the full route registration flow in workflow.py ==="
sed -n '120,150p' src/nat/front_ends/fastapi/routes/workflow.py

Repository: NVIDIA/NeMo-Agent-Toolkit

Length of output: 1467


🏁 Script executed:

#!/bin/bash
# Test the logic: if path is already /v1/workflow, what does _versioned_path return?
python3 << 'EOF'
def _versioned_path(path: str, version: int = 1) -> str:
    """Prefix a path with /v{n} if not already versioned."""
    prefix = f"/v{version}"
    normalized = path if path.startswith("/") else f"/{path}"
    if normalized.startswith(prefix):
        return normalized
    return f"{prefix}{normalized}"

# Test cases
test_paths = [
    "/workflow",
    "/v1/workflow", 
    "/v2/workflow",
    "workflow",
    "/v1/evaluate",
    "/monitor/users"
]

print("Testing _versioned_path logic with version=1:")
for path in test_paths:
    result = _versioned_path(path)
    equals = result == path
    print(f"  Input: {path:20} → Output: {result:20} | versioned==path: {equals}")
    
print("\nKey insight: When path is already /v1/x:")
print("  - versioned == path is TRUE")
print("  - So line 340-341 condition (if versioned != path) is SKIPPED")
print("  - If enable_legacy_routes=False, line 337-338 is also SKIPPED")
print("  - Result: NO ROUTE REGISTERED!")
EOF

Repository: NVIDIA/NeMo-Agent-Toolkit

Length of output: 155


🏁 Script executed:

#!/bin/bash
# Get the version number from VersioningConfig
echo "=== VersioningConfig full definition ==="
rg -A10 'class VersioningConfig' src/nat/front_ends/fastapi/fastapi_front_end_config.py

echo ""
echo "=== Check workflow endpoint config ==="
rg -B2 -A5 'workflow.*EndpointBase' src/nat/front_ends/fastapi/fastapi_front_end_config.py

Repository: NVIDIA/NeMo-Agent-Toolkit

Length of output: 1120


Routes fail to register when enable_legacy_routes=False with versioned default paths.

The configuration defaults use versioned paths (/v1/workflow, /v1/evaluate, etc.). When enable_legacy_routes=False, these routes do not register because:

  • _versioned_path("/v1/workflow") returns "/v1/workflow" unchanged (path already starts with /v1)
  • Line 337–338 is skipped due to enable_legacy_routes=False
  • Line 340–341 is skipped because versioned == path evaluates to True

Result: no route is registered. The conditional logic assumes input paths are unversioned, but the defaults are already versioned. At minimum, ensure the versioned path is always registered when it differs from the input, or revise the logic to guarantee at least one path is always registered.

🤖 Prompt for AI Agents
In `@src/nat/front_ends/fastapi/fastapi_front_end_plugin_worker.py` around lines
335 - 348, The route registration logic in _register_api_route and
_register_websocket_route skips registering any route when the incoming path is
already versioned and enable_legacy_routes is False; update both functions
(referencing _register_api_route, _register_websocket_route and _versioned_path)
so the versioned path is always registered (call
app.add_api_route/app.add_websocket_route for versioned) and only register the
legacy/original path when front_end_config.versioning.enable_legacy_routes is
True (and avoid double-registering the same path). Ensure at least one app.add_*
call happens: always register the computed versioned path, and conditionally
register the original path when legacy routes are enabled.

Signed-off-by: Will Killian <wkillian@nvidia.com>
@willkill07 willkill07 force-pushed the wkk_fastapi-refactor branch from 5884408 to 4e5eec0 Compare January 21, 2026 18:54
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
tests/nat/front_ends/fastapi/test_evaluate_endpoints.py (2)

118-118: Inconsistent endpoint path - missing /v1 prefix.

Line 118 uses /evaluate/job/{job_id} while other tests in this file use /v1/evaluate/... paths. This inconsistency may cause test failures if legacy routes are disabled.

✏️ Suggested fix
-    status_response = test_client.get(f"/evaluate/job/{job_id}")
+    status_response = test_client.get(f"/v1/evaluate/job/{job_id}")

229-230: Missing comma causes unintended string concatenation.

There's a missing comma between "/abolute/path" and "../relative", causing Python's implicit string concatenation. The test will receive /abolute/path../relative as a single value instead of two separate test cases. Additionally, "abolute" appears to be a typo for "absolute".

✏️ Suggested fix
-@pytest.mark.parametrize("job_id", ["test/job/123", "..", ".", "/abolute/path"
-                                    "../relative", "/"])
+@pytest.mark.parametrize("job_id", ["test/job/123", "..", ".", "/absolute/path",
+                                    "../relative", "/"])
🤖 Fix all issues with AI agents
In `@docs/source/run-workflows/observe/observability-propagation.md`:
- Line 62: Change the body text in the table cell that currently reads "Stable
NeMo Agent Toolkit workflow run identifier (UUID string)" to use lowercase
"toolkit" (i.e., "Stable NeMo Agent toolkit workflow run identifier (UUID
string)"); update the string associated with the `X-NAT-Workflow-Run-ID` table
entry so the word "Toolkit" is lowercase to comply with the coding guideline.
- Line 60: The table cell currently uses "NeMo Agent Toolkit" with a capital
"Toolkit"; update the body text to use lowercase "toolkit" so the cell reads
"NeMo Agent toolkit" (search for the string `X-NAT-Trace-ID` or `NeMo Agent
Toolkit` in the document and change only the word "Toolkit" to "toolkit" while
keeping the rest of the text unchanged).
- Line 107: Replace the capitalized word "Toolkit" with lowercase "toolkit" in
the sentence containing "NeMo Agent Toolkit programmatically (not through HTTP),
you can explicitly pass parent information to the workflow runner:" so the body
text reads "NeMo Agent toolkit" to conform to the guideline that "Toolkit" is
only capitalized in titles/headings; update the phrase wherever the exact string
"NeMo Agent Toolkit" appears in body text in this document.
♻️ Duplicate comments (10)
docs/source/run-workflows/observe/observability-propagation.md (3)

20-20: First use should include "NVIDIA" prefix.

Per the coding guidelines, the first use of the product name should be "NVIDIA NeMo Agent toolkit".

📝 Proposed fix
-Observability propagation enables distributed tracing across multiple NeMo Agent toolkit workflows. When a parent workflow calls a child workflow (either via HTTP or programmatically), trace context flows from parent to child. After execution, trace data from child workflows can be aggregated back into the parent's trace.
+Observability propagation enables distributed tracing across multiple NVIDIA NeMo Agent toolkit workflows. When a parent workflow calls a child workflow (either via HTTP or programmatically), trace context flows from parent to child. After execution, trace data from child workflows can be aggregated back into the parent's trace.

Based on coding guidelines.


35-47: Add a language identifier to the diagram fence (MD040).

Fenced code blocks should have a language specified for proper rendering.

📝 Proposed fix
-```
+```text
 ┌─────────────────────────────────────────────────────────────────┐
 │                      Parent Workflow                            │
 │  ┌───────────────┐                      ┌───────────────────┐   │
 │  │ Start Request │──── HTTP Headers ───▶│  Child Workflow   │   │
 │  │               │    (traceparent,     │                   │   │
 │  │               │    X-NAT-Trace-ID)   │                   │   │
 │  │               │                      │                   │   │
 │  │               │◀── Response with ────│                   │   │
 │  │  Merge Steps  │    _trace payload    │                   │   │
 │  └───────────────┘                      └───────────────────┘   │
 └─────────────────────────────────────────────────────────────────┘
</details>

---

`192-194`: **Add a language identifier to the header example fence (MD040).**

Fenced code blocks should have a language specified for proper rendering.




<details>
<summary>📝 Proposed fix</summary>

```diff
-```
+```text
 Observability-Trace-Id: 0123456789abcdef0123456789abcdef
</details>

</blockquote></details>
<details>
<summary>src/nat/front_ends/fastapi/api_versioning.py (2)</summary><blockquote>

`29-40`: **Add missing type hints for public API compliance.**

`VersioningOptions.__init__` lacks a return type annotation and `from_config` lacks a return type hint, which violates the repo's public API typing requirements. Based on learnings requiring Python 3.11+ type hints on all public APIs.


<details>
<summary>♻️ Proposed fix</summary>

```diff
 class VersioningOptions:
     """Options for version headers."""

-    def __init__(self, *, api_version_header: bool = True, version: str = "1"):
+    def __init__(self, *, api_version_header: bool = True, version: str = "1") -> None:
         self.api_version_header = api_version_header
         self.version = version

     `@classmethod`
-    def from_config(cls, cfg: FastApiFrontEndConfig):  # pragma: no cover - simple wiring helper
+    def from_config(cls, cfg: FastApiFrontEndConfig) -> "VersioningOptions":  # pragma: no cover - simple wiring helper
         version_cfg = getattr(cfg, "versioning", None)
         return cls(api_version_header=bool(getattr(version_cfg, "api_version_header", True)),
                    version=str(getattr(version_cfg, "version", 1)))

43-46: Unused is_legacy parameter will fail linting (ARG001).

The is_legacy parameter is declared but never used. Either remove it or implement the intended deprecation behavior (e.g., adding a deprecation warning header for legacy routes).

🧹 Option 1: Remove unused parameter
-def apply_version_headers(response: Response, opts: VersioningOptions, *, is_legacy: bool = False) -> None:
+def apply_version_headers(response: Response, opts: VersioningOptions) -> None:
     """Inject version header if enabled."""
     if opts.api_version_header:
         response.headers["X-API-Version"] = opts.version
💡 Option 2: Use the parameter for deprecation warnings
 def apply_version_headers(response: Response, opts: VersioningOptions, *, is_legacy: bool = False) -> None:
     """Inject version header if enabled."""
     if opts.api_version_header:
         response.headers["X-API-Version"] = opts.version
+    if is_legacy:
+        response.headers["Deprecation"] = "true"
+        response.headers["X-Deprecated-Message"] = "This endpoint path is deprecated. Use /v1/... paths instead."
docs/source/reference/rest-api/api-server-endpoints.md (3)

20-20: Use full product name on first mention.

The first mention in body text should be "NVIDIA NeMo Agent toolkit" per coding guidelines.

✏️ Suggested fix
-There are currently four workflow transactions that can be initiated using HTTP or WebSocket when the NeMo Agent toolkit server is running: `workflow non-streaming`, `workflow async`, `workflow streaming`, and `workflow streaming full`. The following are types of interfaces you can use to interact with your running workflows.
+There are currently four workflow transactions that can be initiated using HTTP or WebSocket when the NVIDIA NeMo Agent toolkit server is running: `workflow non-streaming`, `workflow async`, `workflow streaming`, and `workflow streaming full`. The following are types of interfaces you can use to interact with your running workflows.

33-33: Use lowercase "toolkit" in body text.

Per coding guidelines, body text should use "NeMo Agent toolkit" (lowercase "t").

✏️ Suggested fix
-The endpoints `/chat` and `/chat/completions` have been **removed** in NeMo Agent Toolkit v1.5 and later.
+The endpoints `/chat` and `/chat/completions` have been **removed** in NeMo Agent toolkit v1.5 and later.

400-403: Add language identifier to fenced code block (MD040).

The SSE example code block should have a language specifier for consistency with markdownlint rules.

✏️ Suggested fix
-- **SSE Event Example:**
-  ```
+- **SSE Event Example:**
+  ```text
   data: {"id": "interaction-uuid-5678", "prompt": "Approve data export?", "options": ["approve", "reject"], "metadata": {}}

   ```
src/nat/front_ends/fastapi/fastapi_front_end_plugin_worker.py (2)

306-310: Logic bug: condition is inverted and will overwrite valid usage data.

The condition result.usage != Usage() evaluates to True when result.usage differs from the default empty Usage(). This causes the method to overwrite valid usage data (e.g., actual token counts) with zeros, which is the opposite of the intended behavior.

🐛 Proposed fix
 `@staticmethod`
 def _ensure_usage(result: typing.Any) -> None:
     """Ensure ChatResponse includes usage per OpenAI spec."""
-    if isinstance(result, ChatResponse) and result.usage != Usage():
+    if isinstance(result, ChatResponse) and result.usage is None:
         result.usage = Usage(prompt_tokens=0, completion_tokens=0, total_tokens=0)

335-348: Routes fail to register when enable_legacy_routes=False with default versioned paths.

The default config uses versioned paths (/v1/workflow, /v1/evaluate). When enable_legacy_routes=False:

  • _versioned_path("/v1/workflow") returns "/v1/workflow" unchanged
  • Line 337-338 is skipped due to enable_legacy_routes=False
  • Line 340-341 is skipped because versioned == path

Result: no route is registered.

🐛 Proposed fix - always register versioned path
 def _register_api_route(self, app: FastAPI, *, path: str, **kwargs) -> None:
     """Register route on legacy + versioned paths depending on config."""
+    versioned = self._versioned_path(path)
+    # Always register the versioned path
+    app.add_api_route(path=versioned, **kwargs)
+    # Register legacy path only if enabled and different from versioned
     if self.front_end_config.versioning.enable_legacy_routes:
-        app.add_api_route(path=path, **kwargs)
-    versioned = self._versioned_path(path)
-    if versioned != path:
-        app.add_api_route(path=versioned, **kwargs)
+        if path != versioned:
+            app.add_api_route(path=path, **kwargs)

 def _register_websocket_route(self, app: FastAPI, *, path: str, endpoint) -> None:
+    versioned = self._versioned_path(path)
+    # Always register the versioned path
+    app.add_websocket_route(versioned, endpoint)
+    # Register legacy path only if enabled and different from versioned
     if self.front_end_config.versioning.enable_legacy_routes:
-        app.add_websocket_route(path, endpoint)
-    versioned = self._versioned_path(path)
-    if versioned != path:
-        app.add_websocket_route(versioned, endpoint)
+        if path != versioned:
+            app.add_websocket_route(path, endpoint)
🧹 Nitpick comments (3)
examples/agents/auto_memory_wrapper/README.md (1)

146-151: Endpoint migration looks correct.

The curl example correctly updates to the new /v1/chat/completions endpoint, aligning with the migration strategy in the PR objectives.

💡 Optional: Add explicit stream parameter for clarity

Since the migration guide emphasizes "stream": (true|false), consider making this explicit in the example to better demonstrate the new endpoint's capabilities:

 curl -X POST http://localhost:8000/v1/chat/completions \
   -H "Content-Type: application/json" \
   -H "X-User-ID: test_user_123" \
   -H "conversation-id: test_conv_001" \
-  -d '{"messages": [{"role": "user", "content": "Hello!"}]}'
+  -d '{"messages": [{"role": "user", "content": "Hello!"}], "stream": false}'

This would help readers understand that they can control streaming behavior via the request body.

tests/nat/front_ends/fastapi/test_evaluate_endpoints.py (1)

32-38: Consider aligning fixture path with test expectations.

The fixture configures path="/evaluate" but tests use /v1/evaluate. This works because enable_legacy_routes=True by default, but for clarity and robustness, consider updating the fixture to use the versioned path directly.

♻️ Optional alignment
 `@pytest.fixture`(name="test_config")
 def test_config_fixture() -> Config:
     config = Config()
     config.general.front_end = FastApiFrontEndConfig(evaluate=FastApiFrontEndConfig.EndpointBase(
-        path="/evaluate", method="POST", description="Test evaluate endpoint"))
+        path="/v1/evaluate", method="POST", description="Test evaluate endpoint"))

     return config
src/nat/front_ends/fastapi/fastapi_front_end_plugin_worker.py (1)

437-437: Sort __all__ for consistency (Ruff RUF022).

♻️ Sorted __all__
-__all__ = ["FastApiFrontEndPluginWorkerBase", "FastApiFrontEndPluginWorker", "RouteInfo", "EvaluationRun"]
+__all__ = ["EvaluationRun", "FastApiFrontEndPluginWorker", "FastApiFrontEndPluginWorkerBase", "RouteInfo"]

| Header | Description |
|--------|-------------|
| `traceparent` | W3C Trace Context header: `00-<trace-id>-<span-id>-<flags>` |
| `X-NAT-Trace-ID` | NeMo Agent Toolkit specific trace identifier (32-character hex string) |
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use lowercase "toolkit" in body text.

Per the coding guidelines, "Toolkit" (capital T) should only be used in titles/headings. In body text, use lowercase "toolkit".

📝 Proposed fix
-| `X-NAT-Trace-ID` | NeMo Agent Toolkit specific trace identifier (32-character hex string) |
+| `X-NAT-Trace-ID` | NeMo Agent toolkit specific trace identifier (32-character hex string) |

Based on coding guidelines.

🤖 Prompt for AI Agents
In `@docs/source/run-workflows/observe/observability-propagation.md` at line 60,
The table cell currently uses "NeMo Agent Toolkit" with a capital "Toolkit";
update the body text to use lowercase "toolkit" so the cell reads "NeMo Agent
toolkit" (search for the string `X-NAT-Trace-ID` or `NeMo Agent Toolkit` in the
document and change only the word "Toolkit" to "toolkit" while keeping the rest
of the text unchanged).

| `traceparent` | W3C Trace Context header: `00-<trace-id>-<span-id>-<flags>` |
| `X-NAT-Trace-ID` | NeMo Agent Toolkit specific trace identifier (32-character hex string) |
| `X-NAT-Span-ID` | NeMo Agent Toolkit specific span identifier for parent-child linking |
| `X-NAT-Workflow-Run-ID` | Stable NeMo Agent Toolkit workflow run identifier (UUID string) |
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use lowercase "toolkit" in body text.

Per the coding guidelines, "Toolkit" (capital T) should only be used in titles/headings. In body text, use lowercase "toolkit".

📝 Proposed fix
-| `X-NAT-Workflow-Run-ID` | Stable NeMo Agent Toolkit workflow run identifier (UUID string) |
+| `X-NAT-Workflow-Run-ID` | Stable NeMo Agent toolkit workflow run identifier (UUID string) |

Based on coding guidelines.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
| `X-NAT-Workflow-Run-ID` | Stable NeMo Agent Toolkit workflow run identifier (UUID string) |
| `X-NAT-Workflow-Run-ID` | Stable NeMo Agent toolkit workflow run identifier (UUID string) |
🤖 Prompt for AI Agents
In `@docs/source/run-workflows/observe/observability-propagation.md` at line 62,
Change the body text in the table cell that currently reads "Stable NeMo Agent
Toolkit workflow run identifier (UUID string)" to use lowercase "toolkit" (i.e.,
"Stable NeMo Agent toolkit workflow run identifier (UUID string)"); update the
string associated with the `X-NAT-Workflow-Run-ID` table entry so the word
"Toolkit" is lowercase to comply with the coding guideline.


### Explicit Parent ID in Workflow Runner

When using NeMo Agent Toolkit programmatically (not through HTTP), you can explicitly pass parent information to the workflow runner:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use lowercase "toolkit" in body text.

Per the coding guidelines, "Toolkit" (capital T) should only be used in titles/headings. In body text, use lowercase "toolkit".

📝 Proposed fix
-When using NeMo Agent Toolkit programmatically (not through HTTP), you can explicitly pass parent information to the workflow runner:
+When using NeMo Agent toolkit programmatically (not through HTTP), you can explicitly pass parent information to the workflow runner:

Based on coding guidelines.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
When using NeMo Agent Toolkit programmatically (not through HTTP), you can explicitly pass parent information to the workflow runner:
When using NeMo Agent toolkit programmatically (not through HTTP), you can explicitly pass parent information to the workflow runner:
🤖 Prompt for AI Agents
In `@docs/source/run-workflows/observe/observability-propagation.md` at line 107,
Replace the capitalized word "Toolkit" with lowercase "toolkit" in the sentence
containing "NeMo Agent Toolkit programmatically (not through HTTP), you can
explicitly pass parent information to the workflow runner:" so the body text
reads "NeMo Agent toolkit" to conform to the guideline that "Toolkit" is only
capitalized in titles/headings; update the phrase wherever the exact string
"NeMo Agent Toolkit" appears in body text in this document.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking Breaking change feature request New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant