Skip to content

Conversation

@rapids-bot
Copy link

@rapids-bot rapids-bot bot commented Jan 23, 2026

Forward-merge triggered by push to release/1.4 that creates a PR to keep develop up-to-date. If this PR is unable to be immediately merged due to conflicts, it will remain open for the team to manually merge. See forward-merger docs for more info.

#1464)

## Description

This PR completes the model migration started in PR #1425 by updating
the remaining 6 examples that were causing CI test failures. Updated 6 config files to use Nemotron 30B
Updated READMEs to reflect model changes.

## Fixes 20 CI test failures :
- mixture_of_agents (2 failures - timeout/hang)
- tool_calling-reasoning (1 failure - parsing error)
- simple_calculator (13 failures - recursion/parsing errors)
- simple_calculator_eval (1 failure - workflow interrupted)
- simple_web_query (1 failure - missing Action)
- simple_web_query_eval (1 failure - missing intermediate_steps)


## Root Cause
Llama 3.1/3.3 models struggle with the current ReAct prompt format,
causing parsing errors, recursion limits, and timeouts.

## Solution
Migrate to Nemotron models (proven to work in PR #1425):
- `meta/llama-3.1-405b-instruct` -> `nvidia/nemotron-3-nano-30b-a3b`
- `meta/llama-3.3-70b-instruct` -> `nvidia/nemotron-3-nano-30b-a3b`
- `meta/llama-3.1-70b-instruct` -> `nvidia/nemotron-3-nano-30b-a3b`

## Testing
Tested locally with mixture_of_agents example, completes successfully instead of hanging/timing out.

## By Submitting this PR I confirm:
- I am familiar with the [Contributing
Guidelines](https://github.com/NVIDIA/NeMo-Agent-Toolkit/blob/develop/docs/source/resources/contributing/index.md).
- We require that all contributors "sign-off" on their commits. This
certifies that the contribution is your original work, or you have
rights to submit it under the same license, or a compatible license.
- Any contribution which contains commits that are not Signed-Off will
not be accepted.
- When the PR is ready for review, new or existing tests cover these
changes.
- When the PR is ready for review, the documentation is up to date with
these changes.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Chores**
* Updated example configurations to use NVIDIA Nemotron models across
multiple examples; other settings (temperature, token limits, and
related parameters) remain unchanged.
* **Documentation**
* Updated example README content and evaluation docs to replace
references to previous Llama variants with NVIDIA Nemotron models and
removed corresponding legacy evaluation entries.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: mnajafian-nv <mnajafian@nvidia.com>
@rapids-bot rapids-bot bot requested a review from a team as a code owner January 23, 2026 20:01
@GPUtester GPUtester merged commit 009563d into develop Jan 23, 2026
@rapids-bot
Copy link
Author

rapids-bot bot commented Jan 23, 2026

SUCCESS - forward-merge complete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants