Skip to content

Conversation

@ananthsub
Copy link
Contributor

@ananthsub ananthsub commented Jan 12, 2026

What does this PR do ?

Part of NVIDIA-NeMo/Gym#292

This PR documents the NeMo RL + Gym integration, which includes:

  1. The Ray actor bridge code in RL that initializes & launches Gym, and how Gym re-uses the Ray cluster info
  2. How RL prepares its vLLM servers for Gym to proxy through to, so inference logic is contained within RL
  3. The training loop flow for how RL sends request data to Gym and how the data is translated between Gym and RL formats

Issues

NVIDIA-NeMo/Gym#292

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
  • Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

  • ...

Signed-off-by: Ananth Subramaniam <ansubramania@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant