-
Notifications
You must be signed in to change notification settings - Fork 5
add reward manager #71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces a reward manager system for reinforcement learning training, enabling modular reward computation through configurable reward functors. The reward manager follows the same design pattern as the existing observation manager and event manager.
Changes:
- Added
RewardManagerclass to orchestrate reward computation with support for multiple weighted reward terms - Implemented 11 reusable reward functions covering distance-based rewards, penalties, and success bonuses
- Integrated reward manager into
EmbodiedEnvandBaseEnvfor automatic reward computation - Refactored
PushCubeEnvto use the reward manager instead of manual reward calculation - Added
randomize_target_posefunction for virtual goal poses without physical scene objects
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| embodichain/lab/gym/envs/managers/reward_manager.py | New reward manager class for orchestrating reward computation |
| embodichain/lab/gym/envs/managers/rewards.py | New module with 11 reward functor implementations |
| embodichain/lab/gym/envs/managers/cfg.py | Added RewardCfg configuration class |
| embodichain/lab/gym/envs/managers/init.py | Exported RewardCfg and RewardManager |
| embodichain/lab/gym/envs/embodied_env.py | Integrated reward manager initialization and reset |
| embodichain/lab/gym/envs/base_env.py | Added _extend_reward hook in get_reward method |
| embodichain/lab/gym/utils/gym_utils.py | Added reward parsing logic in load_gym_cfg |
| embodichain/lab/gym/envs/tasks/rl/push_cube.py | Refactored to use reward manager, removed manual reward code |
| embodichain/lab/gym/envs/managers/randomization/spatial.py | Added randomize_target_pose function |
| embodichain/lab/gym/envs/managers/observations.py | Added get_robot_ee_pose and target_position observation functions |
| configs/agents/rl/push_cube/gym_config.json | Updated to use reward manager configuration |
| configs/agents/rl/push_cube/train_config.json | Changed eval_freq from 2 to 200 |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: chenjian <chenjian@dexforce.com>
Description
This document summarizes the usage of RewardManager and ObservationManager in our RL training pipeline, following a simple and practical format for contributors.
Summary of Change
Motivation & Context
Usage in RL Training
RewardManager
gym_config.json).ObservationManager