Add changes from vcha/stable#436
Open
vcharraut wants to merge 8 commits into
Open
Conversation
… clarity - Added 'amp' option to default.ini for automatic mixed precision support. - Introduced 'resume_state_path' in default.ini for state restoration. - Updated compilation settings in default.ini for better compatibility. - Refined Waypoint structure in datatypes.h for clarity. - Modified Drive class in drive.h to improve collision handling and agent initialization. - Enhanced observation handling in drive.py, including padded observations and traffic control features. - Implemented utility functions in pufferl.py for better device management and state handling. - Improved training state loading and saving mechanisms in PuffeRL class. - Adjusted training logic to support advanced features like mixed precision and dynamic batching.
…d training evaluation
…resource management
There was a problem hiding this comment.
Pull request overview
This PR appears to merge in “stable” changes that extend PufferDrive’s training loop with improved checkpoint/resume support, additional evaluation utilities (multi-scenario evaluation + CSV export), and several Drive environment/config updates.
Changes:
- Extend
PuffeRLwith precision/AMP handling, state dict key cleaning, richer checkpoint state, and resume-from-state support. - Add standalone multi-scenario evaluation helpers (config merging, overrides, CSV export, coverage verification, logging).
- Update Drive env observation construction/padding and configs (including new INI defaults and new weight config YAMLs).
Reviewed changes
Copilot reviewed 9 out of 45 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
weigths/tomate/config.yaml |
Adds a new experiment/config preset for training/eval. |
weigths/salade/config.yaml |
Adds another experiment/config preset for training/eval. |
pufferlib/pufferl.py |
Major training/eval refactor: AMP/precision validation, compile tweaks, checkpoint state v2 + RNG capture/restore, resume, and new multi-scenario eval utilities. |
pufferlib/ocean/torch.py |
Refactors encoder+pooling and aligns one-hot dtypes with continuous features. |
pufferlib/ocean/drive/drive.py |
Adjusts control_mode error message text. |
pufferlib/ocean/drive/drive.h |
Changes observation padding strategy and removes a zero-drivable-cells guard; minor control logic tweak. |
pufferlib/ocean/drive/datatypes.h |
Edits a struct field comment. |
pufferlib/config/ocean/drive.ini |
Updates map_dir and adds an [eval] section with multi-scenario eval config. |
pufferlib/config/default.ini |
Adds amp and resume_state_path defaults; changes torch.compile defaults. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+227
to
+240
| [eval] | ||
| ; Set to True to enable periodic multi-scenario evaluation during training | ||
| multi_scenario_eval = False | ||
| ; Frequency of evaluation during training (in epochs) | ||
| eval_interval = 25 | ||
| num_agents = 512 | ||
| ; Batch size for eval_multi_scenarios (number of scenarios per batch) | ||
| ; Path to dataset used for evaluation | ||
| map_dir = "pufferlib/resources/drive/binaries/eval" | ||
| ; Simulation mode for evaluation: "gigaflow" or "replay" | ||
| multi_scenario_simulation_mode = "replay" | ||
| ; Total number of scenarios to evaluate | ||
| multi_scenario_num_scenarios = 250 | ||
| backend = PufferEnv |
| else: | ||
| raise ValueError( | ||
| f"control_mode must be one of 'control_vehicles', 'control_agents', 'control_wosac', or 'control_sdc_only'. Got: {self.control_mode_str}" | ||
| f"control_mode must be one of 'control_vehicles', 'control_wosac', or 'control_agents'. Got: {self.control_mode_str}" |
| float sin_heading; // Cached sinf(heading) - set in build_path | ||
| float kappa; // Curvature at this point | ||
| int lane_idx; // Index of the lane this waypoint belongs to (for GT path) or closest to (for expert path) | ||
| int lane_idx; // Index of the lane this waypoint |
| if model_path: | ||
| experiment_dir = os.path.dirname(os.path.dirname(model_path)) | ||
| config_yaml_path = os.path.join(experiment_dir, "config.yaml") | ||
| EXCLUDE_KEYS = eval_overrides["env"].keys() |
Comment on lines
+2152
to
+2166
| # Multi-worker backend returns infos as list of lists (one per worker) | ||
| if infos and infos[0]: | ||
| for sub_env in infos: | ||
| for env_idx, summary in enumerate(sub_env): | ||
| env_map_name = summary["map_name"].split("/")[-1].split(".")[0] | ||
| summary["episode_id"] = env_idx | ||
| summary["map_name"] = env_map_name | ||
| scenarios_processed += 1 | ||
| pbar.update(1) | ||
|
|
||
| for k, v in summary.items(): | ||
| if k not in global_infos: | ||
| global_infos[k] = [] | ||
| global_infos[k].append(v) | ||
|
|
Comment on lines
+1947
to
+1953
| try: | ||
| df_episodes = pd.DataFrame(global_infos) | ||
| first_cols = ["episode_id", "map_name"] | ||
| other_cols = [col for col in df_episodes.columns if col not in first_cols] | ||
| new_col_order = first_cols + other_cols | ||
| df_episodes = df_episodes[new_col_order] | ||
|
|
| return; | ||
| } | ||
| int num_agents_to_create = env->num_controllable_agents; | ||
|
|
| static inline void fill_padded_observation_rows(float *obs, int rows, int features) { | ||
| for (int r = 0; r < rows; r++) { | ||
| for (int c = 0; c < features; c++) { | ||
| obs[r * features + c] = PADDED_OBSERVATION_VALUE; |
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add one-line comments to fill_padded_observation_rows / fill_padded_traffic_control_rows, and pull the road-edge heading fold into a reusable wrap_heading(angle) helper (folds a heading into [-pi/2, pi/2] so opposite directions map to one orientation). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The message omitted control_sdc_only (a valid mode → control_mode=3); list all four accepted values. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.