Skip to content

Add changes from vcha/stable#436

Open
vcharraut wants to merge 8 commits into
emerge/temp_trainingfrom
vcha/update
Open

Add changes from vcha/stable#436
vcharraut wants to merge 8 commits into
emerge/temp_trainingfrom
vcha/update

Conversation

@vcharraut
Copy link
Copy Markdown
Collaborator

No description provided.

vcharraut added 5 commits May 21, 2026 18:00
… clarity

- Added 'amp' option to default.ini for automatic mixed precision support.
- Introduced 'resume_state_path' in default.ini for state restoration.
- Updated compilation settings in default.ini for better compatibility.
- Refined Waypoint structure in datatypes.h for clarity.
- Modified Drive class in drive.h to improve collision handling and agent initialization.
- Enhanced observation handling in drive.py, including padded observations and traffic control features.
- Implemented utility functions in pufferl.py for better device management and state handling.
- Improved training state loading and saving mechanisms in PuffeRL class.
- Adjusted training logic to support advanced features like mixed precision and dynamic batching.
Copilot AI review requested due to automatic review settings May 21, 2026 16:25
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR appears to merge in “stable” changes that extend PufferDrive’s training loop with improved checkpoint/resume support, additional evaluation utilities (multi-scenario evaluation + CSV export), and several Drive environment/config updates.

Changes:

  • Extend PuffeRL with precision/AMP handling, state dict key cleaning, richer checkpoint state, and resume-from-state support.
  • Add standalone multi-scenario evaluation helpers (config merging, overrides, CSV export, coverage verification, logging).
  • Update Drive env observation construction/padding and configs (including new INI defaults and new weight config YAMLs).

Reviewed changes

Copilot reviewed 9 out of 45 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
weigths/tomate/config.yaml Adds a new experiment/config preset for training/eval.
weigths/salade/config.yaml Adds another experiment/config preset for training/eval.
pufferlib/pufferl.py Major training/eval refactor: AMP/precision validation, compile tweaks, checkpoint state v2 + RNG capture/restore, resume, and new multi-scenario eval utilities.
pufferlib/ocean/torch.py Refactors encoder+pooling and aligns one-hot dtypes with continuous features.
pufferlib/ocean/drive/drive.py Adjusts control_mode error message text.
pufferlib/ocean/drive/drive.h Changes observation padding strategy and removes a zero-drivable-cells guard; minor control logic tweak.
pufferlib/ocean/drive/datatypes.h Edits a struct field comment.
pufferlib/config/ocean/drive.ini Updates map_dir and adds an [eval] section with multi-scenario eval config.
pufferlib/config/default.ini Adds amp and resume_state_path defaults; changes torch.compile defaults.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +227 to +240
[eval]
; Set to True to enable periodic multi-scenario evaluation during training
multi_scenario_eval = False
; Frequency of evaluation during training (in epochs)
eval_interval = 25
num_agents = 512
; Batch size for eval_multi_scenarios (number of scenarios per batch)
; Path to dataset used for evaluation
map_dir = "pufferlib/resources/drive/binaries/eval"
; Simulation mode for evaluation: "gigaflow" or "replay"
multi_scenario_simulation_mode = "replay"
; Total number of scenarios to evaluate
multi_scenario_num_scenarios = 250
backend = PufferEnv
Comment thread pufferlib/ocean/drive/drive.py Outdated
else:
raise ValueError(
f"control_mode must be one of 'control_vehicles', 'control_agents', 'control_wosac', or 'control_sdc_only'. Got: {self.control_mode_str}"
f"control_mode must be one of 'control_vehicles', 'control_wosac', or 'control_agents'. Got: {self.control_mode_str}"
float sin_heading; // Cached sinf(heading) - set in build_path
float kappa; // Curvature at this point
int lane_idx; // Index of the lane this waypoint belongs to (for GT path) or closest to (for expert path)
int lane_idx; // Index of the lane this waypoint
Comment thread pufferlib/pufferl.py
if model_path:
experiment_dir = os.path.dirname(os.path.dirname(model_path))
config_yaml_path = os.path.join(experiment_dir, "config.yaml")
EXCLUDE_KEYS = eval_overrides["env"].keys()
Comment thread pufferlib/pufferl.py
Comment on lines +2152 to +2166
# Multi-worker backend returns infos as list of lists (one per worker)
if infos and infos[0]:
for sub_env in infos:
for env_idx, summary in enumerate(sub_env):
env_map_name = summary["map_name"].split("/")[-1].split(".")[0]
summary["episode_id"] = env_idx
summary["map_name"] = env_map_name
scenarios_processed += 1
pbar.update(1)

for k, v in summary.items():
if k not in global_infos:
global_infos[k] = []
global_infos[k].append(v)

Comment thread pufferlib/pufferl.py
Comment on lines +1947 to +1953
try:
df_episodes = pd.DataFrame(global_infos)
first_cols = ["episode_id", "map_name"]
other_cols = [col for col in df_episodes.columns if col not in first_cols]
new_col_order = first_cols + other_cols
df_episodes = df_episodes[new_col_order]

return;
}
int num_agents_to_create = env->num_controllable_agents;

static inline void fill_padded_observation_rows(float *obs, int rows, int features) {
for (int r = 0; r < rows; r++) {
for (int c = 0; c < features; c++) {
obs[r * features + c] = PADDED_OBSERVATION_VALUE;
Eugene Vinitsky and others added 3 commits May 21, 2026 12:07
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add one-line comments to fill_padded_observation_rows /
fill_padded_traffic_control_rows, and pull the road-edge heading fold into a
reusable wrap_heading(angle) helper (folds a heading into [-pi/2, pi/2] so
opposite directions map to one orientation).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The message omitted control_sdc_only (a valid mode → control_mode=3); list
all four accepted values.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants