Append trajectory row on cache hits#888
Conversation
When `CloudAIGymEnv.step()` returns a previously-evaluated result from the trajectory cache, the row was not written to `trajectory.csv` or appended to `self.current_trajectory`. The agent observed `agent_steps` interactions, but only the executed (non-cached) ones appeared in the visible trajectory, breaking step counts, downstream reports, and any analysis that assumes one row per agent step. Write a `TrajectoryEntry` for cached hits using the current step index. The cached step's reward and observation are reused, and a log entry indicates the source step the cached result came from. Add a unit test that primes the in-memory cache, runs `step()`, and asserts a new row appears in both `env.trajectory[iteration]` and `trajectory.csv`.
|
@alexmanle FYI |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/test_cloudaigym.py (1)
397-423: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick winAssert that cache hits truly short-circuit execution.
This test verifies row appending, but it should also assert
runner.run()is not invoked on cache hits to lock in the “skip execution” contract.Suggested test hardening
def test_cached_step_appends_trajectory_row( nemorun: NeMoRunTestDefinition, tmp_path: Path ) -> None: @@ runner = MagicMock(spec=BaseRunner) runner.scenario_root = tmp_path / "scenario" runner.system = MagicMock() + runner.run = MagicMock() @@ env.test_run.step = 5 obs, reward, done, _info = env.step(cached_action) + runner.run.assert_not_called()🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_cloudaigym.py` around lines 397 - 423, The test is missing an assertion that cached actions skip execution; update the test for CloudAIGymEnv so that after creating the MagicMock runner and populating env.trajectory with the cached_action, you assert runner.run was not called when invoking env.step(cached_action) (use runner.run.assert_not_called() or equivalent) to ensure the cache hit short-circuits execution; keep existing assertions about reward/obs/trajectory/csv intact and place the runner.run assertion right after env.step(cached_action) and before any further state checks.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@tests/test_cloudaigym.py`:
- Around line 397-423: The test is missing an assertion that cached actions skip
execution; update the test for CloudAIGymEnv so that after creating the
MagicMock runner and populating env.trajectory with the cached_action, you
assert runner.run was not called when invoking env.step(cached_action) (use
runner.run.assert_not_called() or equivalent) to ensure the cache hit
short-circuits execution; keep existing assertions about
reward/obs/trajectory/csv intact and place the runner.run assertion right after
env.step(cached_action) and before any further state checks.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: 301fa07a-ed11-4f7d-b249-bfeff73f44f6
📒 Files selected for processing (2)
src/cloudai/configurator/cloudai_gym.pytests/test_cloudaigym.py
…e short-circuit - tests/test_cloudaigym.py: collapse `test_cached_step_appends_trajectory_row` signature onto one line to satisfy `ruff format`; add `runner.run.assert_not_called()` to lock in the cache short-circuit contract (CodeRabbit suggestion). - tests/test_handlers.py: `test_dse_run_cache` previously asserted that the cached step (step 2) was absent from `trajectory.csv`. That codified the buggy behavior this PR fixes. Expect three rows now, one per agent step.
|
@coderabbitai resolved done - af2e9c4 Addressed in af2e9c4 — added |
Summary
CloudAIGymEnv.step()short-circuits when the requested action has already been evaluated, returning the cached reward/observation. Currently this path does not append a row totrajectory.csvor toself.current_trajectory, so the visible trajectory contains only the executed (non-cached) steps even though the agent observedagent_stepsinteractions.Downstream consequences:
analysis.jsondiverge from the agent's actual interaction count.step().trajectory.csvis missing transitions.This PR writes a
TrajectoryEntryfor cache hits using the current step index, reusing the cached reward and observation. The log line is also expanded to record the originating step the cached result was taken from.Changes
src/cloudai/configurator/cloudai_gym.py: append aTrajectoryEntryin the cache-hit branch ofstep(); log the source step.tests/test_cloudaigym.py: addtest_cached_step_appends_trajectory_rowthat primesenv.trajectory, invokesstep(), and asserts a new row appears in both the in-memory trajectory andtrajectory.csv.Test plan
pytest tests/test_cloudaigym.py— 27 passed locally.ruff checkon the touched files — clean.