Skip to content

Append trajectory row on cache hits#888

Open
rutayan-nv wants to merge 2 commits into
NVIDIA:mainfrom
rutayan-nv:rpatro/cache-trajectory-row
Open

Append trajectory row on cache hits#888
rutayan-nv wants to merge 2 commits into
NVIDIA:mainfrom
rutayan-nv:rpatro/cache-trajectory-row

Conversation

@rutayan-nv
Copy link
Copy Markdown
Contributor

Summary

CloudAIGymEnv.step() short-circuits when the requested action has already been evaluated, returning the cached reward/observation. Currently this path does not append a row to trajectory.csv or to self.current_trajectory, so the visible trajectory contains only the executed (non-cached) steps even though the agent observed agent_steps interactions.

Downstream consequences:

  • Step counts in reports / analysis.json diverge from the agent's actual interaction count.
  • The CSV row index is no longer a stable proxy for "agent step", breaking offline analysis tools that assume one row per call to step().
  • Any future offline-RL / behavior-cloning corpus built from trajectory.csv is missing transitions.

This PR writes a TrajectoryEntry for cache hits using the current step index, reusing the cached reward and observation. The log line is also expanded to record the originating step the cached result was taken from.

Changes

  • src/cloudai/configurator/cloudai_gym.py: append a TrajectoryEntry in the cache-hit branch of step(); log the source step.
  • tests/test_cloudaigym.py: add test_cached_step_appends_trajectory_row that primes env.trajectory, invokes step(), and asserts a new row appears in both the in-memory trajectory and trajectory.csv.

Test plan

  • pytest tests/test_cloudaigym.py — 27 passed locally.
  • ruff check on the touched files — clean.
  • CI green.

When `CloudAIGymEnv.step()` returns a previously-evaluated result from the
trajectory cache, the row was not written to `trajectory.csv` or appended to
`self.current_trajectory`. The agent observed `agent_steps` interactions, but
only the executed (non-cached) ones appeared in the visible trajectory,
breaking step counts, downstream reports, and any analysis that assumes one
row per agent step.

Write a `TrajectoryEntry` for cached hits using the current step index. The
cached step's reward and observation are reused, and a log entry indicates the
source step the cached result came from.

Add a unit test that primes the in-memory cache, runs `step()`, and asserts a
new row appears in both `env.trajectory[iteration]` and `trajectory.csv`.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 12, 2026

Review Change Stack
No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 8e1a1b3f-5762-4bc7-a3c7-7eeb5c4f932f

📥 Commits

Reviewing files that changed from the base of the PR and between 3b68974 and af2e9c4.

📒 Files selected for processing (2)
  • tests/test_cloudaigym.py
  • tests/test_handlers.py

📝 Walkthrough

Walkthrough

CloudAIGymEnv now records cached trajectory results to disk and in-memory state. On a cache hit the env logs the cached step, constructs a TrajectoryEntry, appends it via write_trajectory(), and returns the cached observation/reward. Tests verify the in-memory trajectory and trajectory.csv include the appended row.

Changes

Trajectory Recording for Cached Steps

Layer / File(s) Summary
Cached trajectory recording implementation
src/cloudai/configurator/cloudai_gym.py
In the cached-result branch of step(), logging now includes the cached step number, and the cached result is recorded via write_trajectory() with a constructed TrajectoryEntry before returning the cached observation and reward.
Cached trajectory recording validation
tests/test_cloudaigym.py, tests/test_handlers.py
Adds test_cached_step_appends_trajectory_row which asserts a cache-hit appends an entry for the current env.test_run.step and that trajectory.csv contains the header and final row; updates test_dse_run_cache expected trajectory rows to include the cached candidate row.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 I hopped through cached steps tonight,
Wrote every stride in rows of light,
A tiny entry, step by name,
So past rewards find lasting fame. ✨

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Append trajectory row on cache hits' directly and clearly summarizes the main change: appending trajectory rows when cached action results are reused.
Description check ✅ Passed The description is well-structured and directly related to the changeset, explaining the problem (missing trajectory rows on cache hits), the solution (appending TrajectoryEntry), and the files modified.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@rutayan-nv
Copy link
Copy Markdown
Contributor Author

@alexmanle FYI

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/test_cloudaigym.py (1)

397-423: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Assert that cache hits truly short-circuit execution.

This test verifies row appending, but it should also assert runner.run() is not invoked on cache hits to lock in the “skip execution” contract.

Suggested test hardening
 def test_cached_step_appends_trajectory_row(
     nemorun: NeMoRunTestDefinition, tmp_path: Path
 ) -> None:
@@
     runner = MagicMock(spec=BaseRunner)
     runner.scenario_root = tmp_path / "scenario"
     runner.system = MagicMock()
+    runner.run = MagicMock()
@@
     env.test_run.step = 5
     obs, reward, done, _info = env.step(cached_action)
+    runner.run.assert_not_called()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_cloudaigym.py` around lines 397 - 423, The test is missing an
assertion that cached actions skip execution; update the test for CloudAIGymEnv
so that after creating the MagicMock runner and populating env.trajectory with
the cached_action, you assert runner.run was not called when invoking
env.step(cached_action) (use runner.run.assert_not_called() or equivalent) to
ensure the cache hit short-circuits execution; keep existing assertions about
reward/obs/trajectory/csv intact and place the runner.run assertion right after
env.step(cached_action) and before any further state checks.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@tests/test_cloudaigym.py`:
- Around line 397-423: The test is missing an assertion that cached actions skip
execution; update the test for CloudAIGymEnv so that after creating the
MagicMock runner and populating env.trajectory with the cached_action, you
assert runner.run was not called when invoking env.step(cached_action) (use
runner.run.assert_not_called() or equivalent) to ensure the cache hit
short-circuits execution; keep existing assertions about
reward/obs/trajectory/csv intact and place the runner.run assertion right after
env.step(cached_action) and before any further state checks.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 301fa07a-ed11-4f7d-b249-bfeff73f44f6

📥 Commits

Reviewing files that changed from the base of the PR and between a44c791 and 3b68974.

📒 Files selected for processing (2)
  • src/cloudai/configurator/cloudai_gym.py
  • tests/test_cloudaigym.py

…e short-circuit

- tests/test_cloudaigym.py: collapse `test_cached_step_appends_trajectory_row`
  signature onto one line to satisfy `ruff format`; add
  `runner.run.assert_not_called()` to lock in the cache short-circuit contract
  (CodeRabbit suggestion).
- tests/test_handlers.py: `test_dse_run_cache` previously asserted that the
  cached step (step 2) was absent from `trajectory.csv`. That codified the
  buggy behavior this PR fixes. Expect three rows now, one per agent step.
@rutayan-nv
Copy link
Copy Markdown
Contributor Author

@coderabbitai resolved

done - af2e9c4

Addressed in af2e9c4 — added runner.run.assert_not_called() immediately after env.step(cached_action) in test_cached_step_appends_trajectory_row, before the remaining reward / observation / trajectory / csv assertions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant