rsz: update failing test golden on PR 10248#10284
rsz: update failing test golden on PR 10248#10284openroad-ci wants to merge 18 commits intoThe-OpenROAD-Project:masterfrom
Conversation
Adds a deterministic WNS-stagnation gate to RepairSetup::terminateProgress(). Best-so-far WNS is sampled every pass into a 200-pass ring buffer; if the best observed value has not improved by max(1 ps, 0.5% * |initial_wns|) over a full window, the gate returns true. Combined with the existing two-consecutive-termination rule this aborts the phase after ~1200 passes of no WNS movement on designs where tiny TNS twitches would otherwise keep the optimizer running forever. Motivated by automated architectural-exploration flows where the .sdc is held at an ambitious target while RTL evolves. Without this, a WNS gap of hundreds of ps grinds repair_timing for hours producing no useful movement and forcing users to guess SETUP_SLACK_MARGIN values to stop it. The gate fires only when no reasonable user would disagree that further effort is futile (1 ps absolute floor keeps tape-out grind untouched; the TNS fix-rate gate still owns termination near closure). On trip, a single loud INFO log (RSZ-0234/235/236) names the best-effort WNS and notes this is probably an exploration run. No new Tcl flag, no new ORFS env var - hardcoded conservative defaults. New test test/orfs/hopeless/ synthesizes 8 parallel 8-deep arithmetic pipelines on asap7 with a 50 ps clock and asserts both that the flow completes and that the gate log message appears (sh_test + grep). A check_same idempotent test guards the determinism property. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
- RepairSetup.hh: include <cstddef> for size_t and <string> for std::string (include-cleaner warnings from clang-tidy CI bot). - RepairSetup.hh / RepairSetup.cc: name the stagnation-gate warmup threshold wns_stagnation_warmup_iterations_ instead of the bare 1000 literal, matching the style of the other tunables in this class. Not applied: gemini-code-assist's suggestion to change wns_stagnation_abs_tol_ from 1.0e-12f to 1.0e-3f. sta::Slack is measured in seconds (see src/sta/include/sta/Units.hh:73 "Sta internal units are always seconds, ..."), so 1.0e-12 s is 1 ps, which is the intended value. 1.0e-3 s would be 1 ms and would disable the gate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
Wrap long sta::Slack initialization line that clang-format flagged on PR The-OpenROAD-Project#10248 after the gemini-code-assist suggestion was applied via the GitHub UI. Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
The sliding-window check (window_best - window_oldest over 200 passes) mis-classified plateaus as futility: any real design whose WNS drops dramatically early then flat-lines at a topology-bound floor while TNS keeps improving (aes, clone_flat, repair_fanout*) tripped the gate and aborted repair_timing too early, leaving worse max_cap/max_slew slack and different .ok-file output. Replace with best_wns_ever vs initial_wns_: only fire when WNS has effectively never moved from its starting value, which is the real signature of an obviously-futile run. Also bump rel_tol 0.5% -> 5%. Empirically aes improves WNS by ~50% of initial, clone_flat by ~95%, repair_fanout by ~90%; the hopeless.v synthetic only moves WNS by ~2% because buffer insertion chips something off even a grossly-over-clocked design. 5% sits in the gap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
Tiny TCL unit test exercising the WNS-stagnation gate added in this PR without scraping log strings. It builds a flat block of 1200 register-to-register pairs (DFF_X2 driver -> DFF_X2 sink, no logic in between), holds the cell library to _X1 / _X2 sizes, and sets a 1 ps clock period so that no SizeUpMove / BufferMove / pin-swap / clone / split-load / unbuffer move can improve WNS. The endpoint count is chosen so the inner-loop counter passes the gate's warmup before the legacy phase has visited every endpoint. Without the gate the same .tcl runs to iteration 1200 (one futile pass per endpoint); with the gate it exits at iteration 1002 and the .ok diverges, so the test fails without the fix and passes with it. Replaces the orfs-based test/orfs/hopeless gate-coverage check that maliberty asked to convert into a repair_setup-only unit test. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com> # Conflicts: # src/rsz/test/CMakeLists.txt
Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>
|
clang-tidy review says "All clean, LGTM! 👍" |
There was a problem hiding this comment.
Code Review
This pull request introduces a 'WNS-stagnation' gate to the repair_timing flow in RepairSetup. This mechanism detects and aborts 'obviously futile' optimization runs where the Worst Negative Slack (WNS) fails to improve significantly after a warmup period, preventing the tool from grinding indefinitely on infeasible designs. The changes include new tracking methods in RepairSetup, updated logging, and a new regression test suite to verify the gate's behavior. I have no further feedback to provide.
This reverts commit 1dfe516. Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>
Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>
…ate/OpenROAD into secure-pr-10248-add-unit-test Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>
|
clang-tidy review says "All clean, LGTM! 👍" |
Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>
Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>
|
clang-tidy review says "All clean, LGTM! 👍" |
The window-improvement and threshold values are pure functions of already-stored state (best_wns_, ring buffer, initial_wns_). Compute them locally in terminateProgress() and recompute on demand in wnsStagnationReport() instead of mirroring them on the class. Removes 2 members and the "result-cached-for-display" smell. Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>
|
clang-tidy review says "All clean, LGTM! 👍" |
|
@oharboe 1. WNS impact
2. TNS degradation
|
Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>
Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>
…-unit-test Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com> # Conflicts: # src/rsz/test/CMakeLists.txt
|
clang-tidy review says "All clean, LGTM! 👍" |
|
@oharboe I enabled your WNS stagnation checker for |
Yes, I only have this problem when WNS < 0. |
|
@oharboe
I want to clarify these things. B. If you care the TNS too, then the TNS monitoring logic enhancement is required. So I wonder if you are ok w/ |
|
When it is clearly hopeless to close timing(WNS=0), then I want the quick wins in WNS and TNS and then terminate. Terminating quickly is more important than the exact improvements in WNS and TNS, we just want to fix pathological fanout things that are quick to fix(such as pathological fanout?). |
|
I see. If |
Do I have to configure something or change options? The change I made originally would do this without configuration, which is important for the automated design space exploration and I think a better user experience overall. |
This PR is based on the latest head of the original public PR:
The only additional change on top of that PR is the failing-test rebaseline commit:
This updates only src/rsz/test/repair_hold_multi_output_load.ok for the current timing output. The previously local repair_setup_stagnation test commit was intentionally excluded because PR #10248 already added a similar repair_setup_hopeless unit test.
Verification:
bazel test --cache_test_results=no --test_output=errors //src/rsz/test:repair_hold_multi_output_load-tcl_testResult: PASSED