rsz: update failing test golden on PR 10248 by openroad-ci · Pull Request #10284 · The-OpenROAD-Project/OpenROAD

openroad-ci · 2026-04-28T14:22:04Z

This PR is based on the latest head of the original public PR:

rsz: bail repair_timing on obviously-futile designs #10248
Original PR head used here: 2d61012

The only additional change on top of that PR is the failing-test rebaseline commit:

1dfe516 rsz: update repair hold multi-output golden

This updates only src/rsz/test/repair_hold_multi_output_load.ok for the current timing output. The previously local repair_setup_stagnation test commit was intentionally excluded because PR #10248 already added a similar repair_setup_hopeless unit test.

Verification:

bazel test --cache_test_results=no --test_output=errors //src/rsz/test:repair_hold_multi_output_load-tcl_test

Result: PASSED

Adds a deterministic WNS-stagnation gate to RepairSetup::terminateProgress(). Best-so-far WNS is sampled every pass into a 200-pass ring buffer; if the best observed value has not improved by max(1 ps, 0.5% * |initial_wns|) over a full window, the gate returns true. Combined with the existing two-consecutive-termination rule this aborts the phase after ~1200 passes of no WNS movement on designs where tiny TNS twitches would otherwise keep the optimizer running forever. Motivated by automated architectural-exploration flows where the .sdc is held at an ambitious target while RTL evolves. Without this, a WNS gap of hundreds of ps grinds repair_timing for hours producing no useful movement and forcing users to guess SETUP_SLACK_MARGIN values to stop it. The gate fires only when no reasonable user would disagree that further effort is futile (1 ps absolute floor keeps tape-out grind untouched; the TNS fix-rate gate still owns termination near closure). On trip, a single loud INFO log (RSZ-0234/235/236) names the best-effort WNS and notes this is probably an exploration run. No new Tcl flag, no new ORFS env var - hardcoded conservative defaults. New test test/orfs/hopeless/ synthesizes 8 parallel 8-deep arithmetic pipelines on asap7 with a 50 ps clock and asserts both that the flow completes and that the gate log message appears (sh_test + grep). A check_same idempotent test guards the determinism property. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>

- RepairSetup.hh: include <cstddef> for size_t and <string> for std::string (include-cleaner warnings from clang-tidy CI bot). - RepairSetup.hh / RepairSetup.cc: name the stagnation-gate warmup threshold wns_stagnation_warmup_iterations_ instead of the bare 1000 literal, matching the style of the other tunables in this class. Not applied: gemini-code-assist's suggestion to change wns_stagnation_abs_tol_ from 1.0e-12f to 1.0e-3f. sta::Slack is measured in seconds (see src/sta/include/sta/Units.hh:73 "Sta internal units are always seconds, ..."), so 1.0e-12 s is 1 ps, which is the intended value. 1.0e-3 s would be 1 ms and would disable the gate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>

Wrap long sta::Slack initialization line that clang-format flagged on PR The-OpenROAD-Project#10248 after the gemini-code-assist suggestion was applied via the GitHub UI. Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>

The sliding-window check (window_best - window_oldest over 200 passes) mis-classified plateaus as futility: any real design whose WNS drops dramatically early then flat-lines at a topology-bound floor while TNS keeps improving (aes, clone_flat, repair_fanout*) tripped the gate and aborted repair_timing too early, leaving worse max_cap/max_slew slack and different .ok-file output. Replace with best_wns_ever vs initial_wns_: only fire when WNS has effectively never moved from its starting value, which is the real signature of an obviously-futile run. Also bump rel_tol 0.5% -> 5%. Empirically aes improves WNS by ~50% of initial, clone_flat by ~95%, repair_fanout by ~90%; the hopeless.v synthetic only moves WNS by ~2% because buffer insertion chips something off even a grossly-over-clocked design. 5% sits in the gap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>

Tiny TCL unit test exercising the WNS-stagnation gate added in this PR without scraping log strings. It builds a flat block of 1200 register-to-register pairs (DFF_X2 driver -> DFF_X2 sink, no logic in between), holds the cell library to _X1 / _X2 sizes, and sets a 1 ps clock period so that no SizeUpMove / BufferMove / pin-swap / clone / split-load / unbuffer move can improve WNS. The endpoint count is chosen so the inner-loop counter passes the gate's warmup before the legacy phase has visited every endpoint. Without the gate the same .tcl runs to iteration 1200 (one futile pass per endpoint); with the gate it exits at iteration 1002 and the .ok diverges, so the test fails without the fix and passes with it. Replaces the orfs-based test/orfs/hopeless gate-coverage check that maliberty asked to convert into a repair_setup-only unit test. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>

Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com> # Conflicts: # src/rsz/test/CMakeLists.txt

Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>

github-actions · 2026-04-28T14:26:42Z

clang-tidy review says "All clean, LGTM! 👍"

gemini-code-assist

Code Review

This pull request introduces a 'WNS-stagnation' gate to the repair_timing flow in RepairSetup. This mechanism detects and aborts 'obviously futile' optimization runs where the Worst Negative Slack (WNS) fails to improve significantly after a warmup period, preventing the tool from grinding indefinitely on infeasible designs. The changes include new tracking methods in RepairSetup, updated logging, and a new regression test suite to verify the gate's behavior. I have no further feedback to provide.

This reverts commit 1dfe516. Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>

Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>

…ate/OpenROAD into secure-pr-10248-add-unit-test Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>

github-actions · 2026-04-29T00:45:59Z

clang-tidy review says "All clean, LGTM! 👍"

Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>

github-actions · 2026-04-29T01:42:35Z

clang-tidy review says "All clean, LGTM! 👍"

The window-improvement and threshold values are pure functions of already-stored state (best_wns_, ring buffer, initial_wns_). Compute them locally in terminateProgress() and recompute on demand in wnsStagnationReport() instead of mirroring them on the class. Removes 2 members and the "result-cached-for-display" smell. Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>

github-actions · 2026-04-29T02:17:28Z

clang-tidy review says "All clean, LGTM! 👍"

jhkim-pii · 2026-04-29T02:57:34Z

Added features from the original PR #10248

Implemented the ring buffer w/ 200 entries to track 200 passes for WNS stagnation.
Minor code refactorings.

ORFS runs for QoR impact check

jhkim-pii · 2026-04-29T05:35:11Z

@oharboe
There are two issues.

1. WNS impact

Current WNS stagnation threshold 200 pass affects setup WNS (some designs showed better WNS while some other designs showed worse WNS in the dashboard).
200 pass threshold is insufficient??? Need more investigation.
--> Found the reason.
- Repair setup is used at multiple stages (floorplan, placement, cts, ...).
- The different optimization result in an early stage (e.g., floorplan) leads to different placement and different finish QoR.

2. TNS degradation

WNS stagnation checker does not consider the TNS improvement. So it causes TNS degradation.
I think WNS stagnation checker should not be enabled for LEGACY and LAST_GASP phases because it hurts TNS.

Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>

…-unit-test Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com> # Conflicts: # src/rsz/test/CMakeLists.txt

github-actions · 2026-04-29T08:14:08Z

clang-tidy review says "All clean, LGTM! 👍"

jhkim-pii · 2026-04-29T11:20:05Z

@oharboe I enabled your WNS stagnation checker for WNS phase only. If you do not take care of TNS, you can run repair_timing w/ WNS phase configuration. If it cannot satisfy your needs, please share your thoughts.

oharboe · 2026-04-29T11:23:11Z

@oharboe I enabled your WNS stagnation checker for WNS phase only. If you do not take care of TNS, you can run repair_timing w/ WNS phase configuration. If it cannot satisfy your needs, please share your thoughts.

Yes, I only have this problem when WNS < 0.

jhkim-pii · 2026-04-30T14:18:30Z

@oharboe
It looks like my explanation was insufficient.

The current WNS stagnation checker only monitors the WNS. It terminates the optimization if there is less than 0.5% WNS enhancement over 200 iterations.
Originally, WNS stagnation checker was enabled for LEGACY, LAST_GASP, and WNS phases.
LEGACY & LAST_GASP phases should improve both WNS & TNS.
But the early termination by WNS monitoring can hurt TNS in some designs. Some designs still have room to improve TNS further, but the WNS stagnation checker eliminates the TNS improvement opportunity.
So I applied the WNS stagnation checker only for WNS phase. It means that WNS stagnation checker is disabled in LEGACY & LAST_GASP policies.

I want to clarify these things.
A. In your experiment, are you interested in WNS only? (do not care how bad the TNS is)
If it is true, I think you can use WNS phase instead of LEGACY to ignore TNS.

B. If you care the TNS too, then the TNS monitoring logic enhancement is required.

So I wonder if you are ok w/ WNS phase (repair_timing -setup -phases "WNS"). Then, current version in this PR would be sufficient.

oharboe · 2026-04-30T15:07:49Z

When it is clearly hopeless to close timing(WNS=0), then I want the quick wins in WNS and TNS and then terminate. Terminating quickly is more important than the exact improvements in WNS and TNS, we just want to fix pathological fanout things that are quick to fix(such as pathological fanout?).

jhkim-pii · 2026-05-01T00:58:08Z

I see. If repair_timing -setup -phases "WNS" with this PR cannot resolve your issue, please let us know.
We can find another solution (e.g., another TNS stagnation checking logic, apply multi-threading for faster runtime at the expense of small QoR loss, ...).

oharboe · 2026-05-01T06:29:01Z

I see. If repair_timing -setup -phases "WNS" with this PR cannot resolve your issue, please let us know. We can find another solution (e.g., another TNS stagnation checking logic, apply multi-threading for faster runtime at the expense of small QoR loss, ...).

Do I have to configure something or change options?

The change I made originally would do this without configuration, which is important for the automated design space exploration and I think a better user experience overall.

oharboe and others added 9 commits April 24, 2026 10:31

Update src/rsz/src/RepairSetup.cc

41913a6

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>

Update src/rsz/src/RepairSetup.cc

8e5b2f2

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>

rsz: clang-format RepairSetup.cc

6466ae9

Wrap long sta::Slack initialization line that clang-format flagged on PR The-OpenROAD-Project#10248 after the gemini-code-assist suggestion was applied via the GitHub UI. Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>

Merge remote-tracking branch 'origin/master' into end-hopeless-repair

2d61012

Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com> # Conflicts: # src/rsz/test/CMakeLists.txt

rsz: update repair hold multi-output golden

1dfe516

Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>

openroad-ci assigned jhkim-pii Apr 28, 2026

github-actions Bot added the size/M label Apr 28, 2026

gemini-code-assist Bot reviewed Apr 28, 2026

View reviewed changes

jhkim-pii added 3 commits April 29, 2026 09:31

Revert "rsz: update repair hold multi-output golden"

63e089a

This reverts commit 1dfe516. Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>

test: remove obsolete ORFS hopeless test

d9a8a7a

Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>

Merge branch 'master' of https://github.com/The-OpenROAD-Project-priv…

bd54f17

…ate/OpenROAD into secure-pr-10248-add-unit-test Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>

jhkim-pii added 2 commits April 29, 2026 10:11

rsz: consolidate WNS stagnation reporting

a3e31bf

Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>

rsz: use ring buffer for WNS stagnation

b0cc76f

Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>

jhkim-pii mentioned this pull request Apr 29, 2026

rsz: bail repair_timing on obviously-futile designs #10248

Closed

jhkim-pii mentioned this pull request Apr 29, 2026

Update OpenROAD for repair setup WNS stagnation The-OpenROAD-Project/OpenROAD-flow-scripts#4198

Open

openroad-ci mentioned this pull request Apr 29, 2026

Update OpenROAD to master HEAD (baseline for PR #4198 QoR comparison) The-OpenROAD-Project/OpenROAD-flow-scripts#4199

Open

jhkim-pii added 3 commits April 29, 2026 16:33

rsz: limit WNS stagnation gate to WNS phase

cd2cf65

Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>

rsz: factor WNS stagnation check

c9c9ffa

Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>

Merge remote-tracking branch 'origin/master' into secure-pr-10248-add…

1ab6cc4

…-unit-test Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com> # Conflicts: # src/rsz/test/CMakeLists.txt

oharboe mentioned this pull request May 4, 2026

[rsz] repair_design: lib+odb screen + defer in-loop parasitic flush #10326

Closed

4 tasks

Conversation

openroad-ci commented Apr 28, 2026

Uh oh!

github-actions Bot commented Apr 28, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

github-actions Bot commented Apr 29, 2026

Uh oh!

github-actions Bot commented Apr 29, 2026

Uh oh!

github-actions Bot commented Apr 29, 2026

Uh oh!

jhkim-pii commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Added features from the original PR #10248

ORFS runs for QoR impact check

Uh oh!

jhkim-pii commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1. WNS impact

2. TNS degradation

Uh oh!

github-actions Bot commented Apr 29, 2026

Uh oh!

jhkim-pii commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oharboe commented Apr 29, 2026

Uh oh!

jhkim-pii commented Apr 30, 2026

Uh oh!

oharboe commented Apr 30, 2026

Uh oh!

jhkim-pii commented May 1, 2026

Uh oh!

oharboe commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jhkim-pii commented Apr 29, 2026 •

edited

Loading

jhkim-pii commented Apr 29, 2026 •

edited

Loading

jhkim-pii commented Apr 29, 2026 •

edited

Loading