Skip to content

[CTS] Fix cloned ICG location collision in mapLocationToSink_#9864

Open
openroad-ci wants to merge 4 commits intoThe-OpenROAD-Project:masterfrom
The-OpenROAD-Project-staging:secure-fix-cloned-icg-loc-coll
Open

[CTS] Fix cloned ICG location collision in mapLocationToSink_#9864
openroad-ci wants to merge 4 commits intoThe-OpenROAD-Project:masterfrom
The-OpenROAD-Project-staging:secure-fix-cloned-icg-loc-coll

Conversation

@openroad-ci
Copy link
Collaborator

Summary

  • When cloneClockGaters() places multiple ICG clones at the same sinksBbox center, the single-valued mapLocationToSink_ map in HTreeBuilder silently overwrites earlier entries, disconnecting ICGs from the clock tree.
  • Add resolveLocationCollision() to ensure each clone gets a unique position by tracking all occupied instance positions and shifting colliding clones by site width.
  • Physical overlap is resolved by the subsequent detailed_placement step in the ORFS flow.

Test plan

  • New colocated_sinks regression test that reproduces the bug with two ICGs whose sink bounding boxes share the same center
  • Full CTS regression (52/52 pass)
  • Full OpenROAD regression (all pass, 0 failures)
  • CI

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a bug where multiple ICG clones placed at the same location would cause an entry in mapLocationToSink_ to be overwritten, effectively disconnecting one of the ICGs. The fix introduces a collision resolution mechanism that tracks occupied positions and shifts new clones to a unique location if a collision is detected. The changes look good and are supported by a new regression test. I have a couple of suggestions: one to improve performance by avoiding redundant work, and another to prevent a potential infinite loop in the collision resolution logic.

Comment on lines +1201 to +1204
int shift = 1;
if (!block_->getRows().empty()) {
shift = block_->getRows().begin()->getSite()->getWidth();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There's a potential for an infinite loop here. If block_->getRows() is not empty but block_->getRows().begin()->getSite()->getWidth() returns 0, shift will be set to 0. If a location collision occurs, the while loop on line 1205 will never terminate because cloneLoc will not be updated.

To make this more robust, you should ensure shift is always a positive value.

  int shift = 1;
  if (!block_->getRows().empty()) {
    const int site_width = block_->getRows().begin()->getSite()->getWidth();
    if (site_width > 0) {
      shift = site_width;
    }
  }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved by removing the use of getWidth().

Comment on lines +858 to +870
void TritonCTS::cloneClockGaters(odb::dbNet* clkNet)
{
// Seed with all existing instance positions to prevent clones from
// landing on a pre-existing cell and causing mapLocationToSink_
// key collision in HTreeBuilder.
std::set<odb::Point> occupiedPositions;
for (odb::dbInst* inst : block_->getInsts()) {
int x, y;
inst->getLocation(x, y);
occupiedPositions.emplace(x, y);
}
cloneClockGaters(clkNet, occupiedPositions);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This function iterates over all instances in the block to populate occupiedPositions. Since cloneClockGaters is called for each clock net within populateTritonCTS, this can lead to significant performance degradation on designs with many clock nets and instances, as the instance list is traversed repeatedly.

To optimize this, I recommend creating the occupiedPositions set once in populateTritonCTS before the clock net loop and passing it to the overloaded cloneClockGaters function. This would make the population of existing instance locations a one-time operation.

References
  1. When analyzing code within a loop, consider the entire loop structure. Operations at the beginning of the loop can affect the logic within the loop body, potentially making similar calls inside the loop redundant.

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

1 similar comment
@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@jhkim-pii
Copy link
Contributor

Fixes #9816

Failure Sequence

  1. cloneClockGaters() / findLongEdges() — Two or more ICG clones are placed at the same sinksBbox.center() because their sinks are arranged symmetrically around an identical center point.
  2. HTreeBuilder::preSinkClustering() — Populates mapLocationToSink_[normLoc] = &sink. Since the map is single-valued (std::map<Point<double>, ClockInst*>), the second ICG at the same position silently overwrites the first entry.
  3. HTreeBuilder::createClockSubNets() — Looks up sinks via mapLocationToSink_[loc]. The overwritten ICG is never found, so it is not connected to any CTS buffer driver. Its clock input pin remains disconnected.
  4. Kepler Formal (LEC) — Detects the unconnected clock input pin (e.g., .A2 on AND-based gating cells in mempool_group) as a connectivity mismatch between RTL and netlist.
// Connected clone (correct)
AND2_X1 \clone_11_..._11727_
    (.A1(\..._WORD_ITER[2].CG_Inst.clk_en ),
     .A2(\clknet_leaf_164_..._06692_ ),
     .ZN(\..._WORD_ITER[2].CG_Inst.clk_o ));

// Disconnected clone (BUG)
AND2_X1 \clone_10_..._11727_
    (.A1(\..._WORD_ITER[2].CG_Inst.clk_en ),
     .ZN(\..._WORD_ITER[2].CG_Inst.clk_o ));
// NOTE: .A2 missing — clock input is floating

Root Cause

cloneClockGaters() in TritonCTS.cpp clones ICG (Integrated Clock Gating) cells and places each clone at the center of its sink bounding box (sinksBbox.center()). When two or more ICGs drive sinks arranged symmetrically around the same center point, all clones are placed at the identical location.

Later, HTreeBuilder::preSinkClustering() populates mapLocationToSink_ — a std::map<Point<double>, ClockInst*> — keyed by each sink's position. Since this is a single-valued map, when two ICG clones share the same position, the second insert silently overwrites the first. The overwritten ICG is then lost from the clock tree: its clock input pin is never connected to a CTS buffer driver, leaving it disconnected.

This manifests as missing clock connections on ICG clock input pins in designs with symmetrically placed gated clock sinks (e.g., mempool_group).

Solution

  1. resolveLocationCollision() — a new private method in TritonCTS (named to match HTreeBuilder::resolveLocationCollision() which solves the same problem for CTS buffers). After placing a clone at sinksBbox.center(), this method checks an occupiedPositions set and shifts the clone by one site width until a unique position is found.

  2. occupiedPositions set — seeded in the cloneClockGaters() wrapper with all existing instance positions in the block, then passed by reference through the recursive cloning. This prevents collisions against both other clones and pre-existing placed cells.

  3. No DPL dependency — physical legalization (site alignment, overlap removal) is handled by the detailed_placement call that follows CTS in the ORFS flow (cts.tcl line 49). The fix only ensures unique positions for mapLocationToSink_ correctness.

When cloneClockGaters() places multiple ICG clones at the same
sinksBbox center, the single-valued mapLocationToSink_ map in
HTreeBuilder silently overwrites earlier entries, disconnecting
ICGs from the clock tree.

Add resolveLocationCollision() to ensure each clone gets a unique
position by tracking all occupied instance positions and shifting
colliding clones by site width. Physical overlap is resolved by
the subsequent detailed_placement step in the ORFS flow.

Add colocated_sinks regression test that reproduces the bug with
two ICGs whose sink bounding boxes share the same center.

Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>
Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>
Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>
Avoid re-iterating all block instances per clock net by seeding
the occupiedPositions set once in populateTritonCTS() before the
clock net loop, instead of inside the per-net wrapper.

Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>
@openroad-ci openroad-ci force-pushed the secure-fix-cloned-icg-loc-coll branch from 4c1ab98 to 6be2024 Compare March 21, 2026 09:02
@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@jhkim-pii
Copy link
Contributor

@arthurjolo I found cloned ICG cell location collision, which is similar to the previous issue #9313 .
Could you please review this PR?

@jhkim-pii
Copy link
Contributor

@arthurjolo I found this during the debugging.

[ Current: Allow cell overlaps during CTS + DPL ]

  • Cloned ICGs and new clock buffers are placed during clock_tree_synthesis.- But they can have cell overlap during clock_tree_synthesis.- After clock_tree_synthesis, detailed_placement is called to resolve cell overlaps.- ISSUE: The cell displacement after detailed_placement can increase clock skew.

[ Proposed: Just-in-time legalization when a new cell is inserted during CTS ]

  • If incremental legalization is executed whenever a new cell is inserted during clock_tree_synthesis, no detailed_placement command call is required and there will be no clock skew degradation by detailed_placement.
  • Does this make sense?
  • Or, is there a difficulty in performing the incremental legalization right after inserting a new cell during CTS?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants