Skip to content

Optimize Valid, Decoupled, and Irrevocable by using () => gen#5205

Open
jackkoenig wants to merge 2 commits intomainfrom
jackkoenig/faster-valid-and-decoupled
Open

Optimize Valid, Decoupled, and Irrevocable by using () => gen#5205
jackkoenig wants to merge 2 commits intomainfrom
jackkoenig/faster-valid-and-decoupled

Conversation

@jackkoenig
Copy link
Contributor

@jackkoenig jackkoenig commented Feb 11, 2026

This is an API deprecation but it's effect is a performance improvement. See the release notes for the explanation.

The performance problem being fixed here is a really tough one, it can come up in user-defined types and I don't think there's an easy way to fix it in Chisel for all users. Fixing it for Valid, Decoupled, and Irrevocable (but mainly Valid because users sometimes nest that) helps, and it provides an example of what to do, but the fundamental problem remains. I don't really know how to fix it fundamentally other than by:

  1. Require Aligned/Flipped or Field - then autoclonetype wouldn't need to clone
  2. Change Chisel's type system to return different values for types vs. values
  3. Detect in the compiler plugin if calling clonetype on arguments is necessary--this could be really tricky...
  4. Insert something like "CloneIfNecessary" in the plugin for each field, it'd be a bit of runtime overhead for every single Bundle field (Unless we do something like (3) to try to detect when it's not needed).
  5. Encourage use of gen: () => T pattern rather than gen: T (we could do this in compiler plugin as a warning)

Using this little benchmark that I had an AI generate: https://gist.github.com/jackkoenig/12f6d6bfeee04f00404e4563898957ae

Depth Before (ms) After (ms) Speedup
1 0.017 0.018 0.9x
2 0.144 0.022 6.6x
3 0.249 0.026 9.7x
4 0.249 0.029 8.7x
5 0.319 0.032 9.9x
6 0.601 0.036 16.7x
7 1.155 0.041 28.0x
8 2.363 0.043 54.7x
9 4.782 0.054 88.6x
10 9.573 0.061 157x

Deep nesting comparison (depth 8 vs depth 1):

  • Before: 187x slower at depth 8 (exponential O(2^n) scaling)
  • After: 2.7x slower at depth 8 (linear O(n) scaling)

Contributor Checklist

  • Did you add Scaladoc to every public function/method?
  • Did you add at least one test demonstrating the PR?
  • Did you delete any extraneous printlns/debugging code?
  • Did you specify the type of improvement?
  • Did you add appropriate documentation in docs/src?
  • Did you request a desired merge strategy?
  • Did you add text to be included in the Release Notes for this change?

Type of Improvement

  • API deprecation
  • Performance improvement

Desired Merge Strategy

  • Squash

Release Notes

This involves a change to the type of the default constructor to pass a function instead of just a Data object. This fixes an O(n^2) issue where gen needed to be cloned twice. Once because the compiler plugin has to be conservative and clone, but then second because gen is an eager argument, it is constructed before the Output(...) invocation so Output also has to clone. By using () => gen, the compiler plugin will no longer clone the Data. The function will be called in a way that Output can detect if the result is fresh or not so it will only be cloned once.

Reviewer Checklist (only modified by reviewer)

  • Did you add the appropriate labels? (Select the most appropriate one based on the "Type of Improvement")
  • Did you mark the proper milestone (Bug fix: 3.6.x, 5.x, or 6.x depending on impact, API modification or big change: 7.0)?
  • Did you review?
  • Did you check whether all relevant Contributor checkboxes have been checked?
  • Did you do one of the following when ready to merge:
    • Squash: You/ the contributor Enable auto-merge (squash) and clean up the commit message.
    • Merge: Ensure that contributor has cleaned up their commit history, then merge with Create a merge commit.

This involves a change to the type of the default constructor to pass a
function instead of just a Data object. This fixes an O(n^2) issue where
gen needed to be cloned twice. Once because the compiler plugin has to be
conservative and clone, but then second because gen is an eager
argument, it is constructed before the Output(...) invocation so Output
also has to clone. By using () => gen, the compiler plugin will no
longer clone the Data. The function will be called in a way that Output
can detect if the result is fresh or not so it will only be cloned once.
@jackkoenig jackkoenig added the Deprecation Deprecates an API, will be included in release notes label Feb 11, 2026
@jackkoenig
Copy link
Contributor Author

I'm waiting for some numbers from @trmckay to see how much this helps in practice. I don't love this whack-a-mole style of fixing this problem but that more complete solutions are hard...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Deprecation Deprecates an API, will be included in release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant