Conversation
f8ca4e2 to
e55ac11
Compare
Added cach layer Remove redundant calls.
e55ac11 to
04f6520
Compare
| blocktime, | ||
| true, | ||
| c.config.GetUint64(optionNameMinimumGasTipCap), | ||
| blocktime-2, |
There was a problem hiding this comment.
Just to note that current blocktime on our CI is 1s. So this way caching is not used there... How we could achive to use it there?
There was a problem hiding this comment.
In deploy.go blockTime is hardcoded by itself. But if we are talking about CI, normally cacheTTL is about 85% of block time, so even if it is 1 second, than we will get 850 ms
There was a problem hiding this comment.
This is very strange, as block time is defined as const blocktime = 15 on line 16, which is actually a nanosecond. As that is passed in poolingInterval, it looks to me that it is a very short time, I suppose that block time should be in minutes, not nanoseconds. Luckily, this is only for the deploy command. Bee has its own block time passed from options. I believe that line 16 in this file should look like this const blocktime = 15 * time.Minute, which is not related to this PR.
Also what happens if blocktime value is 1 (if someone changes the constant) and we have a negative duration? If the constant is changed to the much longer value like 15*time.Minute, literal 2 in here would reduce only two nanoseconds from that large number.
| t.Helper() | ||
|
|
||
| var expSegments [][]byte | ||
| expSegments := make([][]byte, 0, len(exp)) |
There was a problem hiding this comment.
These are fine but feel like they belong in a separate PR.
There was a problem hiding this comment.
Yes, but linter was failing so I decided to fix everything. Next time will create separate PR
pkg/node/chain.go
Outdated
| pollingInterval time.Duration, | ||
| chainEnabled bool, | ||
| minimumGasTipCap uint64, | ||
| blockCacheTTl time.Duration, |
There was a problem hiding this comment.
Typo blockCacheTTl -> blockCacheTTL
| } | ||
| } | ||
|
|
||
| func NewMetrics() Metrics { |
There was a problem hiding this comment.
Seems not used (dead code).
| c.metrics.LoadErrors.Inc() | ||
| return val, err | ||
| } | ||
| c.Set(val, time.Now()) |
There was a problem hiding this comment.
Should we use here now from argument?
There was a problem hiding this comment.
What could happen if loader takes a while?
There was a problem hiding this comment.
I do not think so because there is difference between time when we request value and time, when we set new value if we had to load it.
|
Suggestion: Instead of a fixed TTL ( How it works:
|
| @@ -0,0 +1,107 @@ | |||
| // Copyright 2025 The Swarm Authors. All rights reserved. | |||
| @@ -0,0 +1,189 @@ | |||
| // Copyright 2025 The Swarm Authors. All rights reserved. | |||
| @@ -0,0 +1,10 @@ | |||
| // Copyright 2025 The Swarm Authors. All rights reserved. | |||
| @@ -0,0 +1,84 @@ | |||
| // Copyright 2025 The Swarm Authors. All rights reserved. | |||
| expiresAt time.Time | ||
|
|
||
| group singleflight.Group | ||
| key Key |
There was a problem hiding this comment.
The key here is only used in metrics and in c.group.Do as the key, but that key is always the same as the singleflight group is the same and the key is the same for the same instance of the ExpiringSingleFlightCache type. I owuld suggest to remove the Key type and just to provide the metrics prefix string in the constructor NewExpiringSingleFlightCache.
| }(i) | ||
| } | ||
|
|
||
| time.Sleep(50 * time.Millisecond) |
There was a problem hiding this comment.
Could synctest be used here to avoid sleeping and potentially have a flaky test as synchronization not happen in 50ms?
|
|
||
| c.metrics.Misses.Inc() | ||
|
|
||
| result, err, shared := c.group.Do(string(c.key), func() (any, error) { |
There was a problem hiding this comment.
This group is always calling the same key, so it can be even the static string.
| b.metrics.TotalRPCCalls.Inc() | ||
| b.metrics.BlockNumberCalls.Inc() | ||
|
|
||
| blockNumber, err := b.backend.BlockNumber(ctx) |
There was a problem hiding this comment.
The problem with the golang.org/x/sync/singleflight is that it is not context.Context aware. The context that is passed from the first caller will influence the end result of all other callers of this function. Meaning, if the first caller cancels or times out on the context, all other callers will receive that error even if they did not cancel or the timeout for their call did not happen. This is why I've created the resenje.org/signleflight, that is context aware and will not terminate the execution of the function until all callers terminate their contexts. There are a few places where resenje.org/singleflight is used in bee, if you like, you can look at it and consider using if you see important for this case. Basically GetOrLoad would require to accept context, pass to signleflight.Do and the context from the callback would be used in b.backend.BlockNumber.
| blocktime, | ||
| true, | ||
| c.config.GetUint64(optionNameMinimumGasTipCap), | ||
| blocktime-2, |
There was a problem hiding this comment.
This is very strange, as block time is defined as const blocktime = 15 on line 16, which is actually a nanosecond. As that is passed in poolingInterval, it looks to me that it is a very short time, I suppose that block time should be in minutes, not nanoseconds. Luckily, this is only for the deploy command. Bee has its own block time passed from options. I believe that line 16 in this file should look like this const blocktime = 15 * time.Minute, which is not related to this PR.
Also what happens if blocktime value is 1 (if someone changes the constant) and we have a negative duration? If the constant is changed to the much longer value like 15*time.Minute, literal 2 in here would reduce only two nanoseconds from that large number.
|
|
||
| c.value = value | ||
| c.valid = true | ||
| c.expiresAt = now.Add(c.ttl) |
There was a problem hiding this comment.
Ideally, for the block number caching, the expiration time should be the time when the next block number is expected and that is around the block time interval. This is to have as least as possible time when the cache is reporting the older block number comapred what is the current block number.
In this generalized caching implementation, it is hard to generalize such requirement as it is very specific to how the block number is increased (in some ~ period, that is configurable). So the ideal situation would be to calculate what would be the time of the next block number and to set expiresAt to that time, or a bit later. And to watch if the block number is actually increased.
In the current implementation, the worst case would be that the cache will keep the block number cache up to the block time (15 minutes, for example on some networks) when the actual block number is increased, if the value is set on the cache a moment before the new block number. On average, statistically in a very large sample of caching, that time would be half of the block time (7 minutes).
I am not sure if such precision is needed, and if the delay of the new block number of up to the whole block time is acceptable. This is just my observation.
Checklist
Description
Removed RPC patterns in the transaction flow. Added cache layer for BlockNumber rpc call.
Fixed lint issues in the new cache package and cleaned up minor lint findings in adjacent test/helper code.
Open API Spec Version Changes (if applicable)
Motivation and Context (Optional)
Related Issue (Optional)
#5388
Screenshots (if appropriate):
Highly likely hit ratio is low because of high load errors (EOF), which is fix by PR