Skip to content

flatkv cache#3027

Open
cody-littley wants to merge 94 commits intomainfrom
cjl/flatkv-cache
Open

flatkv cache#3027
cody-littley wants to merge 94 commits intomainfrom
cjl/flatkv-cache

Conversation

@cody-littley
Copy link
Copy Markdown
Contributor

@cody-littley cody-littley commented Mar 5, 2026

Describe your changes and provide context

Add a caching layer to FlatKV, more than doubling performance in cryptosim benchmarks.

Testing performed to validate your change

Unit tests, ran benchmark over several days.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 5, 2026

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedMar 27, 2026, 1:05 PM

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 6, 2026

Codecov Report

❌ Patch coverage is 74.03846% with 108 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.74%. Comparing base (9abe71e) to head (33378ce).
⚠️ Report is 8 commits behind head on main.

Files with missing lines Patch % Lines
sei-db/state_db/sc/flatkv/store_write.go 78.47% 18 Missing and 13 partials ⚠️
sei-db/db_engine/pebbledb/db.go 50.00% 18 Missing and 5 partials ⚠️
sei-db/state_db/sc/flatkv/config.go 65.62% 11 Missing and 11 partials ⚠️
sei-db/state_db/sc/flatkv/store.go 75.75% 8 Missing and 8 partials ⚠️
sei-db/db_engine/dbcache/cache_config.go 45.45% 6 Missing ⚠️
sei-db/db_engine/pebbledb/pebbledb_config.go 60.00% 2 Missing and 2 partials ⚠️
sei-db/common/threading/fixed_pool.go 0.00% 1 Missing and 1 partial ⚠️
sei-db/db_engine/pebbledb/batch.go 60.00% 1 Missing and 1 partial ⚠️
sei-db/state_db/sc/composite/store.go 60.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3027      +/-   ##
==========================================
+ Coverage   58.62%   58.74%   +0.12%     
==========================================
  Files        2099     2104       +5     
  Lines      173514   175077    +1563     
==========================================
+ Hits       101716   102852    +1136     
- Misses      62750    63139     +389     
- Partials     9048     9086      +38     
Flag Coverage Δ
sei-chain-pr 73.22% <74.03%> (?)
sei-db 70.41% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
sei-db/common/metrics/phase_timer.go 74.46% <100.00%> (+0.55%) ⬆️
sei-db/config/sc_config.go 100.00% <100.00%> (ø)
sei-db/db_engine/dbcache/cache.go 77.77% <100.00%> (+66.66%) ⬆️
sei-db/db_engine/dbcache/cache_impl.go 95.65% <100.00%> (ø)
sei-db/db_engine/pebbledb/pebble_metrics.go 69.09% <100.00%> (ø)
sei-db/db_engine/pebbledb/pebbledb_test_config.go 100.00% <100.00%> (ø)
sei-db/db_engine/types/types.go 100.00% <ø> (ø)
sei-db/state_db/sc/flatkv/flatkv_test_config.go 100.00% <100.00%> (ø)
sei-db/state_db/sc/flatkv/snapshot.go 66.76% <100.00%> (+0.19%) ⬆️
sei-db/state_db/sc/flatkv/store_lifecycle.go 62.50% <100.00%> (+4.75%) ⬆️
... and 9 more

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@cody-littley cody-littley marked this pull request as ready for review March 20, 2026 15:23
}

return store
return store, nil
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of changing the interface to return an error, wouldn't it be easier to just panic inside this constructor? Is there any benefit to return the error if we failed to create the store?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Converted back to a panic().

As a general pattern, I prefer to return errors when possible instead of using panic, except in cases where I'm just doing sanity/invariant testing. But this isn't something I feel strongly enough to argue about, so I don't mind switching this one back. 😜


// FlatKVConfig is the configuration for the FlatKV (EVM) backend
FlatKVConfig flatkv.Config
FlatKVConfig *flatkv.Config
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to make this a pointer?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted back to a direct reference.

No need to make this a pointer, except that I prefer pointer types for any struct that has more than one or two fields. It's unlikely that this is going to be a major performance factor in this case, so I don't mind using a direct reference if that's your preferred pattern.

if err := scStore.CleanupCrashArtifacts(); err != nil {
panic(err)
}
if err := scStore.CleanupCrashArtifacts(); err != nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate CleanupCrashArtifacts call here, we should remove the one above?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed duplicate call. This was probably happened when I merged upstream into my branch.

return fmt.Errorf("failed to batch read old values: %w", err)
}

s.phaseTimer.SetPhase("apply_change_sets_prepare")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate code s.phaseTimer.SetPhase("apply_change_sets_prepare")

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

)

// Configuration for the PebbleDB database.
type PebbleDBConfig struct {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These configs are not specific to PebbleDB, instead, they are specific to flatkv, I think we should move them to flatkv instead of put them here, we could change the db engine in the future for flatkv, but these configs are universal to all db engines

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extracted configuration.

// The number of shards in the key-value cache. Must be a power of two and greater than 0.
CacheShardCount uint64
// The size of pebbleDB's internal block cache, in bytes.
BlockCacheSize int
Copy link
Copy Markdown
Contributor

@yzang2019 yzang2019 Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would recommend remove this BlockCacheSize from the config, we can set a tuned default value for now, we can add it in the future if we are 100% sure we need a different specific value for each DB.
In any way, we are not going rely on the PebbleDB internal cache for read optimization, so would recommend maybe just set a default number

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in this PR and set to previous default, but I'd like to discuss adding it back in a future.

The block cache still serves an important role in database performance. It's used by pebble to cache LSM blocks, and we can't just disable it.

The reason why I added this config was because I thought the default value of 0.5gb was way too large for unit tests, which I generally prefer to use very little resources. Eventually I'd like to run unit tests in parallel, that will be tricky if each FlatKV instance utilizes 2.5gb memory.

Although I haven't done so yet, I'd actually like to enable configuration of ALL pebbleDB settings (although we can give them sane defaults). Right now it's tricky to experiment with pebble configuration because it's all hard coded.

Perhaps good for a quick in-person discussion.

opts types.OpenOptions,
enableMetrics bool,
config *PebbleDBConfig,
comparer *pebble.Comparer,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to pass the comparer in? Under what scenario will we pass in a custom comparer?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed comparer parameter.

return fmt.Errorf("account db config is invalid: %w", c.AccountDBConfig.Validate())
}
if c.CodeDBConfig.Validate() != nil {
return fmt.Errorf("code db config is invalid: %w", c.CodeDBConfig.Validate())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: We shouldn't need to call Validate() twice here

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

return fmt.Errorf("metadata db config is invalid: %w", c.MetadataDBConfig.Validate())
}

if c.ReaderThreadsPerCore < 0 {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If ReaderThreadsPerCore == 0, and ReaderConstantThreadCount == 0, it will create a pool with 0 workers, causing deadlock issue?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I've updated the pool constructor to more elegantly handle this case:

	if workers <= 0 {
		workers = 1
	}

if c.DataDir == "" {
return fmt.Errorf("data dir is required")
}
if c.CacheSize > 0 && (c.CacheShardCount&(c.CacheShardCount-1)) != 0 {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we also validate and make sure CacheShardCount > 0?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Validation added.

@yzang2019
Copy link
Copy Markdown
Contributor

One blocker comment: I think we should decouple cache from db_engine as much as possible, which allows FlatKV to switch different db_engine easily without reimplmenting the cache for each engine in the future

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants