Skip to content

feat: parallel digest and compression stage during log import#66

Open
doudou wants to merge 11 commits into
masterfrom
parallel_digest_and_compression
Open

feat: parallel digest and compression stage during log import#66
doudou wants to merge 11 commits into
masterfrom
parallel_digest_and_compression

Conversation

@doudou
Copy link
Copy Markdown
Member

@doudou doudou commented May 26, 2026

A significant amount of CPU is spent on digest calculation and on compression (profiling on some of our logs showed around 40%). This PR runs them as separate processes to make both parallel.

In addition to all of that, zstd is stupidly well optimized even on the I/O side, which leads to even more savings on huge logs.

doudou added 11 commits May 20, 2026 17:10
…el of normalization

This allows to extract more parallelism, and simplifies the overall logic
"-19" is really really a lot slower
Otherwise it gets queued on the main executor. If zstd/sha256sum is a bit slow,
it leads to freeing space only a lot later than we can (and needing a lot more
space on disk)

With this change it is executed right away as soon as both computation futures
finish
Does not change performance.
Empty or truncated files
Given that we use a `while` block, the variable is not redefined, and the block
was always using the last set value
@doudou doudou requested review from jhonasiv and wvmcastro May 26, 2026 18:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant