Skip to content

⚡ Bolt: [performance improvement] Optimize dynamic SQL generation for D1#243

Open
bashandbone wants to merge 1 commit into
mainfrom
bolt/optimize-d1-sql-generation-12039807803892862716
Open

⚡ Bolt: [performance improvement] Optimize dynamic SQL generation for D1#243
bashandbone wants to merge 1 commit into
mainfrom
bolt/optimize-d1-sql-generation-12039807803892862716

Conversation

@bashandbone
Copy link
Copy Markdown
Contributor

@bashandbone bashandbone commented May 18, 2026

💡 What:
Replaced intermediate Vec allocations and format! calls inside loops with a pre-allocated String::with_capacity buffer and sequential write! (via std::fmt::Write) macro calls in build_upsert_stmt and build_delete_stmt for Cloudflare D1 targets.

🎯 Why:
During dynamic SQL generation in crates/flow/src/targets/d1.rs, creating intermediate Vectors (e.g., update_clauses, where_clauses) and running .join(", ") alongside multiple nested string interpolations (format!) triggered unnecessary heap allocations and expensive memory copying (churn). This overhead scaled linearly with the number of columns updated or queried.

📊 Impact:
Reduces heap allocation frequency during SQL construction in D1 database targets. Improves memory efficiency and slightly speeds up D1 mutation requests by eliminating the Vec allocation overhead, making performance scale smoothly as queries grow.

🔬 Measurement:
Verified functionality by running all related D1 target tests (cargo test -p thread-flow --test d1_target_tests --test d1_minimal_tests --test d1_cache_integration), which passed successfully. Ran benchmarks via cargo bench -p thread-flow d1_profiling to ensure no regressions occurred.


PR created automatically by Jules for task 12039807803892862716 started by @bashandbone

Summary by Sourcery

Optimize dynamic SQL generation for Cloudflare D1 targets and apply minor code cleanups and style adjustments across rule-engine and AST modules.

Enhancements:

  • Preallocate SQL strings and build INSERT/UPSERT and DELETE statements incrementally for D1 exports to reduce heap allocations and improve performance.
  • Relax unnecessary lifetime annotations in rule-engine variable checking helpers to simplify signatures while preserving behavior.
  • Tidy formatting and error-handling expressions in AST and rule-engine modules for improved readability and consistency.

Documentation:

  • Extend internal Bolt performance notes with guidance on efficient dynamic SQL construction for Cloudflare D1.

Tests:

  • Remove a redundant assertion from D1 cache integration tests to keep the test suite minimal and focused.

* Replaced intermediate `Vec` allocations and `format!` calls with a single `String::with_capacity` buffer and `write!` macro usage in `build_upsert_stmt` and `build_delete_stmt` in `crates/flow/src/targets/d1.rs`.
* Added performance learning to `.jules/bolt.md`.

Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 18, 2026 18:12
@google-labs-jules
Copy link
Copy Markdown
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented May 18, 2026

Reviewer's Guide

Optimizes dynamic SQL generation for Cloudflare D1 by replacing intermediate Vec and format! usage with preallocated String buffers and write! calls, makes some rule-engine APIs take non-lifetime-tied references, and applies a few minor formatting/clarity cleanups across the codebase and tests.

File-Level Changes

Change Details Files
Optimize D1 upsert SQL construction to avoid intermediate Vec allocations and repeated format! calls.
  • Remove allocation of update_clauses Vec and its use with join(", ")
  • Precompute an approximate capacity for the SQL String using table and column metadata
  • Use write! in loops to append INSERT columns, VALUES placeholders, and ON CONFLICT update assignments directly into the preallocated String
  • Iterate over values.fields and value_fields_schema to emit update clauses with a first_update flag instead of building a separate collection
crates/flow/src/targets/d1.rs
Optimize D1 delete SQL construction similarly by streaming into a preallocated String.
  • Remove where_clauses Vec and join(" AND ") usage for DELETE query generation
  • Precompute String capacity based on table name and key field schema
  • Use write! to build the DELETE FROM ... WHERE ... AND ... statement incrementally while iterating over key parts
  • Track first condition with a boolean flag to correctly place AND separators
crates/flow/src/targets/d1.rs
Relax lifetimes on rule-engine checking helpers to accept plain borrowed references instead of lifetime-parameterized references.
  • Change constraints and transform parameters in check_rule_with_hint and helper functions from &'r T to &T
  • Propagate the same signature simplification to check_vars_in_rewriter, check_vars, check_var_in_constraints, and check_var_in_transform
crates/rule-engine/src/check_var.rs
Minor readability and style cleanups in AST engine and rule engine implementations.
  • Reformat String::from_utf8 error handling in ContentExt impl into a single expression while preserving behavior
  • Expand an assert_eq! in a tree-sitter test into multi-line form for readability
  • Format the Rule::Pattern branch of defined_vars onto multiple lines for clearer chaining
  • Inline the RwLock read().unwrap_or_else(...).clone() chain in Registration::read into a single expression
crates/ast-engine/src/tree_sitter/mod.rs
crates/rule-engine/src/rule/mod.rs
crates/rule-engine/src/rule/referent_rule.rs
Update internal performance notes and simplify a trivial test assertion.
  • Add a new Bolt note documenting the learning and action items around dynamic SQL generation for Cloudflare D1 using String::with_capacity and write!
  • Remove a redundant assert!(true, ...) from a D1 cache integration test, leaving the test body as just the construction call
.jules/bolt.md
crates/flow/tests/d1_cache_integration.rs

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • In build_upsert_stmt, the new ON CONFLICT DO UPDATE builder iterates over values.fields (and unconditionally uses value_fields_schema for each index), whereas the insert columns/placeholders are only populated when value is Some; this can cause the update clause to diverge from the set of actually inserted columns—consider keeping the update loop aligned with the same Some(value) filter used when pushing columns/placeholders.
  • The new SQL builders rely on multiple write!(...).unwrap() calls, which will now panic on any formatting error; if you want to preserve the previous behavior of format! (which didn’t panic on formatting), consider either propagating these fmt::Results or explicitly ignoring them (e.g., let _ = write!(...)) to avoid introducing new panic paths.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `build_upsert_stmt`, the new `ON CONFLICT DO UPDATE` builder iterates over `values.fields` (and unconditionally uses `value_fields_schema` for each index), whereas the insert `columns`/`placeholders` are only populated when `value` is `Some`; this can cause the update clause to diverge from the set of actually inserted columns—consider keeping the update loop aligned with the same `Some(value)` filter used when pushing columns/placeholders.
- The new SQL builders rely on multiple `write!(...).unwrap()` calls, which will now panic on any formatting error; if you want to preserve the previous behavior of `format!` (which didn’t panic on formatting), consider either propagating these `fmt::Result`s or explicitly ignoring them (e.g., `let _ = write!(...)`) to avoid introducing new panic paths.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Replaces intermediate Vec/format!/join patterns in D1 dynamic SQL builders (build_upsert_stmt, build_delete_stmt) with pre-sized String::with_capacity + write! to reduce heap churn. Also includes incidental rustfmt-style reflows in a few unrelated files and a learnings note in .jules/bolt.md.

Changes:

  • Rewrote D1 upsert/delete SQL construction to use a single pre-allocated buffer with std::fmt::Write.
  • Minor formatting reflows in referent_rule.rs, rule/mod.rs, check_var.rs, tree_sitter/mod.rs, and removed a trivial assert!(true, ...) in d1_cache_integration.rs.
  • Added a performance-learning entry to .jules/bolt.md.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
crates/flow/src/targets/d1.rs Main perf change: replaces Vec + format!/join with pre-sized String + write! in build_upsert_stmt and build_delete_stmt; adds std::fmt::Write import.
crates/flow/tests/d1_cache_integration.rs Removes a no-op assert!(true, ...) from a construction-only test.
crates/rule-engine/src/rule/referent_rule.rs Cosmetic single-line reflow of read().
crates/rule-engine/src/rule/mod.rs Cosmetic multi-line reflow of defined_vars chain.
crates/rule-engine/src/check_var.rs Drops unnecessary 'r lifetime annotations on references that don't need to share the lifetime.
crates/ast-engine/src/tree_sitter/mod.rs Cosmetic reflow of from_utf8 fallback and a test assert_eq!.
.jules/bolt.md Adds a performance-learning note documenting the SQL-builder optimization.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .jules/bolt.md
**Learning:** During DAG traversals, creating owned variants of identifiers (like `file.to_path_buf()`) *before* checking `visited` HashSets results in heap allocations (O(E)) for every edge instead of every visited node (O(V)). By moving the `&PathBuf` allocation strictly *after* all HashSet `contains` checks using the borrowed reference (`&Path`), we drastically reduce memory churn.
**Action:** Always check `HashSet::contains` with a borrowed reference *before* creating the owned version required by `HashSet::insert`, especially in performance-critical graph traversal paths.

## 2024-05-18 - [Performance: Dynamic SQL Generation for Cloudflare D1]
Comment on lines +383 to +390
let mut sql = String::with_capacity(
14 + self.table_name.len()
+ 7
+ self
.key_fields_schema
.iter()
.map(|f| f.name.len() + 5)
.sum::<usize>(),
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants