Skip to content

Conversation

@google-labs-jules
Copy link

This PR addresses critical performance issues and memory leaks identified during code review.

1. Tokenizer Optimization & Fix

  • Performance: The tokenizer previously recompiled regexes for every character, leading to $O(N \times M)$ compilations. I introduced once_cell::sync::Lazy to compile regexes once statically.
  • Bug Fix: Fixed an infinite loop in TokenStream::next where it would endlessly return _EOFDummy tokens because sent_eof_dummy flag was never updated.

2. Closure Memory Leak Fix (Flat Closures)

  • Issue: Closures previously captured the entire Environment (scope chain), causing all variables in parent scopes (even unused ones) to remain in memory. This is a common source of leaks in naive interpreter implementations.
  • Solution: Implemented a semantic analysis pass (semantic::captured_variables) that identifies exactly which variables a function uses from its parent scope. Function::new_closure now creates a detached VariableScope containing only these captured variables, linked directly to the global scope. This allows intermediate stack frames to be garbage collected.

3. Closure Initialization Order

  • Regression Fix: The Flat Closure implementation revealed an issue where sibling functions (e.g., get and set inside a useState function) could not capture each other because they were initialized sequentially.
  • Fix: Refactored Environment::setup_scope to execute in two phases:
    1. Hoisting: Declare all variable/function names in the scope.
    2. Initialization: Create closures and assign them.
      This ensures that when a closure is created, all potential capture targets in the same scope are already declared.

4. Safety

  • Addressed unsafe usage in native_wrap.

Verification

  • Added bench_tokenizer test which now runs in milliseconds (vs timeout previously).
  • Verified all existing tests pass, including complex_closure.

PR created automatically by Jules for task 17035486210205838536 started by @aryan-programmer

- Optimized tokenizer by using `once_cell::Lazy` for regexes to avoid recompilation.
- Fixed infinite loop bug in tokenizer at EOF.
- Implemented "Flat Closure" optimization: closures now capture only used variables instead of the entire parent scope stack, preventing memory leaks.
- Added semantic analyzer for variable capturing.
- Refactored `Environment::setup_scope` to two-phase (hoist then init) to support mutual recursion/capture between sibling functions with the new closure implementation.
- Attempted to clean up `unsafe` in `native_wrap` (safe coercion verified).
- Added benchmark and verified tests.
@google-labs-jules
Copy link
Author

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!


For security, I will only act on instructions from the user who triggered this task.

New to Jules? Learn more at jules.google/docs.

- Optimized tokenizer with `Lazy` regex compilation.
- Fixed infinite loop in tokenizer at EOF.
- Implemented Flat Closures to fix memory leaks (semantic analysis + detached scope).
- Refactored `Environment::setup_scope` to two-phase (hoist then init) to fix closure capture order.
- Added tests verifying closure capture optimization.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant