Skip to content

Conversation

@tnm
Copy link
Contributor

@tnm tnm commented Jan 7, 2026

Summary

Three performance optimizations targeting commonly-used code paths:

1. Dependency analyzer: O(n²) → O(1) incoming connections

  • Add reverse dependency map built in single O(n) pass
  • Lookup incoming connections via dict instead of scanning all nodes
  • ~50x speedup for graphs with 200+ modules

2. Line reference fixer: O(n) → O(log n) nearest line lookup

  • Use bisect binary search instead of min() with lambda
  • Cache sorted line lists per file
  • ~100-700x speedup depending on diff size

3. Validator pre-compiled regex patterns

  • Move regex compilation to class-level ClassVar
  • Combine separate def/class patterns into single regex
  • ~1.2x speedup (modest due to Python's internal caching)

Benchmarks

Synthetic (isolated operations)

Optimization Size Before After Speedup
Dependency graph incoming 200 nodes 31.9ms 0.6ms 52x
Line ref lookup 2000 lines 590ms 0.9ms 696x

Real-world (kit codebase)

Operation Dataset Time
generate_llm_context() 335 modules 0.5ms
fix_comment() 2684-line diff 1.1ms

Test plan

  • All existing tests pass (46 passed)
  • Added 8 new edge case tests for binary search algorithm
  • Formatting/linting passes

Three performance optimizations targeting commonly-used code paths:

1. Dependency analyzer O(n²) → O(1) incoming connections
   - Add reverse dependency map built in single O(n) pass
   - Lookup incoming connections via dict instead of scanning all nodes
   - ~50x speedup for graphs with 200+ modules

2. Line reference fixer O(n) → O(log n) nearest line lookup
   - Use bisect binary search instead of min() with lambda
   - Cache sorted line lists per file
   - ~100-700x speedup depending on diff size

3. Validator pre-compiled regex patterns
   - Move regex compilation to class-level ClassVar
   - Combine separate def/class patterns into single regex
   - ~1.2x speedup (modest due to Python's internal caching)

Benchmarks on kit codebase (335 modules, 2684-line diff):
- generate_llm_context(): 0.5ms
- fix_comment(): 1.1ms

Also adds 8 edge case tests for the binary search algorithm.
@tnm tnm merged commit 658752e into main Jan 7, 2026
2 checks passed
tnm added a commit that referenced this pull request Jan 7, 2026
Add missing performance improvements from PRs #179 and #180:
- Tier 1: O(n²)→O(1) dependency graph, O(n)→O(log n) line lookup, regex precompile
- Tier 2: Vector search collection reset, context extractor file caching
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants