perf: optimize hot paths with algorithmic improvements #179

tnm · 2026-01-07T06:32:14Z

Summary

Three performance optimizations targeting commonly-used code paths:

1. Dependency analyzer: O(n²) → O(1) incoming connections

Add reverse dependency map built in single O(n) pass
Lookup incoming connections via dict instead of scanning all nodes
~50x speedup for graphs with 200+ modules

2. Line reference fixer: O(n) → O(log n) nearest line lookup

Use bisect binary search instead of min() with lambda
Cache sorted line lists per file
~100-700x speedup depending on diff size

3. Validator pre-compiled regex patterns

Move regex compilation to class-level ClassVar
Combine separate def/class patterns into single regex
~1.2x speedup (modest due to Python's internal caching)

Benchmarks

Synthetic (isolated operations)

Optimization	Size	Before	After	Speedup
Dependency graph incoming	200 nodes	31.9ms	0.6ms	52x
Line ref lookup	2000 lines	590ms	0.9ms	696x

Real-world (kit codebase)

Operation	Dataset	Time
`generate_llm_context()`	335 modules	0.5ms
`fix_comment()`	2684-line diff	1.1ms

Test plan

All existing tests pass (46 passed)
Added 8 new edge case tests for binary search algorithm
Formatting/linting passes

Three performance optimizations targeting commonly-used code paths: 1. Dependency analyzer O(n²) → O(1) incoming connections - Add reverse dependency map built in single O(n) pass - Lookup incoming connections via dict instead of scanning all nodes - ~50x speedup for graphs with 200+ modules 2. Line reference fixer O(n) → O(log n) nearest line lookup - Use bisect binary search instead of min() with lambda - Cache sorted line lists per file - ~100-700x speedup depending on diff size 3. Validator pre-compiled regex patterns - Move regex compilation to class-level ClassVar - Combine separate def/class patterns into single regex - ~1.2x speedup (modest due to Python's internal caching) Benchmarks on kit codebase (335 modules, 2684-line diff): - generate_llm_context(): 0.5ms - fix_comment(): 1.1ms Also adds 8 edge case tests for the binary search algorithm.

Add missing performance improvements from PRs #179 and #180: - Tier 1: O(n²)→O(1) dependency graph, O(n)→O(log n) line lookup, regex precompile - Tier 2: Vector search collection reset, context extractor file caching

tnm merged commit 658752e into main Jan 7, 2026
2 checks passed

tnm mentioned this pull request Jan 7, 2026

feat: live repo indexing + CI integration #37

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: optimize hot paths with algorithmic improvements #179

perf: optimize hot paths with algorithmic improvements #179

Uh oh!

tnm commented Jan 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

perf: optimize hot paths with algorithmic improvements #179

perf: optimize hot paths with algorithmic improvements #179

Uh oh!

Conversation

tnm commented Jan 7, 2026

Summary

1. Dependency analyzer: O(n²) → O(1) incoming connections

2. Line reference fixer: O(n) → O(log n) nearest line lookup

3. Validator pre-compiled regex patterns

Benchmarks

Synthetic (isolated operations)

Real-world (kit codebase)

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants