Skip to content

Conversation

@tnm
Copy link
Contributor

@tnm tnm commented Jan 7, 2026

Summary

Two performance improvements from Tier 3 optimization investigation:

1. Diff Parsing Cache (line_ref_fixer.py)

  • LineRefFixer.fix_comment() now accepts optional parsed_diff parameter
  • Reviewer passes already-parsed diff to avoid redundant re-parsing
  • 21.5x speedup (95% faster) - 0.076ms → 0.004ms per call

2. Parallel File Processing (vector_searcher.py)

  • VectorSearcher.build_index() now processes files in parallel using ThreadPoolExecutor
  • New parameters: parallel=True, max_workers=None
  • Configurable via KIT_INDEXER_MAX_WORKERS environment variable
  • 1.33x speedup - 63ms → 47ms for 50 files (4 workers)

Benchmark Results

Optimization Before After Speedup
Diff parsing (per call) 0.076ms 0.004ms 21.5x
Vector indexing (50 files) 63ms 47ms 1.33x

Investigation Summary

Other optimizations investigated but not implemented:

  • Pricing API singleton: lru_cache already handles caching well (first call 341ms, subsequent 0.007ms)
  • Batch symbol extraction: Grouping by language showed 1.77x speedup but parallel processing covers this better

Test plan

  • All existing tests pass
  • Line ref fixer tests pass with new parameter
  • Real-world benchmarks verified

Two performance improvements:

1. Diff parsing cache (LineRefFixer)
   - Accept pre-parsed diff dict to avoid re-parsing
   - Reviewer now passes cached parsed diff to fix_comment()
   - 21.5x speedup (95% faster) per fix_comment call

2. Parallel file processing (VectorSearcher)
   - Add ThreadPoolExecutor for parallel file chunking
   - Configurable via parallel=True and max_workers param
   - Environment variable: KIT_INDEXER_MAX_WORKERS
   - 1.33x speedup on 50-file indexing (4 workers)

Benchmark results:
- Diff parsing: 0.076ms → 0.004ms per call
- Vector indexing: 63ms → 47ms for 50 files
@tnm tnm merged commit cca7db8 into main Jan 7, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants