perf: Tier 3 optimizations - diff cache and parallel indexing #181
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Two performance improvements from Tier 3 optimization investigation:
1. Diff Parsing Cache (
line_ref_fixer.py)LineRefFixer.fix_comment()now accepts optionalparsed_diffparameter2. Parallel File Processing (
vector_searcher.py)VectorSearcher.build_index()now processes files in parallel using ThreadPoolExecutorparallel=True,max_workers=NoneKIT_INDEXER_MAX_WORKERSenvironment variableBenchmark Results
Investigation Summary
Other optimizations investigated but not implemented:
lru_cachealready handles caching well (first call 341ms, subsequent 0.007ms)Test plan