Goal
Enhance the variant generator to inject controlled amounts of irrelevant/distractor code around the core performance pattern. This will test the LLM’s ability to identify and localize the actual bottleneck in a noisy, more realistic codebase — a critical real-world skill. Currently, patterns are relatively clean. Adding distractors will make the benchmark significantly harder and more representative of repository-level optimization tasks.
Requirements
- Add a new parameter noise_level (none | low | medium | high) to the variant generation system.
- Support both single-file and multi-file distractor injection.
- Distractors must not affect functional correctness or the ground-truth performance measurement of the target hotspot.
- All variants remain fully reproducible via seeds.
Types of Distractors to Include
- Dead / unused functions and variables
- Boilerplate code (argument parsing, config loading, logging, error handling)
- Semantically similar but low-impact code (e.g., another loop on small data)
- Unrelated helper classes or utilities
- Red herring functions that look optimizable but have negligible runtime impact
- Comments, documentation, and unused imports
- For multi-file: spread distractors across headers, utils, and main files
Goal
Enhance the variant generator to inject controlled amounts of irrelevant/distractor code around the core performance pattern. This will test the LLM’s ability to identify and localize the actual bottleneck in a noisy, more realistic codebase — a critical real-world skill. Currently, patterns are relatively clean. Adding distractors will make the benchmark significantly harder and more representative of repository-level optimization tasks.
Requirements
Types of Distractors to Include