docs(experiment): qualified-name preamble — recorded after revert by dvcdsys · Pull Request #32 · dvcdsys/code-index

dvcdsys · 2026-05-07T14:12:54Z

Summary

Captures the A/B/C testing of a docstring-wrapped preamble with qualified symbol names (UserService.authenticate) across two real codebases — Python class-heavy (brain-project) and Go-heavy (this repo) — plus controlled fixture experiments.
The naive QID benchmark showed +5.6%, but a semantic NL benchmark (queries describe behaviour, never name the class/method) showed essentially zero gain on Python, zero on Go, and one regression where a Mode A top-1 hit dropped below the relevance threshold in Mode B.
The +5.6% turned out to be a literal-string-match artefact of Class.method appearing verbatim in the new preamble. Body content already carries enough lexical signal (self.X, type hints, imports, SQL table names) for the embedder to disambiguate class-scoped methods.
Feature was reverted in the same session; this doc is the record so future iterations don't re-litigate the same hypothesis without the right test.

What's in the diff

A single new file: doc/qualified-name-preamble-experiment.md (~210 lines). Covers:

TL;DR + decision
Implementation scope summary (~20 files touched, all reverted)
Test methodology — controlled fixtures, QID battery, semantic NL battery (the decisive one)
Per-codebase results with hit-rate, rank-1 hit-rate, avg expected score, top-K shifts
Disambiguation control (EventMemory.search_embeddings vs SemanticMemory.search_embeddings — margin actually shrank in Mode B)
Conclusion + reproducing instructions

Test plan

Read doc/qualified-name-preamble-experiment.md end-to-end
Verify the conclusion is consistent with team's calibration of when to ship vs. revert experiments
Decide whether to fold this into a broader "experiments/" subdir in doc/ or leave standalone

🤖 Generated with Claude Code

Captures the A/B/C testing of a docstring-wrapped preamble with qualified symbol names (`UserService.authenticate`) across two real codebases (Python class-heavy + Go-heavy) plus controlled fixtures. Conclusion: the +5.6% QID benchmark gain was a literal-string-match artefact of the new preamble; semantic NL queries that don't name the class/method showed near-zero gain and one regression. Feature was reverted in the same session — this doc is the record so future iterations don't repeat the same hypothesis without the right test. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(experiment): qualified-name preamble — recorded after revert#32

docs(experiment): qualified-name preamble — recorded after revert#32
dvcdsys wants to merge 1 commit intomainfrom
docs/qualified-name-preamble-experiment

dvcdsys commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dvcdsys commented May 7, 2026

Summary

What's in the diff

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant