Observation
ddk-parent/pom.xml sets:
<pmd.cpd.min>100000</pmd.cpd.min>
This is the minimumTokens threshold for PMD's Copy/Paste Detector (CPD). At this value, CPD is effectively disabled — pmd:cpd-check will pass on virtually any duplication.
Empirical evidence
Modified com.avaloq.tools.ddk.test.core/src/.../PostconditionViolation.java to add two identical 8-line methods (~65 tokens each by CPD's count):
public int compute1(int x) {
int a = x + 1;
int b = a * 2;
int c = b - 3;
int d = c + 4;
int e = d * 5;
int f = e - 6;
return f + a + b + c + d + e;
}
public int compute2(int x) {
// identical body to compute1
}
Running pmd:cpd-check at the project's configured threshold:
| Run |
Threshold |
Reactor duplications detected |
Current pmd.cpd.min=100000 |
100 000 tokens |
0 (synthetic 65-token duplication missed) |
-Dpmd.cpd.min=20 (override) |
20 tokens |
21 including the synthetic one and real ones in the codebase |
The largest pre-existing duplication detected at the lower threshold is 90 lines / 267 tokens — meaningful duplication that the project currently doesn't see.
Suggested fix
PMD's CPD default is 100 tokens. Common production values for Java sit between 50 and 100. A reasonable starting point:
<pmd.cpd.min>100</pmd.cpd.min>
Then run pmd:cpd-check against the reactor, triage the surfaced duplications (some may be legitimate — e.g., generated code, parser tables — and warrant suppression rather than fixing), and tune from there.
Why this matters now
- Contributors and reviewers may believe CPD enforcement is active when it's actually a no-op.
- CI redesign work (e.g. SARIF + GitHub Code Scanning experimentation) treats
pmd:cpd-check as a gate, but it never fires today.
- 21 real duplications in the reactor are silently accumulating.
Filed by Claude at João's request.
Observation
ddk-parent/pom.xmlsets:This is the
minimumTokensthreshold for PMD's Copy/Paste Detector (CPD). At this value, CPD is effectively disabled —pmd:cpd-checkwill pass on virtually any duplication.Empirical evidence
Modified
com.avaloq.tools.ddk.test.core/src/.../PostconditionViolation.javato add two identical 8-line methods (~65 tokens each by CPD's count):Running
pmd:cpd-checkat the project's configured threshold:pmd.cpd.min=100000-Dpmd.cpd.min=20(override)The largest pre-existing duplication detected at the lower threshold is 90 lines / 267 tokens — meaningful duplication that the project currently doesn't see.
Suggested fix
PMD's CPD default is 100 tokens. Common production values for Java sit between 50 and 100. A reasonable starting point:
Then run
pmd:cpd-checkagainst the reactor, triage the surfaced duplications (some may be legitimate — e.g., generated code, parser tables — and warrant suppression rather than fixing), and tune from there.Why this matters now
pmd:cpd-checkas a gate, but it never fires today.Filed by Claude at João's request.