Fix Qualys parser collapsing findings with same QID but different por… by tejas0077 · Pull Request #14528 · DefectDojo/django-DefectDojo

tejas0077 · 2026-03-15T04:15:34Z

Description
When importing Qualys scan reports, findings with the same QID but
different ports were being collapsed into a single finding, causing
inaccurate vulnerability counts and loss of port-level granularity.
Root cause: the finding title only used QID and vulnerability name,
so findings with the same QID on different ports (e.g. 80, 5985, 9999)
got deduplicated into one finding.
Fix: port is now added directly to the Endpoint (and LocationData for V3) when present. Each QID+port combination gets its own endpoint on the finding. Finding titles and deduplication are completely unchanged.
Fixes #13682
Test results
Manually traced the parser logic. Port is already extracted from the
XML as temp["port_status"] and is now passed to the Endpoint object when present.
Documentation
No documentation changes needed.
Checklist

Bugfix submitted against the bugfix branch.
Meaningful PR name given.
Proper label added.

valentijnscholten · 2026-03-15T09:27:51Z

This will need some consideration/assurance as this will completely break deduplication with existing findings. Posted a comment on #13682

tejas0077 · 2026-03-15T09:48:50Z

Hi @valentijnscholten, thank you for the feedback!

You raise a valid point. Changing the title format will break
deduplication with existing findings since the hash code is
calculated from the title.

A safer approach would be to keep the title unchanged but instead
use the port in the hash code calculation for Qualys specifically,
by adding port to the HASHCODE_FIELDS_PER_SCANNER setting:

"Qualys Scan": ["title", "severity", "vulnerability_ids", "cwe", "port"]

This way:

Existing finding titles remain unchanged
Deduplication correctly separates findings by port
No breaking change to existing data

Would you prefer this approach instead? I can update the PR accordingly.

Maffooch

Qualys is a super popular parser, and breaking deduplication would have far reaching impacts. Port is not a field that could be used for deduplication since port is on the endpoint model rather than the finding model.

Something that could be more palatable is to add the port to the endpoint, and then add multiple endpoints to the finding

tejas0077 · 2026-03-16T02:54:50Z

Thanks @Maffooch, that makes sense! I'll rework the fix to keep a single finding per QID but attach multiple endpoints with the respective ports. That way port-level granularity is preserved without touching titles or breaking deduplication. Will update the PR shortly.

tejas0077 · 2026-03-16T03:03:50Z

Hi @Maffooch, I've reworked the fix as suggested. Instead of modifying the title, the port is now added directly to the Endpoint (and LocationData for V3) when present. This way each QID+port combination gets its own endpoint on the finding, port-level granularity is preserved, and existing titles/deduplication are completely unchanged. Please take a look!

valentijnscholten · 2026-03-16T20:39:04Z

The title is still being modified, I believe this has to be removed.

tejas0077 · 2026-03-17T04:30:12Z

Hi @valentijnscholten, thanks for catching that! I've removed the port from the finding title. The title is now back to the original format QID-XXXX | Vulnerability Name and the port is only added to the Endpoint. Please take a look!

Maffooch · 2026-03-20T15:33:08Z

Please add some unit tests for this to prevent regressions, add some updates to the release notes, and then I think this should be good!

@Maffooch

…n fix - Add test XML with same QID on ports 80, 443, 8080 - Add test verifying each port gets its own endpoint - Add 2.57.x release notes mentioning the fix Addresses review feedback from @Maffooch on PR DefectDojo#14528

tejas0077 · 2026-03-20T17:39:20Z

Hi @Maffooch, I have addressed both your review requests:

Added unit tests -created a test XML file with the same QID (12345) on three different ports (80, 443, 8080) and added a test that verifies each port gets its own separate endpoint, the finding title remains unchanged as QID-12345 | Test Vulnerability, and all 3 ports are correctly captured.
Added release notes -created docs/content/releases/os_upgrading/2.57.md with a note about the Qualys parser fix referencing issue #13682.

Please take a look!

github-actions · 2026-03-20T17:40:21Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

valentijnscholten · 2026-03-30T18:01:16Z

@tejas0077 Can you rebase to get rid of the conflicts?

Bumps [ruff](https://github.com/astral-sh/ruff) from 0.15.2 to 0.15.4. - [Release notes](https://github.com/astral-sh/ruff/releases) - [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md) - [Commits](astral-sh/ruff@0.15.2...0.15.4) --- updated-dependencies: - dependency-name: ruff dependency-version: 0.15.4 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

…mpose.yml) (DefectDojo#14399) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

…idator action from v2.0.0 to v2.1.0 (.github/workflows/renovate.yaml) (DefectDojo#14407) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

…v1.35.2 (.github/workflows/k8s-tests.yml) (DefectDojo#14417) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

…ithub/workflows/k8s-tests.yml) (DefectDojo#14418) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* perf: batch duplicate marking in batch deduplication Instead of saving each duplicate finding individually, collect all modified findings during a batch deduplication run and flush them in a single bulk_update call. Original (existing) findings are still saved individually to preserve auto_now timestamp updates and post_save signal behavior, but are deduplicated by id so each is saved at most once per batch. Reduces DB writes from O(2N) individual saves to 1 bulk_update + O(unique originals) saves for a batch of N duplicates. Performance test shows -23 queries on a second import with duplicates. * perf: restrict SELECT columns for batch deduplication via only() Add Finding.DEDUPLICATION_FIELDS — the union of all Finding fields needed across every deduplication algorithm — and apply it as an only() clause in get_finding_models_for_deduplication. This avoids loading large text columns (description, mitigation, impact, references, steps_to_reproduce, severity_justification, etc.) when loading findings for the batch deduplication task, reducing data transferred from the database without affecting query count. build_candidate_scope_queryset is intentionally excluded: it is also used for reimport matching (which accesses severity, numerical_severity and other fields outside this set) and applying only() there would cause deferred-field extra queries. * perf(dedup): defer large text fields on candidate queryset - Add Finding.DEDUPLICATION_DEFERRED_FIELDS constant listing large text columns (description, mitigation, impact, references, etc.) that are never read during deduplication or candidate matching. - Apply .defer(*Finding.DEDUPLICATION_DEFERRED_FIELDS) in build_candidate_scope_queryset to avoid loading those columns for the potentially large candidate pool fetched per dedup batch. Reduces deduplication second-import query count from 213 to 183 (-30). --------- Co-authored-by: Matt Tesauro <mtesauro@gmail.com>

…#14449) * perf(fp-history): batch false positive history processing Replaces the N+1 query pattern in false positive history with a single product-scoped DB query per batch, and switches per-finding save() calls to QuerySet.update() to eliminate redundant signal overhead. Changes: - Extract _fp_candidates_qs() as the single algorithm-dispatch helper shared by both single-finding and batch lookup paths - Add do_false_positive_history_batch() which fetches all FP candidates in one query and marks findings with a single UPDATE - do_false_positive_history() now delegates to the batch function - post_process_findings_batch (import/reimport) calls the batch function instead of a per-finding loop - _bulk_update_finding_status_and_severity (bulk edit) groups findings by (product, dedup_alg) and calls the batch function once per group; retroactive reactivation also batched the same way - Fix dead-code bug in process_false_positive_history: the condition finding.false_p and not finding.false_p was always False because form.save(commit=False) mutates the finding in place; fixed by capturing old_false_p before the form save - Replace all per-finding save()/save_no_options() in FP history paths with QuerySet.update() (bypasses signals identically to the old calls) - Move all FP history helpers from dojo/utils.py to dojo/finding/deduplication.py alongside the matching dedupe helpers All update() calls carry a comment explaining the signal-bypass equivalence with the previous save(skip_validation=True) calls. Adds 4 unit tests covering: batch single-query behaviour, retroactive batch FP marking, retroactive reactivation (previously dead code), and the no-reactivation guard. * perf(fp-history): add .only() to candidate fetch, fix update() comments Limit _fetch_fp_candidates_for_batch to only the fields actually read from candidate objects (id, false_p, active, hash_code, unique_id_from_tool, title, severity), avoiding loading unused columns. Correct update() comments to clarify that .only() does not constrain QuerySet.update() — Django generates UPDATE SQL independently — so the sync requirement is only for fields *read* from candidate objects. * test(fp-history): assert exact query count in batch tests assertNumQueries(7) on both batch tests covers: System_Settings, 4 lazy-load chain (test/engagement/product/test_type from findings[0]), candidates SELECT with .only(), and the bulk UPDATE — fixed regardless of batch size or number of retroactively marked findings. * test(fp-history): assert query count stays flat with N affected findings New test creates 5 pre-existing findings and asserts the batch still uses exactly 7 queries regardless — proving the old O(N) per-finding save loop is gone and a single bulk UPDATE covers all affected rows.

…8.0.1 (.github/workflows/rest-framework-tests.yml) (DefectDojo#14490) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

…to v0.13.1 (.github/workflows/cancel-outdated-workflow-runs.yml) (DefectDojo#14491) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

…0 to v7 (.github/workflows/release-drafter.yml) (DefectDojo#14513) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

Bumps [ruff](https://github.com/astral-sh/ruff) from 0.15.5 to 0.15.6. - [Release notes](https://github.com/astral-sh/ruff/releases) - [Changelog](https://github.com/astral-sh/ruff/blob/main/CHANGELOG.md) - [Commits](astral-sh/ruff@0.15.5...0.15.6) --- updated-dependencies: - dependency-name: ruff dependency-version: 0.15.6 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

…43.76.4 (.github/workflows/renovate.yaml) (DefectDojo#14526) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

… v2.5.3 (.github/workflows/release-x-manual-helm-chart.yml) (DefectDojo#14525) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

… v2.6.1 (.github/workflows/release-x-manual-helm-chart.yml) (DefectDojo#14532) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

…efectDojo#14481)

…ces and publish date. (DefectDojo#14498) * Support CVSS4 and also import CVSS vectors, references and publish date. * Fix linter issues

…e tests

…fectdojo/chart.yaml) (DefectDojo#14509) * chore(deps): update valkey docker tag from 0.17.1 to v0.18.0 (helm/defectdojo/chart.yaml) * update Helm documentation --------- Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* deduplication: return modified findings * fix(lint): remove unnecessary elif after return (RET505) * update comments

fixes DefectDojo#13682

@Maffooch

…n fix - Add test XML with same QID on ports 80, 443, 8080 - Add test verifying each port gets its own endpoint - Add 2.57.x release notes mentioning the fix Addresses review feedback from @Maffooch on PR DefectDojo#14528

…API token auth

github-actions · 2026-03-30T18:43:13Z

Conflicts have been resolved. A maintainer will review the pull request shortly.

tejas0077 · 2026-03-30T18:43:38Z

Hi @valentijnscholten, I've rebased the branch onto the latest bugfix branch. Conflicts are resolved. Please take a look!

github-actions · 2026-04-06T16:19:27Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

tejas0077 requested review from Maffooch and mtesauro as code owners March 15, 2026 04:15

github-actions Bot added the parser label Mar 15, 2026

Maffooch requested changes Mar 16, 2026

View reviewed changes

Maffooch added this to the 2.57.0 milestone Mar 20, 2026

tejas0077 force-pushed the fix/qualys-port-deduplication branch from 1671d06 to a4e5698 Compare March 20, 2026 17:38

github-actions Bot added docker settings_changes Needs changes to settings.py based on changes in settings.dist.py included in this PR apiv2 docs unittests ui helm labels Mar 20, 2026

github-actions Bot added the conflicts-detected label Mar 20, 2026

dependabot Bot and others added 5 commits March 30, 2026 14:35

chore(deps): update postgres docker tag from 18.2 to v18.3 (docker-co…

d6c99be

…mpose.yml) (DefectDojo#14399) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

chore(deps): update suzuki-shunsuke/github-action-renovate-config-val…

f473410

…idator action from v2.0.0 to v2.1.0 (.github/workflows/renovate.yaml) (DefectDojo#14407) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

chore(deps): update dependency kubernetes/kubernetes from v1.35.1 to …

013fb70

…v1.35.2 (.github/workflows/k8s-tests.yml) (DefectDojo#14417) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

chore(deps): update dependency kubernetes from 1.32.12 to v1.33.9 (.g…

9140849

…ithub/workflows/k8s-tests.yml) (DefectDojo#14418) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

valentijnscholten and others added 19 commits March 30, 2026 14:37

chore(deps): update actions/download-artifact action from v8.0.0 to v…

6ef6495

…8.0.1 (.github/workflows/rest-framework-tests.yml) (DefectDojo#14490) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

chore(deps): update styfle/cancel-workflow-action action from 0.13.0 …

4368e72

…to v0.13.1 (.github/workflows/cancel-outdated-workflow-runs.yml) (DefectDojo#14491) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

chore(deps): update release-drafter/release-drafter action from v6.4.…

42112fb

…0 to v7 (.github/workflows/release-drafter.yml) (DefectDojo#14513) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

chore(deps): update dependency renovatebot/renovate from 43.60.4 to v…

558dbc3

…43.76.4 (.github/workflows/renovate.yaml) (DefectDojo#14526) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

chore(deps): update softprops/action-gh-release action from v2.5.0 to…

c00acc8

… v2.5.3 (.github/workflows/release-x-manual-helm-chart.yml) (DefectDojo#14525) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

chore(deps): update softprops/action-gh-release action from v2.5.3 to…

781564f

… v2.6.1 (.github/workflows/release-x-manual-helm-chart.yml) (DefectDojo#14532) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

fix(awssecurityhub): use parse_cvss_data helper for CVSS extraction (D…

f4e7d08

…efectDojo#14481)

Dependency Track: Support CVSS4 and also import CVSS vectors, referen…

eba35a9

…ces and publish date. (DefectDojo#14498) * Support CVSS4 and also import CVSS vectors, references and publish date. * Fix linter issues

fix(performance): update expected query counts in importer performanc…

6bcda5f

…e tests

(perf) Batch duplicate marking part 2 (DefectDojo#14516)

6d82a8b

* deduplication: return modified findings * fix(lint): remove unnecessary elif after return (RET505) * update comments

Fix Qualys parser collapsing findings with same QID but different ports

710d885

fixes DefectDojo#13682

Fix Qualys parser: add port to endpoint for per-port finding separation

5a43eb3

Remove port from finding title, keep only in endpoint

7cb2aa4

Fix username logging: set REMOTE_USER in LoginRequiredMiddleware for …

a998e63

…API token auth

tejas0077 force-pushed the fix/qualys-port-deduplication branch from a4e5698 to a998e63 Compare March 30, 2026 18:42

github-actions Bot removed the conflicts-detected label Mar 30, 2026

Maffooch modified the milestones: 2.57.0, 2.58.0 Apr 3, 2026

github-actions Bot added the conflicts-detected label Apr 6, 2026

valentijnscholten marked this pull request as draft April 11, 2026 13:55

Maffooch modified the milestones: 2.58.0, 2.59.0 May 1, 2026

Conversation

tejas0077 commented Mar 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

valentijnscholten commented Mar 15, 2026

Uh oh!

tejas0077 commented Mar 15, 2026

Uh oh!

Maffooch left a comment

Choose a reason for hiding this comment

Uh oh!

tejas0077 commented Mar 16, 2026

Uh oh!

tejas0077 commented Mar 16, 2026

Uh oh!

valentijnscholten commented Mar 16, 2026

Uh oh!

tejas0077 commented Mar 17, 2026

Uh oh!

Maffooch commented Mar 20, 2026

Uh oh!

tejas0077 commented Mar 20, 2026

Uh oh!

github-actions Bot commented Mar 20, 2026

Uh oh!

valentijnscholten commented Mar 30, 2026

Uh oh!

github-actions Bot commented Mar 30, 2026

Uh oh!

tejas0077 commented Mar 30, 2026

Uh oh!

github-actions Bot commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

tejas0077 commented Mar 15, 2026 •

edited

Loading