Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 30 additions & 11 deletions tools/sbom-diff-and-risk/docs/policy-schema.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
# Policy schema

`sbom-diff-and-risk` supports YAML-only policy schemas in versions `1`, `2`, and `3` for the local, provenance-aware, and optional Scorecard-aware policy flows described here.
`sbom-diff-and-risk` supports YAML-only policy schemas in versions `1`, `2`,
and `3` for the local, provenance-aware, and optional Scorecard-aware policy
flows described here.

The schema is intentionally conservative and fail-closed:

Expand Down Expand Up @@ -39,9 +41,14 @@ Version `2` supports every version `1` field plus:
- `require_provenance_for_suspicious_sources: bool`
- `allow_unattested_packages: [package_name, ...]`
- `allow_provenance_publishers: [publisher_kind, ...]`
- `allow_unattested_publishers: [publisher_kind, ...]` as an accepted compatibility alias for `allow_provenance_publishers`
- `allow_unattested_publishers: [publisher_kind, ...]` as an accepted
compatibility alias for `allow_provenance_publishers`

`allow_provenance_publishers` is the canonical publisher override field. The parser also accepts `allow_unattested_publishers` as an alias when teams want a more explicit override-style name in review. Neither field treats missing attestations as trusted; they only constrain which attested publisher kinds count as verified provenance.
`allow_provenance_publishers` is the canonical publisher override field. The
parser also accepts `allow_unattested_publishers` as an alias when teams want a
more explicit override-style name in review. Neither field treats missing
attestations as trusted; they only constrain which attested publisher kinds
count as verified provenance.

## Version 2 supported rule ids

Expand All @@ -58,7 +65,9 @@ Version `3` supports every version `1` and `2` field plus:

- `minimum_scorecard_score: float`

`minimum_scorecard_score` is advisory by itself. It only affects policy outcomes when you also opt into the `scorecard_below_threshold` rule through `block_on`, `warn_on`, or `ignore_rules`.
`minimum_scorecard_score` is advisory by itself. It only affects policy outcomes
when you also opt into the `scorecard_below_threshold` rule through `block_on`,
`warn_on`, or `ignore_rules`.

## Version 3 supported rule ids

Expand All @@ -75,17 +84,27 @@ Version `3` supports every version `1` and `2` rule id plus:
- `allow_sources` enforces exact host matches against `source_url` hosts for added and changed components.
- `ignore_rules` suppresses matching rule ids entirely.
- `missing_attestation` means PyPI release metadata was fetched successfully but no attestations were present.
- `provenance_unavailable` means the run did not have usable provenance evidence for that package, for example because enrichment was disabled, unsupported, or failed.
- `unverified_provenance` means attestations were present, but the provenance could not be verified against publisher metadata.
- `provenance_unavailable` means the run did not have usable provenance
evidence for that package, for example because enrichment was disabled,
unsupported, or failed.
- `unverified_provenance` means attestations were present, but the provenance
could not be verified against publisher metadata.
- `provenance_required` is a policy-only rule emitted when an explicit provenance requirement was not satisfied.
- `require_attestations_for_new_packages` applies only to added PyPI packages.
- `require_provenance_for_suspicious_sources` applies only when the component also triggered `suspicious_source`.
- `allow_unattested_packages` is a narrow package-name override for explicit missing-attestation exceptions only.
- `allow_unattested_packages` does not waive `provenance_unavailable` or `unverified_provenance`; those remain separate, reviewable policy decisions.
- `allow_provenance_publishers` and `allow_unattested_publishers` apply only when attestations exist and publisher kinds are available to verify.
- when enrichment is disabled, deterministic local mode is unchanged unless a provenance-aware policy explicitly turns unavailable evidence into a warning or block.
- `minimum_scorecard_score` does not create alerts or blocks on its own; it only becomes enforceable when `scorecard_below_threshold` is configured explicitly.
- Scorecard evidence remains an auxiliary trust signal. A high score is not proof of safety, and missing Scorecard data is not proof of risk.
- `allow_unattested_packages` does not waive `provenance_unavailable` or
`unverified_provenance`; those remain separate, reviewable policy decisions.
- `allow_provenance_publishers` and `allow_unattested_publishers` apply only
when attestations exist and publisher kinds are available to verify.
- when enrichment is disabled, deterministic local mode is unchanged unless a
provenance-aware policy explicitly turns unavailable evidence into a warning
or block.
- `minimum_scorecard_score` does not create alerts or blocks on its own; it
only becomes enforceable when `scorecard_below_threshold` is configured
explicitly.
- Scorecard evidence remains an auxiliary trust signal. A high score is not
proof of safety, and missing Scorecard data is not proof of risk.

## Version 1 example

Expand Down
85 changes: 64 additions & 21 deletions tools/sbom-diff-and-risk/docs/report-schema.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,13 @@
# JSON report schema

This document describes the stable, reviewer-facing shape of `sbom-diff-and-risk` JSON reports. The JSON format is intended for machine consumption in CI, review tooling, and audit trails. Human-readable review notes remain in the Markdown report.
This document describes the stable, reviewer-facing shape of
`sbom-diff-and-risk` JSON reports. The JSON format is intended for machine
consumption in CI, review tooling, and audit trails. Human-readable review
notes remain in the Markdown report.

The schema is conservative and additive where possible. Golden sample reports in `examples/` lock important output shape for the default, policy, provenance, and Scorecard paths.
The schema is conservative and additive where possible. Golden sample reports
in `examples/` lock important output shape for the default, policy, provenance,
and Scorecard paths.

## Top-level structure

Expand All @@ -26,27 +31,58 @@ JSON reports currently use this top-level structure:
| `metadata` | Run metadata such as input formats, generation time, strict mode, policy state, and enrichment state. |
| `notes` | Additional report notes. |

When provenance policy fields are relevant, reports may also include `provenance_policy` and `provenance_policy_impact`. Consumers should treat unrecognized top-level fields as additive report data.
When provenance policy fields are relevant, reports may also include
`provenance_policy` and `provenance_policy_impact`. Consumers should treat
unrecognized top-level fields as additive report data.

## Policy finding explanation fields

Policy findings in `policy_evaluation.blocking_violations`, `policy_evaluation.warning_violations`, `policy_evaluation.suppressed_violations`, `blocking_findings`, `warning_findings`, `suppressed_findings`, and provenance policy impact sections include stable explainability metadata.

These fields describe why a local policy rule produced a block, warning, or suppression. They are policy-decision metadata only; they are not dependency safety verdicts, CVE results, or proof that a package is safe or unsafe.

| Field | Meaning |
| --- | --- |
| `decision_reason` | Stable reason code for the policy decision, such as `risk_finding_matched_policy_rule`, `added_package_count_exceeded_threshold`, or `scorecard_score_below_threshold`. |
| `policy_rule` | Policy rule id that produced the decision. This mirrors `rule_id` for consumers that group explanation data separately. |
| `severity_source` | Source of the active severity, such as `block_on`, `warn_on`, `default_block`, or `default_warn`; `null` when a policy finding has no active severity. |
| `matched_threshold` | Configured threshold or allowlist value involved in the decision, when applicable. |
| `observed_value` | Observed local value that was compared to the policy rule, when applicable. |

Explanation fields appear only on policy finding objects. Risk findings in `risks` remain the analyzer's local heuristic findings and do not receive policy-decision metadata unless a policy evaluation maps them into policy findings.
Policy findings in the following report sections include stable explainability
metadata:

- `policy_evaluation.blocking_violations`
- `policy_evaluation.warning_violations`
- `policy_evaluation.suppressed_violations`
- `blocking_findings`
- `warning_findings`
- `suppressed_findings`
- provenance policy impact sections

These fields describe why a local policy rule produced a block, warning, or
suppression. They are policy-decision metadata only; they are not dependency
safety verdicts, CVE results, or proof that a package is safe or unsafe.

- `decision_reason`: Stable reason code for the policy decision, such as
`risk_finding_matched_policy_rule`,
`added_package_count_exceeded_threshold`, or
`scorecard_score_below_threshold`.
- `policy_rule`: Policy rule id that produced the decision. This mirrors
`rule_id` for consumers that group explanation data separately.
- `severity_source`: Source of the active severity, such as `block_on`,
`warn_on`, `default_block`, or `default_warn`; `null` when a policy finding
has no active severity.
- `matched_threshold`: Configured threshold or allowlist value involved in the
decision, when applicable.
- `observed_value`: Observed local value that was compared to the policy rule,
when applicable.

Explanation fields appear only on policy finding objects. Risk findings in
`risks` remain the analyzer's local heuristic findings and do not receive
policy-decision metadata unless a policy evaluation maps them into policy
findings.

## Summary contract

`summary` is the stable, compact entry point for automation that needs counts without walking the full report. The `--summary-json PATH` CLI option writes only this stable `report.json["summary"]` object. The checked-in [../examples/sample-summary.json](../examples/sample-summary.json) artifact is the summary-only output for the default CycloneDX example and matches the `summary` object in [../examples/sample-report.json](../examples/sample-report.json). For CI consumption examples, see [summary-json-ci-cookbook.md](summary-json-ci-cookbook.md).
`summary` is the stable, compact entry point for automation that needs counts
without walking the full report. The `--summary-json PATH` CLI option writes
only this stable `report.json["summary"]` object.

The checked-in [../examples/sample-summary.json](../examples/sample-summary.json)
artifact is the summary-only output for the default CycloneDX example and
matches the `summary` object in
[../examples/sample-report.json](../examples/sample-report.json). For CI
consumption examples, see
[summary-json-ci-cookbook.md](summary-json-ci-cookbook.md).

Base `summary` fields:

Expand All @@ -57,9 +93,12 @@ Base `summary` fields:
| `changed` | Number of components present in both inputs with a detected change. |
| `risk_counts` | Map of risk bucket name to count. |

There is intentionally no `unchanged` field. The current diff model does not track unchanged components, so reporting an unchanged count would imply a model guarantee that does not exist.
There is intentionally no `unchanged` field. The current diff model does not
track unchanged components, so reporting an unchanged count would imply a model
guarantee that does not exist.

`summary.policy` appears only when a policy is applied. Absence of `summary.policy` means policy was not used, not that policy evaluation failed.
`summary.policy` appears only when a policy is applied. Absence of
`summary.policy` means policy was not used, not that policy evaluation failed.

| Field | Meaning |
| --- | --- |
Expand All @@ -68,7 +107,9 @@ There is intentionally no `unchanged` field. The current diff model does not tra
| `summary.policy.warning` | Count of warning policy violations. |
| `summary.policy.suppressed` | Count of suppressed policy violations. |

`summary.enrichment` appears only when PyPI or Scorecard enrichment is used. Absence of `summary.enrichment` means enrichment was not used, not that enrichment failed.
`summary.enrichment` appears only when PyPI or Scorecard enrichment is used.
Absence of `summary.enrichment` means enrichment was not used, not that
enrichment failed.

| Field | Meaning |
| --- | --- |
Expand All @@ -81,7 +122,9 @@ There is intentionally no `unchanged` field. The current diff model does not tra
| `summary.enrichment.scorecard.supported_components` | Count of components supported by Scorecard enrichment. |
| `summary.enrichment.scorecard.status_counts` | Sorted map of Scorecard enrichment status names to counts. |

Provider-specific `pypi` and `scorecard` objects appear only for the providers used in that run. Their `status_counts` maps are sorted by key to keep output stable for tests and downstream consumers.
Provider-specific `pypi` and `scorecard` objects appear only for the providers
used in that run. Their `status_counts` maps are sorted by key to keep output
stable for tests and downstream consumers.

## Stability notes

Expand Down