Skip to content

Add bundle-ready UK release manifests#395

Merged
anth-volk merged 6 commits intomainfrom
add-bundle-release-manifest
May 5, 2026
Merged

Add bundle-ready UK release manifests#395
anth-volk merged 6 commits intomainfrom
add-bundle-release-manifest

Conversation

@anth-volk
Copy link
Copy Markdown
Contributor

Fixes #394

Summary

This draft PR makes UK data release manifests compatible with the experimental PolicyEngine bundle/reproducibility contract.

  • Adds bundle-compatible policyengine-uk, policyengine-core, and UK data package provenance metadata to release manifests.
  • Records private Hugging Face artifact release identity, artifact URIs, checksums, sizes, and visibility metadata.
  • Removes legacy top-level created_at from normalized manifests so emitted manifests validate against the shared bundle contract.
  • Adds a PR-only contract validation job that installs policyengine-bundles as test tooling, not as a runtime dependency.

Validation

  • ruff check policyengine_uk_data/utils/release_manifest.py policyengine_uk_data/utils/data_upload.py policyengine_uk_data/tests/test_release_manifest.py
  • ruff format --check policyengine_uk_data/utils/release_manifest.py policyengine_uk_data/utils/data_upload.py policyengine_uk_data/tests/test_release_manifest.py
  • ruby -e 'require "yaml"; YAML.load_file(".github/workflows/pull_request.yaml"); puts "ok"'
  • git diff --check
  • PYTHONPATH=/Users/administrator/Documents/PolicyEngine/policyengine-bundles/src:$PYTHONPATH python -m pytest policyengine_uk_data/tests/test_release_manifest.py -q

Notes

uv run pytest policyengine_uk_data/tests/test_release_manifest.py -q is blocked on this local Intel mac because the current lock resolves torch==2.9.1, which has no compatible macosx_15_0_x86_64 wheel. CI should run this on Linux.

This is intentionally draft because it demonstrates the bundle contract integration before the bundle schema and release orchestration are finalized.

Copy link
Copy Markdown
Contributor Author

Dependency note: this PR should not be merged until the companion policyengine-uk runtime metadata work is merged and released first, currently PolicyEngine/policyengine-uk#1659. Without that, UK data can validate the manifest shape but may only record fallback policyengine-uk metadata rather than the full bundle-ready runtime metadata required for certification.

Copy link
Copy Markdown
Contributor Author

anth-volk commented May 5, 2026

Follow-up scope note for the bundle plan:

This PR is intended to cover the staged migration layer at Commit 7 (my own internal numbering): make UK data release manifests bundle-ready enough to validate against the current policyengine-bundles data release contract. That means emitted manifests now carry artifact URIs/revisions, SHA256s/sizes, built-with policyengine-uk, built-with policyengine-core, data package git SHA, default dataset mapping, and public/private artifact metadata.

Clarified versioning invariant:

  • UK data is intended to use one release coordinate. The policyengine-uk-data package version, Hugging Face release tag, release manifest data_package.version, artifact release metadata version, artifact revision, and artifact URI revision should all be the same value.
  • This PR now tests that emitted manifests preserve that invariant. We should not introduce a separate private artifact release version unless the release process intentionally stops keeping package and dataset releases aligned.
  • The practical remaining release-discipline issue is that bundle generation should reject a UK-data package version until the matching HF tag/manifests/artifacts exist for that same version.

Some issues are deliberately left for later commits because they likely need canonical contract/schema decisions in policyengine-bundles, not just UK-data-local implementation:

  • Commit 10 should finish UK private data atomic release handling. This PR now prevents mutation of finalized HF version tags, but it does not yet add the full release-completion contract: release-complete.json, validation report publication, required private artifact reachability checks, and CI gates that fail if private artifact publication is incomplete.
  • The public/private HF repo split still needs a bundle-level certification rule. This PR records repo identity and visibility in emitted manifests, but it does not redesign how callers choose public vs private repos. The bundle contract should eventually certify exact artifact refs and require private UK data reachability with credentials rather than relying on function defaults.

The reason to defer these is that they are cross-repo contract questions. Implementing them only here would either bury important semantics in free-form metadata or create a UK-specific shape that policyengine-bundles, .py, and other consumers do not yet know how to enforce.

@anth-volk anth-volk requested a review from vahid-ahmadi May 5, 2026 18:47
@anth-volk anth-volk marked this pull request as ready for review May 5, 2026 22:07
@anth-volk anth-volk merged commit eb0ed9c into main May 5, 2026
4 checks passed
@anth-volk anth-volk deleted the add-bundle-release-manifest branch May 5, 2026 22:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add bundle-compatible UK data release manifest metadata

1 participant