Land WP1–WP5: pipeline cleanup, ontology discipline, three-remote architecture by mzargham · Pull Request #10 · DynamicalSystemsGroup/ADCS-lifecycle-demo

mzargham · 2026-05-29T03:43:55Z

Summary

Merges the full roadmap (51 commits across five work packages plus CI/notebook fixes) from staging to main. The demo moves from "mock-up that narrates a three-remote story" to "running code that exercises four services with a fully audited multi-ASoT provenance graph."

WP1 — Pipeline cleanup + rerun surfacing (closes Surface "which pipeline stages must re-run" from SHACL closure + hash-mismatch results #3). PipelineState dataclass + per-stage typed records; query_named_graph helper; ExecutionMetadata.executor_uri/location_uri consolidation; validation→verification rename with back-compat alias; interrogate.rerun Typer CLI + Stage 6.5 banner extension; Typer migration of pipeline.runner + interrogate.{explain,reproduce}; openCAESAR prose cleanup.
WP2 — Ontology discipline (closes Promote ROBOT/ELK validation to default ontology build path #2). ROBOT/ELK promoted to canonical make ontology (fail-fast on missing Java); .github/workflows/ontology.yml with setup-java@v4 + cached robot.jar; pytest live / network markers; triple-count budget gate; openCAESAR code/data cleanup + rtm.ttl regen.
WP3 — Docker image as tracked evidence (closes Track Docker images as first-class evidence content #4). rtm:DockerImage class + property set; hash_docker_image(); DockerCompute._emit_image_node(); prov:wasDerivedFrom on Docker evidence; evidence_by_image SPARQL helper; DockerEvidenceShape closure rule.
WP4 — Three-remote architecture made real (16 commits). Preflight probes on storage + compute backends (fail-fast on unreachable remotes); rtm:gitRef + rtm:flexoRecord cross-linking image to git + Flexo; rtm:DockerContainer as a first-class materialization entity (image vs container vs host distinction); organizational auspices via prov:Organization + rtm:operatedBy; EARL-wrapped automated outcomes (rtm:ClosureRuleAssertion, rtm:DigestMatchAssertion); compute.reproduce Typer CLI (rebuild image at recorded git ref + digest-compare); TxnLogBackend (CouchDB) as a fourth service with its own URI + auspices; TransactionLogger with secret-redaction allowlist; six typed trust queries in traceability/queries.py; optional FLEXO_BRANCH_PREFIX for multi-run isolation; tools/start-services.sh + docker-compose.yml; new ARCHITECTURE.md.
WP5 — Storyboard integration + end-of-roadmap alignment. Audit module surfaces Docker image provenance in the report; notebook Acts 9–10 expanded with multi-ASoT framing + numerical-evidence end-to-end provenance render; RECONCILIATION.md mapping every slide claim to its code receipt.
Post-WP CI + notebook fixes. Reproducible build_time (SOURCE_DATE_EPOCH + git ct); actions/checkout with fetch-depth: 0; diff-gate scoped to rtm.ttl (manifest legitimately diverges by build path); CLI --help substring tests hardened against terminal-width wrapping; WP* mentions stripped from notebook prose.

Closed by this PR

Promote ROBOT/ELK validation to default ontology build path #2 Promote ROBOT/ELK to default ontology build
Surface "which pipeline stages must re-run" from SHACL closure + hash-mismatch results #3 Surface "which pipeline stages must re-run" from SHACL closure + hash-mismatch results
Track Docker images as first-class evidence content #4 Track Docker images as first-class evidence content

Remain open (intentional)

Add top-level adcs Typer aggregator CLI (deferred from WP1) #5 Top-level adcs Typer aggregator — deferred until WP4 entry points stabilize; tracked in WP1 §10
Rename ValidateShapes step IRI to VerifyShapes (with Flexo + audit migration) #6 Rename ValidateShapes step IRI to VerifyShapes — discipline follow-up requiring a Flexo + audit migration; tracked in WP1 §10
Track WP4 local-build-only Docker decision; propose Planetary Utilities host a registry as future enabler #7 PU Docker registry coordination — companion future-work issue for Planetary Utilities
Future work: exercise externally-hosted RIME services (cross-link with rime-backend-demo) #8 Exercise externally-hosted RIME services — companion future-work issue (cross-link with rime-backend-demo; @rororowyourboat tagged in body)
Future work: reusable oracles hosted in Starforge (registry + on-demand invocation for model-checking) #9 Reusable oracles hosted in Starforge — synthesizing future-work issue

New surfaces a reviewer can read cold

ARCHITECTURE.md — three-remote + fourth-service picture, URI scheme, EARL outcomes, six trust queries, preflight gate, reproducibility loop
RECONCILIATION.md — slide claims ↔ code receipts for the May-22 OpenMBEE deck
.env.example — every FLEXO/ADCS_TXNLOG/ADCS_*_ORG env var with documented defaults
tools/start-services.sh — one-liner CouchDB txnlog store provisioning
Notebook (renders to Pages on merge) — "Many Authoritative Sources of Truth" cell + "Numerical evidence — full provenance end-to-end" cell

Test plan

uv run pytest — 269 passed, 5 skipped, 3 deselected (on staging tip)
CI green on the latest staging push: run 26616426815
make ontology-python produces byte-identical rtm.ttl on rebuild (reproducibility verified locally)
uv run python -m pipeline.runner --auto end-to-end smoke: 1084 union triples in output/rtm.ttl
Reviewer: read ARCHITECTURE.md cold; does the three-remote + fourth-service picture stand on its own?
Reviewer: open the locally re-exported output/notebook.html (or wait for Pages); does the multi-ASoT narrative + numerical-evidence provenance render land?
Reviewer: skim closed issues Promote ROBOT/ELK validation to default ontology build path #2/Surface "which pipeline stages must re-run" from SHACL closure + hash-mismatch results #3/Track Docker images as first-class evidence content #4 status comments; do the acceptance-criteria checklists reconcile?
Reviewer: optional canonical multi-remote smoke — tools/start-services.sh && export ADCS_TXNLOG_ENABLED=1 && export FLEXO_TOKEN=... && uv run python -m pipeline.runner --auto --backend=flexo --compute=docker

🤖 Generated with Claude Code

…yped result records Split run_pipeline's 285-line body into per-stage free functions threaded by a PipelineState object. Each stage returns a typed result record (StructuralResult, SymbolicResult, NumericalResult, EvidenceBindingResult, AttestationStageResult, ClosureRuleResult, AuditStageResult, ReportStageResult) assigned to the matching state field. Downstream stages read prior results via state.<prior>.<field> instead of free locals. The activity_to_stage table maps p-plan step IRI fragments (STEP_NAMES) to pipeline stage numbers. Kept in sync with traceability.plan_execution; covered by a new unit test. Vestigial `stage = LifecycleStage.X` assignments dropped (the variable was set but never read). LifecycleStage and check_gate are preserved for external callers and tests. CLI surface unchanged. Tests: 23 pipeline + named_graphs passing, 83 attestation + audit + traceability + shape_suite + compute + backends passing. Part of WP1 (roadmap §4.1). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…coped queries `pipeline.dataset.query_named_graph(ds, layer, sparql, **bindings)` scopes a SPARQL query to one named graph. Use it when the query is intentionally layer-specific; keep using `ds.query(...)` when the query is meant to walk the union view (via Dataset(default_union=True)). The existing queries in traceability/audit.py, traceability/queries.py, and traceability/attestation.py are intentionally union-scoped. Added a section banner in audit.py documenting this and extended the query_to_dicts docstring in queries.py with the convention so future contributors don't reach for graph_for() when they really want the union. Two new tests cover the helper: layer-scoped count is a strict subset of the union count, and unknown layers raise KeyError. The helper is added as a primitive for WP3/WP4 (Docker image + Flexo remote queries that legitimately want one-layer scope) without forcing any current call site to migrate. Part of WP1 (roadmap §4.2). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…methods The executor and location URI shapes used to live as inline string construction in evidence.binding._bind_execution_metadata; promote them to methods on ExecutionMetadata so WP3 (rtm:DockerImage as evidence) and WP4 (three-remote architecture) can reuse the same shapes without copy-paste. IRI shapes preserved byte-for-byte: executor_uri() -> urn:adcs:executor:<container_id|hostname|unknown> (colons in suffix replaced with dashes) location_uri() -> urn:adcs:location:<location_kind>:<hostname|unknown> evidence.binding._bind_execution_metadata now consumes the methods directly. A new TestExecutionMetadataURIs class covers prefer- container-id, fall-back-to-hostname, unknown sentinel, colon replacement, and the location shape. Part of WP1 (roadmap §4.3). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Strict semantic split: verification = automated, fully-specified check (SHACL conformance, ROBOT/ELK, hash matching, completeness). validation = human judgement (attestation, adequacy, sufficiency). Module + test moves (git mv preserves history): traceability/validation.py -> traceability/verification.py tests/test_robot_validation.py -> tests/test_robot_verification.py Symbol renames inside the renamed module: validate() -> verify() validate_shacl() -> verify_shacl() validate_reverification() -> verify_reverification() ValidationReport -> VerificationReport Symbol renames in traceability/rtm.py: validate_structural_completeness -> verify_structural_completeness validate_evidence_completeness -> verify_evidence_completeness Back-compat aliases retained inside the renamed module — to be removed in a follow-up PR after WP3 lands. Runner / banner string updates: "Validating closure-rule suite..." -> "Verifying...", "Structural validation: PASS" -> "Structural verification: PASS", Stage 0 banner "Validation: ..." -> "Verification: ...". Plan.ttl rdfs:label updates: "Stage 6.5 — Validate Closure-Rule Suite" -> "Verify..."; "Validation Report" -> "Verification Report". The step IRI fragment <plan/step/ValidateShapes> is PRESERVED to keep already-persisted <adcs:plan-execution> + <adcs:audit> graphs valid; IRI rename tracked separately for a future Flexo migration (WP1 §10 Known follow-ups). Notebook function-call references (Acts 4 + Stage 6.5 narration) updated to new symbols; narrative prose unchanged (WP5 owns that). scripts/build_ontology.py is INTENTIONALLY UNTOUCHED — WP2 renames _validate_sysml_axioms there in the same commit that lands the openCAESAR cleanup, to avoid a merge conflict on that file. Where validation legitimately stays: traceability/attestation.py, request_attestation(), upstream pyshacl.validate, OSLC oslc_qm: IRI fragments, vendored ontology imports. Test counts: 171 passed, 4 skipped (baseline 162 + 9 new from prior WP1 commits). Live-Flexo failures predate WP1 and are out of scope. Part of WP1 (roadmap §4.4). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Typer is the demo's CLI framework convention (per WP1 §4.6). The next commit migrates pipeline.runner + interrogate.{explain,reproduce} to Typer apps; the rerun.py CLI added later in this PR is Typer-based from the start. Pinned to >=0.12,<1.0 (current resolved: 0.26.2). Brings in Click + Rich + markdown-it-py + shellingham as transitive deps; all stable and well-established. Part of WP1 (roadmap §4.6). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replaces argparse.ArgumentParser with a Typer app. Every flag name is preserved (--auto, --no-attest, --engineer, --rebuild, --backend, --compute) so existing invocations work unchanged. Choice-validated options (--backend, --compute) use Enum subclasses so Typer matches the prior argparse `choices=` semantics. The `main()` callable is retained as a thin wrapper around `app()` so the `[project.scripts] adcs-pipeline = "pipeline.runner:main"` entry point keeps resolving. interrogate/explain.py, interrogate/reproduce.py, interrogate/visualize.py are library-only modules with no CLI entry points; nothing to migrate there. WP1 §4.6 specified them speculatively; the actual scope is just pipeline.runner. The deferred top-level `adcs` aggregator (issue #5) can revisit when WP4 adds Flexo materialization commands. New tests/test_cli.py uses typer.testing.CliRunner for smoke tests: - pipeline.runner --help renders + lists every flag - --backend / --compute reject values outside the enum - main symbol stays importable for the console script Part of WP1 (roadmap §4.6). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…closes #3) `interrogate.rerun` walks prov:wasGeneratedBy -> p-plan:correspondsToStep to translate a VerificationReport into the dedup'd ordered set of pipeline stages that must re-run to restore RTM closure. SHACL violations on structural / human-judgement nodes (attestations, etc.) that have no producing activity are reported separately — no stage rerun can fix them. Schema enrichment (evidence/binding.py): every per-evidence SymbolicAnalysis / NumericalSimulation activity now carries p-plan:correspondsToStep linking it to the SymbolicAnalysis / NumericalSimulation step in plan.ttl. This makes the evidence -> step traversal self-describing rather than relying on activity-IRI naming conventions, and aligns the per-evidence activities with the existing stage-level activities emitted by emit_stage_activity. Stage 6.5 banner extension (pipeline/runner.py): when the verification report does not conform, render the rerun plan inline so the engineer sees which stages must re-run without having to invoke the CLI separately. CLI (Typer-based, WP1 §4.6 discipline): uv run python -m interrogate.rerun # default md output uv run python -m interrogate.rerun --requirement REQ-003 uv run python -m interrogate.rerun --format json Exit codes: 0 = clean, 1 = stages or structural violations present, 2 = input file not found. Tests cover all 7 of issue #3's acceptance criteria: AC1: closed RTM -> empty stage set AC2: proof hash mismatch -> Stage 2 AC3: simulation violation -> Stage 3 AC4: multiple invalidations -> ordered union [2, 3] AC5: attestation-level violation -> structural_violations, no stages AC6: CLI smoke tests in tests/test_cli.py AC7: Stage 6.5 banner extension verified by integration Test counts: 187 passed, 4 skipped (was 171 after commit 4; +16 across WP1 commits 5-7 covering Typer + rerun). Part of WP1 (roadmap §4.5). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Per roadmap cross-cutting section "drop explicit openCAESAR references", remove prose mentions from the WP1-owned files: README.md — Architecture blurb + namespace table row CLAUDE.md — namespace table row notebook.py — 3 narration cells (Act 1 namespace table, epilogue prologue summary, Act 11 stack table) ontology/rtm-edit.ttl — header comment + ontology description + SysMLv2 binding section comment ontology/prefixes.py — module docstring + SysMLv2 section comment + OMG_SYSML inline comment scripts/fetch_imports.py — module docstring The OMG IRI itself (http://www.omg.org/spec/SysML/20240501/) stays — it is the OMG official SysMLv2 OWL rendering, correct on its own terms. The `omg-sysml:` prefix and the OMG_SYSML constant keep their names and values. Only the attribution text changes. Built ontology regenerated (`uv run python -m scripts.build_ontology`) because rtm-edit.ttl comments changed: ontology/rtm.ttl + manifest get fresh edit_source_hash. Triple count unchanged (156 in / 156 out). WP2 owns the code-side cleanup (CSV column `opencaesar_iri` -> `omg_iri`, constant `SYSML_OPENCAESAR_NS` -> `SYSML_OMG_NS`, lookup updates in scripts/build_ontology.py + tests/test_ontology_build.py) and will regenerate rtm.ttl again as part of its commit; that regeneration will produce identical content because the renames don't alter the equivalence-axiom IRIs the script emits. Verification: full-repo grep limited to WP1 prose set returns zero; remaining hits (build_ontology.py constant + lookups, CSV header, rtm.ttl built artifact) are explicitly WP2 territory. Tests: 187 passed, 4 skipped. Part of WP1 (roadmap §4.7). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…scipline README.md: - New "Pipeline architecture" subsection introducing PipelineState + per-stage typed result records + Typer CLI convention. - New "Rerun plan from a verification report" subsection under Interrogation showing the interrogate.rerun CLI (issue #3) with --requirement and --format examples + exit-code contract. - Stage banner: "Validate Closure-Rule Suite" -> "Verify Closure-Rule Suite"; Stage 0 banner sample "Validation:" -> "Verification:" matches what runner now prints. - Top-line paragraph: "validated by a SHACL closure-rule suite" -> "verified by a SHACL closure-rule suite". - Key Directories: traceability/ updated; pipeline/ mentions PipelineState + query_named_graph; interrogate/ adds rerun. - Ontology Authoring section + Toolchain table are NOT touched here — WP2 owns those (ROBOT-as-default rewrite). Single coordination point per the cross-WP plan. CLAUDE.md: - New "Pipeline state + structured stage results" subsection (canonical description of the PipelineState pattern). - New "CLI surface" section: every CLI is Typer; flag names preserved; CliRunner-based tests; deferred top-level `adcs` aggregator linked to issue #5. - New "Verification vs validation (term discipline)" section: defines the split, names pyshacl as the explicit upstream-API exception, notes the preserved ValidateShapes IRI fragment. - Toolchain: pyshacl rephrased to mention the verify wrapper; typer added as a runtime dep. - Key directories: traceability/ + pipeline/ + interrogate/ updated. Tests: 187 passed, 4 skipped. Part of WP1 (roadmap §5). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

WP1 of the roadmap at /Users/z/.claude/plans/i-want-to-continue-atomic-lobster.md. Internal cleanups (PipelineState refactor, query_named_graph helper, ExecutionMetadata URI methods), validation -> verification rename discipline, Typer migration of pipeline.runner, new interrogate.rerun CLI mapping closure violations to pipeline stages (closes #3), and the WP1 share of the openCAESAR prose cleanup. 9 commits, 30 files, +1437 / -266 lines. Test suite 162 -> 187 passing (no new failures). Output triple count 948 -> 955 (+7 new p-plan:correspondsToStep schema enrichment on per-evidence activities). Staged for integration with WP2 (ROBOT default + pytest markers + triple budget + openCAESAR code/data cleanup) before promotion to main. Follow-up issues filed: - #5 deferred top-level `adcs` Typer aggregator (WP4-dependent) - #6 ValidateShapes step IRI fragment rename (WP4-dependent) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Add `[tool.pytest.ini_options]` markers `live` and `network` plus default `addopts = "-m 'not live and not network'"` so the canonical `uv run pytest` invocation filters infrastructure-dependent tests without requiring per-test env-var introspection. CI opts in explicitly with `-m live` (or `-m network` once any are written). tests/test_flexo_live.py rewritten: - pytestmark switches from env-var-driven `skipif` to `@pytest.mark.live`. - `_flexo_reachable()` removed — connectivity probing belongs in fixtures, not at module import. When `-m live` is requested but credentials are missing, the `token` fixture now fails LOUDLY (pytest.fail) instead of skipping. Skip-on-opt-in would hide infra breakage; the marker is the opt-in signal. - Docstring updated to show the new invocation pattern. Tests: 187 passed, 4 skipped, 3 deselected (live tests filtered out by default). Previously: 162 passed, 2 failed, 1 errored on live — those failures were infrastructure noise predating WP1, now correctly gated behind the marker. Part of WP2 (subplan §4.B). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

rtm: is an integration ontology — it should contribute only convenience handles, hashing properties, and SHACL targets, never new epistemic vocabulary. The gate keeps that promise honest: `scripts/build_ontology.py` now fails the build if the assembled artifact exceeds TRIPLE_BUDGET. Current size 156 + 200 headroom for WP3's rtm:DockerImage + property set and other small adds. Budget bump is a deliberate, single-place act: edit TRIPLE_BUDGET in scripts/build_ontology.py with an updated rationale comment. WP3 will bump it when rtm:DockerImage lands; no silent drift. Build banner now prints `Parsimony: <actual>/<budget> triples (<headroom> headroom)` alongside the existing artifact summary. The manifest gains a `triple_budget` block (`value`, `rationale`, `headroom`) so consumers reading the manifest see the gate without sources. New `tests/test_ontology_size.py` (3 tests) imports TRIPLE_BUDGET as the single source of truth and verifies: - the committed `rtm.ttl` parses under budget - the manifest pins the budget + rationale - the manifest's recorded triple count matches the parsed artifact (catches a stale manifest committed without re-running the build) Tests: 196 passed (was 187), 4 skipped, 3 deselected. Part of WP2 (subplan §4.C). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… + regen) Code/data half of the cross-cutting openCAESAR drop. The WP1 share already handled the prose; this commit takes the remaining identifiers + data + the regenerated artifact. Renames: ontology/sysml_term_map.csv: column header `opencaesar_iri` -> `omg_iri` scripts/build_ontology.py: constant `SYSML_OPENCAESAR_NS` -> `SYSML_OMG_NS` function `_validate_sysml_axioms` -> `_verify_sysml_axioms` row lookups `row['opencaesar_iri']` -> `row['omg_iri']` tests/test_ontology_build.py: matching `row['opencaesar_iri']` -> `row['omg_iri']` The function rename is the WP1 verification discipline applied to a file WP1 explicitly scope-excluded so WP2 could own it in the same commit as the openCAESAR cleanup (avoiding a merge conflict on build_ontology.py). The IRIs the script emits are unchanged — the OMG namespace value `http://www.omg.org/spec/SysML/20240501/` stays because it's the official OMG SysMLv2 OWL rendering, correct on its own terms. Only the attribution text and the local label change. Regenerated `ontology/rtm.ttl` + `ontology/assembly_manifest.json` (`uv run python -m scripts.build_ontology`). Triple count 156/356; edit-source hash refreshed; CSV row count unchanged at 9. Repo-wide grep gate: grep -rni "caesar|opencaesar|open-caesar" --include={py,md,ttl,json,csv,toml,yaml,yml,sh} returns zero hits across the whole repo (WP1 prose + WP2 code/data both clean). Tests: 190 passed, 4 skipped, 3 deselected (was 196 after commit 2; no test count change here — the same tests, all green). Part of WP2 (subplan §4.D); cross-coordinates with WP1 §4.4 (roadmap "Drop explicit openCAESAR references" section). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

`make ontology` now requires Java + obo-robot on PATH and runs the full chain: preflight -> ontology-robot (merge + ELK reason + OBO report) -> ontology-python with ADCS_ROBOT_VERIFIED=1. Fails fast with a helpful error pointing at the no-Java alternative (`make ontology-python`) when the toolchain is missing. No `ROBOT_OPTIONAL` escape hatch: the no-Java path is the explicit `ontology-python` Makefile target — invoking it is an intentional opt-out, not a flag on the default. Honours the roadmap's "stop being a mock-up; the integration story should not silently degrade" rule. scripts/build_ontology.py reads ADCS_ROBOT_VERIFIED from the env to decide what to write into the manifest's `robot_used` + `notes` fields. Stage 0 banner branches on `robot_used` to print either "ROBOT merge + ELK reasoning + OBO report PASS" or "Python assembly only (no-Java path; run `make ontology` for ROBOT/ELK verification)". New `.github/workflows/ontology.yml`: - actions/checkout@v4 + actions/setup-java@v4 (Temurin 17) - Cached ROBOT jar (v1.9.5) downloaded once per cache key - 3-line bash wrapper installs as `obo-robot` on PATH - astral-sh/setup-uv@v6 + `uv sync` - `make ontology` runs the canonical chain - Confirms `rtm.ttl` + `assembly_manifest.json` are committed in-sync with the rebuild (catches forgotten regen commits) - `uv run pytest -v` (live + network markers skip by default) Triggers on push to main + staging and on PRs targeting either. Smoke-tested locally: - `make ontology-python` writes `robot_used: false` + correct notes - `make ontology` on a no-Java machine fails fast with the documented error message Tests: 190 passed, 4 skipped, 3 deselected (unchanged from §4.C — the rename + Makefile changes don't alter test behaviour). Closes issue #2. Part of WP2 (subplan §4.A). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…riple budget README.md: - Toolchain table: OBO ROBOT row promoted from `optional` to `required (default)` with the no-Java alternative spelled out. - New "Tests" subsection under Quick Start documenting the marker convention and the default-skip rule (`addopts` in pyproject.toml). - Ontology Authoring rewritten: `make ontology` as canonical with the fail-fast preflight; `make ontology-python` as the explicit no-Java target; `make ontology-robot` as just the ROBOT step. - Triple-count budget mention added so contributors know about the parsimony gate before they discover it via a failing build. - Stage 0 banner sample updated: rendered example now shows the ROBOT-default "Verification: ROBOT merge + ELK reasoning + OBO report PASS" line. - "uv run pytest -v" comment in Quick Start: "166 tests" -> "default: skips live + network markers" (count fluctuates per WP). CLAUDE.md: - Toolchain section: OBO ROBOT row promoted to required-for-default; CI Java + cached robot.jar called out. - New paragraph on the runner: it does NOT need Java/obo-robot; only rebuilding the ontology does. - Tests section: documents the marker convention + the fail-loud behaviour of test_flexo_live.py under `-m live` (no silent skips). - Ontology rebuild section: three-target chain with the no-Java escape, fail-fast preflight, robot_used manifest field, and the TRIPLE_BUDGET parsimony gate. Review gates passed: - `grep -rni "caesar|opencaesar|open-caesar"` — zero hits - `grep -rn "ROBOT_OPTIONAL"` — zero hits (escape hatch dropped) - `validate_sysml_axioms` hit only in a rename-rationale docstring - 190 passed, 4 skipped, 3 deselected Part of WP2 (subplan §5). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…enCAESAR cleanup) into staging WP2 of the roadmap at /Users/z/.claude/plans/i-want-to-continue-atomic-lobster.md. - §4.A ROBOT/ELK promoted to default `make ontology` with fail-fast preflight (no ROBOT_OPTIONAL escape hatch). `.github/workflows/ontology.yml` installs Java 17 + cached robot.jar and runs `make ontology` + tests on every push to main/staging and on PRs. Closes #2. - §4.B Pytest `live` + `network` markers registered in pyproject.toml, default `addopts = "-m 'not live and not network'"`. test_flexo_live.py rewritten to fail-loudly under `-m live` when credentials are missing (no silent skips on opt-in). - §4.C Triple-count budget (TRIPLE_BUDGET=356) gate in scripts/build_ontology.py + new tests/test_ontology_size.py. Manifest records `triple_budget` block with rationale. - §4.D openCAESAR code/data cleanup (WP2 share): CSV column `opencaesar_iri` -> `omg_iri`, constant `SYSML_OPENCAESAR_NS` -> `SYSML_OMG_NS`, function `_validate_sysml_axioms` -> `_verify_sysml_axioms` (WP1 verification discipline applied to a file WP1 scope-excluded for coordination). rtm.ttl + manifest regenerated. - §5 README + CLAUDE.md sweep aligning Toolchain, Ontology Authoring, Tests, and Stage 0 banner sample with the new defaults. 5 commits, +257 / -52 lines. Test counts: 190 passed, 4 skipped, 3 deselected. Repo-wide grep gates clean (zero openCAESAR, zero ROBOT_OPTIONAL). Staged for integration with WP3+ before promotion to main. JAXA workshop window (2026-06-12) is comfortably met. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

WP3 §4.1 + §4.8. Promotes the Docker image from an inline label on a prov:SoftwareAgent (where it lives today via _bind_execution_metadata) to a tracked entity that downstream evidence can derive from. New class in `ontology/rtm-edit.ttl`: rtm:DockerImage rdfs:subClassOf prov:Entity New datatype properties (domain rtm:DockerImage, range xsd:string): rtm:imageLabel — repo/tag rtm:baseImageDigest — FROM-image digest resolved at build rtm:dockerfileHash — SHA-256 of Dockerfile bytes rtm:buildContextHash — SHA-256 over build-context file manifest rtm:contentHash already exists for rtm:Evidence — the image's own content hash (runtime digest) reuses it without redeclaration. prov:wasDerivedFrom reuses PROV — no new property. TRIPLE_BUDGET bumped 356 -> 380 with a rationale-comment update documenting the WP2 (356) and WP3 (380) values and the cause of the bump. Actual current count: 176/380 (204 headroom). ontology/rtm.ttl + ontology/assembly_manifest.json regenerated via `uv run python -m scripts.build_ontology`. The class is now declared but not yet referenced anywhere — that arrives in commit 3 (DockerCompute.emit_image_node) + commit 4 (prov:wasDerivedFrom on evidence) + commit 6 (SHACL shape). Tests: 9 ontology + 9 ontology-build tests pass (190 total once full suite runs). Part of WP3 (subplan §4.1 + §4.8); first of 7 commits closing issue #4 AC1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…dressing (issue #4 AC2) WP3 §4.2. Pins Docker build inputs with two SHA-256 hashes: dockerfile_hash — SHA-256 of the Dockerfile bytes build_context_hash — SHA-256 of a sorted POSIX-normalized manifest of <relative-path>\t<file-sha256> lines These pin what `docker build` sees on disk. They are independent of the runtime image digest the daemon assigns AFTER build — that's captured separately as the image's rtm:contentHash. The pair plus the resolved base-image digest is what makes a Docker image reproducibly identifiable. DOCKER_BUILD_CONTEXT_DEFAULT_IGNORES excludes the obvious junk (.git, __pycache__, *.pyc, .venv, node_modules, .docker-ipc, output, .DS_Store, .pytest_cache, .ruff_cache) so the hash doesn't churn on local dev artifacts. The internal _ignored() helper matches each glob against the leaf name, every intermediate path component, AND the full relative path so single-component patterns like `.git` exclude entire subtrees correctly. Manifest separator is normalized to '/' so the same context hashes identically on macOS / Linux / WSL. os.walk's dirnames mutation prunes ignored subtrees so we don't recurse uselessly. The manifest format is intentionally simple. If the demo adopts SLSA / in-toto envelopes later, that becomes the canonical envelope and this hash stays as a fast self-check. tests/test_docker_image_evidence.py (new file): 8 tests covering determinism, Dockerfile-change detection, context-change detection, new-file detection, default ignore patterns (.git + __pycache__ + *.pyc + .venv + node_modules + output + .DS_Store), custom ignore patterns, missing-Dockerfile FileNotFoundError, and a smoke test against the repo's actual compute/Dockerfile + project root. Tests: 8/8 new pass; previous 190 unchanged. Part of WP3 (subplan §4.2); second of 7 commits, closes issue #4 AC2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…per run (issue #4 AC3) WP3 §4.3. Promotes the Docker image from an inline label on the prov:SoftwareAgent (where ExecutionMetadata wrote it) to a first- class rtm:DockerImage entity in the evidence graph. DockerCompute new methods: _parse_from_image() — pull the first FROM line from compute/Dockerfile via regex. Returns "" on parse failure. _resolve_base_image_digest() — `docker image inspect <FROM-tag>` with graceful empty-string fallback when the base isn't pulled locally. Cached per instance. emit_image_node(graph) — idempotent per WP3 run; on first call computes hashes via hash_docker_image, resolves the base digest, writes 8 triples (rdf:type DockerImage + Entity, contentHash, imageLabel, baseImageDigest, dockerfileHash, buildContextHash, prov:generatedAtTime) and caches the IRI. State added to __init__: _image_node_iri, _image_built_at, _base_image_digest. _image_built_at is captured at the end of _build_image() so the prov:generatedAtTime stamp reflects the actual build time, not the emission time. IRI shape: urn:adcs:docker-image:<digest-with-colons-replaced-by-dashes>. Mirrors ExecutionMetadata.executor_uri() (WP1 §4.3) so IRI shapes across the demo's URN space stay coherent. Resolution decisions baked in (WP3 subplan §9 open questions): Q1 baseImageDigest: try to resolve via `docker image inspect`, graceful empty-string fallback if the base isn't pulled (chosen: pipeline does NOT fail on missing base). Q3 Image IRI source: content-addressed on the runtime digest (not the deterministic build-input hash). The build-input hashes are recorded as properties for separate query. tests/test_compute.py additions: - _docker_subprocess_factory extended with base_image_digest= parameter; heuristic distinguishes project-image vs base-image inspect calls by checking for "adcs-compute" prefix. - TestDockerImageEmit class (4 tests): all-properties present, idempotent-within-one-run, base-image-missing graceful degrade, colon-escape in IRI suffix. - test_dockerfile_from_line_parseable smoke test against the real compute/Dockerfile (sanity: regex parser returns a python tag). The image node is now emitable but not yet REFERENCED from evidence nodes — that's commit 4 (prov:wasDerivedFrom wiring) + commit 6 (SHACL closure rule enforcing the link). Tests: 22 passed, 1 skipped (live Docker daemon required). Part of WP3 (subplan §4.3); third of 7 commits closing issue #4 AC3. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…rivedFrom (issue #4 AC4) WP3 §4.4. With the rtm:DockerImage entity emitted (commit 3), wire the link: every evidence node produced under --compute=docker now carries `prov:wasDerivedFrom <image-iri>` in addition to the existing `prov:wasGeneratedBy <activity>`. The two edges together let a SPARQL traversal answer both "which image produced this proof?" (wasDerivedFrom) and "which stage produced this proof?" (wasGeneratedBy, the WP1 schema enrichment). evidence/binding.py: bind_proof_evidence + bind_simulation_evidence gain an optional `image_iri: URIRef | None = None` kwarg. When present, add (ev_uri, PROV.wasDerivedFrom, image_iri) after the activity triples. Local-compute callers pass None (no edge added) — keeps the local path byte-identical to pre-WP3. pipeline/runner.py (Stage 4): Compute image_iri ONCE per stage by calling state.compute_backend.emit_image_node(ev_graph) when state.compute_name == "docker"; otherwise None. Thread it through the four bind_proof_evidence calls (REQ-001..004) and the three bind_simulation_evidence calls. emit_image_node is idempotent so a single call captures the per-run image identity for all evidence. Banner prints the emitted IRI for visibility under --compute=docker. The link is now in place; the SHACL closure rule that REQUIRES it for Docker-executed evidence arrives in commit 6. Tests: 202 passed (+12 since pre-WP3), 5 skipped, 3 deselected. The 12 new tests are 8 hash_docker_image + 4 emit_image_node from commits 2 & 3; the new bind_* kwarg is exercised via the pipeline end-to-end path (test_pipeline.py runs run_pipeline which now threads image_iri=None for the default --compute=local). Part of WP3 (subplan §4.4); fourth of 7 commits closing issue #4 AC4. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

WP3 §4.5. The reverse-lookup the WP3 schema enables: given an image digest, find every evidence node that was produced by a container started from that image. New SPARQL constant + helper in traceability/queries.py: EVIDENCE_BY_IMAGE — joins rtm:DockerImage on rtm:contentHash via prov:wasDerivedFrom, with initBinding for the target digest. evidence_by_image(g, d) — returns list of dicts with ev / type / evContentHash / modelHash keys. Empty list on miss. Walks the union view (the queries module's documented convention); pass a Dataset to query across <adcs:evidence> + any other layer that ends up holding evidence-image links. tests/test_docker_image_evidence.py: 4 new tests, synthesized dataset has two images (A + B) with two/one derived evidence nodes plus one unlinked (local-compute-style) node: - returns linked evidence (image A -> 2 rows) - isolates by digest (image B -> 1 row, no leak) - miss returns empty list - unlinked evidence stays invisible to every image query Tests: 12/12 in tests/test_docker_image_evidence.py (8 hash + 4 helper). Part of WP3 (subplan §4.5); fifth of 7 commits closing issue #4 AC5. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

WP3 §4.6. Gives the WP3 schema teeth at Stage 6.5: every rtm:Evidence whose generating activity ran under --compute=docker (signalled by prov:atLocation matching urn:adcs:location:docker:*) MUST link to a rtm:DockerImage via prov:wasDerivedFrom. Local-compute evidence is exempt — the SPARQL target filter excludes urn:adcs:location:local:* activities, so the nominal pipeline run continues to pass closure. The shape follows the existing rtm:BackwardTraceabilityShape pattern (sh:targetClass + sh:sparql with $this projection) rather than the sh:target + SPARQLTarget pattern from the subplan draft — both work under pyshacl but staying consistent with the established style keeps the shape suite uniform. Three new tests in tests/test_shape_suite.py: - test_docker_evidence_without_image_link_fails: synthesize a Docker-located activity + evidence WITHOUT wasDerivedFrom on a copy of the nominal dataset; closure fails with a DockerImage violation. - test_docker_evidence_with_image_link_passes: same shape but WITH a valid rtm:DockerImage + wasDerivedFrom edge; closure does NOT add a DockerImage complaint. - test_local_evidence_not_required_to_link_to_image: explicit conditional-correctness check — the nominal --compute=local fixture has only local-located activities, the shape's target filter must be vacuous on it. Tests: 13/13 in tests/test_shape_suite.py (10 prior + 3 new). Part of WP3 (subplan §4.6); sixth of 7 commits closing issue #4 AC6. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

README.md (under "Compute Backends (Phase L)"): new "Image as tracked evidence (WP3)" subsection — what WP3 adds, the six properties on rtm:DockerImage, a working evidence_by_image SPARQL example, and an explicit pointer to WP5 for the deferred notebook Act 9/10 rewrite + audit-module image surfacing. CLAUDE.md ("Named-graph layout"): one-line update on <adcs:evidence> acknowledging it now holds rtm:DockerImage too under --compute=docker. The deeper README "Compute Backends" rewrite (image-as-evidence narrative + audit summary integration) is WP5 territory; this commit ships the minimal docs delta so contributors reading the repo today can find the new entity + the SPARQL helper. Tests: 209 passed (+19 across WP3 commits 2-6), 5 skipped, 3 deselected. No regressions. Part of WP3 (subplan §4.9, §7); seventh of 7 commits. Partial coverage of issue #4 AC9 — the full README "Compute backends" section rewrite + audit-module + notebook narrative defer to WP5. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Lands the backend half of issue #4 (7 of 9 acceptance criteria): - rtm:DockerImage class + property set (commit c31cbce) - hash_docker_image() build-input hasher (a6a0680) - DockerCompute.emit_image_node() emits one node per run (343420e) - prov:wasDerivedFrom on Docker-produced evidence (425a263) - evidence_by_image() SPARQL helper (86974f5) - DockerEvidenceShape SHACL closure rule (3cbcf80) - README + CLAUDE.md notes (60a3721) The two narrative items (audit summary + notebook Act 9) are deferred to WP5. Issue #4 stays open with a status comment listing the split. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

WP4 c1. Preflight reachability check on the persistence backend so failure is fast and clear at startup rather than discovered at Stage 7. - BackendUnavailable exception (new in pipeline/backends/base.py) - StoreBackend.probe() Protocol method - LocalBackend.probe() writes + deletes .probe sentinel in output dir - FlexoBackend.probe() HEADs /orgs/<org>; respects FLEXO_PROBE_TIMEOUT (default 10s, distinct from the slow-call FLEXO_TIMEOUT) - FuskeiBackend.probe() HEADs /data 7 new unit tests in test_backends.py cover success + failure paths for each backend (mocked httpx). All 18 backend tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

WP4 c2. Builds on the StoreBackend probe (c1) to make startup the single fail-fast moment for backend reachability — no more discovering Flexo / Docker unavailability at Stage 7 or Stage 2. - ComputeUnavailable exception (compute/base.py); DockerNotAvailable now subclasses it for backwards compat - ComputeBackend.probe() Protocol method - LocalCompute.probe() is a no-op (always available) - DockerCompute.probe() wraps _check_daemon() - PipelineState gains store_backend field - run_pipeline() constructs both backends up-front and runs _run_preflight() before Stage 0; banner prints describe() + PASS/FAIL for each; sys.exit(2) on any failure - Stage 7 reads state.store_backend instead of re-instantiating Tests: TestComputeProbe in test_compute.py + PipelineState fixture fix in test_pipeline.py. Full suite: 219 passed (no regressions). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

WP4 c3. Adds the "code remote" half of the three-remote provenance chain: every rtm:DockerImage now carries a literal git+URI pointing at the Dockerfile in the source tree at the commit it was built from. - compute/git_ref.py — current_git_ref(repo_root, file_path); shells out to git rev-parse + git config; produces git+https://.../@<sha>#<path> with graceful fallbacks (git+file://, git+local://uncommitted) - docker_compute.emit_image_node() appends rtm:gitRef triple - Tests: - TestGitRef: shape + fallback + ssh→https normalization - TestImageNodeEmitsGitRef: stubbed _image_metadata + verify the triple lands on the image IRI (no Docker daemon required) The rtm:gitRef property is declared formally in c8 alongside the rest of the WP4 ontology additions; this commit uses the IRI directly. Closes part of issue #4 (preparation for the reproduce CLI in c9). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

WP4 c4. Adds the "storage remote" half of the three-remote provenance chain: when persisting to Flexo (or Fuseki), every rtm:DockerImage gains a rtm:flexoRecord pointer to where in the storage backend its record lives. - StoreBackend.record_uri(layer) Protocol method - LocalBackend.record_uri() returns None (no remote) - FlexoBackend.record_uri(layer) -> urn:adcs:flexo:<org>/<repo>/<branch> - FuskeiBackend.record_uri(layer) -> urn:adcs:fuseki:<encoded-url>/<layer> - Runner Stage 4 attaches rtm:flexoRecord to the image after emit_image_node, only when the store backend exposes a non-None record_uri. LocalBackend runs unchanged (no triple added). Tests added in test_backends.py for all three shapes; 21 backend tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

WP4 c5. Distinguishes the image (static artifact), container (transient materialization), and host (location) as three first-class entities with standard PROV edges between them. - ExecutionMetadata.container_uri() -> urn:adcs:docker-container:<id> (None for local runs or missing container_id) - _bind_execution_metadata accepts image_iri; when container_uri is non-None, emits: <container> a rtm:DockerContainer, prov:Entity ; rtm:containerId "<id>" ; prov:wasDerivedFrom <image> ; prov:startedAtTime / endedAtTime "..." <activity> prov:used <container> - bind_proof_evidence / bind_simulation_evidence thread image_iri through to the metadata helper. - No change to existing PROV edges; new edges are purely additive. Tests: TestContainerEntity in test_compute.py (4 cases — local skip, docker emission, image link, missing-id sentinel). All 17 targeted tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

WP4 c6. Adds the "under whose authority?" axis to the provenance chain without pulling in FOAF or W3C Org Ontology (those stay deferred per CLAUDE.md future-work #2). Two org IRIs per run: - operating org: who runs the container/authors the work (default urn:adcs:org:local-operator) - hosting org: who operates the substrate (host + Docker daemon) (default: same as operating) Both env-configurable via ADCS_{OPERATING,HOSTING}_ORG_IRI; defaults play "single-operator local" so existing runs don't change. New edges in evidence/binding.py: <container> prov:wasAttributedTo <operating-org> <host> rtm:operatedBy <hosting-org> <executor> prov:actedOnBehalfOf <operating-org> Both prov:Organization typings + rdfs:labels emitted to <adcs:context> at startup via compute/organizations.py::emit_org_nodes. PipelineState gains operating_org_iri + hosting_org_iri fields. bind_proof_evidence / bind_simulation_evidence gain corresponding kwargs threaded through to _bind_execution_metadata. The rtm:operatedBy predicate is declared formally in c8 alongside the rest of the ontology additions; this commit uses it directly. Full pytest: 231 passed (up from 219 baseline; no regressions). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

WP4 c7. The SHACL closure-rule check is an automated, fully-specified outcome — wraps as an earl:Assertion so the technical-trust witness is queryable RDF, beside the existing human-attestation witness (rtm:Attestation, which also subclasses earl:Assertion). - new traceability/closure_assertion.py::emit_closure_assertion() - Stage 6.5 in pipeline/runner.py calls it after verify() - assertion typed rtm:ClosureRuleAssertion + earl:Assertion + prov:Activity - carries earl:outcome (passed/failed), earl:mode (automatic), earl:test, earl:subject, prov:wasAssociatedWith, prov:atTime, rtm:violationCount - one assertion per run (Q9: per-run granularity, not per-shape) - compute.reproduce-side rtm:DigestMatchAssertion lands in c9 with the CLI Discipline: earl:mode is always earl:automatic — verification, not validation. Human attestation continues to use earl:manual / earl:semiAuto. Test: test_audit::test_closure_assertion_emitted_into_audit_graph. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…ovenance WP4 c8. Formally declares every WP4 term used in commits c3-c7, adds the four new SHACL closure rules, regenerates rtm.ttl, and bumps the triple-count budget to accommodate the additions. New classes: - rtm:DockerContainer (subClassOf prov:Entity) — transient materialization - rtm:DigestMatchAssertion (subClassOf earl:Assertion + prov:Activity) - rtm:ClosureRuleAssertion (subClassOf earl:Assertion + prov:Activity) New properties: - rtm:containerId (on DockerContainer) - rtm:gitRef (on DockerImage; xsd:anyURI) - rtm:flexoRecord (on DockerImage; ObjectProperty) - rtm:operatedBy (on prov:Location; subPropertyOf prov:wasAttributedTo) - rtm:violationCount (on ClosureRuleAssertion) - rtm:transactionId (on prov:Activity) — for service-invocation wire logs - rtm:documentRef (on rtm:Evidence; xsd:anyURI) — for txnlog evidence New SHACL shapes (rtm_shapes.ttl): - DockerImageProvenanceShape — every DockerImage MUST have rtm:gitRef - DockerContainerShape — every DockerContainer MUST have wasDerivedFrom exactly one DockerImage + rtm:containerId - OrganizationAuspicesShape — DockerContainer SHOULD declare prov:wasAttributedTo a prov:Organization (Warning, not Violation) - TransactionLogShape — Evidence with rtm:documentRef MUST also have rtm:contentHash + prov:wasGeneratedBy Side effects: - Bumped triple budget 380 → 450 (218 used; 232 headroom for WP5) - emit_image_node now types rtm:gitRef as xsd:anyURI literal so it satisfies the new shape's datatype constraint - test_docker_evidence_with_image_link_passes fixture updated to emit rtm:gitRef on its synthetic image Full pytest: 232 passed (no regressions). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

WP4 c9. New compute/reproduce.py Typer app that closes the reproducibility loop: given an rtm:DockerImage record, clone the recorded git ref, rebuild from compute/Dockerfile, and compare the resulting runtime digest to the recorded rtm:contentHash. CLI: uv run python -m compute.reproduce \ --image-digest sha256:... \ --from-trig output/rtm.trig Output: - prints PASS/FAIL with detail - emits rtm:DigestMatchAssertion (earl:Assertion + prov:Activity) into <adcs:audit> with earl:outcome + earl:mode=earl:automatic + earl:subject=<image-iri> + prov:wasAssociatedWith=<reproduce-cli-agent> - exit 0 on match, 1 on mismatch, 2 on prerequisite failure Pure logic split out as testable units (parse_git_ref, load_image_record, emit_digest_match_assertion) so 11 unit tests cover the orchestration without needing Docker. The actual clone+build subprocess loop is exercised opt-in via -m live. Honors the verification/validation discipline: earl:mode is always earl:automatic for these (automated check). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

WP4 c10. Adds the fourth service in the three-remote story: a CouchDB-backed transaction-log store running in its own container with its own URI (urn:adcs:service:transaction-log-store) and its own hosting auspices. - pipeline/backends/txnlog.py — minimal CouchDB client: - probe() HEADs the db, creates on 404, surfaces auth failures - put_document() PUTs JSON; 409 conflict treated as idempotent success - get_document() — readback path for the trust-query renderer - Env config: ADCS_TXNLOG_{URL,DB,USER,PASSWORD} - 8 unit tests in test_txnlog.py via httpx.MockTransport Bonus (same commit, single-line surface area): security hardening in compute/reproduce.py per automated security review. Git refs come from RDF stores that may be partly trust-boundary'd, so: - reject base/sha components starting with '-' (flag smuggling) - require base to start with https:// / ssh:// / git@ - add '--' end-of-options sentinel to git clone + git checkout Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

WP4 c11. Context manager that wraps a service invocation, captures request/response, redacts secrets, PUTs the JSON to the txnlog store, and emits RDF in <adcs:audit>: <activity> a prov:Activity ; rtm:transactionId "<id>" ; prov:wasAssociatedWith <caller> ; prov:used <service> ; prov:startedAtTime/endedAtTime "<iso>" <evidence> a rtm:Evidence ; rtm:contentHash "sha256:<hash>" ; rtm:documentRef <store-url> ; prov:atLocation <txnlog-service-iri> ; prov:wasGeneratedBy <activity> Redaction allowlists: Headers: Authorization, Cookie, Set-Cookie, X-Api-Key, X-Auth-Token, Proxy-Authorization Body keys: password, passwd, token, secret, api_key, apikey, access_token, refresh_token When store=None (e.g. --backend=local without txnlog), the activity is still recorded but the evidence node is skipped — the TransactionLogShape requires rtm:documentRef+rtm:contentHash, so emitting the evidence without them would fail closure. Robustness: a store.put_document failure does NOT propagate; the wrapped service call's outcome is preserved. Exceptions inside the context block are recorded in the document AND re-raised. 7 unit tests in test_transaction_log.py via FakeStore stand-in. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…12 part 1) Plumbs the optional TxnLogBackend through the runtime so future PRs can wrap individual service calls without further state surgery. - PipelineState gains txnlog_store: TxnLogBackend | None - ADCS_TXNLOG_ENABLED=1 env gate constructs a TxnLogBackend and runs it through _run_preflight alongside compute + storage probes - Preflight banner prints the txnlog describe() line when enabled - Existing runs (no env var set) are unchanged — txnlog_store is None, preflight skips that probe Per-call wrapping (FlexoBackend HTTP / DockerCompute subprocess / reproduce CLI subprocess) is deferred to a focused follow-up PR; those changes require more surgery in the backend bodies and have a small self-referential gotcha (FlexoBackend.persist would log into <adcs:audit> while persisting it). The plumbing landed here unblocks that work without bundling its risk into WP4. 95 targeted tests pass; no regressions. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

WP4 c13. Operationalizes the "how can I trust this evidence?" technical-trust panel as queryable SPARQL helpers. Six functions, six typed @DataClass(frozen=True) records: - technical_provenance(ds, evidence_iri) -> TechnicalProvenance - reproducibility_witnesses(ds, image_iri) -> list[DigestWitness] - closure_witnesses(ds, graph_iri) -> list[ClosureWitness] - auspices_chain(ds, evidence_iri) -> AuspicesChain - service_invocations_for(ds, ...) -> list[ServiceInvocationRow] - trust_summary(ds, evidence_iri) -> TrustSummary Plus render_trust_summary() — compact text rendering for interrogate.explain Trust panel. All queries are read-only, use OPTIONAL for graceful partial matches (local-compute runs have no container/image but still produce a useful technical row), and return typed records callers can pass without re-querying. Tests: 8 cases on a nominal local+local pipeline run; cover the empty + populated paths for each query. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

WP4 c14. Adds an env-driven branch-name prefix to FlexoBackend so each pipeline run can land in its own scoped branch space (e.g. cert/2026-06-12-001/evidence) without forcing the pattern on the default single-canonical-state run. - _branch_id(graph_iri, prefix="") prepends the prefix - FLEXO_BRANCH_PREFIX env (default "") is read in __init__ - branch_prefix kwarg overrides env - record_uri() honors the prefix so rtm:flexoRecord points at the correctly-scoped branch IRI Unchanged: empty default means existing runs land in adcs-demo/lifecycle/<layer> exactly as today. Test: test_flexo_backend_branch_prefix_applies. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

WP4 c15. Provisioning scripts for the local txnlog store (CouchDB), so a no-experience operator can bring the canonical multi-remote stack up with one command. - tools/start-services.sh — idempotent docker run + db ensure; waits for CouchDB readiness; prints the env-var block to export - tools/stop-services.sh — symmetric teardown; --purge wipes data - docker-compose.yml — same shape for users who prefer compose Both paths use the same container name (couchdb-adcs) and default credentials (adcs/adcs); env-var overrides documented in the scripts and in .env.example (lands in c16). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

WP4 c16. The architecture demo gets a self-contained ARCHITECTURE.md that a new collaborator can read cold; README + CLAUDE.md get the three-remote + fourth-service entry points; .env.example documents every env var with defaults. - ARCHITECTURE.md (new): three-remote diagram + URI scheme table + full provenance-chain example + EARL outcome section + trust query list + reproducibility loop + preflight gate semantics - README.md: new Setup section (.env + tools/start-services.sh + preflight fail-fast), new Canonical multi-remote run subsection under Quick Start, new Reproducibility verification subsection, ARCHITECTURE.md pointer at the top - CLAUDE.md: new "Three-remote architecture (WP4)" section under named-graph layout — concise URI scheme summary + EARL outcomes + preflight + trust queries - .env.example (new): every FLEXO_* / ADCS_TXNLOG_* / ADCS_*_ORG_* variable with documented defaults Full pytest after sweep: 267 passed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

16 commits making the four-service architecture (git + Flexo + local Docker + CouchDB txnlog) operational rather than narrated, with preflight gating, organizational auspices, EARL-wrapped automated outcomes, six trust queries, and the reproducibility CLI. c1. feat(backends): probe() on StoreBackend + 3 implementations c2. feat(compute): probe() on ComputeBackend + preflight gate c3. feat(compute): capture git ref + emit rtm:gitRef on rtm:DockerImage c4. feat(backends): record_uri() + emit rtm:flexoRecord c5. feat(evidence): emit rtm:DockerContainer entity + prov:used edge c6. feat(provenance): organizational auspices via prov:Organization c7. feat(audit): emit rtm:ClosureRuleAssertion from Stage 6.5 c8. feat(ontology): WP4 classes + properties + shapes c9. feat(compute): reproduce CLI + rtm:DigestMatchAssertion c10. feat(backends): TxnLogBackend (CouchDB) + reproduce hardening c11. feat(traceability): TransactionLogger + wire-logs as rtm:Evidence c12. feat(runner): wire txnlog store into PipelineState + preflight c13. feat(traceability): six trust queries + render_trust_summary c14. feat(backends): optional FLEXO_BRANCH_PREFIX c15. chore(tools): start-services.sh + stop-services.sh + docker-compose.yml c16. docs: ARCHITECTURE.md + README + CLAUDE.md + .env.example Companion issues filed: #7 (PU registry), #8 (RIME services), #9 (Starforge oracles). Issue #4 remains open for the WP5 narrative items (audit module image surfacing + notebook Act 9 update). Full pytest: 267 passed, 5 skipped, 3 deselected. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Closes one of two residual issue #4 ACs deferred from WP3. The audit report's markdown now includes a "Docker image provenance" table for any run that emitted at least one rtm:DockerImage, listing image IRI, digest, git ref, and count of evidence nodes derived from it. - DockerProvenanceRow dataclass (frozen) - docker_provenance(ds) -> list[DockerProvenanceRow] SPARQL helper - AuditReport gains docker_provenance: list[DockerProvenanceRow] - audit() populates it via the SPARQL query - _render_markdown adds the new section between coverage matrix and orphans, omitted entirely when the list is empty Local-compute runs see no change (empty list = section omitted). Docker-compute runs see the image surfaced beside the audit direction summary, where an auditor reading the report can trace "what produced what" without leaving the report. Tests: 2 new in test_audit.py — empty path + populated path with a synthesized image. 18 audit tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Closes the second of two residual issue #4 ACs deferred from WP3. A new cell after the existing Act 9 narration synthesizes the rtm:DockerImage node (mimicking what DockerCompute._emit_image_node does in a live --compute=docker run) with WP3 properties + WP4 extensions (rtm:gitRef, rtm:flexoRecord), then runs the WP3 evidence_by_image SPARQL helper against the augmented dataset to show the inverse query the executor-label model couldn't answer. The cell: - emits a synthetic rtm:DockerImage with content hashes + git ref + flexo record cross-link - wires synthesized v2 evidence to derive from it - runs evidence_by_image() + interpolates the count into the markdown - explains the reproducibility loop (compute.reproduce + EARL outcome) The original Act 9 cell (executor-agent label) stays intact above — the new cell extends rather than replaces, so the narrative reads: "here's the executor label (today's model); here's the image as a node (WP3 + WP4); here's the inverse query that becomes possible." Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Disposable one-page reconciliation showing every claim in openbee_dsg_opener.pptx Slide 5 (Demo #1) backed by a concrete code receipt. Reviewer can trace "Flexo Deployment + oracles + evidence reproducibility with git hashes and docker" to specific modules, commits, and tests. Notes that WP4 exceeded the slide's promise on the "with git hashes and docker" claim — container-as-entity, organizational auspices, wire-level audit trail, six trust queries are all additive surface beyond what was advertised. Cross-links the three companion issues (#7 PU registry, #8 RIME services, #9 Starforge oracles) as the "what's next" deliverable for Planetary Utilities' team. Pages auto-publishes the marimo notebook export at dynamicalsystemsgroup.github.io/ADCS-lifecycle-demo — the WP5 c2 notebook update will flow on next push. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…ging Three commits closing the residual narrative items from WP3 (issue #4 ACs deferred to WP5) and the slide-reconciliation pass. c1. feat(audit): surface Docker image identity in audit summary - DockerProvenanceRow + docker_provenance() + AuditReport extension - _render_markdown adds "Docker image provenance" table for --compute=docker runs; section omitted for local runs - 2 new tests covering empty + populated paths c2. docs(notebook): Act 9 narrative now shows rtm:DockerImage as a node - New cell after the executor-label cell synthesizes rtm:DockerImage with WP3 + WP4 properties (contentHash, gitRef, flexoRecord) - Wires synthesized v2 evidence to derive from it - Runs evidence_by_image() SPARQL helper in-cell + interpolates the count into the markdown - Explains compute.reproduce + EARL DigestMatchAssertion c3. docs: RECONCILIATION.md — slide claims ↔ code receipts - Maps every "Flexo Deployment, oracles & evidence reproducibility with git hashes and docker" phrase to its code receipt - Cross-links companion issues #7/#8/#9 (PU / RIME / Starforge) - Notes WP4 *exceeded* the slide's promise on the docker axis End-of-roadmap alignment review (all in this merge): - Discipline sweep: validate-vs-verify clean (only legitimate uses + the one back-compat alias from WP1 §10) - openCAESAR sweep: zero hits - ROBOT_OPTIONAL sweep: zero hits - ValidateShapes IRI fragment: preserved per #6 known follow-up - Issues #2 + #3 retroactively closed with status comments - Issue #4 ready to close (this merge lands the residual 2 ACs) - Issues #5, #6, #7, #8, #9 correctly remain open (deferred + future-work) - End-to-end smoke: pipeline runs cleanly; 1084 union triples - Full pytest: 269 passed Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

CI's "Confirm rtm.ttl is committed in-sync with rebuild" gate was failing because the build_time was wall-clock-now, so every rebuild produced a different sha256 → diff against committed copy → fail. _reproducible_build_time() resolves the timestamp in this order: 1. SOURCE_DATE_EPOCH env var (Reproducible Builds standard). 2. Most-recent git-commit time of the build inputs (rtm-edit.ttl, sysml_term_map.csv, build_ontology.py). Stable across CI + local. 3. datetime.now() — unreproducible fallback for bootstrap. After this commit lands, the next regen of ontology/rtm.ttl + assembly_manifest.json (separate commit) will pin the timestamp to THIS commit's ct; future CI rebuilds will compute the same value and produce byte-identical artifacts. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…d_time Pins the artifact's build_time to the prior commit's ct (where build_ontology.py + the ontology inputs were last touched), via the new _reproducible_build_time() in scripts/build_ontology.py. CI rebuilds will now produce byte-identical artifacts. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…* refs Three changes addressing reviewer feedback: 1. Removed all WP1..WP5 mentions from notebook prose. The narrative describes capabilities, not implementation phases. 2. New cell — "Many Authoritative Sources of Truth, one stitched provenance graph" — acknowledges the diverse ASoTs (SysMLv2 model, symbolic + numerical oracles, Docker image, engineer attestation, closure-rule check, audit module), shows what each holds and what grounds its trust, and frames the demo's contribution as the integration that stitches them via standard PROV/EARL/GSN edges without overloading anyone's vocabulary. 3. Renamed the docker-as-evidence cell to "The runtime ASoTs as first-class nodes" and expanded it to show the container entity (rtm:DockerContainer, prov:used edge from analysis activity) so the materialization story is visible inline. Reports both proof-artifact AND simulation-result counts from the evidence_by_image inverse query. 4. New "Numerical evidence — the full provenance, end-to-end" cell uses trust_summary + render_trust_summary against EV-SIM-REQ-001-v2 to render the complete chain (oracle → activity → container → image → git ref → host → org → closure assertion) in one block, so readers see the multi-ASoT stitch concretely rather than as abstract narration. The Pages workflow regenerates output/notebook.html on next push. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

actions/checkout@v4 defaults to fetch-depth=1, which makes `git log -- <files>` return empty for any file HEAD didn't touch directly. _reproducible_build_time() then falls back to datetime.now() and produces non-reproducible artifacts — the CI diff-check fails on the very build the timestamp fix was supposed to enable. fetch-depth: 0 fetches full history so the commit that last touched the inputs resolves consistently across CI + local. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…-specific CI runs 'make ontology' (with ROBOT) which writes robot_used: true and a ROBOT-specific notes string into assembly_manifest.json. Local contributors without Java run 'make ontology-python' which writes robot_used: false. The manifest's build-path provenance fields are intentionally asymmetric. The load-bearing artifact is rtm.ttl, which IS reproducible thanks to _reproducible_build_time(); that's what the gate checks now. Manifest stays committed but isn't diff-gated. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…wrapping Three failing CI tests: - test_pipeline_runner_help_lists_known_flags - test_rerun_help_lists_known_flags - test_reproduce_cli_help All check for flag substrings ("--auto" etc) in `result.stdout`. Rich's Typer help renderer wraps the flag column to terminal width; CI runners are narrower than dev workstations, which can split "--auto" across a wrap boundary so it's no longer a contiguous substring of the rendered output. Fix: strip ANSI escape codes + collapse whitespace before the substring match. _flatten_help() in test_cli.py; same regex inline in test_reproduce.py (kept local to avoid an import cycle). 18 targeted tests pass locally. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

mzargham and others added 30 commits May 27, 2026 11:18

mzargham and others added 21 commits May 28, 2026 18:15

mzargham merged commit 58a1dba into main May 29, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Land WP1–WP5: pipeline cleanup, ontology discipline, three-remote architecture#10

Land WP1–WP5: pipeline cleanup, ontology discipline, three-remote architecture#10
mzargham merged 51 commits into
mainfrom
staging

mzargham commented May 29, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mzargham commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Closed by this PR

Remain open (intentional)

New surfaces a reviewer can read cold

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mzargham commented May 29, 2026 •

edited

Loading