Conversation
…yped result records Split run_pipeline's 285-line body into per-stage free functions threaded by a PipelineState object. Each stage returns a typed result record (StructuralResult, SymbolicResult, NumericalResult, EvidenceBindingResult, AttestationStageResult, ClosureRuleResult, AuditStageResult, ReportStageResult) assigned to the matching state field. Downstream stages read prior results via state.<prior>.<field> instead of free locals. The activity_to_stage table maps p-plan step IRI fragments (STEP_NAMES) to pipeline stage numbers. Kept in sync with traceability.plan_execution; covered by a new unit test. Vestigial `stage = LifecycleStage.X` assignments dropped (the variable was set but never read). LifecycleStage and check_gate are preserved for external callers and tests. CLI surface unchanged. Tests: 23 pipeline + named_graphs passing, 83 attestation + audit + traceability + shape_suite + compute + backends passing. Part of WP1 (roadmap §4.1). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…coped queries `pipeline.dataset.query_named_graph(ds, layer, sparql, **bindings)` scopes a SPARQL query to one named graph. Use it when the query is intentionally layer-specific; keep using `ds.query(...)` when the query is meant to walk the union view (via Dataset(default_union=True)). The existing queries in traceability/audit.py, traceability/queries.py, and traceability/attestation.py are intentionally union-scoped. Added a section banner in audit.py documenting this and extended the query_to_dicts docstring in queries.py with the convention so future contributors don't reach for graph_for() when they really want the union. Two new tests cover the helper: layer-scoped count is a strict subset of the union count, and unknown layers raise KeyError. The helper is added as a primitive for WP3/WP4 (Docker image + Flexo remote queries that legitimately want one-layer scope) without forcing any current call site to migrate. Part of WP1 (roadmap §4.2). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…methods
The executor and location URI shapes used to live as inline string
construction in evidence.binding._bind_execution_metadata; promote
them to methods on ExecutionMetadata so WP3 (rtm:DockerImage as
evidence) and WP4 (three-remote architecture) can reuse the same
shapes without copy-paste.
IRI shapes preserved byte-for-byte:
executor_uri() -> urn:adcs:executor:<container_id|hostname|unknown>
(colons in suffix replaced with dashes)
location_uri() -> urn:adcs:location:<location_kind>:<hostname|unknown>
evidence.binding._bind_execution_metadata now consumes the methods
directly. A new TestExecutionMetadataURIs class covers prefer-
container-id, fall-back-to-hostname, unknown sentinel, colon
replacement, and the location shape.
Part of WP1 (roadmap §4.3).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Strict semantic split: verification = automated, fully-specified check (SHACL conformance, ROBOT/ELK, hash matching, completeness). validation = human judgement (attestation, adequacy, sufficiency). Module + test moves (git mv preserves history): traceability/validation.py -> traceability/verification.py tests/test_robot_validation.py -> tests/test_robot_verification.py Symbol renames inside the renamed module: validate() -> verify() validate_shacl() -> verify_shacl() validate_reverification() -> verify_reverification() ValidationReport -> VerificationReport Symbol renames in traceability/rtm.py: validate_structural_completeness -> verify_structural_completeness validate_evidence_completeness -> verify_evidence_completeness Back-compat aliases retained inside the renamed module — to be removed in a follow-up PR after WP3 lands. Runner / banner string updates: "Validating closure-rule suite..." -> "Verifying...", "Structural validation: PASS" -> "Structural verification: PASS", Stage 0 banner "Validation: ..." -> "Verification: ...". Plan.ttl rdfs:label updates: "Stage 6.5 — Validate Closure-Rule Suite" -> "Verify..."; "Validation Report" -> "Verification Report". The step IRI fragment <plan/step/ValidateShapes> is PRESERVED to keep already-persisted <adcs:plan-execution> + <adcs:audit> graphs valid; IRI rename tracked separately for a future Flexo migration (WP1 §10 Known follow-ups). Notebook function-call references (Acts 4 + Stage 6.5 narration) updated to new symbols; narrative prose unchanged (WP5 owns that). scripts/build_ontology.py is INTENTIONALLY UNTOUCHED — WP2 renames _validate_sysml_axioms there in the same commit that lands the openCAESAR cleanup, to avoid a merge conflict on that file. Where validation legitimately stays: traceability/attestation.py, request_attestation(), upstream pyshacl.validate, OSLC oslc_qm: IRI fragments, vendored ontology imports. Test counts: 171 passed, 4 skipped (baseline 162 + 9 new from prior WP1 commits). Live-Flexo failures predate WP1 and are out of scope. Part of WP1 (roadmap §4.4). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Typer is the demo's CLI framework convention (per WP1 §4.6). The next
commit migrates pipeline.runner + interrogate.{explain,reproduce} to
Typer apps; the rerun.py CLI added later in this PR is Typer-based
from the start.
Pinned to >=0.12,<1.0 (current resolved: 0.26.2). Brings in Click +
Rich + markdown-it-py + shellingham as transitive deps; all stable
and well-established.
Part of WP1 (roadmap §4.6).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces argparse.ArgumentParser with a Typer app. Every flag name is preserved (--auto, --no-attest, --engineer, --rebuild, --backend, --compute) so existing invocations work unchanged. Choice-validated options (--backend, --compute) use Enum subclasses so Typer matches the prior argparse `choices=` semantics. The `main()` callable is retained as a thin wrapper around `app()` so the `[project.scripts] adcs-pipeline = "pipeline.runner:main"` entry point keeps resolving. interrogate/explain.py, interrogate/reproduce.py, interrogate/visualize.py are library-only modules with no CLI entry points; nothing to migrate there. WP1 §4.6 specified them speculatively; the actual scope is just pipeline.runner. The deferred top-level `adcs` aggregator (issue #5) can revisit when WP4 adds Flexo materialization commands. New tests/test_cli.py uses typer.testing.CliRunner for smoke tests: - pipeline.runner --help renders + lists every flag - --backend / --compute reject values outside the enum - main symbol stays importable for the console script Part of WP1 (roadmap §4.6). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…closes #3) `interrogate.rerun` walks prov:wasGeneratedBy -> p-plan:correspondsToStep to translate a VerificationReport into the dedup'd ordered set of pipeline stages that must re-run to restore RTM closure. SHACL violations on structural / human-judgement nodes (attestations, etc.) that have no producing activity are reported separately — no stage rerun can fix them. Schema enrichment (evidence/binding.py): every per-evidence SymbolicAnalysis / NumericalSimulation activity now carries p-plan:correspondsToStep linking it to the SymbolicAnalysis / NumericalSimulation step in plan.ttl. This makes the evidence -> step traversal self-describing rather than relying on activity-IRI naming conventions, and aligns the per-evidence activities with the existing stage-level activities emitted by emit_stage_activity. Stage 6.5 banner extension (pipeline/runner.py): when the verification report does not conform, render the rerun plan inline so the engineer sees which stages must re-run without having to invoke the CLI separately. CLI (Typer-based, WP1 §4.6 discipline): uv run python -m interrogate.rerun # default md output uv run python -m interrogate.rerun --requirement REQ-003 uv run python -m interrogate.rerun --format json Exit codes: 0 = clean, 1 = stages or structural violations present, 2 = input file not found. Tests cover all 7 of issue #3's acceptance criteria: AC1: closed RTM -> empty stage set AC2: proof hash mismatch -> Stage 2 AC3: simulation violation -> Stage 3 AC4: multiple invalidations -> ordered union [2, 3] AC5: attestation-level violation -> structural_violations, no stages AC6: CLI smoke tests in tests/test_cli.py AC7: Stage 6.5 banner extension verified by integration Test counts: 187 passed, 4 skipped (was 171 after commit 4; +16 across WP1 commits 5-7 covering Typer + rerun). Part of WP1 (roadmap §4.5). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per roadmap cross-cutting section "drop explicit openCAESAR
references", remove prose mentions from the WP1-owned files:
README.md — Architecture blurb + namespace table row
CLAUDE.md — namespace table row
notebook.py — 3 narration cells (Act 1 namespace table,
epilogue prologue summary, Act 11 stack table)
ontology/rtm-edit.ttl — header comment + ontology description +
SysMLv2 binding section comment
ontology/prefixes.py — module docstring + SysMLv2 section comment +
OMG_SYSML inline comment
scripts/fetch_imports.py — module docstring
The OMG IRI itself (http://www.omg.org/spec/SysML/20240501/) stays —
it is the OMG official SysMLv2 OWL rendering, correct on its own
terms. The `omg-sysml:` prefix and the OMG_SYSML constant keep their
names and values. Only the attribution text changes.
Built ontology regenerated (`uv run python -m scripts.build_ontology`)
because rtm-edit.ttl comments changed: ontology/rtm.ttl + manifest
get fresh edit_source_hash. Triple count unchanged (156 in / 156 out).
WP2 owns the code-side cleanup (CSV column `opencaesar_iri` ->
`omg_iri`, constant `SYSML_OPENCAESAR_NS` -> `SYSML_OMG_NS`, lookup
updates in scripts/build_ontology.py + tests/test_ontology_build.py)
and will regenerate rtm.ttl again as part of its commit; that
regeneration will produce identical content because the renames
don't alter the equivalence-axiom IRIs the script emits.
Verification: full-repo grep limited to WP1 prose set returns zero;
remaining hits (build_ontology.py constant + lookups, CSV header,
rtm.ttl built artifact) are explicitly WP2 territory.
Tests: 187 passed, 4 skipped.
Part of WP1 (roadmap §4.7).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…scipline README.md: - New "Pipeline architecture" subsection introducing PipelineState + per-stage typed result records + Typer CLI convention. - New "Rerun plan from a verification report" subsection under Interrogation showing the interrogate.rerun CLI (issue #3) with --requirement and --format examples + exit-code contract. - Stage banner: "Validate Closure-Rule Suite" -> "Verify Closure-Rule Suite"; Stage 0 banner sample "Validation:" -> "Verification:" matches what runner now prints. - Top-line paragraph: "validated by a SHACL closure-rule suite" -> "verified by a SHACL closure-rule suite". - Key Directories: traceability/ updated; pipeline/ mentions PipelineState + query_named_graph; interrogate/ adds rerun. - Ontology Authoring section + Toolchain table are NOT touched here — WP2 owns those (ROBOT-as-default rewrite). Single coordination point per the cross-WP plan. CLAUDE.md: - New "Pipeline state + structured stage results" subsection (canonical description of the PipelineState pattern). - New "CLI surface" section: every CLI is Typer; flag names preserved; CliRunner-based tests; deferred top-level `adcs` aggregator linked to issue #5. - New "Verification vs validation (term discipline)" section: defines the split, names pyshacl as the explicit upstream-API exception, notes the preserved ValidateShapes IRI fragment. - Toolchain: pyshacl rephrased to mention the verify wrapper; typer added as a runtime dep. - Key directories: traceability/ + pipeline/ + interrogate/ updated. Tests: 187 passed, 4 skipped. Part of WP1 (roadmap §5). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
WP1 of the roadmap at /Users/z/.claude/plans/i-want-to-continue-atomic-lobster.md. Internal cleanups (PipelineState refactor, query_named_graph helper, ExecutionMetadata URI methods), validation -> verification rename discipline, Typer migration of pipeline.runner, new interrogate.rerun CLI mapping closure violations to pipeline stages (closes #3), and the WP1 share of the openCAESAR prose cleanup. 9 commits, 30 files, +1437 / -266 lines. Test suite 162 -> 187 passing (no new failures). Output triple count 948 -> 955 (+7 new p-plan:correspondsToStep schema enrichment on per-evidence activities). Staged for integration with WP2 (ROBOT default + pytest markers + triple budget + openCAESAR code/data cleanup) before promotion to main. Follow-up issues filed: - #5 deferred top-level `adcs` Typer aggregator (WP4-dependent) - #6 ValidateShapes step IRI fragment rename (WP4-dependent) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add `[tool.pytest.ini_options]` markers `live` and `network` plus default `addopts = "-m 'not live and not network'"` so the canonical `uv run pytest` invocation filters infrastructure-dependent tests without requiring per-test env-var introspection. CI opts in explicitly with `-m live` (or `-m network` once any are written). tests/test_flexo_live.py rewritten: - pytestmark switches from env-var-driven `skipif` to `@pytest.mark.live`. - `_flexo_reachable()` removed — connectivity probing belongs in fixtures, not at module import. When `-m live` is requested but credentials are missing, the `token` fixture now fails LOUDLY (pytest.fail) instead of skipping. Skip-on-opt-in would hide infra breakage; the marker is the opt-in signal. - Docstring updated to show the new invocation pattern. Tests: 187 passed, 4 skipped, 3 deselected (live tests filtered out by default). Previously: 162 passed, 2 failed, 1 errored on live — those failures were infrastructure noise predating WP1, now correctly gated behind the marker. Part of WP2 (subplan §4.B). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
rtm: is an integration ontology — it should contribute only
convenience handles, hashing properties, and SHACL targets, never
new epistemic vocabulary. The gate keeps that promise honest:
`scripts/build_ontology.py` now fails the build if the assembled
artifact exceeds TRIPLE_BUDGET. Current size 156 + 200 headroom for
WP3's rtm:DockerImage + property set and other small adds.
Budget bump is a deliberate, single-place act: edit TRIPLE_BUDGET
in scripts/build_ontology.py with an updated rationale comment.
WP3 will bump it when rtm:DockerImage lands; no silent drift.
Build banner now prints `Parsimony: <actual>/<budget> triples
(<headroom> headroom)` alongside the existing artifact summary. The
manifest gains a `triple_budget` block (`value`, `rationale`,
`headroom`) so consumers reading the manifest see the gate without
sources.
New `tests/test_ontology_size.py` (3 tests) imports TRIPLE_BUDGET as
the single source of truth and verifies:
- the committed `rtm.ttl` parses under budget
- the manifest pins the budget + rationale
- the manifest's recorded triple count matches the parsed artifact
(catches a stale manifest committed without re-running the build)
Tests: 196 passed (was 187), 4 skipped, 3 deselected.
Part of WP2 (subplan §4.C).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… + regen)
Code/data half of the cross-cutting openCAESAR drop. The WP1 share
already handled the prose; this commit takes the remaining
identifiers + data + the regenerated artifact.
Renames:
ontology/sysml_term_map.csv:
column header `opencaesar_iri` -> `omg_iri`
scripts/build_ontology.py:
constant `SYSML_OPENCAESAR_NS` -> `SYSML_OMG_NS`
function `_validate_sysml_axioms` -> `_verify_sysml_axioms`
row lookups `row['opencaesar_iri']` -> `row['omg_iri']`
tests/test_ontology_build.py:
matching `row['opencaesar_iri']` -> `row['omg_iri']`
The function rename is the WP1 verification discipline applied to a
file WP1 explicitly scope-excluded so WP2 could own it in the same
commit as the openCAESAR cleanup (avoiding a merge conflict on
build_ontology.py). The IRIs the script emits are unchanged — the
OMG namespace value `http://www.omg.org/spec/SysML/20240501/` stays
because it's the official OMG SysMLv2 OWL rendering, correct on its
own terms. Only the attribution text and the local label change.
Regenerated `ontology/rtm.ttl` + `ontology/assembly_manifest.json`
(`uv run python -m scripts.build_ontology`). Triple count 156/356;
edit-source hash refreshed; CSV row count unchanged at 9.
Repo-wide grep gate:
grep -rni "caesar|opencaesar|open-caesar" --include={py,md,ttl,json,csv,toml,yaml,yml,sh}
returns zero hits across the whole repo (WP1 prose + WP2 code/data
both clean).
Tests: 190 passed, 4 skipped, 3 deselected (was 196 after commit 2;
no test count change here — the same tests, all green).
Part of WP2 (subplan §4.D); cross-coordinates with WP1 §4.4
(roadmap "Drop explicit openCAESAR references" section).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`make ontology` now requires Java + obo-robot on PATH and runs the
full chain: preflight -> ontology-robot (merge + ELK reason + OBO
report) -> ontology-python with ADCS_ROBOT_VERIFIED=1. Fails fast
with a helpful error pointing at the no-Java alternative
(`make ontology-python`) when the toolchain is missing.
No `ROBOT_OPTIONAL` escape hatch: the no-Java path is the explicit
`ontology-python` Makefile target — invoking it is an intentional
opt-out, not a flag on the default. Honours the roadmap's "stop being
a mock-up; the integration story should not silently degrade" rule.
scripts/build_ontology.py reads ADCS_ROBOT_VERIFIED from the env to
decide what to write into the manifest's `robot_used` + `notes`
fields. Stage 0 banner branches on `robot_used` to print either
"ROBOT merge + ELK reasoning + OBO report PASS" or "Python assembly
only (no-Java path; run `make ontology` for ROBOT/ELK verification)".
New `.github/workflows/ontology.yml`:
- actions/checkout@v4 + actions/setup-java@v4 (Temurin 17)
- Cached ROBOT jar (v1.9.5) downloaded once per cache key
- 3-line bash wrapper installs as `obo-robot` on PATH
- astral-sh/setup-uv@v6 + `uv sync`
- `make ontology` runs the canonical chain
- Confirms `rtm.ttl` + `assembly_manifest.json` are committed
in-sync with the rebuild (catches forgotten regen commits)
- `uv run pytest -v` (live + network markers skip by default)
Triggers on push to main + staging and on PRs targeting either.
Smoke-tested locally:
- `make ontology-python` writes `robot_used: false` + correct notes
- `make ontology` on a no-Java machine fails fast with the
documented error message
Tests: 190 passed, 4 skipped, 3 deselected (unchanged from §4.C —
the rename + Makefile changes don't alter test behaviour).
Closes issue #2.
Part of WP2 (subplan §4.A).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…riple budget README.md: - Toolchain table: OBO ROBOT row promoted from `optional` to `required (default)` with the no-Java alternative spelled out. - New "Tests" subsection under Quick Start documenting the marker convention and the default-skip rule (`addopts` in pyproject.toml). - Ontology Authoring rewritten: `make ontology` as canonical with the fail-fast preflight; `make ontology-python` as the explicit no-Java target; `make ontology-robot` as just the ROBOT step. - Triple-count budget mention added so contributors know about the parsimony gate before they discover it via a failing build. - Stage 0 banner sample updated: rendered example now shows the ROBOT-default "Verification: ROBOT merge + ELK reasoning + OBO report PASS" line. - "uv run pytest -v" comment in Quick Start: "166 tests" -> "default: skips live + network markers" (count fluctuates per WP). CLAUDE.md: - Toolchain section: OBO ROBOT row promoted to required-for-default; CI Java + cached robot.jar called out. - New paragraph on the runner: it does NOT need Java/obo-robot; only rebuilding the ontology does. - Tests section: documents the marker convention + the fail-loud behaviour of test_flexo_live.py under `-m live` (no silent skips). - Ontology rebuild section: three-target chain with the no-Java escape, fail-fast preflight, robot_used manifest field, and the TRIPLE_BUDGET parsimony gate. Review gates passed: - `grep -rni "caesar|opencaesar|open-caesar"` — zero hits - `grep -rn "ROBOT_OPTIONAL"` — zero hits (escape hatch dropped) - `validate_sysml_axioms` hit only in a rename-rationale docstring - 190 passed, 4 skipped, 3 deselected Part of WP2 (subplan §5). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…enCAESAR cleanup) into staging WP2 of the roadmap at /Users/z/.claude/plans/i-want-to-continue-atomic-lobster.md. - §4.A ROBOT/ELK promoted to default `make ontology` with fail-fast preflight (no ROBOT_OPTIONAL escape hatch). `.github/workflows/ontology.yml` installs Java 17 + cached robot.jar and runs `make ontology` + tests on every push to main/staging and on PRs. Closes #2. - §4.B Pytest `live` + `network` markers registered in pyproject.toml, default `addopts = "-m 'not live and not network'"`. test_flexo_live.py rewritten to fail-loudly under `-m live` when credentials are missing (no silent skips on opt-in). - §4.C Triple-count budget (TRIPLE_BUDGET=356) gate in scripts/build_ontology.py + new tests/test_ontology_size.py. Manifest records `triple_budget` block with rationale. - §4.D openCAESAR code/data cleanup (WP2 share): CSV column `opencaesar_iri` -> `omg_iri`, constant `SYSML_OPENCAESAR_NS` -> `SYSML_OMG_NS`, function `_validate_sysml_axioms` -> `_verify_sysml_axioms` (WP1 verification discipline applied to a file WP1 scope-excluded for coordination). rtm.ttl + manifest regenerated. - §5 README + CLAUDE.md sweep aligning Toolchain, Ontology Authoring, Tests, and Stage 0 banner sample with the new defaults. 5 commits, +257 / -52 lines. Test counts: 190 passed, 4 skipped, 3 deselected. Repo-wide grep gates clean (zero openCAESAR, zero ROBOT_OPTIONAL). Staged for integration with WP3+ before promotion to main. JAXA workshop window (2026-06-12) is comfortably met. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
WP3 §4.1 + §4.8. Promotes the Docker image from an inline label on a prov:SoftwareAgent (where it lives today via _bind_execution_metadata) to a tracked entity that downstream evidence can derive from. New class in `ontology/rtm-edit.ttl`: rtm:DockerImage rdfs:subClassOf prov:Entity New datatype properties (domain rtm:DockerImage, range xsd:string): rtm:imageLabel — repo/tag rtm:baseImageDigest — FROM-image digest resolved at build rtm:dockerfileHash — SHA-256 of Dockerfile bytes rtm:buildContextHash — SHA-256 over build-context file manifest rtm:contentHash already exists for rtm:Evidence — the image's own content hash (runtime digest) reuses it without redeclaration. prov:wasDerivedFrom reuses PROV — no new property. TRIPLE_BUDGET bumped 356 -> 380 with a rationale-comment update documenting the WP2 (356) and WP3 (380) values and the cause of the bump. Actual current count: 176/380 (204 headroom). ontology/rtm.ttl + ontology/assembly_manifest.json regenerated via `uv run python -m scripts.build_ontology`. The class is now declared but not yet referenced anywhere — that arrives in commit 3 (DockerCompute.emit_image_node) + commit 4 (prov:wasDerivedFrom on evidence) + commit 6 (SHACL shape). Tests: 9 ontology + 9 ontology-build tests pass (190 total once full suite runs). Part of WP3 (subplan §4.1 + §4.8); first of 7 commits closing issue #4 AC1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…dressing (issue #4 AC2) WP3 §4.2. Pins Docker build inputs with two SHA-256 hashes: dockerfile_hash — SHA-256 of the Dockerfile bytes build_context_hash — SHA-256 of a sorted POSIX-normalized manifest of <relative-path>\t<file-sha256> lines These pin what `docker build` sees on disk. They are independent of the runtime image digest the daemon assigns AFTER build — that's captured separately as the image's rtm:contentHash. The pair plus the resolved base-image digest is what makes a Docker image reproducibly identifiable. DOCKER_BUILD_CONTEXT_DEFAULT_IGNORES excludes the obvious junk (.git, __pycache__, *.pyc, .venv, node_modules, .docker-ipc, output, .DS_Store, .pytest_cache, .ruff_cache) so the hash doesn't churn on local dev artifacts. The internal _ignored() helper matches each glob against the leaf name, every intermediate path component, AND the full relative path so single-component patterns like `.git` exclude entire subtrees correctly. Manifest separator is normalized to '/' so the same context hashes identically on macOS / Linux / WSL. os.walk's dirnames mutation prunes ignored subtrees so we don't recurse uselessly. The manifest format is intentionally simple. If the demo adopts SLSA / in-toto envelopes later, that becomes the canonical envelope and this hash stays as a fast self-check. tests/test_docker_image_evidence.py (new file): 8 tests covering determinism, Dockerfile-change detection, context-change detection, new-file detection, default ignore patterns (.git + __pycache__ + *.pyc + .venv + node_modules + output + .DS_Store), custom ignore patterns, missing-Dockerfile FileNotFoundError, and a smoke test against the repo's actual compute/Dockerfile + project root. Tests: 8/8 new pass; previous 190 unchanged. Part of WP3 (subplan §4.2); second of 7 commits, closes issue #4 AC2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…per run (issue #4 AC3) WP3 §4.3. Promotes the Docker image from an inline label on the prov:SoftwareAgent (where ExecutionMetadata wrote it) to a first- class rtm:DockerImage entity in the evidence graph. DockerCompute new methods: _parse_from_image() — pull the first FROM line from compute/Dockerfile via regex. Returns "" on parse failure. _resolve_base_image_digest() — `docker image inspect <FROM-tag>` with graceful empty-string fallback when the base isn't pulled locally. Cached per instance. emit_image_node(graph) — idempotent per WP3 run; on first call computes hashes via hash_docker_image, resolves the base digest, writes 8 triples (rdf:type DockerImage + Entity, contentHash, imageLabel, baseImageDigest, dockerfileHash, buildContextHash, prov:generatedAtTime) and caches the IRI. State added to __init__: _image_node_iri, _image_built_at, _base_image_digest. _image_built_at is captured at the end of _build_image() so the prov:generatedAtTime stamp reflects the actual build time, not the emission time. IRI shape: urn:adcs:docker-image:<digest-with-colons-replaced-by-dashes>. Mirrors ExecutionMetadata.executor_uri() (WP1 §4.3) so IRI shapes across the demo's URN space stay coherent. Resolution decisions baked in (WP3 subplan §9 open questions): Q1 baseImageDigest: try to resolve via `docker image inspect`, graceful empty-string fallback if the base isn't pulled (chosen: pipeline does NOT fail on missing base). Q3 Image IRI source: content-addressed on the runtime digest (not the deterministic build-input hash). The build-input hashes are recorded as properties for separate query. tests/test_compute.py additions: - _docker_subprocess_factory extended with base_image_digest= parameter; heuristic distinguishes project-image vs base-image inspect calls by checking for "adcs-compute" prefix. - TestDockerImageEmit class (4 tests): all-properties present, idempotent-within-one-run, base-image-missing graceful degrade, colon-escape in IRI suffix. - test_dockerfile_from_line_parseable smoke test against the real compute/Dockerfile (sanity: regex parser returns a python tag). The image node is now emitable but not yet REFERENCED from evidence nodes — that's commit 4 (prov:wasDerivedFrom wiring) + commit 6 (SHACL closure rule enforcing the link). Tests: 22 passed, 1 skipped (live Docker daemon required). Part of WP3 (subplan §4.3); third of 7 commits closing issue #4 AC3. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rivedFrom (issue #4 AC4) WP3 §4.4. With the rtm:DockerImage entity emitted (commit 3), wire the link: every evidence node produced under --compute=docker now carries `prov:wasDerivedFrom <image-iri>` in addition to the existing `prov:wasGeneratedBy <activity>`. The two edges together let a SPARQL traversal answer both "which image produced this proof?" (wasDerivedFrom) and "which stage produced this proof?" (wasGeneratedBy, the WP1 schema enrichment). evidence/binding.py: bind_proof_evidence + bind_simulation_evidence gain an optional `image_iri: URIRef | None = None` kwarg. When present, add (ev_uri, PROV.wasDerivedFrom, image_iri) after the activity triples. Local-compute callers pass None (no edge added) — keeps the local path byte-identical to pre-WP3. pipeline/runner.py (Stage 4): Compute image_iri ONCE per stage by calling state.compute_backend.emit_image_node(ev_graph) when state.compute_name == "docker"; otherwise None. Thread it through the four bind_proof_evidence calls (REQ-001..004) and the three bind_simulation_evidence calls. emit_image_node is idempotent so a single call captures the per-run image identity for all evidence. Banner prints the emitted IRI for visibility under --compute=docker. The link is now in place; the SHACL closure rule that REQUIRES it for Docker-executed evidence arrives in commit 6. Tests: 202 passed (+12 since pre-WP3), 5 skipped, 3 deselected. The 12 new tests are 8 hash_docker_image + 4 emit_image_node from commits 2 & 3; the new bind_* kwarg is exercised via the pipeline end-to-end path (test_pipeline.py runs run_pipeline which now threads image_iri=None for the default --compute=local). Part of WP3 (subplan §4.4); fourth of 7 commits closing issue #4 AC4. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
WP3 §4.5. The reverse-lookup the WP3 schema enables: given an image
digest, find every evidence node that was produced by a container
started from that image.
New SPARQL constant + helper in traceability/queries.py:
EVIDENCE_BY_IMAGE — joins rtm:DockerImage on rtm:contentHash
via prov:wasDerivedFrom, with initBinding
for the target digest.
evidence_by_image(g, d) — returns list of dicts with ev / type /
evContentHash / modelHash keys. Empty
list on miss.
Walks the union view (the queries module's documented convention);
pass a Dataset to query across <adcs:evidence> + any other layer
that ends up holding evidence-image links.
tests/test_docker_image_evidence.py: 4 new tests, synthesized
dataset has two images (A + B) with two/one derived evidence nodes
plus one unlinked (local-compute-style) node:
- returns linked evidence (image A -> 2 rows)
- isolates by digest (image B -> 1 row, no leak)
- miss returns empty list
- unlinked evidence stays invisible to every image query
Tests: 12/12 in tests/test_docker_image_evidence.py (8 hash + 4
helper).
Part of WP3 (subplan §4.5); fifth of 7 commits closing issue #4 AC5.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
WP3 §4.6. Gives the WP3 schema teeth at Stage 6.5: every rtm:Evidence
whose generating activity ran under --compute=docker (signalled by
prov:atLocation matching urn:adcs:location:docker:*) MUST link to a
rtm:DockerImage via prov:wasDerivedFrom. Local-compute evidence is
exempt — the SPARQL target filter excludes urn:adcs:location:local:*
activities, so the nominal pipeline run continues to pass closure.
The shape follows the existing rtm:BackwardTraceabilityShape pattern
(sh:targetClass + sh:sparql with $this projection) rather than the
sh:target + SPARQLTarget pattern from the subplan draft — both work
under pyshacl but staying consistent with the established style
keeps the shape suite uniform.
Three new tests in tests/test_shape_suite.py:
- test_docker_evidence_without_image_link_fails: synthesize a
Docker-located activity + evidence WITHOUT wasDerivedFrom on a
copy of the nominal dataset; closure fails with a DockerImage
violation.
- test_docker_evidence_with_image_link_passes: same shape but WITH
a valid rtm:DockerImage + wasDerivedFrom edge; closure does NOT
add a DockerImage complaint.
- test_local_evidence_not_required_to_link_to_image: explicit
conditional-correctness check — the nominal --compute=local
fixture has only local-located activities, the shape's target
filter must be vacuous on it.
Tests: 13/13 in tests/test_shape_suite.py (10 prior + 3 new).
Part of WP3 (subplan §4.6); sixth of 7 commits closing issue #4 AC6.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
README.md (under "Compute Backends (Phase L)"): new "Image as
tracked evidence (WP3)" subsection — what WP3 adds, the six
properties on rtm:DockerImage, a working evidence_by_image SPARQL
example, and an explicit pointer to WP5 for the deferred notebook
Act 9/10 rewrite + audit-module image surfacing.
CLAUDE.md ("Named-graph layout"): one-line update on <adcs:evidence>
acknowledging it now holds rtm:DockerImage too under --compute=docker.
The deeper README "Compute Backends" rewrite (image-as-evidence
narrative + audit summary integration) is WP5 territory; this commit
ships the minimal docs delta so contributors reading the repo today
can find the new entity + the SPARQL helper.
Tests: 209 passed (+19 across WP3 commits 2-6), 5 skipped, 3
deselected. No regressions.
Part of WP3 (subplan §4.9, §7); seventh of 7 commits. Partial
coverage of issue #4 AC9 — the full README "Compute backends"
section rewrite + audit-module + notebook narrative defer to WP5.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lands the backend half of issue #4 (7 of 9 acceptance criteria): - rtm:DockerImage class + property set (commit c31cbce) - hash_docker_image() build-input hasher (a6a0680) - DockerCompute.emit_image_node() emits one node per run (343420e) - prov:wasDerivedFrom on Docker-produced evidence (425a263) - evidence_by_image() SPARQL helper (86974f5) - DockerEvidenceShape SHACL closure rule (3cbcf80) - README + CLAUDE.md notes (60a3721) The two narrative items (audit summary + notebook Act 9) are deferred to WP5. Issue #4 stays open with a status comment listing the split. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
WP4 c1. Preflight reachability check on the persistence backend so failure is fast and clear at startup rather than discovered at Stage 7. - BackendUnavailable exception (new in pipeline/backends/base.py) - StoreBackend.probe() Protocol method - LocalBackend.probe() writes + deletes .probe sentinel in output dir - FlexoBackend.probe() HEADs /orgs/<org>; respects FLEXO_PROBE_TIMEOUT (default 10s, distinct from the slow-call FLEXO_TIMEOUT) - FuskeiBackend.probe() HEADs /data 7 new unit tests in test_backends.py cover success + failure paths for each backend (mocked httpx). All 18 backend tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
WP4 c2. Builds on the StoreBackend probe (c1) to make startup the single fail-fast moment for backend reachability — no more discovering Flexo / Docker unavailability at Stage 7 or Stage 2. - ComputeUnavailable exception (compute/base.py); DockerNotAvailable now subclasses it for backwards compat - ComputeBackend.probe() Protocol method - LocalCompute.probe() is a no-op (always available) - DockerCompute.probe() wraps _check_daemon() - PipelineState gains store_backend field - run_pipeline() constructs both backends up-front and runs _run_preflight() before Stage 0; banner prints describe() + PASS/FAIL for each; sys.exit(2) on any failure - Stage 7 reads state.store_backend instead of re-instantiating Tests: TestComputeProbe in test_compute.py + PipelineState fixture fix in test_pipeline.py. Full suite: 219 passed (no regressions). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
WP4 c3. Adds the "code remote" half of the three-remote provenance
chain: every rtm:DockerImage now carries a literal git+URI pointing
at the Dockerfile in the source tree at the commit it was built from.
- compute/git_ref.py — current_git_ref(repo_root, file_path); shells
out to git rev-parse + git config; produces git+https://.../@<sha>#<path>
with graceful fallbacks (git+file://, git+local://uncommitted)
- docker_compute.emit_image_node() appends rtm:gitRef triple
- Tests:
- TestGitRef: shape + fallback + ssh→https normalization
- TestImageNodeEmitsGitRef: stubbed _image_metadata + verify the
triple lands on the image IRI (no Docker daemon required)
The rtm:gitRef property is declared formally in c8 alongside the rest
of the WP4 ontology additions; this commit uses the IRI directly.
Closes part of issue #4 (preparation for the reproduce CLI in c9).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
WP4 c4. Adds the "storage remote" half of the three-remote provenance chain: when persisting to Flexo (or Fuseki), every rtm:DockerImage gains a rtm:flexoRecord pointer to where in the storage backend its record lives. - StoreBackend.record_uri(layer) Protocol method - LocalBackend.record_uri() returns None (no remote) - FlexoBackend.record_uri(layer) -> urn:adcs:flexo:<org>/<repo>/<branch> - FuskeiBackend.record_uri(layer) -> urn:adcs:fuseki:<encoded-url>/<layer> - Runner Stage 4 attaches rtm:flexoRecord to the image after emit_image_node, only when the store backend exposes a non-None record_uri. LocalBackend runs unchanged (no triple added). Tests added in test_backends.py for all three shapes; 21 backend tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
WP4 c5. Distinguishes the image (static artifact), container
(transient materialization), and host (location) as three first-class
entities with standard PROV edges between them.
- ExecutionMetadata.container_uri() -> urn:adcs:docker-container:<id>
(None for local runs or missing container_id)
- _bind_execution_metadata accepts image_iri; when container_uri is
non-None, emits:
<container> a rtm:DockerContainer, prov:Entity ;
rtm:containerId "<id>" ;
prov:wasDerivedFrom <image> ;
prov:startedAtTime / endedAtTime "..."
<activity> prov:used <container>
- bind_proof_evidence / bind_simulation_evidence thread image_iri
through to the metadata helper.
- No change to existing PROV edges; new edges are purely additive.
Tests: TestContainerEntity in test_compute.py (4 cases — local skip,
docker emission, image link, missing-id sentinel). All 17 targeted
tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
WP4 c6. Adds the "under whose authority?" axis to the provenance chain without pulling in FOAF or W3C Org Ontology (those stay deferred per CLAUDE.md future-work #2). Two org IRIs per run: - operating org: who runs the container/authors the work (default urn:adcs:org:local-operator) - hosting org: who operates the substrate (host + Docker daemon) (default: same as operating) Both env-configurable via ADCS_{OPERATING,HOSTING}_ORG_IRI; defaults play "single-operator local" so existing runs don't change. New edges in evidence/binding.py: <container> prov:wasAttributedTo <operating-org> <host> rtm:operatedBy <hosting-org> <executor> prov:actedOnBehalfOf <operating-org> Both prov:Organization typings + rdfs:labels emitted to <adcs:context> at startup via compute/organizations.py::emit_org_nodes. PipelineState gains operating_org_iri + hosting_org_iri fields. bind_proof_evidence / bind_simulation_evidence gain corresponding kwargs threaded through to _bind_execution_metadata. The rtm:operatedBy predicate is declared formally in c8 alongside the rest of the ontology additions; this commit uses it directly. Full pytest: 231 passed (up from 219 baseline; no regressions). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
WP4 c7. The SHACL closure-rule check is an automated, fully-specified outcome — wraps as an earl:Assertion so the technical-trust witness is queryable RDF, beside the existing human-attestation witness (rtm:Attestation, which also subclasses earl:Assertion). - new traceability/closure_assertion.py::emit_closure_assertion() - Stage 6.5 in pipeline/runner.py calls it after verify() - assertion typed rtm:ClosureRuleAssertion + earl:Assertion + prov:Activity - carries earl:outcome (passed/failed), earl:mode (automatic), earl:test, earl:subject, prov:wasAssociatedWith, prov:atTime, rtm:violationCount - one assertion per run (Q9: per-run granularity, not per-shape) - compute.reproduce-side rtm:DigestMatchAssertion lands in c9 with the CLI Discipline: earl:mode is always earl:automatic — verification, not validation. Human attestation continues to use earl:manual / earl:semiAuto. Test: test_audit::test_closure_assertion_emitted_into_audit_graph. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ovenance WP4 c8. Formally declares every WP4 term used in commits c3-c7, adds the four new SHACL closure rules, regenerates rtm.ttl, and bumps the triple-count budget to accommodate the additions. New classes: - rtm:DockerContainer (subClassOf prov:Entity) — transient materialization - rtm:DigestMatchAssertion (subClassOf earl:Assertion + prov:Activity) - rtm:ClosureRuleAssertion (subClassOf earl:Assertion + prov:Activity) New properties: - rtm:containerId (on DockerContainer) - rtm:gitRef (on DockerImage; xsd:anyURI) - rtm:flexoRecord (on DockerImage; ObjectProperty) - rtm:operatedBy (on prov:Location; subPropertyOf prov:wasAttributedTo) - rtm:violationCount (on ClosureRuleAssertion) - rtm:transactionId (on prov:Activity) — for service-invocation wire logs - rtm:documentRef (on rtm:Evidence; xsd:anyURI) — for txnlog evidence New SHACL shapes (rtm_shapes.ttl): - DockerImageProvenanceShape — every DockerImage MUST have rtm:gitRef - DockerContainerShape — every DockerContainer MUST have wasDerivedFrom exactly one DockerImage + rtm:containerId - OrganizationAuspicesShape — DockerContainer SHOULD declare prov:wasAttributedTo a prov:Organization (Warning, not Violation) - TransactionLogShape — Evidence with rtm:documentRef MUST also have rtm:contentHash + prov:wasGeneratedBy Side effects: - Bumped triple budget 380 → 450 (218 used; 232 headroom for WP5) - emit_image_node now types rtm:gitRef as xsd:anyURI literal so it satisfies the new shape's datatype constraint - test_docker_evidence_with_image_link_passes fixture updated to emit rtm:gitRef on its synthetic image Full pytest: 232 passed (no regressions). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
WP4 c9. New compute/reproduce.py Typer app that closes the
reproducibility loop: given an rtm:DockerImage record, clone the
recorded git ref, rebuild from compute/Dockerfile, and compare the
resulting runtime digest to the recorded rtm:contentHash.
CLI:
uv run python -m compute.reproduce \
--image-digest sha256:... \
--from-trig output/rtm.trig
Output:
- prints PASS/FAIL with detail
- emits rtm:DigestMatchAssertion (earl:Assertion + prov:Activity)
into <adcs:audit> with earl:outcome + earl:mode=earl:automatic +
earl:subject=<image-iri> + prov:wasAssociatedWith=<reproduce-cli-agent>
- exit 0 on match, 1 on mismatch, 2 on prerequisite failure
Pure logic split out as testable units (parse_git_ref,
load_image_record, emit_digest_match_assertion) so 11 unit tests
cover the orchestration without needing Docker. The actual
clone+build subprocess loop is exercised opt-in via -m live.
Honors the verification/validation discipline: earl:mode is always
earl:automatic for these (automated check).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
WP4 c10. Adds the fourth service in the three-remote story: a
CouchDB-backed transaction-log store running in its own container
with its own URI (urn:adcs:service:transaction-log-store) and its
own hosting auspices.
- pipeline/backends/txnlog.py — minimal CouchDB client:
- probe() HEADs the db, creates on 404, surfaces auth failures
- put_document() PUTs JSON; 409 conflict treated as idempotent success
- get_document() — readback path for the trust-query renderer
- Env config: ADCS_TXNLOG_{URL,DB,USER,PASSWORD}
- 8 unit tests in test_txnlog.py via httpx.MockTransport
Bonus (same commit, single-line surface area): security hardening
in compute/reproduce.py per automated security review. Git refs
come from RDF stores that may be partly trust-boundary'd, so:
- reject base/sha components starting with '-' (flag smuggling)
- require base to start with https:// / ssh:// / git@
- add '--' end-of-options sentinel to git clone + git checkout
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
WP4 c11. Context manager that wraps a service invocation, captures
request/response, redacts secrets, PUTs the JSON to the txnlog store,
and emits RDF in <adcs:audit>:
<activity> a prov:Activity ;
rtm:transactionId "<id>" ;
prov:wasAssociatedWith <caller> ;
prov:used <service> ;
prov:startedAtTime/endedAtTime "<iso>"
<evidence> a rtm:Evidence ;
rtm:contentHash "sha256:<hash>" ;
rtm:documentRef <store-url> ;
prov:atLocation <txnlog-service-iri> ;
prov:wasGeneratedBy <activity>
Redaction allowlists:
Headers: Authorization, Cookie, Set-Cookie, X-Api-Key, X-Auth-Token,
Proxy-Authorization
Body keys: password, passwd, token, secret, api_key, apikey,
access_token, refresh_token
When store=None (e.g. --backend=local without txnlog), the activity
is still recorded but the evidence node is skipped — the
TransactionLogShape requires rtm:documentRef+rtm:contentHash, so
emitting the evidence without them would fail closure.
Robustness: a store.put_document failure does NOT propagate; the
wrapped service call's outcome is preserved. Exceptions inside the
context block are recorded in the document AND re-raised.
7 unit tests in test_transaction_log.py via FakeStore stand-in.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…12 part 1) Plumbs the optional TxnLogBackend through the runtime so future PRs can wrap individual service calls without further state surgery. - PipelineState gains txnlog_store: TxnLogBackend | None - ADCS_TXNLOG_ENABLED=1 env gate constructs a TxnLogBackend and runs it through _run_preflight alongside compute + storage probes - Preflight banner prints the txnlog describe() line when enabled - Existing runs (no env var set) are unchanged — txnlog_store is None, preflight skips that probe Per-call wrapping (FlexoBackend HTTP / DockerCompute subprocess / reproduce CLI subprocess) is deferred to a focused follow-up PR; those changes require more surgery in the backend bodies and have a small self-referential gotcha (FlexoBackend.persist would log into <adcs:audit> while persisting it). The plumbing landed here unblocks that work without bundling its risk into WP4. 95 targeted tests pass; no regressions. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
WP4 c13. Operationalizes the "how can I trust this evidence?" technical-trust panel as queryable SPARQL helpers. Six functions, six typed @DataClass(frozen=True) records: - technical_provenance(ds, evidence_iri) -> TechnicalProvenance - reproducibility_witnesses(ds, image_iri) -> list[DigestWitness] - closure_witnesses(ds, graph_iri) -> list[ClosureWitness] - auspices_chain(ds, evidence_iri) -> AuspicesChain - service_invocations_for(ds, ...) -> list[ServiceInvocationRow] - trust_summary(ds, evidence_iri) -> TrustSummary Plus render_trust_summary() — compact text rendering for interrogate.explain Trust panel. All queries are read-only, use OPTIONAL for graceful partial matches (local-compute runs have no container/image but still produce a useful technical row), and return typed records callers can pass without re-querying. Tests: 8 cases on a nominal local+local pipeline run; cover the empty + populated paths for each query. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
WP4 c14. Adds an env-driven branch-name prefix to FlexoBackend so each pipeline run can land in its own scoped branch space (e.g. cert/2026-06-12-001/evidence) without forcing the pattern on the default single-canonical-state run. - _branch_id(graph_iri, prefix="") prepends the prefix - FLEXO_BRANCH_PREFIX env (default "") is read in __init__ - branch_prefix kwarg overrides env - record_uri() honors the prefix so rtm:flexoRecord points at the correctly-scoped branch IRI Unchanged: empty default means existing runs land in adcs-demo/lifecycle/<layer> exactly as today. Test: test_flexo_backend_branch_prefix_applies. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
WP4 c15. Provisioning scripts for the local txnlog store (CouchDB), so a no-experience operator can bring the canonical multi-remote stack up with one command. - tools/start-services.sh — idempotent docker run + db ensure; waits for CouchDB readiness; prints the env-var block to export - tools/stop-services.sh — symmetric teardown; --purge wipes data - docker-compose.yml — same shape for users who prefer compose Both paths use the same container name (couchdb-adcs) and default credentials (adcs/adcs); env-var overrides documented in the scripts and in .env.example (lands in c16). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
WP4 c16. The architecture demo gets a self-contained ARCHITECTURE.md that a new collaborator can read cold; README + CLAUDE.md get the three-remote + fourth-service entry points; .env.example documents every env var with defaults. - ARCHITECTURE.md (new): three-remote diagram + URI scheme table + full provenance-chain example + EARL outcome section + trust query list + reproducibility loop + preflight gate semantics - README.md: new Setup section (.env + tools/start-services.sh + preflight fail-fast), new Canonical multi-remote run subsection under Quick Start, new Reproducibility verification subsection, ARCHITECTURE.md pointer at the top - CLAUDE.md: new "Three-remote architecture (WP4)" section under named-graph layout — concise URI scheme summary + EARL outcomes + preflight + trust queries - .env.example (new): every FLEXO_* / ADCS_TXNLOG_* / ADCS_*_ORG_* variable with documented defaults Full pytest after sweep: 267 passed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
16 commits making the four-service architecture (git + Flexo + local Docker + CouchDB txnlog) operational rather than narrated, with preflight gating, organizational auspices, EARL-wrapped automated outcomes, six trust queries, and the reproducibility CLI. c1. feat(backends): probe() on StoreBackend + 3 implementations c2. feat(compute): probe() on ComputeBackend + preflight gate c3. feat(compute): capture git ref + emit rtm:gitRef on rtm:DockerImage c4. feat(backends): record_uri() + emit rtm:flexoRecord c5. feat(evidence): emit rtm:DockerContainer entity + prov:used edge c6. feat(provenance): organizational auspices via prov:Organization c7. feat(audit): emit rtm:ClosureRuleAssertion from Stage 6.5 c8. feat(ontology): WP4 classes + properties + shapes c9. feat(compute): reproduce CLI + rtm:DigestMatchAssertion c10. feat(backends): TxnLogBackend (CouchDB) + reproduce hardening c11. feat(traceability): TransactionLogger + wire-logs as rtm:Evidence c12. feat(runner): wire txnlog store into PipelineState + preflight c13. feat(traceability): six trust queries + render_trust_summary c14. feat(backends): optional FLEXO_BRANCH_PREFIX c15. chore(tools): start-services.sh + stop-services.sh + docker-compose.yml c16. docs: ARCHITECTURE.md + README + CLAUDE.md + .env.example Companion issues filed: #7 (PU registry), #8 (RIME services), #9 (Starforge oracles). Issue #4 remains open for the WP5 narrative items (audit module image surfacing + notebook Act 9 update). Full pytest: 267 passed, 5 skipped, 3 deselected. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Closes one of two residual issue #4 ACs deferred from WP3. The audit report's markdown now includes a "Docker image provenance" table for any run that emitted at least one rtm:DockerImage, listing image IRI, digest, git ref, and count of evidence nodes derived from it. - DockerProvenanceRow dataclass (frozen) - docker_provenance(ds) -> list[DockerProvenanceRow] SPARQL helper - AuditReport gains docker_provenance: list[DockerProvenanceRow] - audit() populates it via the SPARQL query - _render_markdown adds the new section between coverage matrix and orphans, omitted entirely when the list is empty Local-compute runs see no change (empty list = section omitted). Docker-compute runs see the image surfaced beside the audit direction summary, where an auditor reading the report can trace "what produced what" without leaving the report. Tests: 2 new in test_audit.py — empty path + populated path with a synthesized image. 18 audit tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Closes the second of two residual issue #4 ACs deferred from WP3. A new cell after the existing Act 9 narration synthesizes the rtm:DockerImage node (mimicking what DockerCompute._emit_image_node does in a live --compute=docker run) with WP3 properties + WP4 extensions (rtm:gitRef, rtm:flexoRecord), then runs the WP3 evidence_by_image SPARQL helper against the augmented dataset to show the inverse query the executor-label model couldn't answer. The cell: - emits a synthetic rtm:DockerImage with content hashes + git ref + flexo record cross-link - wires synthesized v2 evidence to derive from it - runs evidence_by_image() + interpolates the count into the markdown - explains the reproducibility loop (compute.reproduce + EARL outcome) The original Act 9 cell (executor-agent label) stays intact above — the new cell extends rather than replaces, so the narrative reads: "here's the executor label (today's model); here's the image as a node (WP3 + WP4); here's the inverse query that becomes possible." Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Disposable one-page reconciliation showing every claim in openbee_dsg_opener.pptx Slide 5 (Demo #1) backed by a concrete code receipt. Reviewer can trace "Flexo Deployment + oracles + evidence reproducibility with git hashes and docker" to specific modules, commits, and tests. Notes that WP4 exceeded the slide's promise on the "with git hashes and docker" claim — container-as-entity, organizational auspices, wire-level audit trail, six trust queries are all additive surface beyond what was advertised. Cross-links the three companion issues (#7 PU registry, #8 RIME services, #9 Starforge oracles) as the "what's next" deliverable for Planetary Utilities' team. Pages auto-publishes the marimo notebook export at dynamicalsystemsgroup.github.io/ADCS-lifecycle-demo — the WP5 c2 notebook update will flow on next push. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ging Three commits closing the residual narrative items from WP3 (issue #4 ACs deferred to WP5) and the slide-reconciliation pass. c1. feat(audit): surface Docker image identity in audit summary - DockerProvenanceRow + docker_provenance() + AuditReport extension - _render_markdown adds "Docker image provenance" table for --compute=docker runs; section omitted for local runs - 2 new tests covering empty + populated paths c2. docs(notebook): Act 9 narrative now shows rtm:DockerImage as a node - New cell after the executor-label cell synthesizes rtm:DockerImage with WP3 + WP4 properties (contentHash, gitRef, flexoRecord) - Wires synthesized v2 evidence to derive from it - Runs evidence_by_image() SPARQL helper in-cell + interpolates the count into the markdown - Explains compute.reproduce + EARL DigestMatchAssertion c3. docs: RECONCILIATION.md — slide claims ↔ code receipts - Maps every "Flexo Deployment, oracles & evidence reproducibility with git hashes and docker" phrase to its code receipt - Cross-links companion issues #7/#8/#9 (PU / RIME / Starforge) - Notes WP4 *exceeded* the slide's promise on the docker axis End-of-roadmap alignment review (all in this merge): - Discipline sweep: validate-vs-verify clean (only legitimate uses + the one back-compat alias from WP1 §10) - openCAESAR sweep: zero hits - ROBOT_OPTIONAL sweep: zero hits - ValidateShapes IRI fragment: preserved per #6 known follow-up - Issues #2 + #3 retroactively closed with status comments - Issue #4 ready to close (this merge lands the residual 2 ACs) - Issues #5, #6, #7, #8, #9 correctly remain open (deferred + future-work) - End-to-end smoke: pipeline runs cleanly; 1084 union triples - Full pytest: 269 passed Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
CI's "Confirm rtm.ttl is committed in-sync with rebuild" gate was failing because the build_time was wall-clock-now, so every rebuild produced a different sha256 → diff against committed copy → fail. _reproducible_build_time() resolves the timestamp in this order: 1. SOURCE_DATE_EPOCH env var (Reproducible Builds standard). 2. Most-recent git-commit time of the build inputs (rtm-edit.ttl, sysml_term_map.csv, build_ontology.py). Stable across CI + local. 3. datetime.now() — unreproducible fallback for bootstrap. After this commit lands, the next regen of ontology/rtm.ttl + assembly_manifest.json (separate commit) will pin the timestamp to THIS commit's ct; future CI rebuilds will compute the same value and produce byte-identical artifacts. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…d_time Pins the artifact's build_time to the prior commit's ct (where build_ontology.py + the ontology inputs were last touched), via the new _reproducible_build_time() in scripts/build_ontology.py. CI rebuilds will now produce byte-identical artifacts. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…* refs Three changes addressing reviewer feedback: 1. Removed all WP1..WP5 mentions from notebook prose. The narrative describes capabilities, not implementation phases. 2. New cell — "Many Authoritative Sources of Truth, one stitched provenance graph" — acknowledges the diverse ASoTs (SysMLv2 model, symbolic + numerical oracles, Docker image, engineer attestation, closure-rule check, audit module), shows what each holds and what grounds its trust, and frames the demo's contribution as the integration that stitches them via standard PROV/EARL/GSN edges without overloading anyone's vocabulary. 3. Renamed the docker-as-evidence cell to "The runtime ASoTs as first-class nodes" and expanded it to show the container entity (rtm:DockerContainer, prov:used edge from analysis activity) so the materialization story is visible inline. Reports both proof-artifact AND simulation-result counts from the evidence_by_image inverse query. 4. New "Numerical evidence — the full provenance, end-to-end" cell uses trust_summary + render_trust_summary against EV-SIM-REQ-001-v2 to render the complete chain (oracle → activity → container → image → git ref → host → org → closure assertion) in one block, so readers see the multi-ASoT stitch concretely rather than as abstract narration. The Pages workflow regenerates output/notebook.html on next push. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
actions/checkout@v4 defaults to fetch-depth=1, which makes `git log -- <files>` return empty for any file HEAD didn't touch directly. _reproducible_build_time() then falls back to datetime.now() and produces non-reproducible artifacts — the CI diff-check fails on the very build the timestamp fix was supposed to enable. fetch-depth: 0 fetches full history so the commit that last touched the inputs resolves consistently across CI + local. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…-specific CI runs 'make ontology' (with ROBOT) which writes robot_used: true and a ROBOT-specific notes string into assembly_manifest.json. Local contributors without Java run 'make ontology-python' which writes robot_used: false. The manifest's build-path provenance fields are intentionally asymmetric. The load-bearing artifact is rtm.ttl, which IS reproducible thanks to _reproducible_build_time(); that's what the gate checks now. Manifest stays committed but isn't diff-gated. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…wrapping
Three failing CI tests:
- test_pipeline_runner_help_lists_known_flags
- test_rerun_help_lists_known_flags
- test_reproduce_cli_help
All check for flag substrings ("--auto" etc) in `result.stdout`.
Rich's Typer help renderer wraps the flag column to terminal width;
CI runners are narrower than dev workstations, which can split
"--auto" across a wrap boundary so it's no longer a contiguous
substring of the rendered output.
Fix: strip ANSI escape codes + collapse whitespace before the
substring match. _flatten_help() in test_cli.py; same regex inline
in test_reproduce.py (kept local to avoid an import cycle).
18 targeted tests pass locally.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Merges the full roadmap (51 commits across five work packages plus CI/notebook fixes) from
stagingtomain. The demo moves from "mock-up that narrates a three-remote story" to "running code that exercises four services with a fully audited multi-ASoT provenance graph."query_named_graphhelper;ExecutionMetadata.executor_uri/location_uriconsolidation; validation→verification rename with back-compat alias;interrogate.rerunTyper CLI + Stage 6.5 banner extension; Typer migration ofpipeline.runner+interrogate.{explain,reproduce}; openCAESAR prose cleanup.make ontology(fail-fast on missing Java);.github/workflows/ontology.ymlwith setup-java@v4 + cached robot.jar; pytestlive/networkmarkers; triple-count budget gate; openCAESAR code/data cleanup +rtm.ttlregen.rtm:DockerImageclass + property set;hash_docker_image();DockerCompute._emit_image_node();prov:wasDerivedFromon Docker evidence;evidence_by_imageSPARQL helper;DockerEvidenceShapeclosure rule.rtm:gitRef+rtm:flexoRecordcross-linking image to git + Flexo;rtm:DockerContaineras a first-class materialization entity (image vs container vs host distinction); organizational auspices viaprov:Organization+rtm:operatedBy; EARL-wrapped automated outcomes (rtm:ClosureRuleAssertion,rtm:DigestMatchAssertion);compute.reproduceTyper CLI (rebuild image at recorded git ref + digest-compare);TxnLogBackend(CouchDB) as a fourth service with its own URI + auspices;TransactionLoggerwith secret-redaction allowlist; six typed trust queries intraceability/queries.py; optionalFLEXO_BRANCH_PREFIXfor multi-run isolation;tools/start-services.sh+docker-compose.yml; newARCHITECTURE.md.RECONCILIATION.mdmapping every slide claim to its code receipt.build_time(SOURCE_DATE_EPOCH + git ct);actions/checkoutwithfetch-depth: 0; diff-gate scoped tortm.ttl(manifest legitimately diverges by build path); CLI--helpsubstring tests hardened against terminal-width wrapping; WP* mentions stripped from notebook prose.Closed by this PR
Remain open (intentional)
adcsTyper aggregator CLI (deferred from WP1) #5 Top-leveladcsTyper aggregator — deferred until WP4 entry points stabilize; tracked in WP1 §10ValidateShapesstep IRI toVerifyShapes(with Flexo + audit migration) #6 RenameValidateShapesstep IRI toVerifyShapes— discipline follow-up requiring a Flexo + audit migration; tracked in WP1 §10rime-backend-demo; @rororowyourboat tagged in body)New surfaces a reviewer can read cold
ARCHITECTURE.md— three-remote + fourth-service picture, URI scheme, EARL outcomes, six trust queries, preflight gate, reproducibility loopRECONCILIATION.md— slide claims ↔ code receipts for the May-22 OpenMBEE deck.env.example— every FLEXO/ADCS_TXNLOG/ADCS_*_ORG env var with documented defaultstools/start-services.sh— one-liner CouchDB txnlog store provisioningTest plan
uv run pytest— 269 passed, 5 skipped, 3 deselected (on staging tip)stagingpush: run 26616426815make ontology-pythonproduces byte-identicalrtm.ttlon rebuild (reproducibility verified locally)uv run python -m pipeline.runner --autoend-to-end smoke: 1084 union triples inoutput/rtm.ttlARCHITECTURE.mdcold; does the three-remote + fourth-service picture stand on its own?output/notebook.html(or wait for Pages); does the multi-ASoT narrative + numerical-evidence provenance render land?tools/start-services.sh && export ADCS_TXNLOG_ENABLED=1 && export FLEXO_TOKEN=... && uv run python -m pipeline.runner --auto --backend=flexo --compute=docker🤖 Generated with Claude Code