Skip to content

Conversation

@lsm5
Copy link
Member

@lsm5 lsm5 commented Dec 10, 2025

Depends on #512 . Only see HEAD commit here.

@github-actions github-actions bot added the image Related to "image" package label Dec 10, 2025
podmanbot pushed a commit to podmanbot/buildah that referenced this pull request Dec 10, 2025
@podmanbot
Copy link

✅ A new PR has been created in buildah to vendor these changes: containers/buildah#6597

Copy link
Contributor

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is valuable.

On the process side, I’m not quite sure how to go about building this — I’d like to avoid exposing and API that plain doesn’t work for some transports. I think I’d prefer:

  • Building this ~one complete feature at a time, e.g. if we add PutBlobOptions.Digests, then actually implement it in all transports immediately. We should never be failing to enforce ForceDigest… and reporting success. (In many cases, a Situation{CannotChangeDigestReason: "a sha256-only transport not updated yet", digest: sha256Value} in a transport would be fine, that’s what the tracker is for, as long as the enforcement of the rules is all there immediately.)
  • Not providing the public API until all pieces are in place. If we need something for testing, a temporary BrokenSetForceDigestAlgorithm /* Unstable API, may be changed or removed any time */?

Without a deep review or revisiting earlier PRs, I think missing pieces here are

  • multi-platform images: should not be too difficult, but is fairly orthogonal
  • TryReusingBlob: this + BlobInfoCache will be significant, it has ~the same prerequisite question “what layer reuse is valid cross-digest” as c/storage — and that’s a major reason why I’d prefer to build this a feature at a time, so that we can get plain PutBlob or configs done without blocking on this
  • The PutBlobPartial code path (plausibly should be just a failure because pull --force-digest= makes little sense, and would be difficult to implement, I’m not sure)
  • DiffID computation (maybe keep sha256 and fail on schema1??)
  • ZstdChunked compression using sha512 digests for all its data

}
if stream.info.Digest != "" && uploadedInfo.Digest != stream.info.Digest {
return types.BlobInfo{}, fmt.Errorf("Internal error writing blob %s, blob with digest %s saved with digest %s", srcInfo.Digest, stream.info.Digest, uploadedInfo.Digest)
expectedAlgo, err := ic.c.options.digestOptions.Choose(digests.Situation{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is redundant / conflicting with PutBlob… making its own choice. If we had the generic code choosing a specific algorithm and expecting precisely that one to be used, then it should do that before PutBlob and provide that to transports as a parameter.

I’m not sure where the decision should be, at this point [and it’s a private interface, so we can change our mind] — I guess the transport might have some information (e.g. about remote registry capabilities) that the generic code doesn’t — but, for now, we don’t have all transports updated yet, either way…


AFAICS this only exists as a sanity check (both config and layer copies later enforce equality if cannotModifyManifest…), so I think an “if algorithms match the whole values must match” check here would be sufficient.

}

// SetForceDigestAlgorithm forces the use of a specific digest algorithm for this copy operation.
func (o *Options) SetForceDigestAlgorithm(algo digest.Algorithm) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Yes, I’m generally in favor of adding more options via methods rather than fields. And we can always add something vaguely like SetForceDigestOptions later if we decided to make that fully public.)

I’m wondering about naming this ForceDestinationDigest… — we might eventually want a “pulls must consume sha512 or stronger digests and refuse to work on sha256”, and that’s a fairly unrelated feature.

if err != nil {
return fmt.Errorf("failed to set force-digest algorithm: %w", err)
}
o.digestOptions = digestOpts
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fail if this is already set? It might protect against oversights. OTOH it can also prevent some code patterns — if that turned out to be troublesome, we can always remove that enforcement later.

// only to allow gradually building the feature set.
// After c/image/copy consistently implements it, provide a public digest options API of some kind.
optionsCopy := *options
// Only set default if not already configured
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(We can avoid the whole optionsCopy if digestOptions != nil.)

if destInfo.Digest != srcInfo.Digest {
return fmt.Errorf("Internal error: copying uncompressed config blob %s changed digest to %s", srcInfo.Digest, destInfo.Digest)
// Allow digest algorithm changes when forcing a specific digest algorithm
forcingDifferentAlgo := ic.c.options.digestOptions.MustUseSet() != "" &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • See the similar check copyBlobFromStream
  • “MustUse” Is not the only reason a copy might choose a different algorithm (e.g. we might be pushing to a sha256-only-transport or registry), hard-coding a check only for MustUse here is overly specific
  • Compare the end of copyLayers, where the equality check is immediately paired with manifest edits [and see elsewhere about manifest edits], so that we always set up edits if we know they are needed, and we minimize the risk of the two getting out of sync. I’m not sure whether the check+update should live in copyConfig or copyUpdatedConfigAndManifest (guessing the latter), but they should be together.


// updateManifestConfigDigest uses typed manifest structures instead of generic JSON manipulation.
// This leverages the existing manifest parsing and serialization infrastructure.
func (ic *imageCopier) updateManifestConfigDigest(manifestBlob []byte, src types.Image, newConfigDigest digest.Digest) ([]byte, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to go through the manifest abstraction in c/image/internal/image and not be hard-coded in c/image/copy.

(It can be a method on image.Sourced/genericManifest, like CanChangeLayerCompression, without making it a public API, at least for now.)

}

ic.c.Printf("Writing manifest to image destination\n")
manifestDigest, err := manifest.Digest(man)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This is another part that will need updating, for multi-arch images.)

@mtrmac
Copy link
Contributor

mtrmac commented Dec 11, 2025

FWIW containers/buildah#6597 hits internal error: use of uninitialized digests.Options — entirely my fault, it’s already broken in #512, but, still a blocker to merging this as well.

@mtrmac
Copy link
Contributor

mtrmac commented Dec 11, 2025

FWIW containers/buildah#6597 hits internal error: use of uninitialized digests.Options

I think #512 has a fix, but I’m not going to stay around today to wait for containers/buildah#6552 to confirm.

@lsm5
Copy link
Member Author

lsm5 commented Dec 12, 2025

I think I’d prefer:
* Not providing the public API until all pieces are in place. If we need something for testing, a temporary BrokenSetForceDigestAlgorithm /* Unstable API, may be changed or removed any time */?

I can go ahead with this. Let me know if you'd much rather the other preference though.

@mtrmac
Copy link
Contributor

mtrmac commented Dec 12, 2025

I didn’t mean the two bullet points as alternatives, but as two parts of the same plan.


My main concern here is having some way of tracking what doesn’t work yet; all of that will need to either be implemented or at least be updated to cleanly fail. So I was thinking “one feature at a time, in all transports”. Doing it that way also means we immediately run into difficulties if some transports’ structures make a design difficult.

There may be other ways to track things or to structure the implementation. And maybe that whole discussion is premature, and we need a prototype PR, or a few, sort of like this, testing the full workflow to just run into things and see what are the things that will need to be done.

The goal is to allow building the digest-choice-dependent machinery
from the bottom up without committing to an API while we don't understand
the problem space; for now, we don't expose any way for users to
actually make a choice.

This code is never called at this point, so this should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
for now, purely to let us internally work on implementing digest
choice / conversions; there is no way for a caller to set it.

Should not change behavior, the field has no users.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
This is:
- UNUSABLE (no external API)
- INCOMPLETE (manifest digests, TryReusingBlob, ...)
- BROKEN (various code assumes that digest returned from upload matches input)
- INCONSISTENT (PutBlobOptions.Digests is only implemented in one transport,
  and silently ignored elsewhere)

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
@lsm5 lsm5 force-pushed the skopeo-copy-force-digest branch from 8a75ccc to f15717e Compare December 26, 2025 21:11
podmanbot pushed a commit to podmanbot/buildah that referenced this pull request Dec 26, 2025
lsm5 added 4 commits December 26, 2025 16:16
Add UpdateConfigDigest() method to the manifest interface to support
updating config descriptor digests when forcing different digest algorithms.

This foundational change allows callers to update config digests uniformly
across different manifest formats (OCI, Docker schema2) while properly
failing for unsupported formats (Docker schema1, which lacks a separate
config descriptor).

Implementations:
- OCI and Docker schema2: Update config digest directly
- Docker schema1: Return error (no separate config descriptor)

This is needed to support forcing digest algorithms in image copy operations,
where the config blob may be re-hashed with a different algorithm.

Signed-off-by: Lokesh Mandvekar <lsm5@redhat.com>
Add public API to force a specific digest algorithm when copying images.
This allows users to override the default sha256 digest algorithm.

The API is marked as "Broken" (unstable) because:
- Not all transports support it yet
- Multi-arch images are not yet handled
- Some edge cases still need work

See containers#552 for status.

Key behaviors:
- Validates algorithm is available on the system
- Prevents reconfiguration (fails if already set)
- Respects existing digestOptions if already configured

This is the user-facing entry point; actual implementation follows
in subsequent commits.

Signed-off-by: Lokesh Mandvekar <lsm5@redhat.com>
Implement the core logic to force digest algorithms when copying images:

1. Config digest handling:
   - Modify copyConfig() to return new digest when algorithm changes
   - Add updateManifestConfigDigest() helper to update config digest in manifest
   - Use manifest abstraction layer for uniform digest updates

2. Manifest digest selection:
   - Use digestOptions.Choose() to select manifest digest algorithm
   - Apply selected algorithm via manifest.DigestWithAlgorithm()

3. Digest mismatch handling:
   - Allow digest changes only when algorithms differ
   - Detect corruption when same algorithm produces different digest

This builds on the manifest abstraction layer and API from previous commits
to enable the actual digest forcing functionality.

Note: Guards for unsupported scenarios (multi-arch, schema1, etc.) are added
in the next commit to keep this focused on core logic.

Signed-off-by: Lokesh Mandvekar <lsm5@redhat.com>
Add guards and limitations for cases where digest forcing is not yet
supported or incompatible with certain features:

1. Multi-arch images (multiple.go):
   - Not yet implemented; needs per-instance config digest updates
   - Fail cleanly with clear error message
   - See containers#552 (comment)

2. Docker schema1 manifests (single.go):
   - Lack separate config descriptor, only support sha256
   - Fail early when forcing non-sha256 algorithms

3. zstd:chunked compression (single.go):
   - Uses sha256 for internal TOC and chunk digests
   - Fall back to plain zstd when forcing non-sha256
   - Fail if requireCompressionFormatMatch is set

4. PutBlobPartial optimization (single.go):
   - Incompatible with forcing different digest (needs full layer)
   - Disable partial pull when algorithms don't match

5. Blob reuse optimization (single.go):
   - Skip blob reuse when digest algorithms don't match
   - Future: could support cross-digest reuse via BlobInfoCache

These guards ensure clean failures and fallbacks rather than silent
corruption or confusing errors.

Signed-off-by: Lokesh Mandvekar <lsm5@redhat.com>
@lsm5 lsm5 force-pushed the skopeo-copy-force-digest branch from f15717e to eca922f Compare December 26, 2025 21:29
podmanbot pushed a commit to podmanbot/buildah that referenced this pull request Dec 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

image Related to "image" package

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants