Skip to content

Conversation

@mswintermeyer
Copy link
Contributor

Before this PR

There's currently a race condition possible that can make files impossible to decrypt, if a file is re-uploaded to s3.

If a service implementing the S3 API takes a long time to process a file creation request (e.g. while ceph is resharding), if the file is re-uploaded during that time, there are no guarantees on which write wins. That means that a file could exist from one upload, but the key material from the other upload, and since they don't match the file can't be decrypted.

After this PR

Fix the race for s3, by always using a unique suffix/identifier for the file key (so they won't ever overwrite each other) and attaching metadata to the primary file with that identifier.

There's now a risk that if a file is re-uploaded, two key material files could exist for it, and one will be useless and never get deleted. But that's reasonable tradeoff to having files that cannot be decrypted.

==COMMIT_MSG==
S3 files contain metadata to uniquely associate them with keys, to protect against a race condition when a file is uploaded twice
==COMMIT_MSG==

Possible downsides?

@changelog-app
Copy link

changelog-app bot commented Nov 25, 2025

Generate changelog in changelog/@unreleased

Type (Select exactly one)

  • Feature (Adding new functionality)
  • Improvement (Improving existing functionality)
  • Fix (Fixing an issue with existing functionality)
  • Break (Creating a new major version by breaking public APIs)
  • Deprecation (Removing functionality in a non-breaking way)
  • Migration (Automatically moving data/functionality to a new system)

Description

S3 files contain metadata to uniquely associate them with keys, to protect against a race condition when a file is uploaded twice

Check the box to generate changelog(s)

  • Generate changelog entry

@changelog-app
Copy link

changelog-app bot commented Nov 25, 2025

Successfully generated changelog entry!

Need to regenerate?

Simply interact with the changelog bot comment again to regenerate these entries.


📋Changelog Preview

✨ Features

  • S3 files contain metadata to uniquely associate them with keys, to protect against a race condition when a file is uploaded twice (#901)

private static final String DEFAULT_CIPHER_ALGORITHM = AesCtrCipher.ALGORITHM;
private static final String S3_METADATA_KEY = "crypto-file-key-suffix";
private static final String S3_METADATA_KEY_HEADER =
Constants.FS_S3A_CREATE_HEADER + ".x-amz-meta-" + S3_METADATA_KEY;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants