-
Notifications
You must be signed in to change notification settings - Fork 32
S3 files contain metadata to uniquely associate them with keys #901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Generate changelog in
|
✅ Successfully generated changelog entry!Need to regenerate?Simply interact with the changelog bot comment again to regenerate these entries. 📋Changelog Preview✨ Features
|
ba339ec to
33e15f0
Compare
| private static final String DEFAULT_CIPHER_ALGORITHM = AesCtrCipher.ALGORITHM; | ||
| private static final String S3_METADATA_KEY = "crypto-file-key-suffix"; | ||
| private static final String S3_METADATA_KEY_HEADER = | ||
| Constants.FS_S3A_CREATE_HEADER + ".x-amz-meta-" + S3_METADATA_KEY; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before this PR
There's currently a race condition possible that can make files impossible to decrypt, if a file is re-uploaded to s3.
If a service implementing the S3 API takes a long time to process a file creation request (e.g. while ceph is resharding), if the file is re-uploaded during that time, there are no guarantees on which write wins. That means that a file could exist from one upload, but the key material from the other upload, and since they don't match the file can't be decrypted.
After this PR
Fix the race for s3, by always using a unique suffix/identifier for the file key (so they won't ever overwrite each other) and attaching metadata to the primary file with that identifier.
There's now a risk that if a file is re-uploaded, two key material files could exist for it, and one will be useless and never get deleted. But that's reasonable tradeoff to having files that cannot be decrypted.
==COMMIT_MSG==
S3 files contain metadata to uniquely associate them with keys, to protect against a race condition when a file is uploaded twice
==COMMIT_MSG==
Possible downsides?