CLDSRV-898: handle checksums in CompleteMultipartUpload#6170
CLDSRV-898: handle checksums in CompleteMultipartUpload#6170leif-scality wants to merge 10 commits into
Conversation
|
LGTM |
❌ 1 Tests Failed:
View the top 1 failed test(s) by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
2493909 to
538c16c
Compare
Hello leif-scality,My role is to assist you with the merge of this Available options
Available commands
Status report is not available. |
Incorrect fix versionThe
Considering where you are trying to merge, I ignored possible hotfix versions and I expected to find:
Please check the |
27b4a43 to
a90bd94
Compare
|
LGTM |
| const parts = jsonList.Part || []; | ||
| for (let i = 0; i < parts.length; i++) { | ||
| const part = parts[i]; |
There was a problem hiding this comment.
| const parts = jsonList.Part || []; | |
| for (let i = 0; i < parts.length; i++) { | |
| const part = parts[i]; | |
| for (const part of (jsonList.Part || [])) { |
| if (tag !== expectedTag) { | ||
| const algoLabel = tag.replace(/^Checksum/, '').toLowerCase(); | ||
| return errorInstances.BadDigest.customizeDescription( | ||
| `The ${algoLabel} you specified for part ${partNumber} ` + 'did not match what we received.', |
There was a problem hiding this comment.
| `The ${algoLabel} you specified for part ${partNumber} ` + 'did not match what we received.', | |
| `The ${algoLabel} you specified for part ${partNumber} did not match what we received.`, |
There was a problem hiding this comment.
There's already a test file multipartUpload.js, wouldn't it be better to add the tests there? Or if a refactor into multiple files is beneficial, maybe port some related tests from there to this new file to avoid scattering the complete MPU tests into multiple files.
There was a problem hiding this comment.
moved the tests to multipartUpload.js. This also triggered the new prettier linter in the file, I added it as a separate commit
|
|
||
| // XML element name AWS uses for each algorithm in CompleteMultipartUpload's | ||
| // per-part body. | ||
| const TAG_BY_ALGO = { |
There was a problem hiding this comment.
Shouldn't this be in constants too?
There was a problem hiding this comment.
The algorithms object contains the TAG of each algo already, this object is just for testing that the tag was not changed in the algorithms object
There was a problem hiding this comment.
Could you construct it from algorithms? If algorithm is supposed to be a constant and we are concerned that it may be accidentally changed, what about using Object.freeze on it?
There was a problem hiding this comment.
The goal of the test if to raise an error if we change the xmlTag for some reason in algorithms. If we construct it from algorithms then we will never detect the change. algorithms is already Object.freeze
| assert.strictEqual( | ||
| err.description, | ||
| 'One or more of the specified parts could not be ' + | ||
| 'found. The part may not have been uploaded, or ' + |
There was a problem hiding this comment.
Is the double space intentional? Also not very fond of matching against such a long error message that may vary, but that's okay I think.
There was a problem hiding this comment.
AWS returns two spaces. For all the final API/XML errors I return exactly what AWS returns
<Error><Code>InvalidPart</Code>
<Message>One or more of the specified parts could not be found. The part may not have been uploaded, or the specified entity
tag may not match the part's entity tag.</Message>
<UploadId>CTFNKyL2hI6n6irH0zbVWDzdPZ4n2ueJceRh1juCeuL2X5HOjrvCmXQMEqaoAatEWTEa3pWWxC7t9lOStMzjo0nJb4pv8Ct6oT
Hv2n8mggVXRQ8RxiXyVyt3.3zpY98HVsZd.ozihhJ1HdUjLCkJtwJMQdNBd4fSdG9drS80vdg-</UploadId>
<PartNumber>1</PartNumber>
<ETag>0ebf9257a12e808d107b2ed1a826c122</ETag>
<RequestId>DAQAPVMCMSY4PDPV</RequestId>
<HostId>XS/2re3ieUQRTKdANLtZv14qyB2h3LjVHnmVrvjP0cj1PazPO16KkArQMtBLBy8S4mmLzQkuXZc=</HostId></Error>
There was a problem hiding this comment.
okay fair enough (my little voice tells me that AWS didn't care that much about the beauty of their error messages 🙂 )
There was a problem hiding this comment.
I believe no sane application would match an error by their exact error message to the dot, but I'm good with sticking to AWS to avoid ourselves asking the question. Personally I wouldn't care though 😛
There was a problem hiding this comment.
Also I believe the error messages are not enforced by the API contract, hence may change at any time on AWS side, which would make our effort useless in the end. Just my two cents and my last comment on this 😉
There was a problem hiding this comment.
Yeah the message can be improved, but I just copy and paste it to stay consistent. Also having the same error message may help the client if they google it, they will get more information than with a custom one
| // `x-amz-checksum-type` and `x-amz-checksum-algorithm` are configuration | ||
| // headers (MPU completeness mode / SDK algorithm hint), not value | ||
| // headers. They must not count toward the "value header" tally. | ||
| const valueHeaders = Object.keys(headers).filter( | ||
| h => h.startsWith('x-amz-checksum-') && h !== 'x-amz-checksum-type' && h !== 'x-amz-checksum-algorithm', | ||
| ); |
There was a problem hiding this comment.
If the list of supported headers is known and a short list of supported algorithms, it may be cleaner to directly extract each of the possible ones, or otherwise filter like [list-of-supported-headers].includes(h).
It would change the behavior (probably in a good way) if the client sends one or more unsupported checksums along with multiple valid ones, where we probably want to return AlgoNotSupported rather than MultipleChecksumTypes in priority in this case, but that's more a nitpick, either way should work.
There was a problem hiding this comment.
AWS checks MultipleChecksumTypes before AlgoNotSupported, the current behavior is the same as AWS.
AWS also ignores only x-amz-checksum-type and x-amz-checksum-algorithm, they don't count them to the checksum count, and they also never trigger AlgoNotSupported. if we send x-amz-checksum-BAD on the other hand we get an AlgoNotSupported. If we send x-amz-checksum-BAD + x-amz-checksum-ZZZ we get MultipleChecksumTypes.
So the order is
- check no MultipleChecksumTypes
- check no AlgoNotSupported
- check actual checksum value, BadDigest if mismatch
I added a commit to ignore x-amz-checksum-type and x-amz-checksum-algorithm in the other functions, I didn't know this behavior existed.
There was a problem hiding this comment.
Sounds good to me.
IMHO we don't necessarily have to mimic every AWS behavior when it comes to error handling in particular, as it takes time to reverse engineer those aspects with no clear benefit, and AWS implementation is not perfect and their choices can be debated too. Sure it became the de facto standard and we need to stick to its way of doing things to avoid breaking existing applications, but not necessarily in all its nitty gritty details and quirks.
There was a problem hiding this comment.
I needed to test a lot to prevent being more strict than AWS and introducing breaking changes. So I have a list of errors already so we might as well use it
a90bd94 to
e468700
Compare
|
LGTM |
46b6d78 to
6e735e4
Compare
|
|
LGTM — solid implementation with thorough test coverage. |
6e735e4 to
1c92b9b
Compare
|
1c92b9b to
3df48c2
Compare
|
LGTM |
3df48c2 to
c190267
Compare
|
c190267 to
7b3785c
Compare
|
LGTM |
There was a problem hiding this comment.
This diff is too big, it could be in another PR by itself
There was a problem hiding this comment.
this was caused by the linter
| const checksumHeaders = Object.keys(headers).filter( | ||
| header => | ||
| header.startsWith('x-amz-checksum-') && | ||
| header !== 'x-amz-checksum-type' && | ||
| header !== 'x-amz-checksum-algorithm', | ||
| ); |
There was a problem hiding this comment.
This pattern reoccurs multiple times in this file, you can make a function for that
There was a problem hiding this comment.
created helper
| }); | ||
|
|
||
| it( | ||
| 'should return CRC64NVME/FULL_OBJECT on CompleteMPU response ' + 'when CreateMPU sent no checksum headers', |
There was a problem hiding this comment.
| 'should return CRC64NVME/FULL_OBJECT on CompleteMPU response ' + 'when CreateMPU sent no checksum headers', | |
| 'should return CRC64NVME/FULL_OBJECT on CompleteMPU response when CreateMPU sent no checksum headers', |
7b3785c to
a90bd94
Compare
| let caught; | ||
| try { | ||
| await promise; | ||
| } catch (err) { | ||
| caught = err; | ||
| } | ||
| assert(caught, 'expected CompleteMPU to reject'); | ||
| assert.strictEqual( | ||
| caught.name, | ||
| 'InvalidPart', | ||
| `expected InvalidPart, got ${caught.name}: ${caught.message}`, | ||
| ); |
There was a problem hiding this comment.
You can use assert.rejects
await assert.rejects(promise, err => {
assert.strictEqual(
err.name,
'InvalidPart',
`expected InvalidPart, got ${err.name}: ${err.message}`,
);
});a90bd94 to
7b3785c
Compare
|
I forced push from the wrong branch, reverted the changes |
| const log = { debug: sandbox.stub() }; | ||
| const result = await validateMethodChecksumNoChunking(request, body, log); | ||
| assert.deepStrictEqual(result, ArsenalErrors.BadDigest); | ||
| assert(log.debug.calledOnce); |
There was a problem hiding this comment.
debug logs should probably not be tested
There was a problem hiding this comment.
I will remove the log tests
|
LGTM — well-structured feature with thorough test coverage across all algorithm/type combinations, error paths, and edge cases. Two minor nits posted inline: |
| }; | ||
| const log = { debug: sandbox.stub() }; | ||
| const result = await validateMethodChecksumNoChunking(request, body, log); | ||
| assert.strictEqual(result, null); |
There was a problem hiding this comment.
In some tests above you use assert.ifError(result) and here assert.strictEqual(result, null), you should stick to only 1 pattern and be consistent everywhere
There was a problem hiding this comment.
used ifError everywhere
| apiMethod: 'completeMultipartUpload', | ||
| headers: { 'content-md5': '1B2M2Y8AsgTpgAmY7PhCfg==' }, | ||
| }; | ||
| const log = { debug: sandbox.stub() }; |
There was a problem hiding this comment.
It would be better to have a full mocked logger like it's done in some other test files, to avoid breakin if we change from debug to another level
There was a problem hiding this comment.
I will remove the log tests
| const algorithm = storedMetadata.checksumAlgorithm; | ||
| const type = storedMetadata.checksumType; |
There was a problem hiding this comment.
Deconstruction could be slightly more readable
| const algorithm = storedMetadata.checksumAlgorithm; | |
| const type = storedMetadata.checksumType; | |
| const { checksumAlgorithm: algorithm, checksumType: type } = storedMetadata; |
| } | ||
| const { keysToDelete, extraPartLocations } = filteredPartsObj; | ||
| const { aggregateETag, dataLocations, calculatedSize } = partsInfo; | ||
| function continueProcessParts() { |
There was a problem hiding this comment.
Shouldn't this function be another step in the waterfall?
There was a problem hiding this comment.
integrated it with the waterfall
| error: resolveErr, | ||
| }); | ||
| return next(resolveErr, destBucket); | ||
| } |
There was a problem hiding this comment.
Is there a benefit in catching those errors since we already have a catch-all below
There was a problem hiding this comment.
fixed by the previous comment
| // CompleteMPU is not in `checksumedMethods` (x-amz-checksum-* is the | ||
| // final-object checksum, not a body digest) but it still must validate | ||
| // Content-MD5 against the XML body when present. | ||
| describe('completeMultipartUpload (md5-only path)', () => { |
There was a problem hiding this comment.
I think a lot of the unit tests could be turned into a declarative matrix for readability.
There was a problem hiding this comment.
This is an edge case, Other tests use for loops to test all checksums algorithms already
|
|
||
| // Resolves with { xml, headers } so callers can inspect both the | ||
| // response body and the response headers. | ||
| function completeMpuP(uploadId, partChecksumXml = '') { |
There was a problem hiding this comment.
It seems to me the test code is riddled with this type of local helpers, maybe we should consider a common layer of helpers for MPU construction?
There was a problem hiding this comment.
moved them to mpuUtils.js
| it( | ||
| 'should return CRC64NVME/FULL_OBJECT on CompleteMPU response ' + 'when CreateMPU sent no checksum headers', |
There was a problem hiding this comment.
Extra newline and concatenation doesn't look necessary.
7b3785c to
fb9139f
Compare
…ing in part metadata
fb9139f to
ee04392
Compare
|
LGTM |
tests/unit/api/multipartUpload.js