feat: Initial support for HTTP range requests#481
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #481 +/- ##
==========================================
- Coverage 87.16% 87.11% -0.05%
==========================================
Files 83 84 +1
Lines 13129 13373 +244
==========================================
+ Hits 11444 11650 +206
- Misses 1685 1723 +38
☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 3 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 02796e5. Configure here.
| .unwrap_or(0); | ||
| return Err(Error::RangeNotSatisfiable { total }); | ||
| } | ||
|
|
There was a problem hiding this comment.
S3 partial GET wrong size
Medium Severity
On ranged object GETs, request_object sets metadata.size from the response Content-Length, which reflects the partial body length, not the full object size. If Content-Range is missing on a 206, the fallback builds ContentRange::full from that value, so the API can emit a 200 with a truncated total and wrong Content-Length semantics.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 02796e5. Configure here.
There was a problem hiding this comment.
False positive. Two reasons:
-
metadata.sizeis explicitly excluded from response headers —to_headers()destructures it assize: _(metadata.rs:347), and there's a test (size_not_included_in_headers) confirming this. For ranged responses,Content-Lengthis set fromcontent_range.len()(objects.rs:115-117), which comes from the parsedContent-Rangeheader, not frommetadata.size. -
The "fallback" scenario doesn't exist — if we get a 206 response without a valid
Content-Rangeheader, the code returns an error (ok_or_else(|| Error::Generic { ... })?at line 329-332), not a fallback toContentRange::full.
| .and_then(ContentRange::parse) | ||
| .unwrap_or_else(|| ContentRange::full(metadata.size.unwrap_or(0) as u64)) | ||
| } else { | ||
| ContentRange::full(metadata.size.unwrap_or(0) as u64) |
There was a problem hiding this comment.
Missing size yields zero length
High Severity
When metadata.size is unset, GCS and S3 build ContentRange::full(0), and the GET handler always sets Content-Length from content_range.len(). Clients can receive Content-Length: 0 while the response body still streams the full object.
Additional Locations (2)
Reviewed by Cursor Bugbot for commit 02796e5. Configure here.
There was a problem hiding this comment.
False positive. When metadata.size is None, ContentRange::full(0) creates { start: 0, end: 0, total: 0 }. is_full() returns true because total == 0 (range.rs:155). In the endpoint handler (objects.rs:113), !content_range.is_full() is false, so the Content-Length header is never set from the content_range for this case. Axum/hyper determines Content-Length from the actual body stream for full (200 OK) responses.
| .unwrap_or_else(|| ContentRange::full(metadata.size.unwrap_or(0) as u64)) | ||
| } else { | ||
| ContentRange::full(metadata.size.unwrap_or(0) as u64) | ||
| }; |
There was a problem hiding this comment.
206 fallback length mismatch
Medium Severity
If GCS returns 206 Partial Content but omits or malforms Content-Range, the code falls back to ContentRange::full using the full object size from metadata while the stream only contains the range bytes. The HTTP layer then sets Content-Length to the full size, not the partial body length.
Reviewed by Cursor Bugbot for commit 02796e5. Configure here.
There was a problem hiding this comment.
This was a valid finding for commit 02796e5, but it's already been fixed in a subsequent commit (7be7cc9). The unwrap_or_else fallback was replaced with ok_or_else(|| Error::Generic { ... })?, so now a 206 response without a valid Content-Range header returns an error instead of falling back to ContentRange::full.
This comment was marked as spam.
This comment was marked as spam.
Thread `Option<ByteRange>` through all `get_object` paths and return `ContentRange` in every response so the HTTP handler can choose between 200, 206, and 416 with correct headers. Backends that store objects remotely (GCS, S3) forward the Range header and parse the response. Local-filesystem uses seek-based partial reads. In-memory and Bigtable ignore the range and always return the full payload with 200, which is RFC-compliant.
- Parse Range header unit case-insensitively per RFC 9110 (e.g. `Bytes=0-4` now works instead of being ignored) - Multi-range requests fall back to full 200 response instead of returning 400, matching RFC semantics for unsupported features
- Only set explicit Content-Length when content_range.total > 0, avoiding a Content-Length: 0 regression when metadata.size is unknown. - Use response Content-Length (not metadata.size) in 206 fallback paths for GCS and S3 so Content-Length matches the body if Content-Range is missing from a 206 response.
Reduce test boilerplate by consolidating 20 unit tests into 7 with identical coverage. Remove redundant integration test already covered by unit tests. Use HeaderValue::from_static, delegate to Display impl, and pass ByteRange directly instead of an intermediate HeaderMap.
Tests added to main after this branch diverged still used the old get_object signature (1 arg, 2-tuple return). Update them to pass None for the range parameter and destructure the 3-tuple.
85de81c to
d12a378
Compare
…r S3 partial size bytes=-0 (last 0 bytes) is semantically meaningless and should be rejected with RangeError::Invalid rather than parsed as Last(0). For S3 206 responses, Content-Length reflects the partial body length, not the full object size. Read metadata.size from the Content-Range total field instead so the metadata accurately represents the full object. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


Adds standard HTTP range requests support for single-object GET operations.
Spec vs implementation:
Range: bytes=0-1023. The spec allows for Multipart ranges too, i.e. fetching multiple ranges in a single request, which we don't support. We can add it in the future if needed.Implementation details:
get_objectpaths now takeOption<ByteRange>and returnContentRangealongside the payload. This seems to be the way to implement this with the least code duplication. Otherwise we would have to introduce aget_object_range, implement it for every backend and wrappers of the backends, etc.objectstore-servicereserializes theRangeheader and forwards it to the backend. On the response path, we proxy back thecontent-rangeprovided by the backend. TODO:Close FS-363