Skip to content

feat: Initial support for HTTP range requests#481

Draft
lcian wants to merge 16 commits into
mainfrom
lcian/feat/range-requests
Draft

feat: Initial support for HTTP range requests#481
lcian wants to merge 16 commits into
mainfrom
lcian/feat/range-requests

Conversation

@lcian
Copy link
Copy Markdown
Member

@lcian lcian commented May 26, 2026

Adds standard HTTP range requests support for single-object GET operations.

Spec vs implementation:

  • This implementation only supports byte ranges. The spec doesn't specifically talk about bytes, so in theory it admits ranges of other "types", which just don't make sense for us.
  • This implementation only supports Single part ranges, meaning stuff like Range: bytes=0-1023. The spec allows for Multipart ranges too, i.e. fetching multiple ranges in a single request, which we don't support. We can add it in the future if needed.
  • We should probably implement Conditional range requests in a follow-up PR, as they're pretty much essential to guarantee that clients read consistent data. However, I don't think Symbolicator uses those.

Implementation details:

  • All get_object paths now take Option<ByteRange> and return ContentRange alongside the payload. This seems to be the way to implement this with the least code duplication. Otherwise we would have to introduce a get_object_range, implement it for every backend and wrappers of the backends, etc.
  • Bigtable: we know that the objects there are small, so we just return the full payload. This behavior is compliant with the spec.
  • GCS/S3: objectstore-service reserializes the Range header and forwards it to the backend. On the response path, we proxy back the content-range provided by the backend. TODO:
  • Not supported for batch operations.
  • Not implemented in clients for now because Symbolicator (which would be the only user atm) doesn't use our client. We can easily implement it when needed.

Close FS-363

@codecov
Copy link
Copy Markdown

codecov Bot commented May 26, 2026

Codecov Report

❌ Patch coverage is 86.88047% with 45 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.11%. Comparing base (324009d) to head (7d629c8).

Files with missing lines Patch % Lines
objectstore-service/src/backend/s3_compatible.rs 17.24% 24 Missing ⚠️
objectstore-service/src/backend/gcs.rs 65.71% 12 Missing ⚠️
objectstore-types/src/range.rs 98.14% 3 Missing ⚠️
objectstore-service/src/backend/common.rs 0.00% 2 Missing ⚠️
objectstore-service/src/backend/testing.rs 50.00% 2 Missing ⚠️
objectstore-server/src/endpoints/common.rs 0.00% 1 Missing ⚠️
objectstore-server/src/endpoints/objects.rs 97.61% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #481      +/-   ##
==========================================
- Coverage   87.16%   87.11%   -0.05%     
==========================================
  Files          83       84       +1     
  Lines       13129    13373     +244     
==========================================
+ Hits        11444    11650     +206     
- Misses       1685     1723      +38     
Components Coverage Δ
Rust Backend 92.14% <86.88%> (-0.19%) ⬇️
Rust Client 79.70% <ø> (ø)
Python Client 88.36% <ø> (ø)

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@lcian

This comment was marked as spam.

@lcian

This comment was marked as spam.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 3 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 02796e5. Configure here.

.unwrap_or(0);
return Err(Error::RangeNotSatisfiable { total });
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

S3 partial GET wrong size

Medium Severity

On ranged object GETs, request_object sets metadata.size from the response Content-Length, which reflects the partial body length, not the full object size. If Content-Range is missing on a 206, the fallback builds ContentRange::full from that value, so the API can emit a 200 with a truncated total and wrong Content-Length semantics.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 02796e5. Configure here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

False positive. Two reasons:

  1. metadata.size is explicitly excluded from response headers — to_headers() destructures it as size: _ (metadata.rs:347), and there's a test (size_not_included_in_headers) confirming this. For ranged responses, Content-Length is set from content_range.len() (objects.rs:115-117), which comes from the parsed Content-Range header, not from metadata.size.

  2. The "fallback" scenario doesn't exist — if we get a 206 response without a valid Content-Range header, the code returns an error (ok_or_else(|| Error::Generic { ... })? at line 329-332), not a fallback to ContentRange::full.

.and_then(ContentRange::parse)
.unwrap_or_else(|| ContentRange::full(metadata.size.unwrap_or(0) as u64))
} else {
ContentRange::full(metadata.size.unwrap_or(0) as u64)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing size yields zero length

High Severity

When metadata.size is unset, GCS and S3 build ContentRange::full(0), and the GET handler always sets Content-Length from content_range.len(). Clients can receive Content-Length: 0 while the response body still streams the full object.

Additional Locations (2)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 02796e5. Configure here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

False positive. When metadata.size is None, ContentRange::full(0) creates { start: 0, end: 0, total: 0 }. is_full() returns true because total == 0 (range.rs:155). In the endpoint handler (objects.rs:113), !content_range.is_full() is false, so the Content-Length header is never set from the content_range for this case. Axum/hyper determines Content-Length from the actual body stream for full (200 OK) responses.

.unwrap_or_else(|| ContentRange::full(metadata.size.unwrap_or(0) as u64))
} else {
ContentRange::full(metadata.size.unwrap_or(0) as u64)
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

206 fallback length mismatch

Medium Severity

If GCS returns 206 Partial Content but omits or malforms Content-Range, the code falls back to ContentRange::full using the full object size from metadata while the stream only contains the range bytes. The HTTP layer then sets Content-Length to the full size, not the partial body length.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 02796e5. Configure here.

Copy link
Copy Markdown
Member Author

@lcian lcian May 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a valid finding for commit 02796e5, but it's already been fixed in a subsequent commit (7be7cc9). The unwrap_or_else fallback was replaced with ok_or_else(|| Error::Generic { ... })?, so now a 206 response without a valid Content-Range header returns an error instead of falling back to ContentRange::full.

@linear-code

This comment was marked as spam.

@lcian lcian changed the title feat: Support HTTP Range requests for single-object GETs feat: Range requests May 27, 2026
@lcian lcian changed the title feat: Range requests feat(server+service): Support HTTP range requests May 27, 2026
lcian added 9 commits May 29, 2026 10:38
Thread `Option<ByteRange>` through all `get_object` paths and return
`ContentRange` in every response so the HTTP handler can choose between
200, 206, and 416 with correct headers.

Backends that store objects remotely (GCS, S3) forward the Range header
and parse the response. Local-filesystem uses seek-based partial reads.
In-memory and Bigtable ignore the range and always return the full
payload with 200, which is RFC-compliant.
- Parse Range header unit case-insensitively per RFC 9110
  (e.g. `Bytes=0-4` now works instead of being ignored)
- Multi-range requests fall back to full 200 response instead
  of returning 400, matching RFC semantics for unsupported features
- Only set explicit Content-Length when content_range.total > 0, avoiding
  a Content-Length: 0 regression when metadata.size is unknown.
- Use response Content-Length (not metadata.size) in 206 fallback paths
  for GCS and S3 so Content-Length matches the body if Content-Range is
  missing from a 206 response.
Reduce test boilerplate by consolidating 20 unit tests into 7 with
identical coverage. Remove redundant integration test already covered
by unit tests. Use HeaderValue::from_static, delegate to Display impl,
and pass ByteRange directly instead of an intermediate HeaderMap.
Tests added to main after this branch diverged still used the old
get_object signature (1 arg, 2-tuple return). Update them to pass
None for the range parameter and destructure the 3-tuple.
@lcian lcian force-pushed the lcian/feat/range-requests branch from 85de81c to d12a378 Compare May 29, 2026 08:41
lcian and others added 4 commits May 29, 2026 13:37
…r S3 partial size

bytes=-0 (last 0 bytes) is semantically meaningless and should be
rejected with RangeError::Invalid rather than parsed as Last(0).

For S3 206 responses, Content-Length reflects the partial body length,
not the full object size. Read metadata.size from the Content-Range
total field instead so the metadata accurately represents the full
object.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@lcian lcian changed the title feat(server+service): Support HTTP range requests feat: Support HTTP range requests May 29, 2026
@lcian lcian changed the title feat: Support HTTP range requests feat: Basic support for HTTP range requests May 29, 2026
@lcian lcian changed the title feat: Basic support for HTTP range requests feat: Initial support for HTTP range requests May 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant