Skip to content

Streaming support for Python#3396

Open
GumpacG wants to merge 5 commits intoapache:masterfrom
GumpacG:python-http-streaming
Open

Streaming support for Python#3396
GumpacG wants to merge 5 commits intoapache:masterfrom
GumpacG:python-http-streaming

Conversation

@GumpacG
Copy link
Copy Markdown
Contributor

@GumpacG GumpacG commented Apr 21, 2026

Adds streaming GraphBinary deserialization to the Python driver. Results are now deserialized directly from the HTTP response stream and pushed to the ResultSet individually, rather than buffering the entire response before processing.

Implements https://lists.apache.org/thread/qyxb845gy7fbhg87pmtcqs5zf0q33zm8

Changes:

  • Added AiohttpSyncStream wrapper and get_stream() to transport, replacing read()
  • Rewrote Connection._receive() to stream GB objects one-at-a-time via GraphBinaryReader.to_object()
  • Updated ResultSet.one()/all() for individual item queue entries; added _EXHAUSTED sentinel to handle None as a valid Gremlin result
  • Added content-type check before GB deserialization for non-GB error responses
  • Removed dead code: buffered read(), data_received_aggregate, deserialize_message/read_payload, stream_chunk, max_content_length
  • Removed GraphSON parameterized test fixtures
  • Updated AbstractBaseTransport: read() → get_stream()
  • 44 new unit tests covering streaming, bulking, error handling, content-type checks, and incremental delivery timing
  • Updated CHANGELOG and upgrade docs
  • Removed AbstractBaseTransport (gremlin_python.driver.transport)
  • Removed AbstractBaseProtocol and GremlinServerHTTPProtocol (gremlin_python.driver.protocol)
  • Removed protocol_factory parameter from Client, DriverRemoteConnection, and Connection
  • Removed transport_factory parameter from Client, DriverRemoteConnection, and Connection
  • GremlinServerError moved from gremlin_python.driver.protocol to gremlin_python.driver.connection
  • Connection constructor signature changed — protocol and transport_factory positional args removed; serializer, auth, and interceptor options are now keyword arguments

Breaking changes:

  • ResultSet iteration now yields individual items instead of lists. Code using results += result must change to
    results.append(result).
  • read() replaced with get_stream()
  • Connection constructor signature changed, protocol and transport_factory positional args removed; serializer, auth, and interceptor options are now keyword arguments
  • Custom transport implementations are no longer supported — the driver uses AiohttpHTTPTransport directly

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 21, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.30%. Comparing base (cfd6889) to head (9ad11d4).
⚠️ Report is 1036 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff              @@
##             master    #3396      +/-   ##
============================================
- Coverage     77.87%   76.30%   -1.57%     
+ Complexity    13578    13376     -202     
============================================
  Files          1015     1011       -4     
  Lines         59308    60147     +839     
  Branches       6835     7046     +211     
============================================
- Hits          46184    45898     -286     
- Misses        10817    11540     +723     
- Partials       2307     2709     +402     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@GumpacG GumpacG force-pushed the python-http-streaming branch from 6dbad56 to 64337e4 Compare April 26, 2026 02:57
@GumpacG GumpacG marked this pull request as ready for review April 26, 2026 02:58
@GumpacG GumpacG marked this pull request as draft April 29, 2026 18:09
Comment thread docs/src/upgrade/release-4.x.x.asciidoc Outdated

==== Python HTTP Streaming Response Support

The Python driver now streams GraphBinary results directly from the HTTP response body, matching the Go driver's
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I don't think it makes sense to compare to the Go driver here, technically go got HTTP streaming first in the TP4 betas, but these upgrade docs should focus on the perspective of TP3.8.x users upgrading to TP4. We will arguably want to rewrite all of these sections once all drivers are done, to either group all streaming updates together, or to better organize sections per-GLV.

For the purposes of this PR, I think we should just ensure we capture all the right information from that TP3 upgrade perspective, and then a larger restructuring can follow if/when it makes sense.

@GumpacG GumpacG marked this pull request as ready for review May 1, 2026 19:17
@GumpacG GumpacG force-pushed the python-http-streaming branch from 05cdc1d to ddaa9ef Compare May 4, 2026 21:09
@kenhuuu
Copy link
Copy Markdown
Contributor

kenhuuu commented May 4, 2026

Should the transport.py file be removed as well?

@GumpacG GumpacG force-pushed the python-http-streaming branch from ddaa9ef to 5a4cb62 Compare May 4, 2026 21:23
if obj == Marker.end_of_stream():
break
if bulked:
bulk = reader.to_object(stream)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to de-bulk here? All the other GLVs, and the old Python code all preserve the bulk count as a Traverser object and let the traversal iteration layer expand lazily. The current code would put all de-bulked objects into the queue, which kinda defeat the purpose of server-side bulking. This should prob be:

if bulked:
      bulk = reader.to_object(stream)
      self._result_set.stream.put_nowait(Traverser(obj, bulk))
else:
      self._result_set.stream.put_nowait(obj)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. Updated

@GumpacG
Copy link
Copy Markdown
Contributor Author

GumpacG commented May 4, 2026

Should the transport.py file be removed as well?

@kenhuuu it was removed. Git just interprets it as a rename rather that a deletion for some reason.
Screenshot 2026-05-04 at 2 26 09 PM

exc_is_null = stream.read(1)[0] == 0x01
status_exception = '' if exc_is_null else reader.to_object(stream, DataType.string, False)

if status_code not in (0, 200, 204):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a case we get a status code of 0?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! No there isn't. Removed

@xiazcy
Copy link
Copy Markdown
Contributor

xiazcy commented May 5, 2026

VOTE +1

1 similar comment
@kenhuuu
Copy link
Copy Markdown
Contributor

kenhuuu commented May 5, 2026

VOTE +1

Comment thread docs/src/upgrade/release-4.x.x.asciidoc Outdated
Comment thread docs/src/upgrade/release-4.x.x.asciidoc Outdated
Comment thread docs/src/upgrade/release-4.x.x.asciidoc Outdated
Comment thread gremlin-python/src/main/python/gremlin_python/driver/serializer.py Outdated
Comment thread gremlin-python/src/main/python/gremlin_python/driver/connection.py Outdated
GumpacG added 3 commits May 5, 2026 15:49
Assisted-by: Devin: Claude Opus 4.7
Assisted-by: Devin: Claude Opus 4.7
Assisted-by: Devin: Claude Opus 4.7
@Cole-Greer
Copy link
Copy Markdown
Contributor

VOTE +1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants