Skip to content

Cookie middleware does not work for do_handshake #755

@WeCodingNow

Description

@WeCodingNow

Describe the bug, including details regarding any error messages, version, and platform.

Problem

Versions:
arrow-go: v18.5.2
python package adbc-driver-flightsql: 1.11.0
java dependency flight-sql-jdbc-driver: 18.3.0

I have an Arrow Flight SQL server that implements a do_handshake method in the following way:

  • do basic auth (via OAuth2 ROPC flow in an external OAuth2 provider), return some token in the response and attach an Authorization: Bearer <token> header to the response
  • also create a session, generate a session_id and attach it to the response as a session_id cookie, via a set-cookie header.

Context to why I am returning two different identification things: It is not possible for me to use the session_id as the token or include it in the token, because the token is generated by an IAM and it does not know anything about my server's sessions; and it is used only for authenticating users. The session_id is used to group together resources allocated for actually executing queries of the user in a scope of one session.

What I expect is that after do_handshake returns a token with headers like { "authorization": "Bearer <token>", "set-cookie": "session_id=<session_id>" }, all following requests from the authenticated client to the server would include both the authorization header and the cookie header with the session_id cookie.

The problem is that it seems that go's adbc client implementation (used by the python's flightsql driver) does not respect a set-cookie header returned by the server in response to a do_handshake call. Requests that follow the do_handshake authentication from the client to the server do not include the cookies set by response to a do_handshake call, but they do include the correct authorization header.

I am testing this functionality by using a python script that uses adbc_driver_flightsql package (ver. 1.11.0) to establish a connection and execute some queries on the server. The script is something like this:

Details
#!/bin/env python3

from adbc_driver_flightsql import DatabaseOptions
import adbc_driver_flightsql.dbapi
import readline
import argparse
import os
import atexit
import pandas as pd

parser = argparse.ArgumentParser(
    prog="flight_client", description="Simple flight SQL client"
)

parser.add_argument("uri")
parser.add_argument("-u", "--username")
parser.add_argument("-p", "--password")
parser.add_argument("-t", "--token")
parser.add_argument("-tr", "--token-raw")

args = parser.parse_args()

db_kwargs = {
    # enable cookies for session management
    DatabaseOptions.WITH_COOKIE_MIDDLEWARE.value: "true",  # <----- use the cookies middleware
}

if args.username is not None and args.password is not None:
    db_kwargs["username"] = args.username
    db_kwargs["password"] = args.password

if args.token is not None:
    db_kwargs["adbc.flight.sql.authorization_header"] = f"Bearer {args.token}"

if args.token_raw is not None:
    db_kwargs["adbc.flight.sql.authorization_header"] = args.token_raw

conn = adbc_driver_flightsql.dbapi.connect(
    f"grpc://{args.uri}",
    db_kwargs=db_kwargs,
)

# do stuff

The set-cookie header is handled correctly if it is returned by other methods, e.g. GetFlightInfoSqlInfo.

The JDBC arrow flight sql driver handles the set-cookie header correctly in all cases, including when it is returned by the call to a do_handshake endpoint.

Analysis (WARNING: LLM USAGE)

I have briefly analyzed the code of the arrow-adbc golang package with an LLM (GLM 5) and it pointed out a possible race condition in arrow-go when handling streaming responses, such as one used in do_handshake.

Full analysis:

Details

Cookie Handling in Arrow Flight Client: Handshake vs GetFlightInfo

Root Cause

The set-cookie header is handled correctly for GetFlightInfo but not for Handshake due to a timing mismatch in the streaming middleware's HeadersReceived callback:

  • Unary RPCs (GetFlightInfo): Headers are captured synchronously via grpc.Header() option during the call
  • Streaming RPCs (Handshake): HeadersReceived is only called in finishFn, which executes after the stream completes (when Recv() returns io.EOF)

The authentication code reads headers before calling Recv(), but the middleware only calls HeadersReceived after Recv() completes - creating a race where cookies aren't in the jar when needed.


Sequence Diagrams

GetFlightInfo (Unary RPC) - ✅ Works Correctly

sequenceDiagram
    participant App as Application
    participant CM as Cookie Middleware
    participant Auth as Auth Interceptor
    participant gRPC as gRPC Channel

    App->>CM: GetFlightInfo(ctx, req)
    
    Note over CM: StartCall(ctx) - Add cookies to outgoing context
    CM->>Auth: Invoke with modified ctx
    Note over Auth: Add auth token to context
    Auth->>gRPC: Invoke(ctx, method, req, reply)
    
    gRPC-->>Auth: Response + Headers
    Auth-->>CM: Response + Headers
    
    Note over CM: grpc.Header(&hdrmd) captured synchronously
    CM->>CM: HeadersReceived(ctx, md)
    CM->>CM: Store set-cookie in jar
    
    CM-->>App: Return response
Loading

Handshake (Streaming RPC) - ❌ Race Condition

sequenceDiagram
    participant App as Application
    participant CM as Cookie Middleware
    participant Auth as Auth Interceptor
    participant gRPC as gRPC Channel
    participant Stream as clientStream Wrapper

    App->>CM: Handshake(ctx)
    Note over CM: StartCall(ctx) - Add cookies to outgoing context
    CM->>Auth: Stream(ctx, desc, method)
    Note over Auth: Skips auth for /Handshake method
    Auth->>gRPC: Streamer(ctx, desc, method)
    
    gRPC-->>Auth: ClientStream
    Auth-->>CM: ClientStream
    Note over CM: Wrap stream in clientStream with deferred finishFn
    CM-->>App: Return wrapped stream
    
    Note over App: Authentication flow begins
    App->>Stream: stream.Header()
    Stream->>gRPC: cs.Header()
    gRPC-->>Stream: headers (contains set-cookie!)
    Stream-->>App: headers
    
    rect rgb(255, 200, 200)
        Note over App,CM: PROBLEM: HeadersReceived NOT called yet!
        Note over App: App processes headers directly
        Note over App: Cookie jar NOT updated
    end
    
    App->>Stream: stream.Recv()
    Stream->>gRPC: RecvMsg()
    gRPC-->>Stream: io.EOF
    Stream->>Stream: finishFn(nil)
    
    Note over Stream: NOW HeadersReceived is called
    Stream->>CM: HeadersReceived(ctx, md)
    Note over CM: Too late! App already processed response
Loading

Key Code Locations

Component File Line
Unary interceptor header capture client.go 100-107
Streaming interceptor deferred header processing client.go 154-157
finishFn called on Recv() EOF client.go 212
Auth reads headers before Recv() client_auth.go 310+
Auth interceptor bypasses Handshake client_auth.go 80

Summary Table

Aspect Unary RPC (GetFlightInfo) Streaming RPC (Handshake)
gRPC method Invoke() NewStream()
Header capture Via grpc.Header() option Via stream.Header() call
When captured During invoke, before return In finishFn, after stream ends
HeadersReceived timing Synchronous with response Deferred until stream completion
Cookie jar updated Before caller sees response After caller has processed response

Potential Solutions

  1. Modify clientStream.Header() to trigger HeadersReceived immediately when headers are first retrieved
  2. Eager header capture in the streaming interceptor after stream creation
  3. Application-level workaround in authentication code to explicitly process cookies during Handshake

Component(s)

Other

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type: bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions