-
Notifications
You must be signed in to change notification settings - Fork 237
Description
We use v1.SubscriberClient#streamingPull method to create a bidirectional stream which we use to consume and ack Pub/Sub messages. After 15 minutes client throws the DEADLINE_EXCEEDED (code 4) error. We have put a lot of effort to investigate why and where this error is thrown.
There is a default 15 minute timeout for streamingPull RPC call which is defined in @google-cloud/pubsub library here. It is possible to override this timeout by providing a different value when calling streamingPull method like so:
const subscriber = new v1.SubscriberClient()
const stream = subscriber.streamingPull({ timeout: 60 * 60 * 1000 /* 1 hour */ })However, this only postpones the throwing of the DEADLINE_EXCEEDED error.
We have tracked down where this error is thrown, and it happens in @grpc/grpc-js library in ResolvingCall class in runDeadline method. This timer which is set on creation of the ResolvingCall class should be cleared when the response is received in the outputStatus method. The outputStatus method is called from onReceiveStatus handler, or from onReceiveMessage handler if certain conditions are true.
However, onReceiveStatus handler is NEVER called, which can be seen if you turn on debug logs:
GRPC_NODE_VERBOSITY=DEBUG
GRPC_NODE_TRACE=resolving_call # only focus on resolving_call class
Then you can clearly see that resolving_call is instantiated and that timer is created.
D 2024-02-08T14:46:36.689Z | v1.9.14 54511 | resolving_call | [0] Created
D 2024-02-08T14:46:36.689Z | v1.9.14 54511 | resolving_call | [0] Deadline: 2024-02-08T15:01:36.688Z
D 2024-02-08T14:46:36.689Z | v1.9.14 54511 | resolving_call | [0] Deadline will be reached in 899999ms
When message is received you also see:
D 2024-02-08T14:50:18.370Z | v1.9.14 54731 | resolving_call | [0] Received message
D 2024-02-08T14:50:18.371Z | v1.9.14 54731 | resolving_call | [0] Finished filtering received message
But you never see Received status log, which means that outputStatus method never gets called to reset the DEADLINE_EXCEEDED timer.
I believe that this is a bug, and keep in mind that this only happens with v1.SubscriberClient, but works correctly with latest Subscription class. When using that class, grpc debug output looks different, and you see 'Received status' logs, which means that timer gets restarted every time you receive the message.
D 2024-02-08T14:58:18.115Z | v1.9.14 55006 | resolving_call | [20] Received metadata
D 2024-02-08T14:58:18.115Z | v1.9.14 55006 | resolving_call | [20] Received message
D 2024-02-08T14:58:18.115Z | v1.9.14 55006 | resolving_call | [20] Finished filtering received message
D 2024-02-08T14:58:18.119Z | v1.9.14 55006 | resolving_call | [20] Received status
D 2024-02-08T14:58:18.119Z | v1.9.14 55006 | resolving_call | [20] ended with status: code=0 details=""
Environment details
- OS: MacOS 14.2.1
- Node.js version: 20.11.0
- npm version: 10.2.4
@google-cloud/pubsubversion: 4.3.0
Steps to reproduce
Check the repository with the code and instructions to reproduce the issue: