Skip to content

Verify GRPC channel connectivity state READY to IDLE transition #332

@CMCDragonkai

Description

@CMCDragonkai

Specification

In #224 and #310 we introduced the ability for our GRPCClient to react the state changes in the underlying client's channel state. In particular, we perform a this.destroy() when the underlying channel is no longer live.

One state transition we were not able to test for. This is when the underlying channel state goes from READY to IDLE.

According to #310 (comment), this can occur one of 2 ways:

  1. when the server closes the connection (already have a test for this)
  2. when the underlying GRPC TTL times out (no test for this yet)

So the issue is that, we want to trigger the GRPC TTL and actually test that this transition actually occurs.

Does this GRPC TTL actually exist? Well according to the C++-docs, it is controlled by this parameter https://grpc.github.io/grpc/core/group__grpc__arg__keys.html#ga51ab062269cd81298f5adb6fd9a45e99, and the default value is 30 minutes. When @tegefaulkes was trying to run it, he couldn't see it after 6 minutes.

Furthermore the transition tables https://github.com/grpc/grpc/blob/master/doc/connectivity-semantics-and-api.md indicate that READY becomes IDLE after the idle timeout passes.

In the source code of grpc-js, there is no mention of an idle timeout parameter on the channel options, so it's possible that this hasn't been exposed in grpc-js, or maybe even not implemented.

 /**
   * An interface that contains options used when initializing a Channel instance.
   */
  export interface ChannelOptions {
    'grpc.ssl_target_name_override'?: string;
    'grpc.primary_user_agent'?: string;
    'grpc.secondary_user_agent'?: string;
    'grpc.default_authority'?: string;
    'grpc.keepalive_time_ms'?: number;
    'grpc.keepalive_timeout_ms'?: number;
    'grpc.keepalive_permit_without_calls'?: number;
    'grpc.service_config'?: string;
    'grpc.max_concurrent_streams'?: number;
    'grpc.initial_reconnect_backoff_ms'?: number;
    'grpc.max_reconnect_backoff_ms'?: number;
    'grpc.use_local_subchannel_pool'?: number;
    'grpc.max_send_message_length'?: number;
    'grpc.max_receive_message_length'?: number;
    'grpc.enable_http_proxy'?: number;
    'grpc.http_connect_target'?: string;
    'grpc.http_connect_creds'?: string;
    'grpc.default_compression_algorithm'?: CompressionAlgorithms;
    'grpc.enable_channelz'?: number;
    'grpc-node.max_session_memory'?: number;
    // eslint-disable-next-line @typescript-eslint/no-explicit-any
    [key: string]: any;
  }

A question about this has been posted: https://groups.google.com/g/grpc-io/c/yq4pmBGaXOQ

@joshuakarp please also create an issue on grpc-js github requesting it as a feature addition.

Additional context

  • upstream issue for feature request Set a channel's idle timeout grpc/grpc-node#2046

  • Propagate networking connection error handling into NodeConnection error handling #224 - Original issue about exposing channel state to our GRPC abstractions and
    image

    Details
    @startuml
    ' Inactive connection
    state "Inactive connection" as NC1 {
      NC1 : running: false
      NC1 : destroyed: false
      NC1 : status: starting
      state "Client" as Client1 {
        Client1 : IDLE
      }
    }
    
    NC1 --> NC2 : attempt connection
    
    ' Connecting to target
    state "Connecting to target" as NC2 {
      NC2 : running: false
      NC2 : destroyed: false
      NC2 : status: starting
      state "Client" as Client2 {
        Client2 : CONNECTING
      }
    }
     
    NC2 --> NC3 : successfully connected to target
    
    ' Active connection
    state "Active connection" as NC3 {
      NC3 : running: true
      NC3 : destroyed: false
      NC3 : status: starting
      state "Client" as Client3 {
        Client3 : READY
      }
    }
    
    NC3 --> NC3 : gRPC call performed
    'NC3 --> NC8 : gRPC TTL expires
    NC3 --> NC9 : server closes connection
    NC3 --> NC5 : error during gRPC call
    
    ' Idle connection
    'state "Idle connection" as NC8 {
    '  NC8 : running: true
    '  NC8 : destroyed: false
    '  NC8 : status: starting
    '  state "Client" as Client8 {
    '    Client8 : IDLE
    '  }
    '}
    
    'NC8 --> NC3 : gRPC call invoked
    
    ' Dropped connection
    state "Dropped connection" as NC9 {
      NC9 : running: true
      NC9 : destroyed: false
      NC9 : status: destroying
      state "Client" as Client9 {
        Client9 : IDLE
      }
    }
    
    NC2 --> NC5 : timed out | failed to validate certificate
    NC9 --> NC7 : channel watcher destroys\n client and shuts down
    
    ' Failed certificate validation
    state "Connection error" as NC5 {
      NC5 : running: true
      NC5 : destroyed: false
      NC5 : status: starting
      state "Client" as Client5 {
        Client5 : TRANSIENT_FAILURE
      }
    }
    
    NC3 --> NC6 : destroy connection | connection TTL expired
    NC5 --> NC7 : destroy client
    
    ' Destroy NodeConnection
    state "Destroying connection" as NC6 {
      NC6 : running: true
      NC6 : destroyed: false
      NC6 : status: destroying
      state "Client" as Client6 {
        Client6 : READY
      }
    }
    
    NC6 --> NC7 : destroy client
    
    ' Destroy client
    state "Destroyed connection" as NC7 {
      NC7 : running: false
      NC7 : destroyed: true
      NC7 : status: destroying
      state "Client" as Client7 {
        Client7 : SHUTDOWN
      }
    }
    
    ' Destroy client
    'state "Destroyed connection" as NC10 {
    '  NC10 : running: false
    '  NC10 : destroyed: true
    '  NC10 : status: destroying
    '  state "Client" as Client10 {
    '    Client10 : IDLE
    '  }
    '}
    @enduml
    

Tasks

  1. ...
  2. ...
  3. ...

Metadata

Metadata

Assignees

Labels

developmentStandard developmentepicBig issue with multiple subissuesr&d:polykey:core activity 1Secret Vault Sharing and Secret History Management

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions