Currently, we submit the task to the grpc-engine in the request directly. When the runnable threads in the engine are exhausted, the request will be blocked until the task is submitted. Although this rarely happens since:
- the gRPC call is IO-bound
- the GOMAXPROCS are counted per worker so it is much higher than the actual number of processors
- Go's scheduler is fantastic
To eliminate this potential risk, we can add another background thread to act as the consumer part in the producer-consumer pattern. The request can communicate with the thread via ngx_thread_cond_signal / ngx_thread_cond_wait.