The grpc RateLimitService has only one method ShouldRateLimit and hits_addend is assumed to be 1 when no value has been explicitly set.
Why would a person want to get quota without incrementing it?
One example is if you're dealing with expensive requests but you don't know how expensive they are until after they've run. Maybe you add up the CPU seconds spent serving a request and increment it at the end of the request. You'd still need to check (but not increment) at the beginning of requests whether the CPU-second quota has been exhausted by previous requests during the current time unit.
A hacky workaround today might be to make sure that expensive requests are measured in large numbers (like hundreds of thousands) and check quota by incrementing a very small number (like 1) that might effectively act as if you haven't incremented it at all.
How might this be implemented?
Seems like there are a few possible approaches to getting quota without changing it:
- Do something clever and subtle like saying that a
hits_addend equal to uint32_max means zero
- Add some kind of a
check_only flag to the RateLimitRequest message
- Make a separate
IsRateLimited method on RateLimitService
- Make a separate non-standard grpc service specific to this rate limiter implementation so the
RateLimitService in envoy proxy doesn't need to change
The grpc RateLimitService has only one method
ShouldRateLimitandhits_addendis assumed to be 1 when no value has been explicitly set.Why would a person want to get quota without incrementing it?
One example is if you're dealing with expensive requests but you don't know how expensive they are until after they've run. Maybe you add up the CPU seconds spent serving a request and increment it at the end of the request. You'd still need to check (but not increment) at the beginning of requests whether the CPU-second quota has been exhausted by previous requests during the current time unit.
A hacky workaround today might be to make sure that expensive requests are measured in large numbers (like hundreds of thousands) and check quota by incrementing a very small number (like 1) that might effectively act as if you haven't incremented it at all.
How might this be implemented?
Seems like there are a few possible approaches to getting quota without changing it:
hits_addendequal to uint32_max means zerocheck_onlyflag to theRateLimitRequestmessageIsRateLimitedmethod onRateLimitServiceRateLimitServicein envoy proxy doesn't need to change