Seems like we are missing resiliency while emitting webhooks. As discussed with @sairanjit webhooks being missed can cause some inconsistency of states in platform and agent. Need to check the impact of this in terms of state recovery in case of failures (like are there any hard fetches being made to get synced with the latest state even if the webhooks fail to update states)
Right now we have:
- no way to retry webhooks
- Error handling/queuing of webhooks, to be tried later.
- Statefull vs stateless ?
- How long do we hold onto them
- Can the retry logic be abused by a third party consumer by for e.g. purposefully failing webhooks, etc? [This might include understanding trusted and UN-trusted entities and their direct interaction with agent via webhooks/websockets]
Similarly, need to look into websockets.
Seems like we are missing resiliency while emitting webhooks. As discussed with @sairanjit webhooks being missed can cause some inconsistency of states in platform and agent. Need to check the impact of this in terms of state recovery in case of failures (like are there any hard fetches being made to get synced with the latest state even if the webhooks fail to update states)
Right now we have:
Similarly, need to look into websockets.