Skip to content

fix(rivetkit): retry serverless pool configuration#4887

Draft
NathanFlurry wants to merge 1 commit intomainfrom
mock-agentic-loop/serverless-configure-retry
Draft

fix(rivetkit): retry serverless pool configuration#4887
NathanFlurry wants to merge 1 commit intomainfrom
mock-agentic-loop/serverless-configure-retry

Conversation

@NathanFlurry
Copy link
Copy Markdown
Member

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@railway-app
Copy link
Copy Markdown

railway-app Bot commented May 3, 2026

🚅 Deployed to the rivet-pr-4887 environment in rivet-frontend

Service Status Web Updated (UTC)
website 😴 Sleeping (View Logs) Web May 3, 2026 at 12:21 am
frontend-inspector ❌ Build Failed (View Logs) Web May 3, 2026 at 12:10 am
frontend-cloud ❌ Build Failed (View Logs) Web May 3, 2026 at 12:10 am
ladle ❌ Build Failed (View Logs) Web May 3, 2026 at 12:10 am
mcp-hub ✅ Success (View Logs) Web May 3, 2026 at 12:09 am
kitchen-sink ❌ Build Failed (View Logs) Web May 3, 2026 at 12:08 am

Copy link
Copy Markdown
Member Author

NathanFlurry commented May 3, 2026

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 3, 2026

Code Review

Overview

This PR fixes serverless pool configuration by:

  1. Adding retry logic (30 attempts, 1s interval) to configureServerlessPool
  2. Moving pool configuration to after handleServerlessRequest resolves
  3. Awaiting the pool configuration promise before returning the HTTP response (key behavioral change)
  4. Properly propagating errors instead of swallowing them

Issues

First HTTP response is blocked for up to 30 seconds

On the first cold start, if pool configuration is slow or requires all 30 retries, the first HTTP response is withheld for up to 30 seconds. The response is gated behind await this.#configureServerlessPoolPromise before return new Response(...). The original code fire-and-forgot pool configuration so it never affected response latency. If blocking is intentional (e.g. to ensure the worker is registered before the first response), that reasoning should be documented.

Programmer errors are retried unnecessarily

configureServerlessPoolOnce throws immediately for missing config fields (namespace, endpoint, configurePool). These are misconfiguration errors that will never succeed on retry — they waste up to 30 seconds before surfacing. They should fail fast without entering the retry loop.

Log field format deviates from project convention

The new code uses logger().warn({ msg: "...", errorMessage: stringifyError(error) }), but the existing pattern elsewhere in registry/index.ts passes the error as a structured field: logger().warn({ err }, "message"). Prefer the { err: error } form to match the rest of the codebase.


Suggestions

Consider exponential backoff instead of fixed interval

30 retries x 1s = fixed 30s total. For transient network failures, exponential backoff with jitter reduces load on the engine during restarts and is a better fit than a fixed polling interval.

Document why response is blocked on pool configuration

The behavioral shift from fire-and-forget to awaited is not explained. A brief comment at the await site explaining the invariant (e.g. "ensure this instance is registered in the pool before serving traffic") would prevent future regressions.

After final failure, handler throws on every subsequent request

Once all 30 attempts are exhausted and the promise rejects, #configureServerlessPoolPromise holds a permanently rejected promise. Every subsequent call to handler() will await it and throw. This is probably correct (matches the error message "restart this process"), but worth an explicit note.


Summary

The retry logic itself is a good addition — the original code silently swallowed pool configuration errors. The main concern is the first-request latency impact from awaiting pool configuration before returning the response, combined with programmer errors being retried 30 times. Addressing the fast-fail path for config errors and documenting the blocking intent would resolve the key issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant