Skip to content

Conversation

@NathanFlurry
Copy link
Member

No description provided.

@railway-app
Copy link

railway-app bot commented Jan 14, 2026

🚅 Deployed to the rivet-pr-3881 environment in rivet-frontend

Service Status Web Updated (UTC)
frontend-inspector 😴 Sleeping (View Logs) Web Jan 14, 2026 at 8:31 am
website 😴 Sleeping (View Logs) Web Jan 14, 2026 at 8:30 am
frontend-cloud 😴 Sleeping (View Logs) Web Jan 14, 2026 at 8:30 am

@railway-app railway-app bot temporarily deployed to rivet-frontend / rivet-pr-3881 January 14, 2026 02:41 Destroyed
Copy link
Member Author

NathanFlurry commented Jan 14, 2026


How to use the Graphite Merge Queue

Add the label merge-queue to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@claude
Copy link

claude bot commented Jan 14, 2026

PR Review: Improved Serverless SSE Error Handling

Overview

This PR consolidates the error handling for serverless SSE events by:

  1. Merging ServerlessInvalidBase64 and ServerlessInvalidPayload into a single ServerlessInvalidSsePayload error
  2. Adding raw_payload to the error for better debugging
  3. Extracting SSE parsing logic into dedicated helper functions
  4. Handling the ping event type

Positive Changes ✅

1. Better Error Reporting

  • Adding raw_payload (truncated to 512 chars) significantly improves debuggability when invalid SSE messages are received
  • Using safe_slice correctly prevents panics on invalid UTF-8 boundaries

2. Code Organization

  • Extracting parse_to_serverless_server_event() and get_raw_event_data() improves readability and eliminates duplication
  • The new functions have clear single responsibilities

3. Handling Edge Cases

  • Now properly handles ping events (line 647-648)
  • Warns about unknown event types instead of crashing (line 658-660)

Issues & Recommendations 🔧

1. CRITICAL: Breaking Change - Test Must Be Updated

The test at engine/packages/engine/tests/actors_scheduling_errors.rs:964-974 still references the old error variants:

rivet_types::actor::RunnerPoolError::ServerlessInvalidPayload { message } => { ... }
rivet_types::actor::RunnerPoolError::ServerlessInvalidBase64 => { ... }

This needs to be updated to:

rivet_types::actor::RunnerPoolError::ServerlessInvalidSsePayload { message, raw_payload } => {
    tracing::info!(?message, ?raw_payload, "got ServerlessInvalidSsePayload as expected");
}

Action Required: Update the test before merging, or verify tests are passing in CI.


2. Documentation Out of Date

docs/engine/ACTOR_ERRORS.md:38-39 still documents the old error types:

- `ServerlessInvalidBase64` - Invalid base64 in SSE message
- `ServerlessInvalidPayload` - Invalid protocol payload

Should be updated to:

- `ServerlessInvalidSsePayload` - Invalid SSE payload (base64 or protocol deserialization error)

3. SDK Regeneration Required

The grep results show that the old error types are embedded in generated SDKs:

  • engine/sdks/go/api-full/types.go
  • engine/sdks/typescript/api-full/src/
  • engine/sdks/rust/api-full/

Action Required: Regenerate SDKs to include the new ServerlessInvalidSsePayload type with the raw_payload field. This is a breaking API change that affects SDK consumers.


4. Minor: Incomplete Logging Context

At line 659, when logging unknown event types:

tracing::warn!(event, "received unknown serverless sse message event kind");

Consider adding more context:

tracing::warn!(event, ?msg, "received unknown serverless sse message event kind");

This would help debugging by showing the full message, not just the event type string.


5. Code Style: Match Arms Consistency

The refactored code at lines 327-336 has inconsistent indentation. The nested if runner_id.is_none() check should perhaps be pulled out for clarity, or kept as-is for consistency with the draining path.

Current:

match payload {
    protocol::mk2::ToServerlessServer::ToServerlessServerInit(init) => {
        if runner_id.is_none() {
            runner_id = Some(...);
            // ...
        }
    }
}

This is fine, but note that in finish_non_critical_draining (line 480-489), the comment explicitly states why the if runner_id.is_none() is checked. Consider adding a similar comment here for consistency:

// Only initialize the runner once
if runner_id.is_none() {

Performance Considerations ⚡

Positive:

  • No performance regressions
  • safe_slice is efficient (single pass through char boundaries)
  • Raw payload is only captured on error path, not hot path

Security Considerations 🔒

Positive:

  • Truncating raw_payload to 512 chars prevents memory exhaustion from malicious payloads
  • Using safe_slice prevents potential panics from invalid UTF-8

Note: The raw payload could contain sensitive information. Consider whether this data should be logged or included in error responses. Since it is only included on error and truncated, the current approach seems reasonable.


Test Coverage 📊

Action Required:

  1. Update existing test in actors_scheduling_errors.rs
  2. Verify test passes after the update
  3. Consider adding a test case for the ping event handling

Summary

This is a solid refactoring that improves error handling and debuggability. However, there are blocking issues that must be addressed:

  1. ✅ Update the test to use new error variant
  2. ✅ Update documentation
  3. ✅ Regenerate SDKs (breaking API change)

Once these are addressed, this PR will be ready to merge.


Checklist for Author

  • Update test in actors_scheduling_errors.rs:964-974
  • Update docs/engine/ACTOR_ERRORS.md:38-39
  • Regenerate SDKs to include ServerlessInvalidSsePayload
  • Run cargo test to verify all tests pass
  • Consider adding comment at line 328: // Only initialize the runner once
  • Consider enhancing logging at line 659 to include ?msg

@pkg-pr-new
Copy link

pkg-pr-new bot commented Jan 14, 2026

More templates

@rivetkit/cloudflare-workers

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/cloudflare-workers@3881

@rivetkit/db

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/db@3881

@rivetkit/framework-base

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/framework-base@3881

@rivetkit/next-js

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/next-js@3881

@rivetkit/react

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/react@3881

rivetkit

pnpm add https://pkg.pr.new/rivet-dev/rivet/rivetkit@3881

@rivetkit/sql-loader

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/sql-loader@3881

@rivetkit/virtual-websocket

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/virtual-websocket@3881

@rivetkit/engine-runner

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner@3881

@rivetkit/engine-runner-protocol

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner-protocol@3881

commit: 41511a3

@NathanFlurry NathanFlurry force-pushed the 01-13-chore_engine_provide_detailed_error_for_invalid_serverless_sse_event branch from 8e7fba5 to 41511a3 Compare January 14, 2026 08:21
@railway-app railway-app bot temporarily deployed to rivet-frontend / rivet-pr-3881 January 14, 2026 08:21 Destroyed
@graphite-app
Copy link
Contributor

graphite-app bot commented Jan 14, 2026

Merge activity

  • Jan 14, 8:52 PM UTC: NathanFlurry added this pull request to the Graphite merge queue.
  • Jan 14, 8:52 PM UTC: CI is running for this pull request on a draft pull request (#3898) due to your merge queue CI optimization settings.
  • Jan 14, 8:53 PM UTC: Merged by the Graphite merge queue via draft PR: #3898.

@graphite-app graphite-app bot closed this Jan 14, 2026
@graphite-app graphite-app bot deleted the 01-13-chore_engine_provide_detailed_error_for_invalid_serverless_sse_event branch January 14, 2026 20:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants