Skip to content

need a river worker solution for cloud run #144

@btc

Description

@btc

river workers in a go binary on cloud run. how do they do periodic jobs with min instances at 0?
8:06 PM

Examined architectural constraints and synthesized alternative scheduling patterns
Examined architectural constraints and synthesized alternative scheduling patterns
Short answer: they don't, and trying to make River's periodic jobs work under scale-to-zero is fighting the tool.
River's PeriodicJobs are scheduled in-process by a goroutine inside the running Client. If no instance is alive, nothing fires the schedule, and when an instance later wakes up River doesn't backfill missed runs by default (there's a RunOnStart flag, but that fires on every cold start, not "once for the slot you missed at 03:00"). So min-instances=0 plus River periodic jobs equals jobs that quietly don't run.
The pattern people actually use on Cloud Run:
Stop using River's periodic feature for scheduling. Use Cloud Scheduler as the clock. Point it at either an HTTP endpoint on your Cloud Run service or, cleaner, a Pub/Sub topic the service subscribes to. The handler's only job is to call riverClient.Insert(ctx, MyJobArgs{}, nil) and return. That insert is just a Postgres write, so it's fast and the request can finish immediately.
Then there's the second problem, which is the one that bites people: who actually runs the River job after it's inserted? With min-instances=0 and no traffic, no worker is polling Postgres. Two real options:
One, set CPU always-allocated and min-instances=1 on a small worker service. This is what most people end up doing because it costs a few dollars a month and everything Just Works. River polls, picks up the job, runs it. You're not really "scale to zero" anymore but you're also not paying for much.
Two, use Cloud Run Jobs (not Services) triggered directly by Cloud Scheduler, and skip River's worker model entirely for periodic work. The Job container starts, does the thing, exits. You only use River for the queue-driven async work that happens in response to user requests, where a request is already keeping an instance warm.
The trap to avoid: enqueueing a River job from an HTTP handler on a scale-to-zero service and assuming the background River worker goroutine in the same process will pick it up before Cloud Run freezes the instance. Without CPU-always-allocated, CPU is throttled to near zero once the request returns, and your worker goroutine stops making progress. The job sits in Postgres until something else wakes the instance. People hit this and think River is broken; it isn't, Cloud Run's execution model just doesn't match River's assumption that the process is continuously alive.
So: Cloud Scheduler for the clock, River for the queue, and either min-instances=1 with always-on CPU for the worker, or a separate Cloud Run Job for the periodic work itself.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions