Skip to content

Add workflow spec-level retry policies#279

Merged
jamescmartinez merged 3 commits into
mainfrom
backoff-retry
Feb 11, 2026
Merged

Add workflow spec-level retry policies#279
jamescmartinez merged 3 commits into
mainfrom
backoff-retry

Conversation

@jamescmartinez
Copy link
Copy Markdown
Contributor

  • Clean up backoff/retry impl to be consistent across backends
  • Change backoff policy to use duration strings for intervals
  • Add user-configurable retries

Copilot AI review requested due to automatic review settings February 11, 2026 21:40
@jamescmartinez jamescmartinez changed the title backoff retry Add user-configurable retries Feb 11, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 11, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@jamescmartinez jamescmartinez changed the title Add user-configurable retries Expose workflow spec-level retry policies Feb 11, 2026
@jamescmartinez jamescmartinez changed the title Expose workflow spec-level retry policies Add workflow spec-level retry policies Feb 11, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR standardizes exponential backoff/retry behavior across OpenWorkflow backends by introducing a shared backoff utility, switching interval configuration to duration strings, and adding per-workflow configurable retry policies (including maximum attempts).

Changes:

  • Added core/backoff.ts with BackoffPolicy + computeBackoffDelayMs() and migrated worker poll backoff to use it.
  • Introduced workflow-level RetryPolicy (duration-string intervals + maximumAttempts) and centralized failure-state computation via computeFailedWorkflowRunUpdate().
  • Updated Postgres/SQLite backends and tests/docs to support passing and respecting a retry policy.

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
packages/openworkflow/workflow.ts Adds RetryPolicy, default policy, and shared failure-update computation.
packages/openworkflow/workflow.test.ts Adds tests for computeFailedWorkflowRunUpdate.
packages/openworkflow/core/backoff.ts New shared backoff computation using duration strings.
packages/openworkflow/core/backoff.test.ts Tests for backoff computation behavior.
packages/openworkflow/worker.ts Migrates poll backoff to shared utility; adds resolveRetryPolicy().
packages/openworkflow/worker.test.ts Adds tests for retry policy resolution and spec override behavior.
packages/openworkflow/backend.ts Extends FailWorkflowRunParams to include retryPolicy.
packages/openworkflow/sqlite/backend.ts Uses shared workflow failure-update computation instead of inline backoff logic.
packages/openworkflow/postgres/backend.ts Uses shared workflow failure-update computation instead of SQL backoff calculation.
packages/openworkflow/execution.ts Threads retryPolicy through execution to backend failure handling.
packages/openworkflow/index.ts Exposes RetryPolicy type from the public entrypoint.
packages/openworkflow/core/retry.ts Removes old default retry policy constant.
packages/docs/docs/workflows.mdx Documents per-workflow retryPolicy usage.
packages/docs/docs/retries.mdx Updates retry documentation to include maximumAttempts and policy fields.
packages/docs/docs/advanced-patterns.mdx Updates guidance to point to workflow-level retryPolicy.
packages/docs/docs/roadmap.mdx Marks configurable retry policies as completed.
ARCHITECTURE.md Documents retry policy behavior and how it affects failure/finalization.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +14 to +16
* Compute capped exponential backoff for a 1-based attempt number.
* @param policy - Backoff policy
* @param attempt - Attempt number (attempt 1 uses initial interval)
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JSDoc says attempt is 1-based, but the implementation explicitly treats attempt = 0 as attempt 1 (and tests rely on that). Update the documentation to match the actual accepted inputs (e.g., "attempt >= 0" or "attempt 0 is treated as 1") to avoid misuse by callers.

Suggested change
* Compute capped exponential backoff for a 1-based attempt number.
* @param policy - Backoff policy
* @param attempt - Attempt number (attempt 1 uses initial interval)
* Compute capped exponential backoff for an attempt number (1-based; attempt 0 is treated as 1).
* @param policy - Backoff policy
* @param attempt - Attempt number (attempt >= 0; attempt 0 and 1 use the initial interval)

Copilot uses AI. Check for mistakes.
Comment on lines +320 to +321
if (!partial) return DEFAULT_WORKFLOW_RETRY_POLICY;
return { ...DEFAULT_WORKFLOW_RETRY_POLICY, ...partial };
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolveRetryPolicy() returns the DEFAULT_WORKFLOW_RETRY_POLICY object by reference when no partial is provided. Because the returned RetryPolicy is mutable, external callers can accidentally mutate the shared default and affect all future runs. Prefer always returning a new object (e.g., spread/structuredClone) and/or freezing the default to prevent accidental mutation.

Suggested change
if (!partial) return DEFAULT_WORKFLOW_RETRY_POLICY;
return { ...DEFAULT_WORKFLOW_RETRY_POLICY, ...partial };
const base: RetryPolicy = { ...DEFAULT_WORKFLOW_RETRY_POLICY };
if (!partial) return base;
return { ...base, ...partial };

Copilot uses AI. Check for mistakes.
partial?: Partial<RetryPolicy>,
): RetryPolicy {
if (!partial) return DEFAULT_WORKFLOW_RETRY_POLICY;
return { ...DEFAULT_WORKFLOW_RETRY_POLICY, ...partial };
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolveRetryPolicy() currently spreads partial over the defaults, so a caller can unintentionally override defaults with undefined fields (e.g., { maximumAttempts: undefined as any }), leading to surprising behavior like attempts >= maximumAttempts becoming false due to NaN. Consider normalizing the partial by ignoring undefined values and validating the resulting policy (non-negative durations, finite numbers) before returning it.

Suggested change
return { ...DEFAULT_WORKFLOW_RETRY_POLICY, ...partial };
// Normalize the partial policy by ignoring undefined values so they do not
// override defaults.
const normalized: Partial<RetryPolicy> = {};
for (const [key, value] of Object.entries(partial)) {
if (value !== undefined) {
(normalized as any)[key] = value;
}
}
const policy: RetryPolicy = {
...DEFAULT_WORKFLOW_RETRY_POLICY,
...normalized,
};
// Validate numeric fields to catch invalid configurations early.
for (const [key, value] of Object.entries(policy)) {
if (typeof value === "number") {
if (!Number.isFinite(value) || value < 0) {
throw new Error(
`Invalid retry policy: field "${key}" must be a non-negative, finite number`,
);
}
}
}
return policy;

Copilot uses AI. Check for mistakes.
workflowRunId: string;
workerId: string;
error: SerializedError;
retryPolicy: RetryPolicy;
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding retryPolicy as a required field on FailWorkflowRunParams is a breaking change for third-party Backend implementations (the Backend interface is part of the public surface via OpenWorkflowOptions). If backward compatibility is desired, consider making retryPolicy optional and defaulting it internally (or bumping the package version/changelog accordingly).

Suggested change
retryPolicy: RetryPolicy;
retryPolicy?: RetryPolicy;

Copilot uses AI. Check for mistakes.
Comment on lines +1382 to +1385
test("returns default policy when partial is undefined", () => {
const result = resolveRetryPolicy();
expect(result).toEqual(DEFAULT_WORKFLOW_RETRY_POLICY);
});
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test case claims to cover "partial is undefined" but calls resolveRetryPolicy() without actually passing undefined. To ensure the intended behavior is covered, call resolveRetryPolicy(undefined) in this test (or remove it if it's redundant with the previous case).

Copilot uses AI. Check for mistakes.
defineWorkflow,
defineWorkflowSpec,
isWorkflow,
RetryPolicy,
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RetryPolicy is a type-only export (export type RetryPolicy = ...). Importing it as a value (import { RetryPolicy } ...) can cause a runtime ESM error ("module does not provide an export") in Vitest/Vite transpilation. Change the import to a type-only import (e.g., import { ..., type RetryPolicy } or a separate import type).

Suggested change
RetryPolicy,
type RetryPolicy,

Copilot uses AI. Check for mistakes.
Comment thread packages/openworkflow/core/backoff.ts
@jamescmartinez jamescmartinez merged commit d828d64 into main Feb 11, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants