Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions openapi/openapiv2.json
Original file line number Diff line number Diff line change
Expand Up @@ -17957,6 +17957,10 @@
"type": "integer",
"format": "int32",
"title": "Starting at 1, how many attempts there have been to complete this task"
},
"scheduleToStartTimeout": {
"type": "string",
"description": "Limits time a workflow task can stay in a task queue before a worker picks it up.\nOnly set when the server enforces a schedule-to-start timeout (e.g. sticky or speculative WT on normal queue).\n"
}
}
},
Expand Down
9 changes: 9 additions & 0 deletions openapi/openapiv3.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16718,6 +16718,15 @@ components:
type: integer
description: Starting at 1, how many attempts there have been to complete this task
format: int32
scheduleToStartTimeout:
pattern: ^-?(?:0|[1-9][0-9]{0,11})(?:\.[0-9]{1,9})?s$
type: string
description: |-
Limits time a workflow task can stay in a task queue before a worker picks it up.
Only set when the server enforces a schedule-to-start timeout (e.g. sticky or speculative WT on normal queue).

(-- api-linter: core::0140::prepositions=disabled
aip.dev/not-precedent: "to" is used to indicate interval. --)
WorkflowTaskStartedEventAttributes:
type: object
properties:
Expand Down
7 changes: 7 additions & 0 deletions temporal/api/history/v1/message.proto
Original file line number Diff line number Diff line change
Expand Up @@ -261,6 +261,13 @@ message WorkflowTaskScheduledEventAttributes {
google.protobuf.Duration start_to_close_timeout = 2;
// Starting at 1, how many attempts there have been to complete this task
int32 attempt = 3;
// Limits time a workflow task can stay in a task queue before a worker picks it up.
// Only set when the server enforces a schedule-to-start timeout (e.g. sticky or speculative WT on normal queue).
//
// (-- api-linter: core::0140::prepositions=disabled
// aip.dev/not-precedent: "to" is used to indicate interval. --)
google.protobuf.Duration schedule_to_start_timeout = 4;
Copy link
Member

@cretz cretz Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify whether this has a specific recent need, or was this just something found on the backlog at temporalio/temporal#778? History fields are pretty scrutinized, so would want to confirm there is a pressing reason for this.

Also, the PR description is missing the accompanying server PR showing how this will be implemented.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change came from the backlog.
There isn’t a new production incident driving it.
The original issue was observability: the UI could show a user-configured schedule-to-start timeout (e.g. 10s) while the server used a different (e.g. hardcoded) value, so users couldn’t see what was actually in effect from history. The goal is to make history reflect the schedule-to-start timeout when the server enforces one, so there’s no pressing reason beyond that observability/consistency improvement. If you’d rather not add history fields for that alone, I’m happy to close or repurpose the PR.

Also thanks for pointing out the missing server PR I thought I added it in the description.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would like to understand details of how this value is present and/or updated or not on successive task failures or timeouts. I wouldn't expect speculative task failure is ever recorded, or is it? And as for sticky, I suppose it can only occur once because rescheduled task does not have a schedule to start?

Any reason not to add this field to WorkflowTaskTimedOutEventAttributes instead?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I think is happening so far the server only sets schedule_to_start_timeout when it actually enforces a schedule-to-start timeout i.e. for sticky workflow tasks and for speculative workflow tasks on the normal queue. For a normal task on the normal queue the field stays unset, so it isn’t “updated” across successive timeouts in the usual retry sense; it’s only present on the scheduled events where that timeout applies.

For speculative tasks, the schedule-to-start timeout is recorded: when the timer fires we create the WORKFLOW_TASK_SCHEDULED event at that moment (in AddWorkflowTaskScheduleToStartTimeoutEvent) and then add the WORKFLOW_TASK_TIMED_OUT event, so the scheduled event and its timeout are written together.

For sticky, schedule-to-start only applies to the sticky task; after it times out we reschedule on the normal queue, and that new scheduled event doesn’t have a schedule-to-start timeout, so the field wouldn’t be set there.

I’m happy to put the duration on WorkflowTaskTimedOutEventAttributes instead it would make the timeout event self-contained (you’d see both the timeout type and the duration that fired), avoid ambiguity about when the field is set on the scheduled event, and better match “this is what timed out.” I can switch the change to add the field there (and only populate it when timeout_type is SCHEDULE_TO_START) if that’s the direction you prefer; otherwise I can keep it on the scheduled event and add a short comment documenting this lifecycle.


}

message WorkflowTaskStartedEventAttributes {
Expand Down