-
Notifications
You must be signed in to change notification settings - Fork 87
Updated proto with API changes #710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -261,6 +261,13 @@ message WorkflowTaskScheduledEventAttributes { | |
| google.protobuf.Duration start_to_close_timeout = 2; | ||
| // Starting at 1, how many attempts there have been to complete this task | ||
| int32 attempt = 3; | ||
| // Limits time a workflow task can stay in a task queue before a worker picks it up. | ||
| // Only set when the server enforces a schedule-to-start timeout (e.g. sticky or speculative WT on normal queue). | ||
| // | ||
| // (-- api-linter: core::0140::prepositions=disabled | ||
| // aip.dev/not-precedent: "to" is used to indicate interval. --) | ||
| google.protobuf.Duration schedule_to_start_timeout = 4; | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would like to understand details of how this value is present and/or updated or not on successive task failures or timeouts. I wouldn't expect speculative task failure is ever recorded, or is it? And as for sticky, I suppose it can only occur once because rescheduled task does not have a schedule to start? Any reason not to add this field to
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is what I think is happening so far the server only sets schedule_to_start_timeout when it actually enforces a schedule-to-start timeout i.e. for sticky workflow tasks and for speculative workflow tasks on the normal queue. For a normal task on the normal queue the field stays unset, so it isn’t “updated” across successive timeouts in the usual retry sense; it’s only present on the scheduled events where that timeout applies. For speculative tasks, the schedule-to-start timeout is recorded: when the timer fires we create the WORKFLOW_TASK_SCHEDULED event at that moment (in AddWorkflowTaskScheduleToStartTimeoutEvent) and then add the WORKFLOW_TASK_TIMED_OUT event, so the scheduled event and its timeout are written together. For sticky, schedule-to-start only applies to the sticky task; after it times out we reschedule on the normal queue, and that new scheduled event doesn’t have a schedule-to-start timeout, so the field wouldn’t be set there. I’m happy to put the duration on WorkflowTaskTimedOutEventAttributes instead it would make the timeout event self-contained (you’d see both the timeout type and the duration that fired), avoid ambiguity about when the field is set on the scheduled event, and better match “this is what timed out.” I can switch the change to add the field there (and only populate it when timeout_type is SCHEDULE_TO_START) if that’s the direction you prefer; otherwise I can keep it on the scheduled event and add a short comment documenting this lifecycle. |
||
|
|
||
| } | ||
|
|
||
| message WorkflowTaskStartedEventAttributes { | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you clarify whether this has a specific recent need, or was this just something found on the backlog at temporalio/temporal#778? History fields are pretty scrutinized, so would want to confirm there is a pressing reason for this.
Also, the PR description is missing the accompanying server PR showing how this will be implemented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change came from the backlog.
There isn’t a new production incident driving it.
The original issue was observability: the UI could show a user-configured schedule-to-start timeout (e.g. 10s) while the server used a different (e.g. hardcoded) value, so users couldn’t see what was actually in effect from history. The goal is to make history reflect the schedule-to-start timeout when the server enforces one, so there’s no pressing reason beyond that observability/consistency improvement. If you’d rather not add history fields for that alone, I’m happy to close or repurpose the PR.
Also thanks for pointing out the missing server PR I thought I added it in the description.