add time-skipping configuration to UpdateWorkflowExecutionOptions#9944
add time-skipping configuration to UpdateWorkflowExecutionOptions#9944feiyang3cat wants to merge 4 commits intotemporalio:mainfrom
UpdateWorkflowExecutionOptions#9944Conversation
1b8b155 to
311214e
Compare
UpdateWorkflowExecutionOptions
| nil, // priority | ||
| nil, // timeSkippingConfig | ||
| ) | ||
| return api.UpdateWorkflowWithoutWorkflowTask, err |
There was a problem hiding this comment.
use existing execution on wfID conflict doesn't involve ts config changes
9fd5698 to
4128a4a
Compare
| if err := priorities.Validate(opts.GetPriority()); err != nil { | ||
| return nil, err | ||
| } | ||
| if err := wh.validateTimeSkippingConfig(opts.GetTimeSkippingConfig(), namespace.Name(request.GetNamespace())); err != nil { |
There was a problem hiding this comment.
If frontend.TimeSkippingEnabled is turned off during a workflow execution, validateTimeSkippingConfig will not allow disabling time skipping for a workflow. validateTimeSkippingConfig will return an error in that case. User will not be able to disable time skipping explicitly if frontend.TimeSkippingEnabled is false. Is that the intended behaviour?
There was a problem hiding this comment.
yes, that is my understanding of how dynamic config feature flag works -> we don't allow users turn on a feature and have a bunch of workflows running with this feature and then turn off the feature and still want to try to manipulate the feature. this kind of flexibility will make the system complicated.
There was a problem hiding this comment.
I may understand this incorrectly. cc @yycptt for a confirmation
There was a problem hiding this comment.
Yeah that makes sense. The one I pointed out is not a normal case scenario.
There was a problem hiding this comment.
I guess we don't support removing a feature flag after users enable it? I don't know I feel that will be weird.
| opts.Priority = cloned | ||
| } | ||
| } | ||
| if timeSkippingInfo := ms.GetExecutionInfo().GetTimeSkippingInfo(); timeSkippingInfo != nil { |
There was a problem hiding this comment.
Is the GetTimeSkippingInfo returned from this line stored to a history event? If so, you should clone this pointer before returning. Just like the versioning info above.
I remember an issue where a pointer to the same struct was present in a history event and mutable state. In that case, it is possible that persistence layer could serialize history events and at the same time another thread can mutate the struct inside mutable state. That caused a data race error in a test cell and it was very hard to debug.
There was a problem hiding this comment.
it is not from history event, it is from mutableState (execution table)
There was a problem hiding this comment.
I think there is a lock for each mutableState change to guarantee the transaction. let me confirm it (@feiyang3cat todo)
There was a problem hiding this comment.
You are right that there is workflow lock protecting this in a single operation. Probably I misremembered this.
But still wouldn't this cause an issue in MergeAndApply() function? mergeWorkflowExecutionOptions() gets an object with reference to mutable state's TimeSkippingInfo. So mutable state itself is mutated inside mergeWorkflowExecutionOptions(). Which is not what is expected there. Then MergeAndApply() checks the returned value from mergeWorkflowExecutionOptions() and the value in mutableState to decide if something changed. But that check will not find any changes compared to mutableState. Because we now changed the value in mutableState itself.
There was a problem hiding this comment.
what does this mean -> " So mutable state itself is mutated inside mergeWorkflowExecutionOptions()."
There was a problem hiding this comment.
are you referring to something related to getOptionsFromMutableState uses proto.Clone
There was a problem hiding this comment.
yeah the issue when getOptionsFromMutableState doesn't use proto.clone. I see that you fixed that anyways. Thanks!
|
|
||
| // ==== Time Skipping Config | ||
| // nil means "no change" — only update if the caller provided an explicit value. | ||
| if _, ok := updateFields["timeSkippingConfig"]; ok { |
There was a problem hiding this comment.
Can updateFields contain something like timeSkippingConfig.bound if the client just wants to change that? I see that things like priority.fairnessWeight are checked in the above lines.
1857153 to
0dda3c6
Compare
0dda3c6 to
d72dcb3
Compare
| // Update time skipping config. | ||
| if tsc := attributes.GetTimeSkippingConfig(); tsc != nil { | ||
| ms.executionInfo.TimeSkippingInfo = &persistencespb.TimeSkippingInfo{ | ||
| Config: tsc, |
There was a problem hiding this comment.
I see that you have a TODO to add runtime details to TimeSkippingInfo. So wouldn't replacing ms.executionInfo.TimeSkippingInfo with a new struct clear all the runtime info?
What changed?
Frontend validation — validateTimeSkippingConfig is now invoked in:
Persistence
Events
Why?
Support update feature of time-skipping.
How did you test it?