[Fix-17817][Master]Add workflow timeout event and handle#18063
[Fix-17817][Master]Add workflow timeout event and handle#18063njnu-seafish wants to merge 9 commits intoapache:devfrom
Conversation
|
There was a problem hiding this comment.
Pull request overview
This PR addresses issue #17817 by introducing a workflow-level timeout lifecycle event/handler so that configured workflow timeout alerts are actually emitted by the master engine.
Changes:
- Add a new
WorkflowTimeoutLifecycleEvent+WorkflowTimeoutLifecycleEventHandler, and wire publishing fromWorkflowStartLifecycleEventHandler. - Add
TIMEOUTtoWorkflowLifecycleEventTypeto support routing the new event. - Add service/DAO plumbing for sending workflow-timeout alerts and refactor alert persistence helper naming; tighten task-timeout validation.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| dolphinscheduler-service/src/main/java/org/apache/dolphinscheduler/service/alert/WorkflowAlertManager.java | Adds sendWorkflowTimeoutAlert entrypoint to reach DAO alert creation. |
| dolphinscheduler-master/src/main/java/org/apache/dolphinscheduler/server/master/engine/workflow/lifecycle/handler/WorkflowTimeoutLifecycleEventHandler.java | New handler to send workflow timeout alerts when the timeout event fires. |
| dolphinscheduler-master/src/main/java/org/apache/dolphinscheduler/server/master/engine/workflow/lifecycle/handler/WorkflowStartLifecycleEventHandler.java | Publishes workflow timeout event at workflow start (if timeout configured). |
| dolphinscheduler-master/src/main/java/org/apache/dolphinscheduler/server/master/engine/workflow/lifecycle/event/WorkflowTimeoutLifecycleEvent.java | New delay event representing workflow timeout. |
| dolphinscheduler-master/src/main/java/org/apache/dolphinscheduler/server/master/engine/workflow/lifecycle/WorkflowLifecycleEventType.java | Adds TIMEOUT event type. |
| dolphinscheduler-master/src/main/java/org/apache/dolphinscheduler/server/master/engine/task/lifecycle/event/TaskTimeoutLifecycleEvent.java | Tightens validation to require timeout minutes > 0. |
| dolphinscheduler-dao/src/main/java/org/apache/dolphinscheduler/dao/AlertDao.java | Adds validation for workflow timeout alert inputs and consolidates timeout-alert saving helper. |
| dolphinscheduler-common/src/main/java/org/apache/dolphinscheduler/common/enums/AlertType.java | Updates comment to reflect actual enum values. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| long delayTime = System.currentTimeMillis() - workflowInstance.getStartTime().getTime() | ||
| + TimeUnit.MINUTES.toMillis(timeout); |
|
|
||
| public static WorkflowTimeoutLifecycleEvent of(IWorkflowExecutionRunnable workflowExecutionRunnable) { | ||
| final WorkflowInstance workflowInstance = workflowExecutionRunnable.getWorkflowInstance(); | ||
| checkState(workflowInstance != null, "The workflow instance must be initialized before retrying."); |
| private void workflowTimeoutMonitor(final IWorkflowExecutionRunnable workflowExecutionRunnable) { | ||
| final WorkflowInstance workflowInstance = workflowExecutionRunnable.getWorkflowInstance(); | ||
| if (workflowInstance.getTimeout() <= 0) { | ||
| log.debug("The workflow {} timeout {} is invalided, so the timeout monitor will not be started.", |
| final int timeout = workflowInstance.getTimeout(); | ||
| checkState(timeout > 0, "The workflow timeout: %s must > 0 minutes", timeout); | ||
|
|
||
| long delayTime = System.currentTimeMillis() - workflowInstance.getStartTime().getTime() |
There was a problem hiding this comment.
When the workflow instance is rerun the startTime will not change. You should use restartTime .
There was a problem hiding this comment.
When the workflow instance is rerun the startTime will not change. You should use
restartTime.
good
| private void workflowTimeoutMonitor(final IWorkflowExecutionRunnable workflowExecutionRunnable) { | ||
| final WorkflowInstance workflowInstance = workflowExecutionRunnable.getWorkflowInstance(); | ||
| if (workflowInstance.getTimeout() <= 0) { | ||
| log.debug("The workflow {} timeout {} is invalided, so the timeout monitor will not be started.", |
There was a problem hiding this comment.
The log is incorrect.
fix it
| * Pause the workflow instance | ||
| */ | ||
| PAUSE, | ||
|
|
There was a problem hiding this comment.
Please don't change the unrelated code.
There was a problem hiding this comment.
Please don't change the unrelated code.
ok
added |


Was this PR generated or assisted by AI?
NO
Purpose of the pull request
close #17817
This bug involves a critical feature and has been stagnant for several months. Since I actually resolved the task timeout and workflow timeout issue quite some time ago, I would like to finalize the process and get it closed.

Brief change log
Add workflow timeout event and handle
Verify this pull request
This pull request is code cleanup without any test coverage.
(or)
This pull request is already covered by existing tests, such as (please describe tests).
(or)
This change added tests and can be verified as follows:
(or)
Pull Request Notice
Pull Request Notice
If your pull request contains incompatible change, you should also add it to
docs/docs/en/guide/upgrade/incompatible.md