You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Result: RuntimeError("node cancelled by user") is raised at graph.py:~896, propagates through _execute_nodes_parallel, and kills the entire graph. step_c never executes. The graph status becomes FAILED instead of continuing.
The issue only manifests on resume (Turn 2). On a fresh start without interrupts, cancel_node also raises but the graph hasn't persisted state yet so there's nothing to corrupt. On resume, the crash leaves the workflow in a FAILED state with no recovery path.
Expected Behavior
Expected Behavior
When BeforeNodeCallEvent.cancel_node = True is set:
The node should be treated as successfully completed (or a new SKIPPED status) for dependency resolution purposes
Downstream nodes (step_c) should execute normally — the cancelled node should not block the graph
The graph should continue to completion or the next interrupt point
execution_order should either omit the skipped node or include it with a distinguishable status
No exception should propagate — cancel_node is an intentional control flow decision, not an error
Actual Behavior
Actual Behavior
Setting cancel_node = True raises RuntimeError that terminates the entire graph:
# graph.py, _execute_node(), line ~896ifbefore_event.cancel_node:
cancel_message= (
before_event.cancel_nodeifisinstance(before_event.cancel_node, str)
else"node cancelled by user"
)
yieldMultiAgentNodeCancelEvent(node.node_id, cancel_message)
raiseRuntimeError(cancel_message) # ← kills the graph
The graph catches this as an unrecoverable failure
record.status becomes FAILED
All downstream nodes are abandoned
The workflow cannot be resumed — the next user message starts a brand new workflow, losing all accumulated state
Additional Context
Additional Context
The cancel_node feature was introduced to support the BeforeNodeCallEvent hook, but its current implementation treats cancellation as a fatal error rather than a control flow mechanism.
This behavior is consistent across versions 1.32.0 through 1.38.0.
The related feature request [FEATURE] Pass invocation_state to edge condition call #1346 (pass invocation_state to edge conditions) would provide an alternative path for conditional routing, but cancel_node should still work as a valid skip mechanism since it's exposed as a public API on the event object.
Our production workaround wraps skippable nodes in a no-op AgentBase implementation that checks the condition at call time and returns an empty AgentResult. This avoids cancel_node entirely but adds complexity and prevents proper skip tracking in execution_order.
Possible Solution
Possible Solution
Replace the RuntimeError in _execute_node() with graceful completion. In graph.py line ~896:
ifbefore_event.cancel_node:
cancel_message= (
before_event.cancel_nodeifisinstance(before_event.cancel_node, str)
else"node cancelled by user"
)
logger.debug("reason=<%s> | skipping node execution", cancel_message)
yieldMultiAgentNodeCancelEvent(node.node_id, cancel_message)
# Mark as completed so downstream nodes can proceednode.execution_status=Status.COMPLETED# Yield a minimal result so the graph can continueyieldMultiAgentNodeCompleteEvent(
node_id=node.node_id,
result=AgentResult(
stop_reason="end_turn",
message={"role": "assistant", "content": [{"text": cancel_message}]},
metrics=EventLoopMetrics(),
state={},
),
)
return# Exit cleanly instead of raising
This ensures:
The cancelled node is treated as completed for dependency resolution
Downstream nodes execute normally
execution_order includes the node (consumers can check MultiAgentNodeCancelEvent to distinguish skipped from executed)
No RuntimeError propagation — the graph continues
An alternative would be adding a Status.SKIPPED enum value that the graph treats identically to COMPLETED for edge traversal but is distinguishable in execution_order for observability.
Checks
Strands Version
1.32.0
Python Version
3.13
Operating System
15.6.1
Installation Method
pip
Steps to Reproduce
Steps to Reproduce
step_a → step_b → step_cstep_ais an INPUT agent that callsinterrupt()to pause for user inputstep_bhas aBeforeNodeCallEventhook that setscancel_node = Truebased on runtime statestep_cis a normal agent that should execute afterstep_bis skippedFileSessionManagerorS3SessionManagerfor persistencegraph("task")—step_ainterrupts, graph pauses. Works fine.graph(responses, invocation_state={"extracted": {"skip_step_b": True}})—step_acompletes, graph reachesstep_b, hook setscancel_node = TrueRuntimeError("node cancelled by user")is raised atgraph.py:~896, propagates through_execute_nodes_parallel, and kills the entire graph.step_cnever executes. The graph status becomes FAILED instead of continuing.The issue only manifests on resume (Turn 2). On a fresh start without interrupts,
cancel_nodealso raises but the graph hasn't persisted state yet so there's nothing to corrupt. On resume, the crash leaves the workflow in a FAILED state with no recovery path.Expected Behavior
Expected Behavior
When
BeforeNodeCallEvent.cancel_node = Trueis set:step_c) should execute normally — the cancelled node should not block the graphexecution_ordershould either omit the skipped node or include it with a distinguishable statuscancel_nodeis an intentional control flow decision, not an errorActual Behavior
Actual Behavior
Setting
cancel_node = TrueraisesRuntimeErrorthat terminates the entire graph:The
RuntimeErrorpropagates:_execute_node→_stream_node_to_queue(line ~790) →_execute_nodes_parallel(line ~752) →raise eventrecord.statusbecomesFAILEDAdditional Context
Additional Context
cancel_nodefeature was introduced to support theBeforeNodeCallEventhook, but its current implementation treats cancellation as a fatal error rather than a control flow mechanism.invocation_stateto edge conditions) would provide an alternative path for conditional routing, butcancel_nodeshould still work as a valid skip mechanism since it's exposed as a public API on the event object.AgentBaseimplementation that checks the condition at call time and returns an emptyAgentResult. This avoidscancel_nodeentirely but adds complexity and prevents proper skip tracking inexecution_order.Possible Solution
Possible Solution
Replace the
RuntimeErrorin_execute_node()with graceful completion. Ingraph.pyline ~896:Current:
Proposed:
This ensures:
execution_orderincludes the node (consumers can checkMultiAgentNodeCancelEventto distinguish skipped from executed)RuntimeErrorpropagation — the graph continuesAn alternative would be adding a
Status.SKIPPEDenum value that the graph treats identically toCOMPLETEDfor edge traversal but is distinguishable inexecution_orderfor observability.Related Issues
No response