Skip to content

Nexus CHASM async completion (1/2)#9951

Open
stephanos wants to merge 8 commits intomainfrom
stephanos/nexus-chasm-async-compl-1
Open

Nexus CHASM async completion (1/2)#9951
stephanos wants to merge 8 commits intomainfrom
stephanos/nexus-chasm-async-compl-1

Conversation

@stephanos
Copy link
Copy Markdown
Contributor

@stephanos stephanos commented Apr 14, 2026

What changed?

Support Nexus async completion in CHASM.

PS: completion-before-start will be handled in a follow-up PR.

How did you test it?

  • built
  • run locally and tested manually
  • covered by existing tests
  • added new unit test(s)
  • added new functional test(s)

Potential risks

Is behind feature flag.

@stephanos stephanos force-pushed the stephanos/nexus-chasm-async-compl-1 branch 7 times, most recently from 8f06aca to de6a5b0 Compare April 15, 2026 04:20
@stephanos stephanos changed the title Nexus CHASM async completion Nexus CHASM async completion (1/2) Apr 15, 2026
@stephanos stephanos force-pushed the stephanos/nexus-chasm-async-compl-1 branch 2 times, most recently from af87649 to 34b5d4f Compare April 15, 2026 17:05
@stephanos stephanos changed the base branch from main to stephanos/nexus-completion-refactor April 15, 2026 17:14
@stephanos stephanos force-pushed the stephanos/nexus-chasm-async-compl-1 branch 6 times, most recently from 3a01f82 to 8254f69 Compare April 15, 2026 18:27
@stephanos stephanos changed the base branch from stephanos/nexus-completion-refactor to main April 15, 2026 18:28
@stephanos stephanos force-pushed the stephanos/nexus-chasm-async-compl-1 branch 10 times, most recently from ef6a516 to 597e8d2 Compare April 15, 2026 18:59
"time"

"github.com/gorilla/mux"
"github.com/nexus-rpc/sdk-go/nexus"
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved here from components; adjusted some of the names to fit into new package better.

@stephanos stephanos force-pushed the stephanos/nexus-chasm-async-compl-1 branch from 597e8d2 to 35c28a9 Compare April 15, 2026 19:17
// When the request has nil headers, it should fall back to the local client.
var gotPath string
ts := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
assert.Equal(t, commonnexus.PathCompletionCallbackNoIdentifier, r.URL.Path)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed linter error

result: result,
retryPolicy: h.config.RetryPolicy(),
})
return err
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed linter error by moving out

@stephanos stephanos force-pushed the stephanos/nexus-chasm-async-compl-1 branch 15 times, most recently from 1c5d51f to b6ee1e8 Compare April 15, 2026 23:58
// user-agent header contains Nexus SDK client info in the form <sdk-name>/v<sdk-version>
headerUserAgent = "user-agent"
clientNameVersionDelim = "/v"
)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ these already exist in temporal/service/frontend/nexus_handler.go

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One step closer to eliminating the tech debt we created with the duplication of interceptors here.

}

func (h *NexusHTTPHandler) parseTlsAndAuthInfo(r *http.Request, nc *nexusContext) (*http.Request, error) {
func (h *NexusOperationHTTPHandler) parseTLSAndAuthInfo(r *http.Request, nc *nexusContext) (*http.Request, error) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

address linter error

// The update may legitimately delete the addressed component, in which case
// there is no new ref to return even though the transition succeeded.
return nil, nil
}
Copy link
Copy Markdown
Contributor Author

@stephanos stephanos Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to add this stopgap to make tests pass.

cc @yycptt if you have thoughts on this issue

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I think we just need to document chasm.UpdateComponent to mention that the returned ref could be nil if the component is not the root and it was deleted in the transaction.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah make sense! Thanks!

Copy link
Copy Markdown
Member

@bergundy bergundy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks great.

JFYI (not blocking this PR), you're also missing the fallback to the "current" run. That's only relevant for workflow operations though. I think we'll need some special handling for the token for workflow. I think the archetype should be on the token and we can change the behavior based on that.

Comment thread service/frontend/fx.go
fx.Invoke(EndpointRegistryLifetimeHooks),
fx.Provide(schedulerpb.NewSchedulerServiceLayeredClient),
nexusfrontend.Module,
chasmnexus.Module,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this supposed to be FrontendModule?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see https://github.com/temporalio/temporal/pull/9951/changes#r3089760669

Yes, on the feature branch we have FrontendModule and I'll change it there; didn't want to pull that change in here.

// user-agent header contains Nexus SDK client info in the form <sdk-name>/v<sdk-version>
headerUserAgent = "user-agent"
clientNameVersionDelim = "/v"
)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One step closer to eliminating the tech debt we created with the duplication of interceptors here.

// CompleteOperation implements nexus.CompletionHandler.
// nolint:revive // (cyclomatic complexity) This function is long but the complexity is justified.
func (h *completionHandler) CompleteOperation(ctx context.Context, r *nexusrpc.CompletionRequest) (retErr error) {
func (h *NexusCompletionHandler) CompleteOperation(ctx context.Context, r *nexusrpc.CompletionRequest) (retErr error) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this struct need to be exported? I thought we only reference it from within the frontend package. (Same for the constructor, etc...)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw that NexusOperationHTTPHandler and OpenAPIHTTPHandler are exported, too, and matched that here for consistency. It can be unexported. I'll make the change.


completion, err := h.CallbackTokenGenerator.DecodeCompletion(token)
if err != nil {
if err != nil || completion == nil {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would the completion be nil and the error not be nil? That seems unexpected behavior to me.

}
namespaceID = ref.GetNamespaceId()
businessID = ref.GetBusinessId()
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

System isn't relevant here, it follows a separate path.


namespaceID := completion.GetNamespaceId()
businessID := completion.GetWorkflowId()
if namespaceID == "" && len(completion.GetComponentRef()) > 0 {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put validation in DecodeCompletion to ensure that either the HSM fields are all present of component ref is not empty. We should never have both variations populated.

Comment on lines +55 to +56
namespaceID = ref.GetNamespaceId()
businessID = ref.GetBusinessId()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's verify that these aren't empty here.

Comment thread chasm/lib/nexusoperation/operation.go Outdated
return TransitionTimedOut.Apply(o, ctx, EventTimedOut{})
}

func (o *Operation) HandleNexusCompletion(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe put this next to saveInvocationResult because functionally there is a lot of overlap? And I would add a comment mentioning why these two separate methods exist. Ideally we'd have only one but I'd rather not spend time refactoring that right now. We can live with a bit of duplication here.

// The update may legitimately delete the addressed component, in which case
// there is no new ref to return even though the transition succeeded.
return nil, nil
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I think we just need to document chasm.UpdateComponent to mention that the returned ref could be nil if the component is not the root and it was deleted in the transaction.

Comment thread tests/nexus_workflow_test.go Outdated
callbackToken, err = gen.Tokenize(workflowNotFoundToken)
s.NoError(err)
completion.Header = nexus.Header{commonnexus.CallbackTokenHeader: callbackToken}
if !chasmEnabled {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can have an equivalent token mutation test for the CHASM implementation though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants