Skip to content

Conversation

@tedchirvasiu
Copy link

@tedchirvasiu tedchirvasiu commented Dec 20, 2025

  • This PR proposes an implementation for FiringMode.Serial as discussed in Introduce FiringMode.Serial #527
  • This is an alternate implementation of PR Implemented FiringMode.Serial with standard locks #639. This implementation runs triggers on a separate task in order not to block any callers to Fire/FireAsync.
  • This implementation still employs a standard short-lived lock to ensure thread safety.
  • This implementation no longer needs the DropUnprocessedEventsOnErrorInSerialMode from the previous PR.
  • All calls to Fire/FireAsync return immediately after the trigger is enqueued.
  • If you need to wait for the trigger to be ran or catch its specific exceptions, you can use await FireAndWaitAsync(trigger). Warning, if you call and await FireAndWaitAsync inside a state transition event, you risk deadlocking (because the current transition now waits for a future transition to be completed).
  • If you need to await the triggers processing task or catch exceptions, you can use await GetSerialEventsWorkerTask().
  • Important: If an unexpected exception is thrown durring the processing of triggers, the faulting trigger is dequeued and the worker task halts (possibly locking tasks waiting via await FireAndWaitAsync(trigger)).
  • If you need to resume execution, you can call the new parameterless Fire method (works for the FiringMode.Queued mode too).
  • If you need to cancel all the unexecuted triggers still pending, you can call CancelPendingTriggers().

Resolves #527

@tedchirvasiu
Copy link
Author

tedchirvasiu commented Dec 20, 2025

@gao-artur What do you think about this implementation? This should fix the issue raised by you on the previous PR in regards to locking the initial caller to Fire.

However exception handling becomes a bit iffy.

@gao-artur
Copy link

Too many gotcha's. IMO, in the current form, this is just not usable. The user will have to write a significant amount of code to make this mode work as expected, which defeats the original proposal's idea of simplifying concurrent usage. Additionally, you have added a new FireAndWaitAsync method that makes sense only in serial mode, but is available in all modes, which is a poor API design.
I'm a little bit biased in reviewing your PR's because, obviously, I'd like it to fit my needs exactly, which may not be general enough to be included in the library. Anyway, neither of these PR's meets my needs.

@tedchirvasiu
Copy link
Author

Too many gotcha's. IMO, in the current form, this is just not usable. The user will have to write a significant amount of code to make this mode work as expected, which defeats the original proposal's idea of simplifying concurrent usage. Additionally, you have added a new FireAndWaitAsync method that makes sense only in serial mode, but is available in all modes, which is a poor API design. I'm a little bit biased in reviewing your PR's because, obviously, I'd like it to fit my needs exactly, which may not be general enough to be included in the library. Anyway, neither of these PR's meets my needs.

Would be curios to see your implementation because I am really unsure what you are going for. Like what does concretely "as expected" mean to you?

In the original issue you proposed a solution with a SemaphoreSlim which deadlocks on recursive Fire calls.

@gao-artur
Copy link

gao-artur commented Dec 20, 2025

Yes, SemaphoreSlim was a naive implementation proposal, and yes, it can lead to a deadlock.
My wrapper implementation looks exactly as I described in this comment.
Here is my stripped implementation of the wrapping class. StartAsync is one of the public methods exposed to the consumers.

public class MyStateMachine: IAsyncDisposable
{
    private readonly StateMachine<State, Trigger> _machine = new StateMachine<State, Trigger>(State.Stopped, FiringMode.Queued);
    private readonly BlockingPriorityQueue<FiringTask, int> _firingQueue = new();
    private readonly CancellationTokenSource _queueWorkerCanceler = new();
    private readonly Task _queueWorkerTask;

    public MyStateMachine()
    {
        _queueWorkerTask = Task.Run(async () => await QueueWorker(_queueWorkerCanceler.Token));
    }

    public async Task StartAsync(CancellationToken token)
    {
        await WaitForExecutionAsync(
            () => _machine.FireAsync(_startTrigger, token),
            _startTrigger.Trigger);
    }

    public async ValueTask DisposeAsync()
    {
        _queueWorkerCanceler.Cancel();
        await _queueWorkerTask;
    }

    private async Task WaitForExecutionAsync(Func<Task> task, Trigger trigger)
    {
        var item = new FiringTask
        {
            Task = task,
            Trigger = trigger,
            TaskCompletionSource = new TaskCompletionSource(TaskCreationOptions.RunContinuationsAsynchronously)
        };

        await _firingQueue.Enqueue(item, trigger.Priority);
        await item.TaskCompletionSource.Task;
    }

    private async Task QueueWorker(CancellationToken token)
    {
        try
        {
            while (!token.IsCancellationRequested)
            {
                var item = await _firingQueue.Dequeue(token);

                try
                {
                    await item.Task();
                    item.TaskCompletionSource.SetResult();
                }
                catch (OperationCanceledException e)
                {
                    item.TaskCompletionSource.SetCanceled(e.CancellationToken);
                }
                catch (Exception e)
                {
                    item.TaskCompletionSource.SetException(e);
                }
            }
        }
        catch (OperationCanceledException)
        {
            _logger.LogInformation("Queue worker cancelled");
        }
        catch (Exception e)
        {
            _logger.LogError(e, "Queue worker failed");
        }
    }
}

The goal of the Serial mode proposal was to move all this inside the SM (maybe with a different implementation but with similar behavior) and keep the StartAsync as simple as this

    public async Task StartAsync(CancellationToken token)
    {
        await _machine.FireAsync(_startTrigger, token);
    }

Priority queue support is a different story.

@tedchirvasiu
Copy link
Author

tedchirvasiu commented Dec 20, 2025

What I'm seeing is something that has a baked in _startTrigger trigger. I assume this is is part of a very specific SM you built and you mean something generic would actually expose parameters allowing you to pass in the trigger + args.

I understand that is probably a snippet you stripped from a specific implementation so I won't pick on the details, but I think what I'm seeing would still deadlock on recursive calls on WaitForExecutionAsync (because the current Firing task would be stuck on waiting the next firing task to complete, but that can't complete because the QueueWorker is waiting for the current task to finish).

Other than that, I don't see how this is much different from this PR (aside from having the queue locking details moved to a probably custom class called BlockingPriorityQueue).

I very intentionally made it so that the default Fire method always returns early and implemented the separate FireAndWaitAsync to do what your WaitForExecutionAsync does in order to not let users deadlock by default.

@tedchirvasiu
Copy link
Author

tedchirvasiu commented Dec 20, 2025

Another reason why I chose to make Fire/FireAsync return right after enqueueing rather than after the trigger has been processed (which is what FireAndWaitAsync does) was because I find it to be consistent with how FiringMode.Queued works.

In the sense that FiringMode.Queued does not have a standard behavior for returning. If you're the first caller to FireAsync, then it returns after the entire queue has been processed, if you're a subsequent caller (so we are talking about a recursive Fire call) then Fire returns right after enqueueing, not after processing.

@gao-artur
Copy link

I understand that is probably a snippet you stripped from a specific implementation

That's correct.

would still deadlock on recursive calls

Recursive calls are handled by the _eventQueue, not by the _firingQueue, so deadlock is not possible.

Other than that, I don't see how this is much different from this PR

Honestly, I didn't review your implementation, just looked at a list of gotchas in the description.

I very intentionally made it so that the default Fire method always returns early

Yes, I understood that. But this decision introduces an additional gotcha that complicates the usage of the mode. I also understand that it was done to be consistent with the Queued mode implementation, but I think that early return and offloading the job by one task to another is a fundamental mistake of this library. Anyway, I'm not a maintainer of this library; I'm just a user who expresses his opinion. I have a very limited usage scenario of this library, so if this PR is not merged or merged as-is, I'll just continue using my wrapper and won't switch to a new mode.

@tedchirvasiu
Copy link
Author

Recursive calls are handled by the _eventQueue, not by the _firingQueue, so deadlock is not possible.

Through which mechanism do they end up on the _eventQueue? If all calls to Fire pass through QueueWorker, then I don't see how it won't deadlock because await item.Task() has to finish on the first task in order for the next item to be dequeued, fired and added to _eventQueue.

Are you mixing calls to WaitForExecutionAsync with direct calls to the State Machine's FireAsync method, like calling WaitForExecutionAsync from the outside of the SM and only direct calls to FireAsync from within the SM? If so, I don't see how you would integrate that into the existing Stateless API without creating new methods specific for the Serial mode.

but I think that early return and offloading the job by one task to another is a fundamental mistake of this library.

I agree that that part is confusing, but to be honest I don't see how they could've done it any other way. Because at the end of the day it comes to tradeoffs, and you fundamentally can't make FireAsync await for the actual execution of the queued event, unless you are willing to accept the tradeoff that users will always deadlock in recursive calls unless they do fire and forget.
And I don't think this is an implementation specific problem, it's a logical limitation (again, because the queue waits on an event which waits on a future event which can't be processed because it waits for the current event to be processed).

Honestly, I didn't review your implementation, just looked at a list of gotchas in the description

I mean, honestly the only 2 gotcha I see are:

  1. The fact that you don't have a stable central point to handle exceptions from the outside when using FireAsync (but could be solved with an OnError event)
  2. the fact that the queue halts when an unexpected exception is thrown. But this again was made in order to stay consistent with the Queued mode (in an exception is thrown the loop breaks but the _eventQueue is not cleared and you have no way to clear it, so you must fire a new event in order to resume - and the parameter-less Fire method and the CancelPendingTriggers give you some ways to handle that situation on Queued mode too).

It's true that FireAndWaitAsync only works as expected on Immediate and Serial modes, and on Queued mode it's a lie because it either returns too late or too early. But introducing the TaskCompletionSource pattern to Queued mode too would've required adding some overhead and I wasn't sure if the maintainers would like that (as I got the vibe that they don't want to affect the performance in existing modes, otherwise we could've just directly added the locks to the Queued mode).

I'm just a user who expresses his opinion.

Same here. I'm curious about your opinion because you made some valid points so far. However I think making a perfect design is pretty much impossible because trade-offs need to be made and ultimately it's going to be up to the maintainers to decide which path would make the most sense for this particular library (if they're going to decide on going forward with Serial mode at all, that is).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Introduce FiringMode.Serial

2 participants