ams: Add initial AMS implementation #6343

kfrydryx · 2022-09-27T10:47:28Z

Add Asynchronous Messaging Service. This is used to communicate between modules in ACE.

Might still need some tinkering: verified single-core scenario and routing logic, though multi core behavior was not tested in-depth.

Signed-off-by: Krzysztof Frydryk krzysztofx.frydryk@intel.com

src/idc/idc.c

src/lib/ams.c

src/include/sof/lib/ams.h

src/lib/ams.c

plbossart · 2022-09-28T11:49:08Z

Add Asynchronous Messaging Service. This is used to communicate between modules in ACE.

Can you provide examples of what this would be used for?

kfrydryx · 2022-10-02T23:18:22Z

@plbossart AMS is basically notifications on steroids. It allows broadcasting notifications from one producer to multiple consumers, as well as targeting single consumers. Producers/consumers can be registered runtime. It should also allow for unified notifications to be sent to host through host gateway (not yet implemented). AFAIK for ACE platforms it is desired to send all module events through asynchronous messages.

lyakh · 2022-10-04T08:56:52Z

@plbossart AMS is basically notifications on steroids. It allows broadcasting notifications from one producer to multiple consumers, as well as targeting single consumers. Producers/consumers can be registered runtime. It should also allow for unified notifications to be sent to host through host gateway (not yet implemented). AFAIK for ACE platforms it is desired to send all module events through asynchronous messages.

Which of these can the current notifier not do?

lgirdwood

@kfrydryx I assume AMS will use the low level Zephyr APIs for core to core or task to task messaging ? i.e. we can send a message to any object with an instance ID on any core/thread ane even the host SW ?
I also assume this PR is the high level AMS functionality for routing, synchronization, ID lookup and translation etc.
@nashif are you able to comment on how AMS would work with Zephyr low level IPC, IDC, task/thread message APIs ?

kfrydryx · 2022-10-06T07:55:09Z

@lyakh AMS is a required part of ACE feature parity. It needs to be compatible with loadable modules and whatever is happening behind the host gateway.

@lgirdwood Async messages between different cores are sent through IDC/IPC. I don't think we have to care about what's behind the sof abstraction. As long as we can send a message and add case to handle ams idc we should be ok. IPC part is not yet implemented, though should be pretty similar.

lgirdwood · 2022-10-10T10:48:43Z

@lgirdwood Async messages between different cores are sent through IDC/IPC. I don't think we have to care about what's behind the sof abstraction. As long as we can send a message and add case to handle ams idc we should be ok. IPC part is not yet implemented, though should be pretty similar.

ok, as long as we are using the Zephyr interface for IDC/IPC then I'm good.

lgirdwood · 2022-10-10T10:53:33Z

@plbossart AMS is basically notifications on steroids. It allows broadcasting notifications from one producer to multiple consumers, as well as targeting single consumers. Producers/consumers can be registered runtime. It should also allow for unified notifications to be sent to host through host gateway (not yet implemented). AFAIK for ACE platforms it is desired to send all module events through asynchronous messages.

Which of these can the current notifier not do?

IIUC, AMS can also message host. Yeah, we cant have 2 methods of messaging between components or to host upstream. We need to update the notifier users to AMS AND IPC events users prior to v2.5. i.e. we merge those 2 existing APIs into AMS.

lyakh

sorry, I stopped reviewing at some point. This file isn't compiled by CI because its Kconfig is by default =n. And it looks like you didn't compile this latest version or you left at least some compiler warnings unfixed. Please compile and test every version before submission. Also, are you sure all your functions, acquiring your coherent object, are only called from the thread context? Also please check output of the sparse CI check - even if it's green now, and fix any warnings in your code, complaining about wrong address space. But that can only be done once you get CI to compile your code

src/lib/ams.c

src/include/sof/lib/ams.h

src/lib/ams.c

src/platform/Kconfig

kv2019i

A few more comments on top of what @lyakh already raised. Would be good to have some in-tree user sooner rather than later for this interface. We definitely build this by default in some configuration that is in CI, but unless we can also do functional testing (with some in-tree), I fear this can get broken very easily by other changes in upstream.

src/idc/idc.c

src/platform/Kconfig

src/lib/ams.c

app/boards/intel_adsp_ace15_mtpm.conf

lyakh · 2023-02-01T11:29:04Z

src/include/sof/lib/ams.h

So, slots aren't per-core, they are per message, right? Every new message takes a new slot with various descriptors, e.g. masks, describing to which cores those messages have to be delivered. Then why is the slot number equal to the core number?

You have basically two scenarios for an intercore am_send. You can send to a specific module, which reserves one slot - for the core, that module runs on. Or you can send to "any", which reserves, again, one slot per core with consumer modules on. Since even if multiple consumers reside on a core only 1 slot is reserved for that route, only CORE_COUNT slots are required for a message to be sent. This of course assumes, that all callbacks are handled before next message is sent. In an unlikely event of multiple multicore messages being sent at the same time (meaning slots are not freed), an error is returned and is expected to be handled by the producer. There are expected flows described in the internal intel docs. If a receiver is expected to do heavy processing, it should prepare a callback queue for incoming messages not to block the ams.

There are expected flows described in the internal intel docs.

src/lib/ams.c is not located in an Intel-specific directory. Should it?

This is a purely software feature, not hardware dependent. Should it?
Every platform can use it, if it were platform specific, than the code would be duplicated per platform. AFAIK it is agreed that sof is moving towards async messages instead of notifications.

ah, so, let me check if I understood this correctly:

messages aren't queued, so there cannot be more than one message in processing at any time

messages can only be directed either to 1 specific instance of a module on a specific core, or to "any" which means - all registered consumers / modules on all cores

If that's correct, then

you have to protect sending of messages well - two cores might decide to send messages at the same time

in principle you only need an array of booleans, indicating whether a message is pending for the respective core or not. That array can either be a bitmask, in which case you'd have to protect it, or a real bool msg_pending[CONFIG_CORE_COUNT];

Is that right?

Why do you want to make those assumptions? Async messages are asynchronous and don't care about the received processing part. We have to satisfy the transport layer conditions: receiver exists, there is a free slot to pass the message. There's no need to consider what type of message it is or whether the message being sent is the same, but to a different core. Or if the previous one finished processing. Or if the processing is being handled properly on the receiving end. There's only the need to satisfy transport conditions. Do we have the receiver? Where is the receiver? Do we have a free slot to send this message to this core? If yes, no problem. If no, than return error to the producer.
Having multiple messages in slots at the same time may happen and is not necessarily a bad thing, as again - the sending transport part does not care about the message type, only routing.
Let's look at an example.
Two producers send given messages: A to cores 1 and 2 from core 0 and B to core 1 from core 0

Message A is stored in slot 0, routed to core 1

Message A is stored in slot 1, routed to core 2
Something takes too long (assumption is no heavy processing is done in context of Asynchronous Message), Message A is not processed on any of the cores yet.

Message B is stored in slot 2, routed to core 1

No issues there, everything works as expected. The sending function does not care what the message is, but only if there's enough place for it to be sent.

If Message A and B were not processed and third Message C is sent, than there is a problem. ams_send returns error, that should be handled by producer module, ie. retry until success.

@kfrydryx I don't want to make those assumptions. I want to have a clearly documented protocol (thanks, @marc-hb). If there can be multiple messages in flight - good. But then slots are per message. Then I see no reason why the number of slots should be equal to the number of cores. Further more, in your above example I don't understand why message A should occupy 2 slots, why doesn't it occupy just one slot with the bitmask equal to 6 == BIT(1) | BIT(2)? Documentation, please.

I'm sorry, I've fooled (and made fool of) myself. That explanation was just wrong. Slots behave differently than I was trying to explain it, it's been a long time since the logic was implemented.
Slots are strictly per message, have reference counter in ams_shared_context->slot_uses[CONFIG_CORE_COUNT]; and mark which core has already handled them in ams_shared_context->uint32_t slot_done[CONFIG_CORE_COUNT];. Sorry for the confusion.
Now, slots are not strictly tied to cores, as ams_push_slot searches for first free slot to accommodate for the message being sent, but this size makes it so all cores can send a message without conflicts if they all decide to send at the same time.
And yes, the documentation is lacking, great that the part @kv2019i mentioned (thanks!) was made public. When/if we agree on the final look of the async messages, I'd be more than happy to prepare an in-depth explanation of what's happening inside. If you have more issues with why do we even need such a mechanism with all of the zephyr goodies available, we'd need to find and contact a person responsible for the decision making.

Thanks @kfrydryx @lyakh . I think this is the trickiest aspect in the implementation. Maybe have a few examples of how slots are used in typical 1:1 and 1:M cases. So to @lyakh 's original question, number of slots is not related to core count. but a slot count matching number of cores has been a practically proven good value. Maybe make this a Kconfig (as the slots do take resources) and have default set to the core count?

@kfrydryx

this size makes it so all cores can send a message without conflicts if they all decide to send at the same time.

that means, that there's one slot per sender. So make the sending core use the slot with the index equal to the core ID and throw away all the free slot searching.

kv2019i · 2023-02-02T19:49:38Z

Btw, @kfrydryx I didn't see this linked anywhere yet but we do have this https://thesofproject.github.io/latest/architectures/firmware/sof-zephyr/mpp_layer/async_messaging.html . Probably should be as a link in the pull request description. I think this would help to understand context. FYI @plbossart @lyakh @dbaluta who have asked about introduction details. (And no, I didn't know we had such documentation already up at sof-docs either )

lyakh · 2023-02-03T07:41:30Z

Btw, @kfrydryx I didn't see this linked anywhere yet but we do have this https://thesofproject.github.io/latest/architectures/firmware/sof-zephyr/mpp_layer/async_messaging.html . Probably should be as a link in the pull request description. I think this would help to understand context. FYI @plbossart @lyakh @dbaluta who have asked about introduction details. (And no, I didn't know we had such documentation already up at sof-docs either )

Thanks, @kv2019i that helps. @kfrydryx how about just using semaphores and not messing around with IDC? We are running on top of a rather advanced multi-threading multi-CPU capable OS, why not use it. Can we register consumers not with callbacks but with a semaphore, so that when producers signal consumers, they just signal those semaphores thus making it truly asynchronous. BTW, Zephyr even has sockets, maybe just use those? They even have socketpairs. Maybe we can really use an existing tested and verified API instead of adding another one. They also have kernel/mailbox.c and they have a zbus for message passing... And possibly more IPC APIs.

plbossart · 2023-02-03T15:19:19Z

@kfrydryx @kv2019i @lyakh The documentation [1] states this "The service exposes an external interface to the host". This seems like an ABI change, what exactly is exposed to the host driver and what can be configured?

[1] https://thesofproject.github.io/latest/architectures/firmware/sof-zephyr/mpp_layer/async_messaging.html .

kfrydryx · 2023-02-08T10:23:11Z

@plbossart This will be coming later in the form of Host Async Message Gateway. That's basically a module responsible for forwarding Asynchronous Messages to the host and the other way round.

kv2019i

A few comments inline. Took a brief look at the low-level tools (k_sem, k_event) and it's true these only cover a small part of the puzzle. The new zbus (https://docs.zephyrproject.org/latest/services/zbus/index.html) is a closer match, but the public interface to consumers and providers is very different from AMS. Both are still worth a look. If we don't go that path, it would seem improving the docs on the slot semantics would be the key thing left (some bits already clarified).

kv2019i · 2023-02-08T18:50:54Z

src/lib/ams.c

I think this is probably iffiest bit. We in general would like to move to native Zephyr APIs for scheduling. For audio tasks, the SOF scheduler does makes sense (although there is some debate) as a generic scheduler will not be aware of the audio pipeline and to do any scheduling decision whether the audio graph topology has an influence, an audio specific layer is needed on top of the RTOS layer.

But here, for generic routing of notification, this is really a task Zephyr scheduler can natively do. We don't need to route via schedule_task_init_ll() here I think.
FYI @marcinszkudlinski @aborisovich @lyakh

kv2019i · 2023-02-08T18:59:21Z

src/include/sof/lib/ams.h

Thanks @kfrydryx @lyakh . I think this is the trickiest aspect in the implementation. Maybe have a few examples of how slots are used in typical 1:1 and 1:M cases. So to @lyakh 's original question, number of slots is not related to core count. but a slot count matching number of cores has been a practically proven good value. Maybe make this a Kconfig (as the slots do take resources) and have default set to the core count?

lgirdwood · 2023-02-09T15:08:14Z

Lets make sure our direction keeps the external API for compatibility (new APIs could be allowed) and integrate to backends in Zephyr (to save memory and reduce support effort),

src/arch/host/lib/ams.c

aborisovich · 2023-02-06T12:38:17Z

src/include/sof/lib/ams.h

It looks like you define here a 0 elements uint32_t array?
This is undefined behavior:

ISO 9899:2011 6.7.6.2:
If the expression is a constant expression, it shall have a value greater than zero.

I know this is C11 standard but it was just not mentioned in C99. The behavior is the same and we are going soon to improve C standard as far as #7027 will be merged.

Great article from Google's security team about making C arrays safer: https://lwn.net/Articles/921799/

ISO 9899:2011 6.7.6.2:

Does this array declarator section apply to flexible array members too? They're not regular arrays.

Also: what in C11 made pad[0] OK again? That's a surprise to me.

AFAIK pad[0] is a pre-C99 backwards compatibility hack. pad[0] incompatible with safety features like -fstrict-flex-arrays (https://people.kernel.org/kees/bounded-flexible-arrays-in-c) Simply replace it with pad[]. If all the compilers we use accept the C99 standard pad[] then everything is OK - and better.

I'm not really aware of how that's handled by different standards/compiler versions. I'll go with your suggestion there @marc-hb.

kfrydryx · 2023-02-22T15:03:54Z

Rebased due to merge conflicts. PR with usage is in draft: #7158

mwasko · 2023-02-22T16:00:50Z

@plbossart, @lyakh please unblock the PR or list the technical issues that still need to be addressed before merge. As agreed, the improved AMS design can be implemented incrementally over time base on the proposed solution.

plbossart

This PR still feels rushed and could use a number of review cycles to clearly explain some of its details.

I am not not going to reject but not going to approve either. This is not the way a publicly-visible PR should be handled.

I would never accept a PR like this if it was at the kernel level.

plbossart · 2023-02-22T16:29:48Z

src/idc/idc.c

usually headers are not conditionally included?

ack, this was necessary in one of the previous iterations, but should not have any impact currently.

plbossart · 2023-02-22T16:30:41Z

src/idc/idc.c

shouldn't this be a one-time log, this could swamp the logs with repeated 'AMS not enabled"

Not really, this should not happen when ams is used correctly. If only ams interface functions are used without CONFIG_AMS, than this will never be called. Nevertheless such case should be reported when looking at idc layer separately.

plbossart · 2023-02-22T16:31:59Z

src/idc/idc.c

use a #define and explain what this is about.

sure, is a masking macro ok?

plbossart · 2023-02-22T16:32:41Z

src/include/sof/lib/ams.h

2023 for consistency with other code in this PR?

plbossart · 2023-02-22T16:33:51Z

src/include/sof/lib/ams.h

not really a wildcard but rather a reserved value.

plbossart · 2023-02-22T18:11:53Z

src/lib/ams.c

there should be a static inline function for this, having compiler-dependent stuff in the middle of a long file is not a good practice

I've checked out the output files and it seems like clz compiles to nsau anyway, so this distinction might not be actually needed.

plbossart · 2023-02-22T18:15:25Z

src/platform/meteorlake/include/platform/lib/ams.h

I can't believe there is no known size here. If this AMS thingy has been tried and tested, surely there's a better answer than "this is what is is and will be modified later'.

plbossart · 2023-02-22T18:16:41Z

src/platform/meteorlake/include/platform/lib/ams.h

I really don't see how you can have a M:N mapping, since there is ONE producer_register. it's 1:N or you are missing a lot of explanations in this PR.

plbossart · 2023-02-22T18:17:28Z

src/platform/meteorlake/include/platform/lib/ams.h

Then if it's a value that was experimented with before, add a comment to that effect. Otherwise this seems completely arbitrary and as you write clearly not the best way to make progress with open-source reviews

plbossart · 2023-02-22T18:19:58Z

src/platform/tigerlake/include/platform/lib/ams.h

the values look exactly the same as for MTL, which begs the question why those values need to be platform-specific?

Let's have them defined once. If there's need to distinguish between platforms/scenarios we may add a config for those values

mwasko · 2023-02-23T07:58:15Z

This PR still feels rushed and could use a number of review cycles to clearly explain some of its details.

@plbossart this PR is in review since September 2022, @kfrydryx is answering questions for months now. If there is still something unclear I am open to have a call and clarify the details. There was also architecture documentation chapter added to sof-docs.

I am not not going to reject but not going to approve either. This is not the way a publicly-visible PR should be handled.

I would never accept a PR like this if it was at the kernel level.

That is why I asked for a list of technical review comments that has not been yet addressed according to you. I want to have a clear line of sight what we are missing to move forward.

overruled by a majority opinion

lyakh · 2023-02-23T08:08:36Z

@plbossart, @lyakh please unblock the PR or list the technical issues that still need to be addressed before merge. As agreed, the improved AMS design can be implemented incrementally over time base on the proposed solution.

@mwasko I've dismissed my request for change. I still don't think this is a good addition to the SOF code base. Both because of the implementation itself and of the chosen approach like the use of IDC. For that reason, that I think it should be rewritten on top of proper RTOS APIs I'd rather not spend more time reviewing this version.

No time to review this initial draft further - I'll make time when the solution is based on Zephyr native APIs

Add Asynchronous Messaging Service. This can be used to communicate between modules. Asynchronous Messages are one-way messages from one producer to one or multiple registered consumers. Messages between modules on different cores are sent through IDC. All inter-core communication must be proxied by the main core. Signed-off-by: Krzysztof Frydryk <krzysztofx.frydryk@intel.com>

Set config to enable AMS for ACE1x platform and add mtl specific values. Signed-off-by: Krzysztof Frydryk <krzysztofx.frydryk@intel.com>

Set config to enable AMS for CAVS25 platform. Signed-off-by: Krzysztof Frydryk <krzysztofx.frydryk@intel.com>

dbaluta reviewed Sep 27, 2022

View reviewed changes

src/idc/idc.c Outdated Show resolved Hide resolved

lyakh requested changes Sep 28, 2022

View reviewed changes

kfrydryx force-pushed the ams_pr branch from 19cd559 to 18a16a5 Compare October 2, 2022 23:03

kfrydryx force-pushed the ams_pr branch from 18a16a5 to a3aca72 Compare October 3, 2022 00:20

lgirdwood reviewed Oct 4, 2022

View reviewed changes

kfrydryx force-pushed the ams_pr branch from a3aca72 to 8d28a0e Compare October 6, 2022 07:18

kfrydryx force-pushed the ams_pr branch from 8d28a0e to b582669 Compare October 11, 2022 10:26

kfrydryx marked this pull request as ready for review October 11, 2022 10:41

kfrydryx requested review from aborisovich, lbetlej, libinyang, marc-hb, mmaka1 and plbossart as code owners October 11, 2022 10:41

lyakh reviewed Oct 12, 2022

View reviewed changes

src/platform/Kconfig Outdated Show resolved Hide resolved

kv2019i reviewed Oct 12, 2022

View reviewed changes

src/idc/idc.c Outdated Show resolved Hide resolved

src/platform/Kconfig Outdated Show resolved Hide resolved

src/lib/ams.c Outdated Show resolved Hide resolved

src/lib/ams.c Outdated Show resolved Hide resolved

kfrydryx force-pushed the ams_pr branch 6 times, most recently from 66bf758 to b8987a2 Compare October 14, 2022 15:26

lyakh reviewed Oct 17, 2022

View reviewed changes

app/boards/intel_adsp_ace15_mtpm.conf Outdated Show resolved Hide resolved

lyakh reviewed Feb 1, 2023

View reviewed changes

kv2019i reviewed Feb 8, 2023

View reviewed changes

aborisovich reviewed Feb 15, 2023

View reviewed changes

mwasko approved these changes Feb 22, 2023

View reviewed changes

kfrydryx force-pushed the ams_pr branch from 15f9978 to 464fb11 Compare February 22, 2023 14:52

kfrydryx requested a review from iuliana-prodan as a code owner February 22, 2023 14:52

iganakov mentioned this pull request Feb 22, 2023

ams: add support for ams to kd and kpb modules #7158

Merged

lgirdwood approved these changes Feb 22, 2023

View reviewed changes

plbossart reviewed Feb 22, 2023

View reviewed changes

kfrydryx force-pushed the ams_pr branch from 464fb11 to b10001c Compare February 27, 2023 14:41

kfrydryx requested a review from abonislawski as a code owner February 27, 2023 14:41

kfrydryx force-pushed the ams_pr branch 3 times, most recently from 133bdce to c37c555 Compare February 27, 2023 16:11

kfrydryx added 3 commits February 27, 2023 17:26

platform: mtl: Enable AMS

664e710

Set config to enable AMS for ACE1x platform and add mtl specific values. Signed-off-by: Krzysztof Frydryk <krzysztofx.frydryk@intel.com>

platform: tgl: Enable AMS

434724a

Set config to enable AMS for CAVS25 platform. Signed-off-by: Krzysztof Frydryk <krzysztofx.frydryk@intel.com>

kfrydryx force-pushed the ams_pr branch from c37c555 to 434724a Compare February 27, 2023 16:27

mwasko merged commit ec974a8 into thesofproject:main Feb 28, 2023

ams: Add initial AMS implementation #6343

ams: Add initial AMS implementation #6343

Uh oh!

Conversation

kfrydryx commented Sep 27, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

plbossart commented Sep 28, 2022

Uh oh!

kfrydryx commented Oct 2, 2022

Uh oh!

lyakh commented Oct 4, 2022

Uh oh!

lgirdwood left a comment

Choose a reason for hiding this comment

Uh oh!

kfrydryx commented Oct 6, 2022

Uh oh!

lgirdwood commented Oct 10, 2022

Uh oh!

lgirdwood commented Oct 10, 2022

Uh oh!

lyakh left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kv2019i left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kv2019i commented Feb 2, 2023

Uh oh!

lyakh commented Feb 3, 2023

Uh oh!

plbossart commented Feb 3, 2023

Uh oh!

kfrydryx commented Feb 8, 2023

Uh oh!

lyakh left a comment •

edited

Loading

marc-hb Feb 22, 2023 •

edited

Loading