-
Notifications
You must be signed in to change notification settings - Fork 349
userspace: Preparation for userspace loadable modules support #10103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/include/sof/audio/component.h
Outdated
| */ | ||
| dev = rzalloc(SOF_MEM_ZONE_RUNTIME_SHARED, 0, SOF_MEM_CAPS_RAM, bytes); | ||
| dev = drv_heap_rzalloc(drv->drv_heap, SOF_MEM_ZONE_RUNTIME_SHARED, 0, SOF_MEM_CAPS_RAM, | ||
| bytes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the size is expected to be passed to the firmware from topology thesofproject/linux#5460
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
module API already has a memory allocator API, we should repurpose it for user as ALL modules will be user.
zephyr/lib/userspace_helper.c
Outdated
| return NULL; | ||
| } | ||
|
|
||
| sys_heap_init(drv_heap, mem, DRV_HEAP_SIZE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wow... heap on heap... I guess, why not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this is good - because we can split and fine tune to each modules exact needs. i.e. minimize any wastage.
lgirdwood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets rework this on top of the memory simplification and reuse the module API for module memory allocations (as it can already track).
We also need to consider that all modules will be USER in the long term.
src/include/sof/audio/component.h
Outdated
| */ | ||
| dev = rzalloc(SOF_MEM_ZONE_RUNTIME_SHARED, 0, SOF_MEM_CAPS_RAM, bytes); | ||
| dev = drv_heap_rzalloc(drv->drv_heap, SOF_MEM_ZONE_RUNTIME_SHARED, 0, SOF_MEM_CAPS_RAM, | ||
| bytes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
module API already has a memory allocator API, we should repurpose it for user as ALL modules will be user.
d556d32 to
0ac161c
Compare
lgirdwood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@softwarecki I think we need to look at RTIO as a method to simplify buffers and IPC and also avoid need for shared userspace mapping and for a worker thread.
I think the 1st step should be creation of a global user heap (size set by Kconfig) which can then be sub divided into smaller per memory domain user heaps (which I think your PR already has started).
zephyr/include/rtos/alloc.h
Outdated
| #define SOF_MEM_FLAG_USER BIT(6) | ||
| #define SOF_MEM_FLAG_USER BIT(6) | ||
| /** \brief Indicates that if we should return shared user memory address. */ | ||
| #define SOF_MEM_FLAG_USER_SHARED BIT(7) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think we need this shared user mapping with RTIO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you suggesting using RTIO to exchange audio data between the SOF and the userspace module? I think shared memory would be simpler and more efficient for this purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@softwarecki I think we do plan to use shared memory with RTIO - you attach pointers into those shared buffers to RTIO messages. AFAIU the plan is to use RTIO as a kind of a poll (2) call whereby user-space threads would be sleeping waiting for RTIO from one of the two sourced - audio data or IPC. This way you can have a single loop in the thread, where you wait for an event and once it arrives you can check what the source of the event was.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@softwarecki ...but, I'm not sure we need memory that is shared between the kernel and all user-space modules. Modules shouldn't access each-other's memory, right? So each module should have its own heap, which will also be accessible from the kernel. When we need to access kernel private data from the user-space, I'd expect that to usually go via system calls
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modules in different domains should not have access to each others memory except where we have a shared buffer where access can be granted via k_object_access_grant()
See:
https://docs.zephyrproject.org/apidoc/latest/group__usermode__apis.html#ga94087bedf96fe2a2bea437d3d585ca22
Example RTIO producer/consumer:
zephyrproject-rtos/zephyr@dea4cca
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shared memory between userspace modules is necessary to enable data exchange between them. For example, when connecting two userspace dp modules together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@softwarecki I don't think binding two DP modules is supported ATM, currently an LL module is always required on each side of DP modules, are there plans to change that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@softwarecki I don't think binding two DP modules is supported ATM, currently an LL module is always required on each side of DP modules, are there plans to change that?
absolutely it is, PR is on the way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have to share only the minimum across 3rd part modules and they should not be able to see each others "shared memory". i.e. to load an untrusted 3P DP module X.
- Infra allocates the userspace sub heap for X (from the global user heap). This includes X's stack too. Both max heap size and stack size come from topology.
- Infra then binds DP and allocates (from infra heap) the shared IPC buffers and audio buffers (as kobjects) for IO between infra and module X.
- Infra then only shares the IPC/audio buffers as kobjects in 2) using
k_object_access_grantto X.
If we have 2 untrusted 3P DP modules X, Y then the same flow still applies, with addition that infra shares the audio buffers with X,Y using k_object_access_grant. This way X and Y cant see each others IP (or anything else)C, but can only correspond via shared audio buffer.
d49a525 to
b2ae091
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces preparation for userspace loadable modules support by implementing memory isolation mechanisms. It adds a shared memory heap for communication between kernel and userspace modules, and a driver heap for module-specific allocations.
Key changes:
- Introduces MMU shared memory heap that splits the system heap into shareable and non-shareable sections
- Adds driver heap support for non-privileged modules to ensure memory isolation
- Updates memory allocation patterns throughout the codebase to use appropriate heap types
Reviewed Changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| zephyr/lib/userspace_helper.c | New implementation of driver heap management functions for userspace modules |
| zephyr/lib/alloc.c | Splits system heap and adds shared heap support with CONFIG_USERSPACE |
| zephyr/include/rtos/userspace_helper.h | Header defining driver heap API for Zephyr userspace support |
| zephyr/include/rtos/alloc.h | Adds SOF_MEM_FLAG_USER_SHARED flag and shared heap accessor functions |
| zephyr/Kconfig | Configuration options for shared heap and userspace module heap sizes |
| zephyr/CMakeLists.txt | Adds userspace_helper.c compilation and -mno-global-merge flag |
| src/library_manager/lib_manager.c | Removes static qualifier from lib_manager_get_instance_bss_address |
| src/ipc/ipc4/helper.c | Refactors ring buffer creation logic and fixes variable shadowing |
| src/include/sof/lib_manager.h | Adds declaration for lib_manager_get_instance_bss_address |
| src/include/sof/audio/ring_buffer.h | Updates ring_buffer_create to accept component device parameter |
| src/include/sof/audio/module_adapter/module/generic.h | Separates generic_module_is_ready_to_process function |
| src/include/sof/audio/component.h | Adds drv_heap field to comp_driver and updates comp_alloc |
| src/audio/module_adapter/module_adapter.c | Updates buffer allocations to use appropriate memory flags |
| src/audio/buffers/ring_buffer.c | Updates ring buffer creation to use device-specific memory allocation |
| posix/include/rtos/userspace_helper.h | POSIX stub implementation of userspace helper functions |
| posix/include/rtos/alloc.h | POSIX version of SOF_MEM_FLAG_USER_SHARED flag definition |
Comments suppressed due to low confidence (3)
zephyr/lib/alloc.c:298
- [nitpick] The variable name
_unused_ram_start_markerdoes not clearly convey its purpose as a virtual heap start marker. Consider renaming to something more descriptive like_virtual_heap_start_marker.
uintptr_t virtual_heap_start = POINTER_TO_UINT(
src/ipc/ipc4/helper.c:607
- [nitpick] The variable name
srcis ambiguous and could be confused with the existingsourcecomponent variable. Consider using a more descriptive name likeaudio_srcorbuffer_source.
struct sof_source *src = audio_buffer_get_source(&buffer->audio_buffer);
src/ipc/ipc4/helper.c:608
- [nitpick] The variable name
snkis an abbreviation and could be confused with the existingsinkcomponent variable. Consider using a more descriptive name likeaudio_sinkorbuffer_sink.
struct sof_sink *snk = audio_buffer_get_sink(&buffer->audio_buffer);
b2ae091 to
9998f6e
Compare
e94ce9f to
8ceeded
Compare
The driver heap is used before the module is created. For example, the |
The total heap memory can be divided into kernel and user.
|
lgirdwood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some opens that we need to align, the main thing is that we should not have a shared heap that is shared between all untrusted modules for audio context and buffers (unless I misunderstand this PR).
We now have the APIs in Zephyr whereby we can only share the minimum needed between infra and untrusted modules (and this may not have been the case when this work started).
My main opens are:
-
We only share only needed and no more.
-
The driver heap naming is confusing. What is this actually used for ?
-
We should not hard code per module heap/stack size, yes we can say if topology does not set these values then we can have a default in Kconfig.
zephyr/include/rtos/alloc.h
Outdated
| #define SOF_MEM_FLAG_USER BIT(6) | ||
| #define SOF_MEM_FLAG_USER BIT(6) | ||
| /** \brief Indicates that if we should return shared user memory address. */ | ||
| #define SOF_MEM_FLAG_USER_SHARED BIT(7) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have to share only the minimum across 3rd part modules and they should not be able to see each others "shared memory". i.e. to load an untrusted 3P DP module X.
- Infra allocates the userspace sub heap for X (from the global user heap). This includes X's stack too. Both max heap size and stack size come from topology.
- Infra then binds DP and allocates (from infra heap) the shared IPC buffers and audio buffers (as kobjects) for IO between infra and module X.
- Infra then only shares the IPC/audio buffers as kobjects in 2) using
k_object_access_grantto X.
If we have 2 untrusted 3P DP modules X, Y then the same flow still applies, with addition that infra shares the audio buffers with X,Y using k_object_access_grant. This way X and Y cant see each others IP (or anything else)C, but can only correspond via shared audio buffer.
| SOF_MEM_FLAG_USER | SOF_MEM_FLAG_COHERENT : SOF_MEM_FLAG_USER; | ||
|
|
||
| mod = rzalloc(flags, sizeof(*mod)); | ||
| mod = drv_heap_rzalloc(drv->drv_heap, flags, sizeof(*mod)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the driver heap the private per module userspace heap or the shared user heap or something else ?
zephyr/lib/userspace_helper.c
Outdated
| return NULL; | ||
| } | ||
|
|
||
| sys_heap_init(drv_heap, mem, DRV_HEAP_SIZE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this is good - because we can split and fine tune to each modules exact needs. i.e. minimize any wastage.
|
|
||
| module_adapter_check_data(mod, dev, sink); | ||
|
|
||
| memory_flags = dev->drv->drv_heap ? SOF_MEM_FLAG_USER_SHARED : SOF_MEM_FLAG_USER; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if source->domain != sink->domain we should be able to allocate memory as a kobject using k_object_alloc_size and share via k_object_access_grant. This would guarantee that audio data would not be shared outside of pipeline to any other untrusted module. i.e. we share the minimum.
|
@lgirdwood This PR is focused solely on implementing the requirements defined in the HSD - specifically, separating code and data between userspace modules. It does not cover audio signal protection between modules, which is part of a later stage. Although the work has been split into multiple phases (HSD, Stage 1–3b), I feel there’s pressure to prematurely introduce changes that belong to the final stage. Regarding RTIO: its intended use is for driver communication, and it doesn’t fit well in this context. The idea of having a component mimic a driver just to use RTIO - with the DP thread communicating through it - seems forced. Moreover, RTIO queues cannot currently be created dynamically, which limits flexibility. There are better mechanisms available for this purpose. On the topic of kobject usage in Zephyr: it’s not enough to allocate memory using The driver heap is a private memory region for the module. Other modules don’t have access to it. The name reflects its association with the driver, not a specific module instance. We’ve assumed that isolating instances of the same module isn’t necessary, since they already share As for the IPC-related heap/stack size: it’s currently unused. If it becomes relevant, I’ll adapt the code accordingly. However, I’d appreciate not being asked to implement ideas that weren’t discussed or agreed upon. The IPC structures were extended unilaterally, which broke compatibility with Windows - something we should avoid. |
|
@softwarecki AFAIU the advantage of RTIOs is their minimalist use of syscalls - only one is required per signaling event. Indeed their static allocation isn't very convenient, but (1) zephyrproject-rtos/zephyr#92852 improves that a bit, adding a pool of RTIO contexts, and (2) - again, if my understanding is correct - if the pool doesn't fit our needs, we still can resort to allocating them dynamically the |
Sorry, should be no pressure, but the aim is to make sure the security architecture is sound. Its not that obvious given we probably need better names for "driver_heap" and "shared_heap".
Your right, that dynamic kobjects are not in last Zephyr release but they are now upstream for next Zephyr release. RTIO is ideal for producer/consumer IO across cores/threads/domains, it's simple and integrates well with Zephyr kernel and our use cases. The alternative would be to reinvent the wheel for a lot of functionality that RTIO provides.
So this is a balance between security and performance, yes kobjects have a small cost but we get the security vs any memory region that is shared for everyone. Can you confirm if the "shared_heap" is used for audio buffers and anything else plus who has access to it ?
So IIUC, the driver_heap is a module class/type heap, i.e. all instances of module X will use "driver_heap" X ? if so, can we rename to module_class or module_type heap ?
This is new, @jsarha will work on integrating this topology stack/heap size after this is merged. IPCs should be extended in a manner that does not break backwards compatibility, If anyone wants to reserve a IPC enum, mask, type etc they need to upstream it here to avoid conflicts (they can use "RESERVED" if non public). So I'm good to approve if you can:
static int user_get_memory_region(dev)
{
#if CONFIG_SHARED_HEAP
memory_flags = dev->drv->drv_heap ? SOF_MEM_FLAG_USER_SHARED : SOF_MEM_FLAG_USER;
#else
memory_flags = SOF_MEM_FLAG_USER;
#endif
return memory_flags;
}This way we can integrate into a userspace infra heap with a Kconfig. |
Regarding RTIO — its logic is actually the opposite of what we need. In RTIO, a userspace thread requests an operation from the kernel, and the kernel sends back a completion. Here, the kernel will initiate a request to a userspace thread, expecting it to perform an operation and return the result. While this can be worked around, is it really an elegant solution? How does it improve upon existing mechanisms in Zephyr like message queues, pipes, FIFOs, or mailboxes? Either way, this discussion seems off-topic and not directly related to the scope of this PR. |
yes, it's simple to work around that. |
In case of CONFIG_USERSPACE enabled, all memory is accessible form kernel space. For accessing it from user space we need to grant access to its explicitly. However there are some infrastructure components that could be shared between kernel and user spaces. This patch splits system heap into two sections. One section which will contatin shareable data will be located in separate memory partition. Each non-privilidged module will obtain it in its private memory domain. Signed-off-by: Jaroslaw Stelter <Jaroslaw.Stelter@intel.com> Signed-off-by: Adrian Warecki <adrian.warecki@intel.com>
8ceeded to
64820d3
Compare
Non-privileged modules should be isolated each other. It means that module related data should be located in memory available for kernel and the owner module only. To manage access all such data should be allocated from single, separate region. This patch introduces private module heap. It is allocated from regular heap, assigned to module thread and then used to allocate all infrastructure data associated with the module. Other non-privileged modules will not have access to this region. Currently it is arbitrary defined to use 5 pages, what could change. Signed-off-by: Jaroslaw Stelter <Jaroslaw.Stelter@intel.com> Signed-off-by: Adrian Warecki <adrian.warecki@intel.com>
Allocates audio data buffer from shared memory heap. The security reqirements assume that the non-privileged modules data and code should be protected from other non-privileged modules access. The audio data could be accessible. This change is effective only when CONFIG_USERSPACE=y. Signed-off-by: Jaroslaw Stelter <Jaroslaw.Stelter@intel.com> Signed-off-by: Adrian Warecki <adrian.warecki@intel.com>
…rspace Allocates audio data buffer from shared memory heap. The security reqirements assume that the non-privileged modules data and code should be protected from other non-privileged modules access. The audio data could be accessible. This change is effective only when CONFIG_USERSPACE=y. Signed-off-by: Adrian Warecki <adrian.warecki@intel.com>
Extract generic code from the module_is_ready_to_process function which can be used by other adapters, in particular the userspace module adapter. Signed-off-by: Adrian Warecki <adrian.warecki@intel.com>
Remove static keyword from lib_manager_get_instance_bss_address function and add function declaration. Signed-off-by: Adrian Warecki <adrian.warecki@intel.com>
When CONFIG_USERSPACE enabled, the merging of global symbols ends with SOF failures. The merging must be disabled when CONFIG_USERSPACE is set. The patch fix also disabling -fno-inline-functions option which works differently for clang, completely disabling function inlining. Signed-off-by: Jaroslaw Stelter <Jaroslaw.Stelter@intel.com>
64820d3 to
d3adf65
Compare
|
@lgirdwood review pls |
Adding @teburd but I don't think it matters who the RTIO event producer is and who the RTIO consumer is wrt user/supervisor mode. Our main use of RTIO is blocking a thread(s) on asynchronous audio and messaging events from other domains and cores. I dont think queues, fifos, pipes etc will give us the 2:1 mappings and cross domain/core IO flows without reinventing a lot of code already in RTIO core. i.e. RTIO is infra that saves us re-inventing infra around the simpler queues, pipes etc. |
The goal as @lyakh noted for SOFs use cases is to minimize syscalls, much like io_uring enables for Linux. You can batch any number of requests and wait on completion of any one of them. The alternative in Zephyr is using k_poll to wait on many things, but this requires multiple syscalls and branching. One syscall to wait on some number of things which subsequently get branched over to figure out which one in the set triggered. Followed by, at minimum 1 syscall per item that triggered, to get the actual information. RTIO has the potential to do this in a single syscall. There's a benchmark in the tree to show the cost of syscalls with user mode but to the best of my knowledge it's not exactly free of charge. So doing this as little as possible is likely an easy way to save some cycles. In all cases the user mode thread needs to wait on something regardless. Whether its viewed as a read request to the audio source waiting on completion or waiting on a write request from an audio source I don't think really changes much myself. The user mode thread needs to wait on something regardless. Who owns the buffer should be answered though. There are repercussions of who owns it. If the kernel owns it, the user mode thread will need to signal back to the kernel (syscall) that the buffer is available to use again at some point presumably. If the user mode thread owns it, its more akin to a move operation, the buffer is moved to the kernel to read into and moved back to the user mode thread when the read completes. No additional signaling is needed. Dynamic allocation is absolutely possible, it would only require a bit of C code to initialize the simple data structures involved. A couple queues really. With user mode dynamic allocation involves creating dynamic kobjects placed into a linked list. Every syscall then involves walking this list to find and check the validity of the kobject. Static kobjects are in a perfect hash built at compile time. Some more cycle savings perhaps. It's maybe good to think of this in Linux analogies... An iodev is a file descriptor you can read and write to, /dev/iodev/ipc /dev/iodev/audio/some_audio_node then become things you read/write data to from you user mode thread. Notably the buffer you read/write with is shared not copied between contexts. Only the operation descriptor (read/write/etc) is copied from user mode threads. It's an option, and one I'm of course biased to say is a good option, but if the SOF consensus ends up being its not a good one then by all means! If that's the case please do let me know why so I can perhaps improve things for future users. |
Introduce MMU shared memory heap commit splits system heap into two sections. One section which will contatin shareable data will be located in separate memory partition. Each non-privilidged module will obtain it in its private memory domain. In case of CONFIG_USERSPACE enabled, all memory is accessible form kernel space. For accessing it from user space we need to grant access to its explicitly. However there are some infrastructure components that could be shared between kernel and user spaces.
Introduce driver heap for non-privileged modules. Non-privileged modules should be isolated each other. It means that module related data should be located in memory available for kernel and the owner module only. To manage access all such data should be allocated from single, separate region.
Separate generic_module_is_ready_to_process function in module_adapter.
Remove static keyword from lib_manager_get_instance_bss_address function and add function declaraton.
Allocates audio data buffer from shared memory heap.