-
Notifications
You must be signed in to change notification settings - Fork 349
llext: add support for PTL DRAM detached sections with MMU enabled #9893
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
lgirdwood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one question if map would ever fail.
src/library_manager/llext_manager.c
Outdated
| return sys_mm_drv_unmap_region(aligned_vma, ALIGN_UP(pre_pad_size + size, PAGE_SZ)); | ||
| } | ||
|
|
||
| static void llext_manager_detached_map(void __sparse_cache *vma, size_t size, uint32_t flags) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this ever fail ? do we need to return an int ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
arch_mem_map returns void so it cant fail in a way we could handle, inside there is only one check for size but it will lead to panic anyway
if (size == 0) {
LOG_ERR("Cannot map physical memory at 0x%08X: invalid "
"zero size", (uint32_t)phys);
k_panic();
}
| /* Use cached virtual address */ | ||
| uintptr_t va = POINTER_TO_UINT(sys_cache_cached_ptr_get(aligned_vma)); | ||
|
|
||
| arch_mem_map(aligned_vma, va, ALIGN_UP(pre_pad_size + size, PAGE_SZ), flags); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In principle you don't have to page-align the size, arch_mem_map() does it already, but doesn't hurt too much
| void __sparse_cache *vma, | ||
| size_t size) | ||
| { | ||
| unsigned int i; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's at least place this whole function under #if CONFIG_MMU. And I'm also considering making this simpler and faster by reducing the generality and making it more restricted to our use-case: only handle 2 detached sections - executable .cold and data .coldrodata. I'd add two fields to struct lib_manager_module to identify those two sections, e.g. store their indices. Like in llext_manager_load_data_from_storage() check shdr.sh_flags to see if it's code or data and assign respective index. Then here just do
llext_get_section_info(ldr, ext, mctx->cold_idx, &shdr, &s_region, &s_offset);
llext_get_region_info(ldr, ext, s_region, NULL, ®ion_addr, NULL);
llext_manager_detached_unmap(...);
and similar for mctx->coldrodata_idx. Yes, you'd need to pass mctx to this function too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was my plan at first, but then I realized we can simply check sections the same way we do at the beginning and make it much more generic: whatever we find, we will also find later to unmap.
- added ifdef for MMU
d7f87aa to
b0e601e
Compare
|
Rebased (all required PRs have been merged). |
432d36e to
d4fdfda
Compare
|
no CI PTL results... let's re-test |
|
SOFCI TEST |
| if (s_region != region) | ||
| continue; | ||
|
|
||
| /* unmap detached sections (will be outside requested VMA area) */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you have to update this PR, better replace "unmap" with "restore default flags"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets do this incrementally - CI passing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spoke too soon - I reloaded and can see a CI timeout on some tests on PTL. Will rerun to be sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lgirdwood those IPC timeouts on PTL are also present in "main," seem to be caused by xruns. So, unrelated to this PR, but have to be fixed
UPDATE: sorry, I might have been wrong, upon closer examination these failures do look different
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was unstable with many timeouts in different tests each run, with reverted 'ipc: move most functions to run from DRAM' its now stable in linux CI, lets firstly merge base PTL DRAM enablement and later stabilize IPC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abonislawski ok, so you're referring to jenkins failures, not to your local or "Intel Internal" tests. One of the failing jenkins runs was https://sof-ci.01.org/sofpr/PR9893/build11697/devicetest/index.html. In it we see the firmware running into an error:
[00:00:02.029,105] <err> ipc: ipc_comp_free: ipc_comp_free(): comp id: 0x3 state is 5 cannot be freed
[00:00:02.029,115] <err> ipc: ipc_pipeline_free: ipc_pipeline_free(): module free () failed
[00:00:02.029,128] <err> ipc: ipc_cmd: ipc4: FW_GEN_MSG failed with err 12
So that should be debuggable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, only linux CI, not a problem for another one in both configurations. This specific fail with pass in most runs and I really dont want to debug all of this, we need DRAM for modules on platforms with DRAM functionality
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I think reverting the DRAM IPC commit only hides the problem. I'd propose a test: update to the Zephyr version as in the PR (use the first commit as is), disable LLEXT and enable cold storage on PTL and run that through the CI. I expect that to pass. Then enable modules but remove all __cold attributes from them - that should pass too. So I suspect only when you enable __cold code in LLEXT modules, then your flag updating code breaks it. So maybe we still have some page overlaps? You're updating flags for sections, that shouldn't be touched?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unfortunately this scenario with __cold removed in src/audio also failed
|
SOFCI TEST |
|
No results for PTL - retest |
|
SOFCI TEST |
1 similar comment
|
SOFCI TEST |
d4fdfda to
2bbefab
Compare
lyakh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This revert wasn't here when I was approving this PR IIRC. Let's check why it's needed now
This patch will set permissions for detached sections if MMU is enabled Signed-off-by: Adrian Bonislawski <adrian.bonislawski@intel.com>
Enable CONFIG_COLD_STORE_EXECUTE_DRAM for ACE 3.0 Signed-off-by: Adrian Bonislawski <adrian.bonislawski@intel.com>
This reverts commit 0e53393. Signed-off-by: Adrian Bonislawski <adrian.bonislawski@intel.com>
2bbefab to
4c44e99
Compare
|
let's keep this test result - it was a rare all green PTL test https://sof-ci.01.org/sofpr/PR9893/build11747/devicetest/index.html |
|
SOFCI TEST |
|
Test results are good. |
Set MMU permissions for detached sections
+
Enable CONFIG_COLD_STORE_EXECUTE_DRAM on PTL
Required patches: