-
Notifications
You must be signed in to change notification settings - Fork 1.5k
arch/sim: avoid host-call being interrupted before getting errno #16742
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Sim use coroutine base on one thread in host to do switch context. but if we allow switch context with in one API (host-API and errno get), maybe the switch context from coroutine cause re-enter host-API call. Make the errno get behavior not work as expected. Signed-off-by: guanyi3 <guanyi3@xiaomi.com>
i feel this approach is very hard to maintain. it's simpler (and probably more efficient) to save/restore errno in the "interrupts" instead. what do you think? |
the interrupt must be disabled, since many host API can't work well if the interrupt happen in side their function.
your mean restore errno before enabling interrupt? It doesn't work since host main thread manage all nuttx thread and errno will lose if another nuttx thread switch in and call some host api which may overwrite the errno value before old thread has the chance to retrieve it. |
i meant to save/restore in interrupt handlers. |
Do you mean save/restore host errno in the interrupt handler? it can't fix the problem I describe here, since the problem isn't the signal handler change errno. The real problem is that sim use one host thread with setjmp/longjmp to simulate multiple nuttx thread and the race may happen like this:
|
Thanks for the feedback. You're right that saving/restoring errno in interrupt handlers is a common Unix practice
|
yes.
i don't understand why it can't fix the problem. do you mean we can't save/restore host errno while we can save/restore the rest of context including hw registers?
save errno here.
and restore here.
|
if we are using host calls in a way which can't tolerate signals, it's a bug. |
Ok, this approach may possible, @GUIDINGLI @jasonbu could you look at this and provide a new fix? |
|
OK, |
Retry loop for EINTR doesn't work very well for many 3rd party library since 3rd party code doesn't consider or test with the environment that signal happen randomly and frequently. |
hi, @yamt, thanks for your idea, try save errno when sim context switch can also handle this problem, please review #16666, append patch in this pr. tested in my local environment with complex sim project. |
|
using #16795 to make it more clear |
Summary
NuttX sim use coroutine base on one thread in host to do switch context. but if we allow switch context with in one API (host-API and errno get), maybe the switch context from coroutine cause re-enter host-API call. Make the errno get behavior not work as expected.
This commit disable interrupts before and after host-APIs, make sure that errno is correct.
Impact
Resolves previously observed cases where errno returned inconsistent values after context switch.
Some host-APIs are now uninterruptible during execution, which may slightly increase latency in scheduling.
Testing
CI-test, sim/nsh ostest