Skip to content

Conversation

@jasonbu
Copy link
Contributor

@jasonbu jasonbu commented Jul 29, 2025

Summary

As discussion in #16742, manually up_irq_save is not a good idea, we should save errno in "interrupt" to ensure the errno use as usual will more easy to mantain.

This patch add errno in sim context, and save it before call coroutine setjump, restore before longjump.

When there is lot of sim feature provide by host, the errno may change by coroutine/switch context/...
The code to reproduce issue:
paste it into apps/examples/hello/hello_main.c

#include <nuttx/config.h>
#include <stdio.h>
#include <unistd.h>
#include <spawn.h>
#include <fcntl.h>
#include <sys/wait.h>
#include <sys/stat.h>

void *thread_gen_errno(void *arg)
{
  bool *gen_done = (bool *)arg;
  const char *dir_path = "/data1/test_is_dir";
  const char *chdir_path = "/data1";
  int ret = 0;

  ret = mkdir(dir_path, 0777);
  printf("mkdir: %s done, %d\n", chdir_path, ret);

  printf("child errno thread start\n");
  while(true) {
      usleep(1);

      if (*gen_done) {
        /* Regular exit */
        ret = 0;
        break;
      }

extern void host_errno_set(int errcode);
extern int host_errno_get(void);
      host_errno_set(EISDIR);
      ret = host_errno_get();
      if (ret != EISDIR) {
        printf("host_errno_set/get fail: set EISDIR, get %d\n", ret);
        *gen_done = true;
        break;
      }
    }

  rmdir(dir_path);

  printf("child errno thread end\n");
  return (void *)(uintptr_t)ret;
}

int main(int argc, FAR char *argv[])
{
  const int maxtimes = 1000*10000;
  const char *dir_path = "/data/test_rm_dir";
  int times = 0;
  bool gen_done = false;
  int ret;
  pid_t tid;
  pthread_attr_t attr;

  pthread_attr_init(&attr);
  attr.priority = CONFIG_EXAMPLES_HELLO_PRIORITY + 1;

  ret = pthread_create(&tid, &attr, thread_gen_errno, &gen_done);
  pthread_attr_destroy(&attr);

  if (ret != 0) {
      printf("Failed to spawn child thread: %d\n", ret);
      return -1;
    }

  for (int i = 0; i < maxtimes; i++) {
      times ++;
      ret = rmdir(dir_path);

      if (ret == 0) {
          printf("rmdir: %s succ", dir_path);
      } else {
        int err = errno;

        if (err == ENOENT) {
          /* printf("dir not exist\n"); The expected err */
        } else {
          printf("rmdir fail: %s (errno: %d)\n", strerror(err), err);
          break;
        }
      }

      if (gen_done) {
        /* Child thread exit abnormally */
        break;
      }

      if (times % 100000 == 0) {
        printf("exec done %d times\n", times);
      }
    }

  gen_done = true;
  pthread_join(tid, (void *)&ret);

  if (times == maxtimes && ret == 0) {
    printf("rmdir test success, total exec %d times\n", times);
  } else {
    printf("break after %d times, ret: %d\n", times, (int)ret);
  }

  return 0;
}

Impact

Now we don't have to take care the errno changed when "interrupt" happened. the context switch will matain the errno correctly.
Code to reproduce will no longer report break.

Testing

CI test, ubuntu 24.04 sim/nsh, local complex sim project.
After revert ac5b38c and add a host_errno_get & host_errno_set.
These demo will able to re-produce it in sim/nsh.

rmdir fail: Unknown error 21 (errno: 21)
child errno thread end
break after 3579966 times, ret: 0

and succ after whole pr.

rmdir test success, total exec 10000000 times

@github-actions github-actions bot added Arch: simulator Issues related to the SIMulator Size: M The size of the change in this PR is medium labels Jul 29, 2025
@xiaoxiang781216 xiaoxiang781216 requested a review from yamt July 29, 2025 18:42
@jasonbu jasonbu force-pushed the sim_save_errno branch 2 times, most recently from b38680a to 7c32224 Compare July 30, 2025 14:43
@xiaoxiang781216
Copy link
Contributor

@yamt could you review whether this patchset fix your concern?

@cederom
Copy link
Contributor

cederom commented Jul 30, 2025

Thank you @jasonbu :-) Do we have some sort of nuke test application for this case where corner cases can be generated and verified? It may be good to see how things are handled on various Sim:OS (i.e. Linux, BSD, macOS, etc)?

Is the code in PR description enough / sufficient to cover all scenarios or just a starting point? Can we put that as starting point of nuttx-apps/testing/simtest and create sim:simtest target so its easy to build/verify for everyone? :-)

@jasonbu
Copy link
Contributor Author

jasonbu commented Jul 31, 2025

Thank you @jasonbu :-) Do we have some sort of nuke test application for this case where corner cases can be generated and verified? It may be good to see how things are handled on various Sim:OS (i.e. Linux, BSD, macOS, etc)?

Is the code in PR description enough / sufficient to cover all scenarios or just a starting point? Can we put that as starting point of nuttx-apps/testing/simtest and create sim:simtest target so its easy to build/verify for everyone? :-)

There is some dependency if reproduce with this code,

  • sim and /data as hostfs
  • have other host business working with polling designed feature.

so in my environment, sim/citest cannot reproduce, but in internal project can reproduce with less than 10000 times loop.

if you have time to ensure it reproduced and add to sim-testing will appreciate

@jasonbu
Copy link
Contributor Author

jasonbu commented Jan 12, 2026

Thank you @jasonbu :-) Do we have some sort of nuke test application for this case where corner cases can be generated and verified? It may be good to see how things are handled on various Sim:OS (i.e. Linux, BSD, macOS, etc)?

Is the code in PR description enough / sufficient to cover all scenarios or just a starting point? Can we put that as starting point of nuttx-apps/testing/simtest and create sim:simtest target so its easy to build/verify for everyone? :-)

hi, @cederom ,
rebase to the lastest code, still working, and updated the sample.
it still corner case, and not eary to take it into standard test/example.

The 1 million times test, will not able to reproduce the problem.
using 10 million times, will >50% re-produce. also try use 2nd mount point and 2nd host relative API to generate a host-API caused errno set.
The possiblily will even lower, so finally use host_errno_set and host_errno_get to make it more eary to re-produce the problem.

In the real-complex scene. True issue will be, if we don't use host_uninterruptible_errno, will cause errno randomly covered by other host-API. And if we depends errno to do host behavior(should be regular desisng). errno occupation will cause a mess.

CC @yamt

@jerpelea jerpelea changed the title save errno in sim context switch to avoid coroutine make errno mess between host api calls. arch/sim: save errno in sim context switch to avoid coroutine make errno mess between host api calls. Jan 12, 2026
jerpelea
jerpelea previously approved these changes Jan 12, 2026
@cederom
Copy link
Contributor

cederom commented Jan 12, 2026

Thank you @jasonbu :-)

Copy link
Contributor

@cederom cederom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @jasonbu for the fix! Let's just make sure CI build errors are fixed before merge :-)

Copy link
Contributor

@cederom cederom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Linker error fix :-)

…rno"

This reverts commit ac5b38c.

Keep host_errno_convert as a common interface in sim_internal.h
Keep the up_irq_save & up_irq_restore as common interface

Signed-off-by: buxiasen <buxiasen@xiaomi.com>
@jasonbu jasonbu dismissed stale reviews from jerpelea and xiaoxiang781216 via 8fac768 January 12, 2026 10:19
@jasonbu
Copy link
Contributor Author

jasonbu commented Jan 12, 2026

TODO: Linker error fix :-)

Ths for review, and help find where CI report error, the host_uninterruptible_errno usage increased during these days. fixed, and please check.

@jasonbu jasonbu requested a review from cederom January 12, 2026 11:02
We will remove the API in next commit, so remove the errno manual save
API and save errno more commonly.
After whole pull request, should no longer need to care about errno.

Signed-off-by: buxiasen <buxiasen@xiaomi.com>
Avoid the errno changed after interrupt in sim and make return value
mistake when according to errno.
Move host_errno_convert from macro to sim_errno.c.

Signed-off-by: buxiasen <buxiasen@xiaomi.com>
Avoid the errno changed after interrupt in sim and make return value
mistake when according to errno.

Signed-off-by: buxiasen <buxiasen@xiaomi.com>
Or will cause the fork break as call will auto push 1 item.

Signed-off-by: buxiasen <buxiasen@xiaomi.com>
@xiaoxiang781216 xiaoxiang781216 merged commit a938926 into apache:master Jan 13, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Arch: simulator Issues related to the SIMulator Size: M The size of the change in this PR is medium

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants