Skip to content

Conversation

@hujun260
Copy link
Contributor

Summary

This PR fixes a critical recursion issue in the notifier system that occurs during early
system startup. The problem occurs when DEBUGASSERT is triggered in sched_lock(), which
internally calls panic_notifier_call_chain() - causing a cascading chain of recursive
DEBUGASSERT failures.

The fix adds explicit notifier head pointer checks before executing notifier chains,
preventing unnecessary lock operations on uninitialized notifiers during system startup
and error handling paths.

Problem Description

During early system startup or when DEBUGASSERT fails in the scheduler:

  1. ARM64 fatal handler triggers DEBUGASSERT
  2. _assert() is called
  3. panic_notifier_call_chain() attempts lock acquisition
  4. sched_lock() contains a DEBUGASSERT that checks rtcb && rtcb->lockcount < MAX_LOCK_COUNT
  5. Since this_task() may return NULL during startup, the DEBUGASSERT fails again
  6. Recursion cycle causes cascading failures instead of clean panic/reboot

Changes Made

Atomic Notifier Chain (ATOMIC_NOTIFIER_CALL macro)

Before:

do {
  FAR struct atomic_notifier_head *nh = (nhead);
  irqstate_t flags;
  flags = rspin_lock_irqsave_nopreempt(&nh->lock);
  notifier_call_chain(nh->head, (val), (v), -1, NULL);
  rspin_unlock_irqrestore_nopreempt(&nh->lock, flags);
} while(0)

@github-actions github-actions bot added Area: OS Components OS Components issues Size: S The size of the change in this PR is small labels Jan 22, 2026
jerpelea
jerpelea previously approved these changes Jan 22, 2026
Add head pointer checks in notifier_call_chain macros to prevent recursion into
sched_lock() when assertions are triggered during early system startup. This avoids
cascading DEBUGASSERT failures when the notifier head is empty or uninitialized.
backtrace:
  DEBUGASSERT(rtcb && rtcb->lockcount < MAX_LOCK_COUNT);
  sched_lock()
  panic_notifier_call_chain
  _assert()
  arm64_fatal_handler

Signed-off-by: hujun5 <hujun5@xiaomi.com>
@hujun260 hujun260 dismissed stale reviews from xiaoxiang781216 and jerpelea via cf95a58 January 22, 2026 12:18
@hujun260 hujun260 force-pushed the apache_assert_fix_DEBUGASSERT branch from ad076fb to cf95a58 Compare January 22, 2026 12:18
@acassis acassis requested review from cederom and linguini1 January 23, 2026 13:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: OS Components OS Components issues Size: S The size of the change in this PR is small

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants