Skip to content

Feat - refactor SD card, improve bluetooth throughput, remove wifi feature, fix issues#5897

Open
TuEmb wants to merge 21 commits intoBasedHardware:mainfrom
TuEmb:TuEmb/sd_card_improvement
Open

Feat - refactor SD card, improve bluetooth throughput, remove wifi feature, fix issues#5897
TuEmb wants to merge 21 commits intoBasedHardware:mainfrom
TuEmb:TuEmb/sd_card_improvement

Conversation

@TuEmb
Copy link
Contributor

@TuEmb TuEmb commented Mar 22, 2026

Fixed #4429 #4351 #4213 #4101 #4052 #4077 #3976 #5634

This pull request removes WiFi support from the OMI firmware and introduces several improvements to audio storage and Bluetooth configuration. The WiFi-related code, configuration options, and dependencies have been removed from both the build system and source files. Additionally, the SD card storage subsystem is enhanced with new request types and structures to better support file management. Bluetooth and logging configurations are streamlined and improved, and the default filesystem is switched to LittleFS for better reliability.

Major removals and refactors:

  • WiFi support removed:
    • All WiFi-related configuration options and code have been deleted from omi.conf, Kconfig, CMakeLists.txt, and source files, including conditional compilation, initialization, and shutdown logic.

Audio storage and SD card improvements:

  • SD card subsystem extended:
    • New request types and structures are added to sd_card.h for advanced file management, including file stats, file listing, deletion, and rotation support.

Bluetooth and logging configuration:

  • Bluetooth performance and connection improvements:
    • Increased buffer counts and enabled automatic MTU updates for higher throughput and stability.
    • Preferred connection interval tightened for faster communication.

Filesystem and memory usage:

  • Switch to LittleFS and memory optimization:
    • LittleFS is now the default filesystem for SD NAND, replacing FATFS for improved power-loss safety and reliability. Memory pool sizes have been adjusted for efficiency.

These changes collectively focus the firmware on offline and Bluetooth functionality, improve reliability and maintainability, and lay groundwork for more robust audio file management.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 22, 2026

Greptile Summary

This PR is a substantial firmware refactor that removes the WiFi feature, migrates the SD card filesystem from FATFS to LittleFS (for power-loss safety and wear-leveling), upgrades the BLE audio sync protocol to support multi-file operations, and improves BLE throughput through tighter connection intervals, semaphore-throttled audio notifications, and MTU retry logic.

Key changes:

  • SD card: Full rewrite of sd_card.c — LittleFS over raw disk_access, custom aligned I/O callbacks, priority + regular message queues, boot-time lfs_fs_gc pre-warm to prevent multi-second stalls on first write, file rotation every 30 min, time-sync rename of TMP files.
  • BLE sync protocol: New multi-file commands (CMD_LIST_FILES 0x10, CMD_READ_FILE 0x11, CMD_DELETE_FILE 0x12), auto-delete after successful transfer, sync speed telemetry.
  • BLE throughput: CONFIG_BT_L2CAP_TX_BUF_COUNT doubled to 20, CONFIG_BT_ATT_TX_COUNT set to 20, bt_gatt_notify_cb with completion callback to avoid buffer exhaustion, semaphore-gated audio TX reserving 2 slots for battery/diagnostics, connection interval tightened from 30 ms to 15 ms max.
  • Codec: Replaced 10 ms polling sleep with a semaphore, inner while loop drains all buffered data per wakeup.
  • RTC: Persistence and SD notification deferred to system workqueue to keep BLE GATT callback stack shallow.
  • WiFi: All WiFi-related code (wifi.c, OMI_ENABLE_WIFI Kconfig, WiFi GATT characteristic, networking stack configs) removed.

Issues found:

  • sync_file_count / sync_file_list / sync_file_sizes in storage.c are accessed from the BLE GATT callback thread without synchronisation while the storage thread writes them (race condition).
  • _le_param_updated in transport.c calls update_conn_params which sleeps up to 900 ms in retries — this can block the BLE host thread and create a tight parameter-negotiation loop if the central consistently settles above the threshold.
  • sd_notify_time_synced is called twice per time-sync event (once directly in transport.c:244, once via rtc_persist_work_handler in rtc.c:32).
  • CONFIG_UART_CONSOLE=y is duplicated in prj.conf (lines 191 and 201).

Confidence Score: 4/5

  • Mostly safe to merge; the race condition in storage.c should be addressed before the sync protocol is exercised under load.
  • The PR represents a very well-considered refactor with good separation of concerns, detailed comments, and solid defensive coding throughout sd_card.c and transport.c. The WiFi removal and LittleFS migration are straightforward. The one structural issue — unsynchronised access to sync_file_count/list/sizes between the BLE callback thread and the storage thread — is a real P1 concurrency bug that can manifest as stale-data transfers or incorrect error responses during multi-file sync. The double sd_notify_time_synced is benign but should be cleaned up. The _le_param_updated sleep issue is worth fixing before production but unlikely to cause hard failures on most phones.
  • omi/firmware/omi/src/lib/core/storage.c (race condition on sync_file_count/sync_file_list), omi/firmware/omi/src/lib/core/transport.c (sleep in _le_param_updated, duplicate sd_notify call)

Important Files Changed

Filename Overview
omi/firmware/omi/src/sd_card.c Complete rewrite: FATFS → LittleFS over raw disk_access with custom I/O callbacks, priority + regular message queues, persistent read handle, boot-time lfs_fs_gc pre-warm, multi-file API, file rotation, time-sync rename, and graceful shutdown. Very large and carefully architected change; no critical bugs found in the SD layer itself.
omi/firmware/omi/src/lib/core/storage.c Completely restructured BLE sync protocol (multi-file commands, file-list response, auto-delete, sync speed logging). Contains a P1 race condition: sync_file_count/sync_file_list/sync_file_sizes are written on the storage thread and read unsynchronised from the BLE GATT callback thread.
omi/firmware/omi/src/lib/core/transport.c BLE throughput improvements: semaphore-throttled audio TX, bt_gatt_notify_cb with completion callback, MTU recheck loop, 2M PHY retry, tighter connection interval (15ms), BT NUS shell support. Two issues: blocking sleep in _le_param_updated callback and redundant sd_notify_time_synced call.
omi/firmware/omi/src/lib/core/codec.c Replaced 10ms polling sleep with a semaphore (codec_data_sem); inner while loop now drains all available ring-buffer data per wakeup, reducing latency and CPU overhead.
omi/firmware/omi/src/rtc.c RTC persistence and SD time-sync notification deferred to system work queue to keep BLE GATT callback stack shallow. The sd_notify_time_synced call here duplicates the one that remains in transport.c's time_sync_write_handler.
omi/firmware/omi/prj.conf New project config file (omi.conf content promoted to prj.conf). Contains a duplicate CONFIG_UART_CONSOLE=y (lines 191 and 201) that is harmless but confusing.
omi/firmware/omi/src/main.c Removed WiFi init call, added LED blinking red warning when RTC is not synced (offline storage mode), removed per-second UTC timestamp log.
omi/firmware/omi/omi.conf WiFi config removed, FAT filesystem replaced with LittleFS, BLE buffers doubled (L2CAP TX buf 10→20, ATT TX count 20), connection interval tightened (30ms→15ms max), HEAP reduced from 175 KB to 40 KB now that networking stack is gone.

Sequence Diagram

sequenceDiagram
    participant Phone as Phone App
    participant BLE as BLE Stack (transport.c)
    participant Store as Storage Thread (storage.c)
    participant SD as SD Worker (sd_card.c / LittleFS)

    Phone->>BLE: Write CMD_LIST_FILES (0x10)
    BLE->>Store: list_files_requested = 1
    Store->>SD: REQ_GET_FILE_LIST (prio queue)
    SD-->>Store: filenames + sizes
    Store->>Phone: Notify [count][ts1][sz1]...[tsN][szN]

    Phone->>BLE: Write CMD_READ_FILE (0x11) [index][offset]
    BLE->>Store: setup_file_transfer(index, offset)
    Store->>Store: remaining_length = file_size - offset
    loop BLE send loop (write_to_gatt)
        Store->>SD: REQ_READ_DATA(filename, offset, chunk) (prio queue)
        SD-->>Store: audio bytes
        Store->>Phone: Notify [ts:4][data:chunk]
    end
    Store->>SD: Auto-delete file (if not active recording)
    Store->>Phone: Notify [100] (done)

    Phone->>BLE: Write CMD_DELETE_FILE (0x12) [index]
    BLE->>Store: delete_file_index = index
    Store->>SD: REQ_DELETE_FILE (worker loop)
    SD-->>Store: result
    Store->>Phone: Notify [result]

    Phone->>BLE: Time-sync write [epoch:4]
    BLE->>BLE: rtc_set_utc_time()
    BLE->>SD: sd_notify_time_synced() → REQ_TIME_SYNCED (prio queue)
    SD->>SD: Rename TMP_*.txt → epoch.txt
Loading

Last reviewed commit: "disable log for rele..."

Comment on lines +45 to +47
static char sync_file_list[MAX_AUDIO_FILES][MAX_FILENAME_LEN];
static uint32_t sync_file_sizes[MAX_AUDIO_FILES];
static int sync_file_count = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Race condition on shared file-list state

sync_file_count, sync_file_list, and sync_file_sizes are static module-level variables that are written by the storage thread (in refresh_file_list_cache, line 246–254) and read concurrently from the BLE GATT write callback (parse_storage_command, lines 375–400) — which runs on the BLE host thread.

Without a mutex or atomic access, a classic TOCTOU race exists:

  1. BLE callback reads sync_file_count = 5.
  2. Storage thread resets sync_file_count = 0 and starts repopulating sync_file_list.
  3. BLE callback uses index 4 against a partially-populated sync_file_list, reading stale or zero-initialized data.

The consequence is that the storage layer could set up a transfer (setup_file_transfer) with a garbage filename, leading to an immediate read error or a corrupted transfer session.

A lightweight fix is to protect reads/writes with a K_MUTEX_DEFINE or to copy the required slice of the list under lock before using it in parse_storage_command.

Comment on lines +654 to +657
if (interval > 24) {
LOG_WRN("Connection interval still high (%u units). Re-requesting preferred params.", interval);
update_conn_params(conn);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Blocking sleep inside BLE connection-parameters callback

_le_param_updated is a BLE stack callback. When interval > 24, it calls update_conn_params, which contains a retry loop that sleeps up to CONN_PARAM_RETRY_COUNT × CONN_PARAM_RETRY_DELAY_MS = 3 × 300 ms = 900 ms on the BLE host thread.

Two problems:

  1. Latency spike: Blocking the BLE host thread for up to 900 ms can delay or drop incoming/outgoing GATT operations, causing the phone to see ATT timeouts.
  2. Negotiation loop: If the central device keeps settling on an interval slightly above 24 units (e.g., a phone that always negotiates 30 ms), every _le_param_updated callback will trigger another round of update_conn_params → re-request → new _le_param_updated, forming a tight re-negotiation cycle for the lifetime of the connection.

The retry logic is appropriate in _transport_connected, but from _le_param_updated a single non-blocking call (no retry loop, no sleep) is safer:

Suggested change
if (interval > 24) {
LOG_WRN("Connection interval still high (%u units). Re-requesting preferred params.", interval);
update_conn_params(conn);
}
if (interval > 24) {
LOG_WRN("Connection interval still high (%u units). Re-requesting preferred params.", interval);
bt_conn_le_param_update(conn, &(struct bt_le_conn_param){
.interval_min = 6, .interval_max = 12, .latency = 0, .timeout = 400
});
}

LOG_INF("Time synchronized successfully");

/* Notify SD card module so it can rename temp files to real timestamps */
sd_notify_time_synced(epoch_s);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Double call to sd_notify_time_synced per time-sync event

sd_notify_time_synced(epoch_s) is called here directly from time_sync_write_handler (line 244). However, rtc_set_utc_time (called at line 235) now schedules rtc_persist_work, whose handler rtc_persist_work_handler in rtc.c also calls sd_notify_time_synced (line 32 of rtc.c).

This means for every BLE time-sync event:

  1. sd_notify_time_synced fires immediately (this line, BLE callback thread).
  2. sd_notify_time_synced fires again shortly after (system work queue from rtc.c).

The SD worker receives two REQ_TIME_SYNCED requests. The second rename attempt will encounter current_file_needs_rename == false and silently return, so there is no corruption — but it does queue an extra priority message, needlessly occupying the priority queue and adding log noise.

Since rtc.c now owns the SD notification as part of RTC persistence, the call here in transport.c should be removed:

Suggested change
sd_notify_time_synced(epoch_s);
LOG_INF("Time synchronized successfully");

## Enable the lines below to enable logs via UART/USB
CONFIG_LOG=y
CONFIG_LOG_PRINTK=y
CONFIG_UART_CONSOLE=y
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Duplicate CONFIG_UART_CONSOLE=y

CONFIG_UART_CONSOLE=y appears twice in this file — at line 191 (under the "Console" section) and again at line 201 (under the "Logs" section). The second occurrence is harmless but creates confusion about where the canonical setting lives. The one in the Logs block appears to be a copy-paste artefact from the comment block above it.

Suggested change
CONFIG_UART_CONSOLE=y
CONFIG_LOG_PRINTK=y
CONFIG_LOG_PROCESS_THREAD_PRIORITY=5

Comment on lines +569 to +570
// Delay a bit before PHY request to avoid early HCI race on some phones.
k_sleep(K_MSEC(300));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Inconsistent indentation

The comment and k_sleep call at lines 569–570 use 8-space indentation while all surrounding code in _transport_connected uses 4-space indentation, suggesting this block was copied from inside a nested scope and never re-indented.

Suggested change
// Delay a bit before PHY request to avoid early HCI race on some phones.
k_sleep(K_MSEC(300));
// Delay a bit before PHY request to avoid early HCI race on some phones.
k_sleep(K_MSEC(300));

@TuEmb
Copy link
Contributor Author

TuEmb commented Mar 22, 2026

Power consumption profile @4.2V:

Offline mode: 8.65 mA
image

Live mode: 7.68 mA
image

Sync mode: 17.7 mA
image

Deepsleep: 34 uA
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

offline timestamp issues fix

1 participant