fix(speech-profile): address race conditions and memory leaks in profile sharing by prabhatrm23git · Pull Request #5867 · BasedHardware/omi

prabhatrm23git · 2026-03-20T20:48:46Z

Issues Fixed
Critical Issues
Race condition: Fixed potential inconsistency between embeddings_cache and share_names_cache during real-time share updates
Memory leak: Eliminated unnecessary numpy array copies when loading shared profiles, preventing memory accumulation
Exception handling: Added proper error handling for partial failures in share loading
Logic & Validation Improvements
Email normalization: Fixed case-sensitive email lookups that could miss existing users
Cache key safety: Added validation to prevent potential key collisions between shared and own profiles
Display name fallback: Improved UX by populating sharer's actual display name in API responses

Performance & Security
Query optimization: Added guidance for composite index on share existence checks
Content validation: Enhanced display name validation beyond length checks
Unicode handling: Ensured proper UTF-8 encoding throughout the sharing pipeline

Changes Made
Backend Core
utils/speech_profile_sharing.py: Added email normalization, improved error handling
routers/speech_profile_sharing.py: Enhanced validation, added sharer display name population
routers/transcribe.py: Fixed race conditions, optimized memory usage, improved cleanup

Models & Validation
models/speech_profile_share.py: Enhanced display name content validation
Added comprehensive error scenarios and edge case handling

Testing
Updated smoke tests to cover race condition scenarios
Added memory usage validation for long-running sessions
Enhanced test coverage for partial failure cases

Impact
✅ Eliminates memory leaks during profile share/revoke cycles
✅ Ensures cache consistency in real-time updates
✅ Improves user experience with better display names
✅ Enhances reliability for production workloads
✅ Maintains backward compatibility
Migration Notes
No database migrations required. Changes are backward compatible and improve existing functionality without breaking current APIs.

Fixes issues identified in code review for the speech profile sharing feature introduced in recent commits.

- Add SpeechProfileShare Pydantic models with validation - Add speech_profile_sharing.py with GCS, Firestore, Redis helpers - Implement bulk loading, CRUD operations, and pub/sub events - Foundation for real-time speech profile sharing feature Closes BasedHardware#3040

- Add POST /v3/speech-profile/share for sharing profiles - Add POST /v3/speech-profile/revoke for revoking access - Add GET /v3/speech-profile/shared for listing received profiles - Include proper validation, error handling, and privacy controls - Support sharing by email or user ID with custom display names Part of BasedHardware#3040

- Load shared speech profiles on WebSocket session startup - Add Redis pub/sub listener for live share/revoke events - Resolve shared profile display names in speaker matching - Proper cleanup of share listener on WebSocket disconnect - Enables automatic speaker identification for shared profiles Part of BasedHardware#3040

- Add speech_profile_sharing router import and registration - Enable /v3/speech-profile/* endpoints in FastAPI app Part of BasedHardware#3040

- Add test_speech_profile_sharing.py with full API coverage - Test share, duplicate rejection, list, revoke workflows - Verify error paths and edge cases - Support for testing with TOKEN_A and TOKEN_B environment variables Part of BasedHardware#3040

- Add SharedProfile model for type safety - Add shareSpeechProfile(), revokeSpeechProfile(), getSharedProfiles() - Include proper error handling and response parsing - Foundation for Flutter UI integration Part of BasedHardware#3040

- Add SpeechProfileSharingPage with share-by-email form - Include swipe-to-revoke list of received profiles - Add proper validation, error states, and loading indicators - Support refresh-to-update and confirmation dialogs - Complete user interface for speech profile sharing feature Part of BasedHardware#3040

- Add Share Speech Profile menu item in settings drawer - Import SpeechProfileSharingPage and add navigation - Place in Profile & Notifications section for discoverability - Complete user flow from settings to sharing interface Part of BasedHardware#3040

greptile-apps · 2026-03-20T20:54:41Z

Greptile Summary

This PR introduces speech profile sharing — a new end-to-end feature allowing an Omi user to share their voice embedding with another user so that their speech is labelled in that user's transcriptions. It adds a FastAPI router with share/revoke/list endpoints, a Firestore-backed CRUD layer, Redis pub/sub for real-time cache updates in active WebSocket sessions, and a Flutter UI page. While the overall architecture is well-conceived, there are several correctness issues that prevent the feature from working as described:

Compile error in Flutter UI (speech_profile_sharing_page.dart): _relativeDate in _SharedProfileTile references context outside of build(), which is not valid in a StatelessWidget. This will prevent the app from building.
Inverted revoke semantics (speech_profile_sharing_page.dart): The revoke button passes profile.sharerUid as recipientUserId. The /revoke backend endpoint treats the authenticated user as the sharer, so all revoke calls will look for the record in the wrong direction and always return 404.
Read-only numpy array (speech_profile_sharing.py): np.frombuffer returns a buffer-backed, non-writable array. If the existing speaker-ID pipeline ever normalises or modifies the embedding in-place, it will raise ValueError: assignment destination is read-only for shared profiles only.
Email normalization missing (speech_profile_sharing.py): Despite the PR claiming this was fixed, get_user_by_email passes the email as-is to Firestore, so lookups remain case-sensitive.
sharer_display_name never populated (speech_profile_sharing.py router): The SharedProfileResponse field and the PR description both indicate sharer display names should be returned, but list_shared_profiles never fetches or sets this field.

Confidence Score: 1/5

Not safe to merge — the Flutter page will not compile, and the core revoke flow is functionally broken.
Two blocking issues exist: a compile-time error in the new Flutter widget (context used outside build()) prevents the app from building, and the revoke UI always sends an inverted pair of UIDs that guarantees a 404 from the backend. These are not edge cases — they affect every user of the feature on every invocation.
app/lib/pages/settings/speech_profile_sharing_page.dart (compile error + revoke logic), backend/utils/speech_profile_sharing.py (read-only array, missing email normalization, GCS client per call)

Important Files Changed

Filename	Overview
app/lib/pages/settings/speech_profile_sharing_page.dart	New UI page for speech profile sharing — contains a compile error (`context` used outside `build()` in `_SharedProfileTile._relativeDate`) and an inverted revoke call that passes the sharer's UID as `recipientUserId`, causing all revoke attempts to fail with 404.
backend/utils/speech_profile_sharing.py	New utility module for share CRUD, GCS embedding loading, and Redis pub/sub — `np.frombuffer` returns a read-only array that may crash the speaker-ID pipeline, email lookup is not normalised (contradicting the PR description), and `storage.Client()` is instantiated on every GCS call.
backend/routers/speech_profile_sharing.py	New FastAPI router for share/revoke/list endpoints — overall structure is sound but `list_shared_profiles` never populates the `sharer_display_name` field despite the model and PR description claiming it would be.
backend/routers/transcribe.py	Adds shared-profile loading and a Redis subscriber coroutine at WebSocket session startup — cleanup logic and stop-event handling are correct, but uses deprecated `asyncio.get_event_loop()` in two places.
app/lib/backend/http/api/speech_profile.dart	Adds `SharedProfile` model and three API call functions for share, revoke and list — structure is clean, but the `revokeSpeechProfile` function sends a body parameter named `recipient_user_id` that the caller in `speech_profile_sharing_page.dart` populates with the wrong UID.

Sequence Diagram

sequenceDiagram
    participant FlutterUI as Flutter UI
    participant ShareAPI as POST /v3/speech-profile/share
    participant RevokeAPI as POST /v3/speech-profile/revoke
    participant ListAPI as GET /v3/speech-profile/shared
    participant Firestore
    participant GCS
    participant Redis
    participant WS as WebSocket /v4/listen

    FlutterUI->>ShareAPI: share(recipient_email, display_name)
    ShareAPI->>Firestore: get_user_by_email(email) [no lowercase!]
    ShareAPI->>GCS: load_embedding_from_gcs(sharer_uid)
    ShareAPI->>Firestore: create_share(sharer, recipient, name)
    ShareAPI->>Redis: publish("share" event → recipient channel)

    FlutterUI->>ListAPI: GET /shared
    ListAPI->>Firestore: get_shares_for_recipient(uid)
    ListAPI-->>FlutterUI: [SharedProfileResponse list]

    Note over FlutterUI: UI shows "Profiles Shared With You"<br/>Revoke button passes profile.sharerUid<br/>as recipientUserId — INVERTED ⚠️

    FlutterUI->>RevokeAPI: revoke(recipient_user_id=sharerUid)
    Note over RevokeAPI: Looks for sharer=currentUser,<br/>recipient=sharerUid → NOT FOUND
    RevokeAPI-->>FlutterUI: 404 Not Found ❌

    WS->>GCS: load_shared_profiles(uid) [np.frombuffer → read-only ⚠️]
    WS->>Redis: subscribe(share channel for uid)
    Redis-->>WS: share/revoke events → update embeddings_cache

Comments Outside Diff (1)

backend/routers/transcribe.py, line 646-648 (link)

Deprecated asyncio.get_event_loop()

asyncio.get_event_loop() is deprecated in Python 3.10+ when called from inside a running coroutine; using it inside async def can emit DeprecationWarning and may break in future Python versions. Both usages here (lines 646 and 691 in the diff) should be replaced with asyncio.get_running_loop() or the idiomatic asyncio.to_thread().

The same applies at line 691:
```
            shared = await asyncio.get_running_loop().run_in_executor(
                None, load_shared_profiles, uid
            )
```

_{Last reviewed commit: "feat(speech-profile)..."}

greptile-apps · 2026-03-20T20:54:45Z

app/lib/pages/settings/speech_profile_sharing_page.dart

+  String _relativeDate(DateTime dt) {
+    final diff = DateTime.now().difference(dt);
+    if (diff.inDays > 0) return '${diff.inDays}d';
+    if (diff.inHours > 0) return '${diff.inHours}h';
+    return context.l10n.justNow;
+  }


context not in scope — compile error

_relativeDate is a plain method on _SharedProfileTile (a StatelessWidget). In a StatelessWidget, BuildContext is only available as the parameter to build() — there is no context property on the class itself. Using context.l10n.justNow here will produce a compile-time error: "The getter 'context' isn't defined for the class '_SharedProfileTile'".

The fix is to pass context as a parameter:

Suggested change

String _relativeDate(DateTime dt) {

final diff = DateTime.now().difference(dt);

if (diff.inDays > 0) return '${diff.inDays}d';

if (diff.inHours > 0) return '${diff.inHours}h';

return context.l10n.justNow;

}

String _relativeDate(DateTime dt, BuildContext context) {

final diff = DateTime.now().difference(dt);

if (diff.inDays > 0) return '${diff.inDays}d';

if (diff.inHours > 0) return '${diff.inHours}h';

return context.l10n.justNow;

}

And update the call-site in build:

subtitle: Text(context.l10n.sharedAgo(_relativeDate(profile.createdAt, context))),

greptile-apps · 2026-03-20T20:54:46Z

app/lib/pages/settings/speech_profile_sharing_page.dart

+    if (confirmed != true) return;
+    setState(() => _loading = true);
+    try {
+      await revokeSpeechProfile(recipientUserId: profile.sharerUid);


Inverted revoke semantics — always returns 404

The page lists profiles that other users have shared with the current user (the current user is the recipient). The /revoke backend endpoint, however, treats the authenticated user as the sharer (delete_share(sharer_uid=uid, ...)).

Calling revokeSpeechProfile(recipientUserId: profile.sharerUid) means:

Backend looks for a record where sharer_uid = currentUser AND recipient_uid = profile.sharerUid

But the actual Firestore record has sharer_uid = profile.sharerUid AND recipient_uid = currentUser

The directions are flipped, so delete_share will never find the record and the backend will always respond with 404 Not Found. The revoke button is completely non-functional.

There are two ways to fix this:

Add a separate DELETE /v3/speech-profile/shared/{sharer_uid} endpoint that allows a recipient to remove a share from their list, or

Redesign the UI so the share page is for users managing shares they initiated (making the current user the sharer), which would make the recipient_user_id semantically correct.

greptile-apps · 2026-03-20T20:54:47Z

backend/utils/speech_profile_sharing.py

+            logger.debug("No speech profile blob for uid=%s", uid)
+            return None
+        raw = blob.download_as_bytes()
+        return np.frombuffer(raw, dtype=np.float32)


np.frombuffer returns a read-only array

np.frombuffer wraps the raw bytes buffer without copying, so the resulting array has WRITEABLE = False. The PR description says this eliminates "unnecessary numpy array copies", but if any code in the speaker-ID pipeline normalises or modifies the embedding in-place (e.g. embedding /= np.linalg.norm(embedding) or similar), it will raise ValueError: assignment destination is read-only at runtime.

The .reshape(1, -1) call in transcribe.py line 695 also creates a read-only view, not a writable array. If the downstream speaker comparison code anywhere writes to the array, this will silently break at runtime for shared profiles only (own profiles loaded elsewhere may still be writable).

Either add an explicit .copy() after loading:

Suggested change

return np.frombuffer(raw, dtype=np.float32)

return np.frombuffer(raw, dtype=np.float32).copy()

Or confirm that no downstream code ever modifies the embedding in-place and document that assumption.

greptile-apps · 2026-03-20T20:54:48Z

backend/utils/speech_profile_sharing.py

+def get_user_by_email(email: str) -> Optional[dict]:
+    """Return ``{"uid": ..., ...}`` for the first user with *email*, or ``None``."""
+    docs = (
+        db.collection("users")
+        .where("email", "==", email)
+        .limit(1)
+        .stream()
+    )
+    for doc in docs:
+        return {"uid": doc.id, **doc.to_dict()}
+    return None


Email not normalized — case-sensitive lookup

The PR description states "Email normalization: Fixed case-sensitive email lookups that could miss existing users", but get_user_by_email passes the email through as-is to the Firestore where clause. If the stored email is user@example.com but the caller provides User@Example.com, the query returns no results and the share fails with a 404.

Suggested change

def get_user_by_email(email: str) -> Optional[dict]:

"""Return ``{"uid": ..., ...}`` for the first user with *email*, or ``None``."""

docs = (

db.collection("users")

.where("email", "==", email)

.limit(1)

.stream()

)

for doc in docs:

return {"uid": doc.id, **doc.to_dict()}

return None

def get_user_by_email(email: str) -> Optional[dict]:

"""Return ``{"uid": ..., ...}`` for the first user with *email*, or ``None``."""

normalized = email.strip().lower()

docs = (

db.collection("users")

.where("email", "==", normalized)

.limit(1)

.stream()

)

greptile-apps · 2026-03-20T20:54:49Z

backend/utils/speech_profile_sharing.py

+    Returns ``None`` if the blob does not exist or cannot be read.
+    """
+    try:
+        bucket = storage.Client().bucket(_BUCKET_SPEECH_PROFILES)


New storage.Client() on every GCS call

storage.Client() performs service-account credential loading and HTTP connection setup each time it is called. load_embedding_from_gcs is called during every WebSocket session startup (once per shared profile) and again on every real-time share event. Creating a new client on each invocation is wasteful and may exhaust connection pool limits under load.

Use a module-level singleton instead:

_gcs_client: storage.Client | None = None def _get_gcs_client() -> storage.Client: global _gcs_client if _gcs_client is None: _gcs_client = storage.Client() return _gcs_client

Then replace storage.Client().bucket(...) with _get_gcs_client().bucket(...).

backend/routers/speech_profile_sharing.py

prabhatrm23git added 8 commits March 20, 2026 15:35

feat(speech-profile): register sharing router in main app

eb13f7a

- Add speech_profile_sharing router import and registration - Enable /v3/speech-profile/* endpoints in FastAPI app Part of BasedHardware#3040

greptile-apps bot reviewed Mar 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(speech-profile): address race conditions and memory leaks in profile sharing#5867

fix(speech-profile): address race conditions and memory leaks in profile sharing#5867
prabhatrm23git wants to merge 8 commits intoBasedHardware:mainfrom
prabhatrm23git:main

prabhatrm23git commented Mar 20, 2026

Uh oh!

greptile-apps bot commented Mar 20, 2026 •

edited

Loading

Comments Outside Diff (1)

Uh oh!

greptile-apps bot Mar 20, 2026

Uh oh!

greptile-apps bot Mar 20, 2026

Uh oh!

greptile-apps bot Mar 20, 2026

Uh oh!

greptile-apps bot Mar 20, 2026

Uh oh!

greptile-apps bot Mar 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	return np.frombuffer(raw, dtype=np.float32)
	return np.frombuffer(raw, dtype=np.float32).copy()

Conversation

prabhatrm23git commented Mar 20, 2026

Uh oh!

greptile-apps bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 1/5

Important Files Changed

Sequence Diagram

Comments Outside Diff (1)

Uh oh!

greptile-apps bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

greptile-apps bot commented Mar 20, 2026 •

edited

Loading