feat: Add speaker name mapping to transcript segments in developer API and webhook #3962

smian1 · 2025-12-29T09:33:49Z

Summary

Adds speaker_name field to transcript segments in developer API responses
Adds speaker_name field to conversation_created webhook payload
Speaker names are resolved from person_id mappings stored in the database

Changes

backend/routers/developer.py: Added _add_speaker_names_to_segments() helper and integrated into GET /v1/dev/user/conversations endpoints
backend/utils/webhooks.py: Added _add_speaker_names_to_payload() helper and integrated into conversation_created_webhook()

Result

Each transcript segment now includes a speaker_name field:

"User" if the segment is from the user
The person's actual name if person_id is assigned
"Speaker {id}" as fallback

🤖 Generated with Claude Code

…d webhook handlers

gemini-code-assist

Code Review

This pull request adds the speaker_name to transcript segments in developer API responses and conversation_created webhooks. The implementation correctly fetches person data to map person_id to a name.

My main feedback is regarding code duplication. The logic for adding speaker names is implemented in two separate places (backend/routers/developer.py and backend/utils/webhooks.py). This is a maintainability concern. I've left a specific comment with a suggestion to refactor one of the implementations for better efficiency and highlighted the need to consolidate this logic into a shared utility function in a follow-up PR to avoid duplication, which aligns with our guidelines for handling large-scale refactoring in separate PRs.

gemini-code-assist · 2025-12-29T09:35:42Z

backend/utils/webhooks.py

+    person_ids = [seg.get('person_id') for seg in segments if seg.get('person_id')]
+    people_map = {}
+    if person_ids:
+        people_data = users_db.get_people_by_ids(uid, list(set(person_ids)))
+        people_map = {p['id']: p['name'] for p in people_data}


The logic for collecting person_ids can be made more memory-efficient by building a set directly, rather than creating an intermediate list and then converting it to a set.

More importantly, this function _add_speaker_names_to_payload is nearly identical to _add_speaker_names_to_segments in backend/routers/developer.py. Code duplication increases maintenance overhead and can lead to inconsistencies. I strongly recommend extracting this logic into a single, shared utility function in a follow-up PR to improve maintainability. This approach aligns with our guideline that large-scale refactoring should be handled in a separate, dedicated pull request rather than being bundled with feature changes.

Suggested change

person_ids = [seg.get('person_id') for seg in segments if seg.get('person_id')]

people_map = {}

if person_ids:

people_data = users_db.get_people_by_ids(uid, list(set(person_ids)))

people_map = {p['id']: p['name'] for p in people_data}

person_ids = {seg['person_id'] for seg in segments if seg.get('person_id')}

people_map = {}

if person_ids:

people_data = users_db.get_people_by_ids(uid, list(person_ids))

people_map = {p['id']: p['name'] for p in people_data}

References

Large-scale refactoring, such as converting synchronous I/O calls to asynchronous across multiple files, should be handled in a separate, dedicated pull request rather than being bundled with feature changes to maintain consistency with existing patterns.

Copilot

Pull request overview

This PR adds speaker name mapping to transcript segments in both the developer API and webhook system. The feature resolves speaker names from person_id mappings stored in the database, providing more meaningful speaker identification in transcript segments.

Key Changes:

Added speaker_name field to transcript segments with intelligent fallback logic: "User" for user segments, actual person names when person_id is mapped, or "Speaker {id}" as default
Implemented speaker name resolution using batched database queries to efficiently fetch person names from person_id mappings
Integrated speaker name enrichment into both webhook payloads and developer API responses when transcripts are included

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File	Description
`backend/utils/webhooks.py`	Added `_add_speaker_names_to_payload()` helper function and integrated it into the `conversation_created_webhook()` to enrich webhook payloads with speaker names
`backend/routers/developer.py`	Added `_add_speaker_names_to_segments()` helper function and integrated it into both the list and single conversation GET endpoints when `include_transcript=True`

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-29T09:37:06Z

backend/routers/developer.py

+def _add_speaker_names_to_segments(uid, conversations: list):
+    """Add speaker_name to transcript segments based on person_id mappings."""
+    all_person_ids = set()
+    for conv in conversations:
+        for seg in conv.get('transcript_segments', []):
+            if seg.get('person_id'):
+                all_person_ids.add(seg['person_id'])
+
+    people_map = {}
+    if all_person_ids:
+        people_data = users_db.get_people_by_ids(uid, list(all_person_ids))
+        people_map = {p['id']: p['name'] for p in people_data}
+
+    for conv in conversations:
+        for seg in conv.get('transcript_segments', []):
+            if seg.get('is_user'):
+                seg['speaker_name'] = 'User'
+            elif seg.get('person_id') and seg['person_id'] in people_map:
+                seg['speaker_name'] = people_map[seg['person_id']]
+            else:
+                seg['speaker_name'] = f"Speaker {seg.get('speaker_id', 0)}"
+
+


This function is nearly identical to _add_speaker_names_to_payload() in backend/utils/webhooks.py (lines 35-53). The logic for mapping speaker names is duplicated in both places.

Consider extracting this common logic into a shared utility function (e.g., in backend/utils/ or backend/database/) that both files can import and use. This would:

Reduce code duplication

Make the logic easier to maintain and test

Ensure consistent behavior across both the webhook and API endpoints

Suggested change

def _add_speaker_names_to_segments(uid, conversations: list):

"""Add speaker_name to transcript segments based on person_id mappings."""

all_person_ids = set()

for conv in conversations:

for seg in conv.get('transcript_segments', []):

if seg.get('person_id'):

all_person_ids.add(seg['person_id'])

people_map = {}

if all_person_ids:

people_data = users_db.get_people_by_ids(uid, list(all_person_ids))

people_map = {p['id']: p['name'] for p in people_data}

for conv in conversations:

for seg in conv.get('transcript_segments', []):

if seg.get('is_user'):

seg['speaker_name'] = 'User'

elif seg.get('person_id') and seg['person_id'] in people_map:

seg['speaker_name'] = people_map[seg['person_id']]

else:

seg['speaker_name'] = f"Speaker {seg.get('speaker_id', 0)}"

def _annotate_speaker_names(uid, transcript_segments: list):

"""

Add speaker_name to transcript segments based on person_id mappings.

This helper operates on a flat list of segments so it can be reused

wherever transcript segments are available.

"""

all_person_ids = {seg['person_id'] for seg in transcript_segments if seg.get('person_id')}

people_map = {}

if all_person_ids:

people_data = users_db.get_people_by_ids(uid, list(all_person_ids))

people_map = {p['id']: p['name'] for p in people_data}

for seg in transcript_segments:

if seg.get('is_user'):

seg['speaker_name'] = 'User'

elif seg.get('person_id') and seg['person_id'] in people_map:

seg['speaker_name'] = people_map[seg['person_id']]

else:

seg['speaker_name'] = f"Speaker {seg.get('speaker_id', 0)}"

def _add_speaker_names_to_segments(uid, conversations: list):

"""Add speaker_name to transcript segments in a list of conversations."""

# Flatten all transcript segments from conversations so we can annotate

# them in a single pass.

all_segments = []

for conv in conversations:

segments = conv.get('transcript_segments', [])

if segments:

all_segments.extend(segments)

if all_segments:

_annotate_speaker_names(uid, all_segments)

Copilot · 2025-12-29T09:37:07Z

backend/utils/webhooks.py

        webhook_url += f'?uid={uid}'
        try:
            payload = memory.as_dict_cleaned_dates()
+            _add_speaker_names_to_payload(uid, payload)


The new speaker name mapping functionality in the webhook lacks test coverage. Since the repository has established integration tests (e.g., backend/tests/integration/test_notifications_integration.py), consider adding tests to verify:

Speaker name resolution from person_id mappings

Correct fallback to "User" for user segments

Correct fallback to "Speaker {id}" format for unknown speakers

Handling of conversations with no transcript segments

Behavior when get_people_by_ids() returns empty results

This would help ensure the feature works correctly and prevent regressions.

Copilot · 2025-12-29T09:37:07Z

backend/routers/developer.py

        for conv in unlocked_conversations:
            conv.pop('transcript_segments', None)
+    else:
+        _add_speaker_names_to_segments(uid, unlocked_conversations)


The new speaker name mapping functionality in the developer API lacks test coverage. Since the repository has established integration tests (e.g., backend/tests/integration/test_notifications_integration.py), consider adding tests to verify:

Speaker name resolution from person_id mappings works correctly in both single and multi-conversation scenarios

Correct fallback to "User" for user segments

Correct fallback to "Speaker {id}" format for unknown speakers

Behavior when include_transcript=True vs include_transcript=False

Handling of conversations with no transcript segments

Behavior when get_people_by_ids() returns empty or partial results

This would help ensure the feature works correctly and prevent regressions.

Copilot · 2025-12-29T09:37:07Z

backend/utils/webhooks.py

+def _add_speaker_names_to_payload(uid, payload: dict):
+    """Add speaker_name to transcript segments in webhook payload."""
+    segments = payload.get('transcript_segments', [])
+    if not segments:
+        return
+
+    person_ids = [seg.get('person_id') for seg in segments if seg.get('person_id')]
+    people_map = {}
+    if person_ids:
+        people_data = users_db.get_people_by_ids(uid, list(set(person_ids)))
+        people_map = {p['id']: p['name'] for p in people_data}
+
+    for seg in segments:
+        if seg.get('is_user'):
+            seg['speaker_name'] = 'User'
+        elif seg.get('person_id') and seg['person_id'] in people_map:
+            seg['speaker_name'] = people_map[seg['person_id']]
+        else:
+            seg['speaker_name'] = f"Speaker {seg.get('speaker_id', 0)}"


This function is nearly identical to _add_speaker_names_to_segments() in backend/routers/developer.py (lines 683-703). The logic for mapping speaker names is duplicated in both places with only minor differences in the conversations iteration structure.

Consider extracting this common logic into a shared utility function (e.g., in backend/utils/ or backend/database/) that both files can import and use. This would:

Reduce code duplication

Make the logic easier to maintain and test

Ensure consistent behavior across both the webhook and API endpoints

feat: Add speaker name mapping to transcript segments in developer an…

2c76630

…d webhook handlers

Copilot AI review requested due to automatic review settings December 29, 2025 09:33

Copilot started reviewing on behalf of smian1 December 29, 2025 09:34 View session

smian1 requested a review from mdmohsin7 December 29, 2025 09:34

gemini-code-assist bot reviewed Dec 29, 2025

View reviewed changes

Copilot AI reviewed Dec 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add speaker name mapping to transcript segments in developer API and webhook #3962

feat: Add speaker name mapping to transcript segments in developer API and webhook #3962

smian1 commented Dec 29, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 29, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 29, 2025

Uh oh!

Copilot AI Dec 29, 2025

Uh oh!

Copilot AI Dec 29, 2025

Uh oh!

Copilot AI Dec 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat: Add speaker name mapping to transcript segments in developer API and webhook #3962

Are you sure you want to change the base?

feat: Add speaker name mapping to transcript segments in developer API and webhook #3962

Conversation

smian1 commented Dec 29, 2025

Summary

Changes

Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 29, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant