Skip to content

Conversation

@smian1
Copy link
Collaborator

@smian1 smian1 commented Dec 29, 2025

Summary

  • Adds speaker_name field to transcript segments in developer API responses
  • Adds speaker_name field to conversation_created webhook payload
  • Speaker names are resolved from person_id mappings stored in the database

Changes

  • backend/routers/developer.py: Added _add_speaker_names_to_segments() helper and integrated into GET /v1/dev/user/conversations endpoints
  • backend/utils/webhooks.py: Added _add_speaker_names_to_payload() helper and integrated into conversation_created_webhook()

Result

Each transcript segment now includes a speaker_name field:

  • "User" if the segment is from the user
  • The person's actual name if person_id is assigned
  • "Speaker {id}" as fallback

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings December 29, 2025 09:33
@smian1 smian1 requested a review from mdmohsin7 December 29, 2025 09:34
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds the speaker_name to transcript segments in developer API responses and conversation_created webhooks. The implementation correctly fetches person data to map person_id to a name.

My main feedback is regarding code duplication. The logic for adding speaker names is implemented in two separate places (backend/routers/developer.py and backend/utils/webhooks.py). This is a maintainability concern. I've left a specific comment with a suggestion to refactor one of the implementations for better efficiency and highlighted the need to consolidate this logic into a shared utility function in a follow-up PR to avoid duplication, which aligns with our guidelines for handling large-scale refactoring in separate PRs.

Comment on lines +41 to +45
person_ids = [seg.get('person_id') for seg in segments if seg.get('person_id')]
people_map = {}
if person_ids:
people_data = users_db.get_people_by_ids(uid, list(set(person_ids)))
people_map = {p['id']: p['name'] for p in people_data}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The logic for collecting person_ids can be made more memory-efficient by building a set directly, rather than creating an intermediate list and then converting it to a set.

More importantly, this function _add_speaker_names_to_payload is nearly identical to _add_speaker_names_to_segments in backend/routers/developer.py. Code duplication increases maintenance overhead and can lead to inconsistencies. I strongly recommend extracting this logic into a single, shared utility function in a follow-up PR to improve maintainability. This approach aligns with our guideline that large-scale refactoring should be handled in a separate, dedicated pull request rather than being bundled with feature changes.

Suggested change
person_ids = [seg.get('person_id') for seg in segments if seg.get('person_id')]
people_map = {}
if person_ids:
people_data = users_db.get_people_by_ids(uid, list(set(person_ids)))
people_map = {p['id']: p['name'] for p in people_data}
person_ids = {seg['person_id'] for seg in segments if seg.get('person_id')}
people_map = {}
if person_ids:
people_data = users_db.get_people_by_ids(uid, list(person_ids))
people_map = {p['id']: p['name'] for p in people_data}
References
  1. Large-scale refactoring, such as converting synchronous I/O calls to asynchronous across multiple files, should be handled in a separate, dedicated pull request rather than being bundled with feature changes to maintain consistency with existing patterns.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds speaker name mapping to transcript segments in both the developer API and webhook system. The feature resolves speaker names from person_id mappings stored in the database, providing more meaningful speaker identification in transcript segments.

Key Changes:

  • Added speaker_name field to transcript segments with intelligent fallback logic: "User" for user segments, actual person names when person_id is mapped, or "Speaker {id}" as default
  • Implemented speaker name resolution using batched database queries to efficiently fetch person names from person_id mappings
  • Integrated speaker name enrichment into both webhook payloads and developer API responses when transcripts are included

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
backend/utils/webhooks.py Added _add_speaker_names_to_payload() helper function and integrated it into the conversation_created_webhook() to enrich webhook payloads with speaker names
backend/routers/developer.py Added _add_speaker_names_to_segments() helper function and integrated it into both the list and single conversation GET endpoints when include_transcript=True

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +683 to +705
def _add_speaker_names_to_segments(uid, conversations: list):
"""Add speaker_name to transcript segments based on person_id mappings."""
all_person_ids = set()
for conv in conversations:
for seg in conv.get('transcript_segments', []):
if seg.get('person_id'):
all_person_ids.add(seg['person_id'])

people_map = {}
if all_person_ids:
people_data = users_db.get_people_by_ids(uid, list(all_person_ids))
people_map = {p['id']: p['name'] for p in people_data}

for conv in conversations:
for seg in conv.get('transcript_segments', []):
if seg.get('is_user'):
seg['speaker_name'] = 'User'
elif seg.get('person_id') and seg['person_id'] in people_map:
seg['speaker_name'] = people_map[seg['person_id']]
else:
seg['speaker_name'] = f"Speaker {seg.get('speaker_id', 0)}"


Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is nearly identical to _add_speaker_names_to_payload() in backend/utils/webhooks.py (lines 35-53). The logic for mapping speaker names is duplicated in both places.

Consider extracting this common logic into a shared utility function (e.g., in backend/utils/ or backend/database/) that both files can import and use. This would:

  • Reduce code duplication
  • Make the logic easier to maintain and test
  • Ensure consistent behavior across both the webhook and API endpoints
Suggested change
def _add_speaker_names_to_segments(uid, conversations: list):
"""Add speaker_name to transcript segments based on person_id mappings."""
all_person_ids = set()
for conv in conversations:
for seg in conv.get('transcript_segments', []):
if seg.get('person_id'):
all_person_ids.add(seg['person_id'])
people_map = {}
if all_person_ids:
people_data = users_db.get_people_by_ids(uid, list(all_person_ids))
people_map = {p['id']: p['name'] for p in people_data}
for conv in conversations:
for seg in conv.get('transcript_segments', []):
if seg.get('is_user'):
seg['speaker_name'] = 'User'
elif seg.get('person_id') and seg['person_id'] in people_map:
seg['speaker_name'] = people_map[seg['person_id']]
else:
seg['speaker_name'] = f"Speaker {seg.get('speaker_id', 0)}"
def _annotate_speaker_names(uid, transcript_segments: list):
"""
Add speaker_name to transcript segments based on person_id mappings.
This helper operates on a flat list of segments so it can be reused
wherever transcript segments are available.
"""
all_person_ids = {seg['person_id'] for seg in transcript_segments if seg.get('person_id')}
people_map = {}
if all_person_ids:
people_data = users_db.get_people_by_ids(uid, list(all_person_ids))
people_map = {p['id']: p['name'] for p in people_data}
for seg in transcript_segments:
if seg.get('is_user'):
seg['speaker_name'] = 'User'
elif seg.get('person_id') and seg['person_id'] in people_map:
seg['speaker_name'] = people_map[seg['person_id']]
else:
seg['speaker_name'] = f"Speaker {seg.get('speaker_id', 0)}"
def _add_speaker_names_to_segments(uid, conversations: list):
"""Add speaker_name to transcript segments in a list of conversations."""
# Flatten all transcript segments from conversations so we can annotate
# them in a single pass.
all_segments = []
for conv in conversations:
segments = conv.get('transcript_segments', [])
if segments:
all_segments.extend(segments)
if all_segments:
_annotate_speaker_names(uid, all_segments)

Copilot uses AI. Check for mistakes.
webhook_url += f'?uid={uid}'
try:
payload = memory.as_dict_cleaned_dates()
_add_speaker_names_to_payload(uid, payload)
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new speaker name mapping functionality in the webhook lacks test coverage. Since the repository has established integration tests (e.g., backend/tests/integration/test_notifications_integration.py), consider adding tests to verify:

  • Speaker name resolution from person_id mappings
  • Correct fallback to "User" for user segments
  • Correct fallback to "Speaker {id}" format for unknown speakers
  • Handling of conversations with no transcript segments
  • Behavior when get_people_by_ids() returns empty results

This would help ensure the feature works correctly and prevent regressions.

Copilot uses AI. Check for mistakes.
for conv in unlocked_conversations:
conv.pop('transcript_segments', None)
else:
_add_speaker_names_to_segments(uid, unlocked_conversations)
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new speaker name mapping functionality in the developer API lacks test coverage. Since the repository has established integration tests (e.g., backend/tests/integration/test_notifications_integration.py), consider adding tests to verify:

  • Speaker name resolution from person_id mappings works correctly in both single and multi-conversation scenarios
  • Correct fallback to "User" for user segments
  • Correct fallback to "Speaker {id}" format for unknown speakers
  • Behavior when include_transcript=True vs include_transcript=False
  • Handling of conversations with no transcript segments
  • Behavior when get_people_by_ids() returns empty or partial results

This would help ensure the feature works correctly and prevent regressions.

Copilot uses AI. Check for mistakes.
Comment on lines +35 to +53
def _add_speaker_names_to_payload(uid, payload: dict):
"""Add speaker_name to transcript segments in webhook payload."""
segments = payload.get('transcript_segments', [])
if not segments:
return

person_ids = [seg.get('person_id') for seg in segments if seg.get('person_id')]
people_map = {}
if person_ids:
people_data = users_db.get_people_by_ids(uid, list(set(person_ids)))
people_map = {p['id']: p['name'] for p in people_data}

for seg in segments:
if seg.get('is_user'):
seg['speaker_name'] = 'User'
elif seg.get('person_id') and seg['person_id'] in people_map:
seg['speaker_name'] = people_map[seg['person_id']]
else:
seg['speaker_name'] = f"Speaker {seg.get('speaker_id', 0)}"
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is nearly identical to _add_speaker_names_to_segments() in backend/routers/developer.py (lines 683-703). The logic for mapping speaker names is duplicated in both places with only minor differences in the conversations iteration structure.

Consider extracting this common logic into a shared utility function (e.g., in backend/utils/ or backend/database/) that both files can import and use. This would:

  • Reduce code duplication
  • Make the logic easier to maintain and test
  • Ensure consistent behavior across both the webhook and API endpoints

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant