Skip to content

fix(core): fix speech start time overriden by VAD SOS events#5670

Open
chenghao-mou wants to merge 4 commits intomainfrom
chenghao/fix/eot-speaking-time
Open

fix(core): fix speech start time overriden by VAD SOS events#5670
chenghao-mou wants to merge 4 commits intomainfrom
chenghao/fix/eot-speaking-time

Conversation

@chenghao-mou
Copy link
Copy Markdown
Member

@chenghao-mou chenghao-mou commented May 7, 2026

Previously, it was overriden by VAD events such that turn start timestamps in Insights were misaligned to the last speaking span start time.

This closes AGT-2840.

Before:
CleanShot 2026-05-07 at 11 46 42@2x

After:
CleanShot 2026-05-07 at 11 48 14@2x

Previously, it was overriden by VAD events such that turn start timestamps in Insights were misaligned to the last speaking span start time.

This closes AGT-2840.
@chenghao-mou chenghao-mou requested a review from a team May 7, 2026 10:54
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

βœ… Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 2 additional findings.

Open in Devin Review

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 4 additional findings in Devin Review.

Open in Devin Review

with trace.use_span(self._ensure_user_turn_span()):
self._hooks.on_end_of_speech(ev)

self._vad_speech_started = False
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟑 clear_user_turn doesn't reset _turn_speech_started, causing stale _speech_start_time in subsequent turns

By making _turn_speech_started turn-scoped (no longer reset on VAD END_OF_SPEECH at the old line 998), clear_user_turn() at audio_recognition.py:650-660 must now explicitly reset _turn_speech_started (and _speech_start_time). Previously, even though clear_user_turn didn't reset _vad_speech_started, the next VAD END_OF_SPEECH event would reset it, allowing the subsequent START_OF_SPEECH to correctly set _speech_start_time. Now, if a user cancels a turn mid-speech (e.g., push-to-talk cancel_turn), _turn_speech_started remains True indefinitely (VAD EOS no longer resets it), so the next turn's first VAD START_OF_SPEECH skips updating _speech_start_time, leaving it stale from the previous turn. This causes incorrect started_speaking_at / stopped_speaking_at / end_of_turn_delay metrics in _EndOfTurnInfo passed to on_end_of_turn.

Prompt for agents
The removal of `self._vad_speech_started = False` from the VAD END_OF_SPEECH handler (old line 998) makes the flag turn-scoped. This is correct for the intended fix (preventing subsequent speech bursts within a turn from overwriting _speech_start_time). However, `clear_user_turn()` at line 650 now needs to also reset `_turn_speech_started` and `_speech_start_time` to ensure a clean slate for the next turn. Without this, cancelling a turn (e.g., push-to-talk cancel_turn) while the user is speaking leaves `_turn_speech_started = True`, so the next turn's VAD START_OF_SPEECH won't update `_speech_start_time`. Add `self._turn_speech_started = False` and `self._speech_start_time = None` to the `clear_user_turn` method body.
Open in Devin Review

Was this helpful? React with πŸ‘ or πŸ‘Ž to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant