Skip to content

Audio Session Defaults#1019

Open
pblazej wants to merge 11 commits into
mainfrom
blaze/audio-session-defaults
Open

Audio Session Defaults#1019
pblazej wants to merge 11 commits into
mainfrom
blaze/audio-session-defaults

Conversation

@pblazej
Copy link
Copy Markdown
Contributor

@pblazej pblazej commented May 27, 2026

Reworks the iOS audio session defaults and category selection. No public API additions; observable behavior changes are listed below with rationale.

  • .playAndRecord options trimmed: .mixWithOthers and .allowAirPlay dropped (removes the -66637 interruption-recovery race and an option that's redundant per Apple QA1803). .mixWithOthers kept on the listener .playback config.
  • Category selection rewritten as selectConfiguration(state:) with sticky upgrade — picks .playAndRecord when any of: recording is active, recording has happened in this session (sticky hasRecorded), the participant has mic-publish permission (canPublishMicrophone), or the app set isRecordingAlwaysPreparedMode. Otherwise .playback. No more category churn on mute toggles.
  • .ambient reset on the empty edge (post-Room.disconnect()) so the iOS volume rocker returns to the media register — fixes the long-standing "in-call volume sticks" symptom.
  • Permission plumbing + docs: LocalParticipant.set(permissions:) forwards the mic-publish predicate to the audio session via a new shared canPublish(source:) helper extracted from the existing checkPermissions(toPublish:); Docs/audio.md gains an "Audio session category selection" section.

pblazej and others added 11 commits May 26, 2026 10:48
When both playout and recording stop, reset the category to .ambient
before deactivating. Without this, the iOS volume rocker stays on the
ringer/call register because the last-active category was .playAndRecord,
causing in-call volume to appear to "stick" when the user adjusts the
rocker after the call ends.

.ambient mixes with other audio so any resumed media (e.g. Music) isn't
re-interrupted if the session momentarily reactivates afterwards.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two independent motivations:

  - .mixWithOthers is a known cause of echo in real-time communication
    scenarios where other apps share the audio device — once the option
    is set, other apps' audio mixes into the capture path and degrades
    the acoustic echo cancellation reference.

  - Our WebRTC ADM has a retry loop in the engine restart path
    working around -66637 (kAudioUnitErr_Initialized) on
    interruption-end recovery when this option is active. Removing
    the option from .playAndRecord eliminates the workaround's
    triggering condition. Related customer symptom: #1011.

.allowAirPlay is redundant under both .voiceChat and .videoChat per
Apple QA1803 — .voiceChat disallows AirPlay outright, .videoChat
auto-allows the mirrored variant.

.mixWithOthers is intentionally kept on the .playback config used for
listener-only sessions, where mixing with media playback is correct.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The audio session category was re-evaluated on every SessionRequirement
change. Muting via setMicrophone(enabled: false) would downgrade from
.playAndRecord back to .playback, then unmuting would upgrade again —
causing N category switches per session, each one flipping the iOS
volume rocker between the media and ringer registers.

Selection now flows through `selectConfiguration(state:)` and lands on
.playAndRecord when any of the following holds, staying there until
both playout and recording stop:

  - recording is enabled (first setMicrophone(enabled: true) or external
    acquire(requirement: .recording))
  - recording was previously enabled in the same session (sticky bit
    `hasEverRecorded`, cleared only on the empty edge)
  - the participant wants to publish: the app declared publishing intent
    via isRecordingAlwaysPreparedMode AND the local participant has
    permission to publish a microphone (`canPublishMicrophone`, driven
    by ParticipantPermissions)

Pure audience participants (no permission, never recorded, no current
recording requirement) keep .playback for the entire session. The mic
publish permission defaults to true here (optimistic) and is corrected
once permissions arrive from the server.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When the local participant's permissions change (server-issued at
JoinResponse time, or via a mid-session permission update), forward
the derived "can publish microphone" predicate to the audio session
so it can pick `.playback` for audience-only participants without
waiting for a publish attempt to fail.

Extracts a `canPublish(source:)` helper from the existing
`checkPermissions(toPublish:)` so the two callers share one predicate
definition. The helper combines the top-level `canPublish` grant with
the per-source restriction in `canPublishSources` (empty = no
restriction).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment out the `&& AudioManager.shared.isRecordingAlwaysPreparedMode`
conjunct in selectConfiguration so the pre-emptive .playAndRecord
upgrade is purely server (permission) driven. Any participant with
mic-publish permission will get .playAndRecord up-front instead of
waiting on the app-side intent signal.

Trade-off: closer to the "always playAndRecord for permitted users"
pattern. Participants who could publish but never do still get the
mic permission prompt and ringer volume register on connect — keeping
the intent signal commented makes restoration trivial once we
validate the desired behaviour.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reflects the behavior shipped in this branch:
  - .ambient reset on the empty edge before deactivating
  - Listener vs publisher category selection
  - Sticky .playAndRecord once recording engages
  - Audience-only participants (canPublish=false) stay on .playback

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Aids diagnosing why a particular AVAudioSession category was picked
without attaching a debugger. Logs the three input signals (current
recording state, sticky bit, mic-publish permission) plus the speaker
preference and the resulting category.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
wantsToPublish now fires when either signal indicates publishing
intent: server-issued mic permission OR the app-declared
isRecordingAlwaysPreparedMode flag. Either alone is sufficient to
pre-empt to .playAndRecord.

In practice the alwaysPrepared path already triggers engine recording
via the ADM (which sets isRecordingEnabled, then the sticky bit), so
the additional disjunct rarely fires standalone — but it makes the
publisher-hint role of the flag explicit in the predicate rather than
buried in a comment.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Updates the bullets to match the predicate after the
isRecordingAlwaysPreparedMode hint was OR'd with canPublishMicrophone.
Audience-only is now defined by absence of both signals, and the
Publisher bullet enumerates all three entry points.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
/// server-issued mic permission (`canPublishMicrophone`) or the
/// app-declared `isRecordingAlwaysPreparedMode`.
private func selectConfiguration(state: State) -> AudioSessionConfiguration {
let wantsToPublish = state.canPublishMicrophone || AudioManager.shared.isRecordingAlwaysPreparedMode
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a heuristic ofc, open to suggestions.

@pblazej
Copy link
Copy Markdown
Contributor Author

pblazej commented May 27, 2026

📱 TF build 2.15.0.b20260527

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant