Skip to content

feat: animated Clawd avatar for camera bubble with face tracking#1672

Open
namearth5005 wants to merge 12 commits intoCapSoftware:mainfrom
namearth5005:feat/clawd-avatar
Open

feat: animated Clawd avatar for camera bubble with face tracking#1672
namearth5005 wants to merge 12 commits intoCapSoftware:mainfrom
namearth5005:feat/clawd-avatar

Conversation

@namearth5005
Copy link

@namearth5005 namearth5005 commented Mar 20, 2026

Summary

Adds an animated Clawd (Claude Code mascot) avatar that replaces the webcam feed in Cap's camera bubble, driven by real-time face tracking via Apple Vision framework.

What it does

When "Avatar mode (Clawd)" is enabled in Settings → Experimental:

  • Your webcam feed is replaced with a procedurally-rendered animated Clawd character
  • Face tracking via Apple Vision detects your head movement, mouth, and eye blinks
  • Spring physics smooth all motion for organic, Disney-quality animation feel
  • The avatar appears in both the editor preview AND exported videos (same pipeline)

How it works

Real webcam → Apple Vision (76 landmarks) → FacePose → Spring smoother → AvatarRenderer
                                                                              ↓
                                                                    512x512 RGBA texture
                                                                              ↓
                                                              CameraLayer::prepare() (existing)
                                                                              ↓
                                                              Camera bubble (existing shader)

The avatar is drawn entirely procedurally in a WGSL shader using SDF (signed distance field) rectangles — no external assets, textures, or model files. The character is resolution-independent and anti-aliased.

Animation features

  • Head tracking — tilts and rotates following your real head
  • Mouth sync — opens when you speak (driven by lip landmark distances)
  • Eye blinks — mirrors your real blinks + natural auto-blinks every 3-6 seconds
  • Idle breathing — subtle scale pulse when still
  • Talk bounce — body bounces slightly when mouth is open (speech energy)
  • Eye sparkle — white highlights shift with head angle
  • Drop shadow — soft shadow beneath the character
  • Idle mode — graceful fallback when no face detected (gentle sway, auto-blinks)

Spring physics (two profiles)

  • Head (tension: 300, mass: 1.5, friction: 25) — moderate response with overshoot for organic head momentum
  • Mouth/Eyes (tension: 500, mass: 0.8, friction: 20) — snappy response for responsive lip sync

New files

File Purpose
crates/face-tracking/ New crate — Apple Vision face landmark extraction
crates/rendering/src/avatar.rs AvatarRenderer — offscreen wgpu rendering of Clawd
crates/rendering/src/avatar_smoothing.rs Spring-smoothed face pose
crates/rendering/src/shaders/avatar-clawd.wgsl Procedural Clawd character shader
crates/project/src/configuration.rs AvatarBackground enum

Zero impact on existing features

  • Camera capture pipeline unchanged (still records real webcam)
  • Camera bubble positioning/rounding/shadow unchanged (existing shader handles it)
  • Timeline, zoom, export pipeline unchanged
  • Avatar swaps at the DecodedFrame level — CameraLayer doesn't know the difference

Platform support

  • macOS: Full face tracking via Apple Vision (macOS 12+, any webcam)
  • Windows/Linux: Avatar renders but with idle animation only (no face tracking yet — stub returns defaults)

Test plan

  • Enable Settings → Experimental → "Avatar mode (Clawd)"
  • Open editor — camera bubble shows animated Clawd
  • Move head — Clawd tilts/rotates smoothly
  • Speak — Clawd's mouth opens/closes in sync
  • Blink — Clawd blinks (plus random auto-blinks)
  • Stay still — idle breathing animation
  • Export video — Clawd appears in exported file
  • Disable avatar mode — real camera feed returns
  • Existing recordings unaffected

Greptile Summary

This PR adds a complete animated "Clawd" avatar system to Cap's camera bubble, driven by real-time Apple Vision face tracking. The feature is well-architected — a new cap-face-tracking crate cleanly isolates the platform-specific Vision code behind a cross-platform stub, spring physics smooth all face pose data, and a WGSL SDF shader renders the character procedurally with no external assets. The swap is done at the DecodedFrame level so the existing camera bubble rendering path requires no changes.

However, there are several correctness issues that need attention before this ships:

  • Uniform buffer size mismatch: The Rust AvatarUniforms struct is 56 bytes but WGSL's struct alignment rules (SizeOf = RoundUp(AlignOf, last_offset+last_size)) require 64 bytes. The _padding field must be [f32; 4] (not [f32; 2]), and _padding: vec2<f32> in the shader must change to vec4<f32>. With wgpu validation active this will produce a binding size error.
  • Per-frame blocking GPU readback: device.poll(wgpu::PollType::Wait) stalls the CPU on every rendered frame, eliminating all CPU/GPU parallelism in the pipeline.
  • AvatarBackground setting is dead code: Dark | Light | Gradient are defined and exposed in the UI, but bg_color in AvatarUniforms is always hardcoded to the dark value — the setting has no effect.
  • GPU resources not freed on disable: Toggling avatar mode off does not drop AvatarRenderer, FaceTracker, or FacePoseSmoother, leaving GPU buffers and textures allocated indefinitely.
  • Copy-pasted avatar block: The ~45-line face-tracking + render + camera-prepare block is duplicated identically in both prepare and prepare_with_encoder; a shared helper method is needed.
  • Hardcoded 33 ms delta time: Both smoother.update(&raw_pose, 33.0) and avatar.render(..., 1.0/30.0) assume 30 fps; actual elapsed time should be derived from the recording timeline.

Confidence Score: 2/5

  • Not safe to merge — the uniform buffer size mismatch will cause wgpu validation errors at draw time, the AvatarBackground setting is dead code, and per-frame blocking GPU readback poses a real performance regression.
  • Multiple P1 bugs: (1) the WGSL struct requires 64-byte alignment but the Rust struct is 56 bytes, causing validation failures with wgpu's validation layer active; (2) AvatarBackground is exposed in the UI but never applied; (3) blocking GPU readback on every frame; (4) GPU resources leak when avatar mode is toggled off. The overall architecture is solid and the feature works in the happy path, but these issues need to be addressed before shipping.
  • crates/rendering/src/avatar.rs and crates/rendering/src/lib.rs need the most attention — uniform buffer padding, blocking readback, dead AvatarBackground field, resource cleanup, and code duplication all live there.

Important Files Changed

Filename Overview
crates/rendering/src/avatar.rs New AvatarRenderer using wgpu; has a uniform buffer size mismatch (56 vs 64 bytes required by WGSL), a per-frame blocking GPU readback, and the AvatarBackground setting is hardcoded and never applied.
crates/rendering/src/lib.rs Avatar rendering integration into RendererLayers; avatar code block is copy-pasted between prepare and prepare_with_encoder, GPU resources aren't freed on disable, and delta time is hardcoded to 33 ms in both paths.
crates/face-tracking/src/macos.rs Apple Vision face landmark extraction; correctly handles pixel buffer lock/unlock, but uses an unsafe transmute/cast to read results from the landmarks request through a DetectFaceRectanglesRequest pointer, which is technically UB in Rust.
crates/rendering/src/shaders/avatar-clawd.wgsl Procedural SDF-based Clawd character shader; the _padding field should be vec4 instead of vec2 to match WGSL struct size rules (64 bytes), but the SDF logic and animation uniforms are otherwise sound.
crates/rendering/src/avatar_smoothing.rs Spring-mass-damper smoother for face pose; correctly applies separate head and expression spring configs, though it uses 2D springs for 1D values (Y component always zero) and relies on a hardcoded dt_ms from the call site.
crates/project/src/configuration.rs Adds AvatarBackground enum (Dark/Light/Gradient) with serde/specta derives; the enum is correctly defined but never consumed by the rendering path — currently dead code.
apps/desktop/src-tauri/src/general_settings.rs Adds avatar_mode (bool) and avatar_background (AvatarBackground) fields to GeneralSettingsStore with correct serde defaults; no issues.
apps/desktop/src/routes/(window-chrome)/settings/experimental.tsx Adds "Avatar mode (Clawd)" toggle to the Experimental settings UI under a new "Camera Features" section; straightforward and consistent with existing toggle patterns.

Sequence Diagram

sequenceDiagram
    participant Webcam as Real Webcam
    participant CamFrame as DecodedFrame (camera)
    participant FaceTracker as FaceTracker (Apple Vision)
    participant Smoother as FacePoseSmoother (spring physics)
    participant AvatarRend as AvatarRenderer (wgpu)
    participant GPU as GPU (WGSL shader)
    participant CamLayer as CameraLayer::prepare()
    participant Output as Video Output

    Webcam->>CamFrame: RGBA pixels (still captured)
    CamFrame->>FaceTracker: track(rgba_data, w, h)
    Note over FaceTracker: RGBA→BGRA copy<br/>CVPixelBuffer<br/>VNDetectFaceRectanglesRequest<br/>VNDetectFaceLandmarksRequest
    FaceTracker-->>Smoother: raw FacePose (pitch/yaw/roll/mouth/eyes)
    Smoother-->>AvatarRend: smoothed FacePose + dt=33ms (hardcoded)
    AvatarRend->>GPU: write AvatarUniforms (56 bytes ⚠️ should be 64)
    GPU-->>AvatarRend: render pass → 512×512 RGBA texture
    Note over AvatarRend: device.poll(Wait) ⚠️ blocking
    AvatarRend-->>CamLayer: output_rgba() → new DecodedFrame(512×512)
    CamLayer->>Output: composited camera bubble (existing shader)
    Note over Output: Same pipeline for editor preview AND export
Loading

Comments Outside Diff (5)

  1. crates/rendering/src/avatar.rs, line 450-451 (link)

    P1 Uniform buffer size mismatch (56 vs 64 bytes)

    The Rust AvatarUniforms struct is 56 bytes, but the corresponding WGSL struct requires 64 bytes due to WGSL's struct alignment rules (SizeOf(S) = RoundUp(AlignOf(S), last_offset + last_size) = RoundUp(16, 48+8) = 64).

    When wgpu validates the binding at draw time, it checks that the bound buffer is at least as large as the shader's declared struct size (64 bytes). With only 56 bytes in the buffer, this triggers a validation error:

    Buffer binding size 56 is smaller than minimum binding size 64

    _padding should be extended from [f32; 2] to [f32; 4] to make the Rust struct 64 bytes:

    The WGSL _padding: vec2<f32> must also be updated to vec4<f32> to keep both sides in sync.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: crates/rendering/src/avatar.rs
    Line: 450-451
    
    Comment:
    **Uniform buffer size mismatch (56 vs 64 bytes)**
    
    The Rust `AvatarUniforms` struct is 56 bytes, but the corresponding WGSL struct requires 64 bytes due to WGSL's struct alignment rules (`SizeOf(S) = RoundUp(AlignOf(S), last_offset + last_size)` = `RoundUp(16, 48+8) = 64`).
    
    When wgpu validates the binding at draw time, it checks that the bound buffer is at least as large as the shader's declared struct size (64 bytes). With only 56 bytes in the buffer, this triggers a validation error:
    > `Buffer binding size 56 is smaller than minimum binding size 64`
    
    `_padding` should be extended from `[f32; 2]` to `[f32; 4]` to make the Rust struct 64 bytes:
    
    
    
    The WGSL `_padding: vec2<f32>` must also be updated to `vec4<f32>` to keep both sides in sync.
    
    How can I resolve this? If you propose a fix, please make it concise.
  2. crates/rendering/src/avatar.rs, line 748-759 (link)

    P1 Blocking GPU readback stalls per-frame rendering

    device.poll(wgpu::PollType::Wait) blocks the calling thread until ALL previously submitted GPU work completes. This is called on every single frame render and turns an inherently asynchronous GPU pipeline into a fully synchronous one.

    For a 512×512 RGBA texture this may be acceptable in testing, but in the recording/rendering pipeline (which renders every video frame), this stall occurs on every frame: it submits GPU work, waits for completion, then maps the buffer — eliminating any CPU/GPU parallelism.

    Consider using double-buffering (submit to buffer N, read from buffer N-1 which finished last frame) or keeping the avatar data GPU-side and compositing it directly without a CPU readback. If CPU readback is truly required, at minimum pipeline the map callback asynchronously rather than block with PollType::Wait.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: crates/rendering/src/avatar.rs
    Line: 748-759
    
    Comment:
    **Blocking GPU readback stalls per-frame rendering**
    
    `device.poll(wgpu::PollType::Wait)` blocks the calling thread until ALL previously submitted GPU work completes. This is called on every single frame render and turns an inherently asynchronous GPU pipeline into a fully synchronous one.
    
    For a 512×512 RGBA texture this may be acceptable in testing, but in the recording/rendering pipeline (which renders every video frame), this stall occurs on every frame: it submits GPU work, waits for completion, then maps the buffer — eliminating any CPU/GPU parallelism.
    
    Consider using double-buffering (submit to buffer N, read from buffer N-1 which finished last frame) or keeping the avatar data GPU-side and compositing it directly without a CPU readback. If CPU readback is truly required, at minimum pipeline the map callback asynchronously rather than block with `PollType::Wait`.
    
    How can I resolve this? If you propose a fix, please make it concise.
  3. crates/rendering/src/lib.rs, line 2932-2997 (link)

    P2 Duplicated avatar rendering block in prepare and prepare_with_encoder

    The entire face-tracking + avatar-render + camera-prepare block (~45 lines) is copy-pasted identically in both prepare and prepare_with_encoder. Any future changes (e.g. applying AvatarBackground, fixing the hardcoded dt) must be made in two places.

    Extract the shared logic into a helper that produces the avatar_frame and avatar_xy, then call it from both paths:

    fn build_avatar_frame(&mut self, device: &wgpu::Device, queue: &wgpu::Queue, camera_frame: Option<&decoder::DecodedFrame>, dt: f64) -> Option<(XY<u32>, decoder::DecodedFrame)> {
        if !self.avatar_enabled { return None; }
        if let (Some(face_tracker), Some(smoother)) = (self.face_tracker.as_mut(), self.face_pose_smoother.as_mut()) {
            if let Some(frame) = camera_frame {
                let raw_pose = face_tracker.track(frame.data(), frame.width(), frame.height());
                self.avatar_face_pose = smoother.update(&raw_pose, (dt * 1000.0) as f32);
            }
        }
        let avatar = self.avatar.as_mut()?;
        avatar.render(device, queue, &self.avatar_face_pose, dt);
        let size = crate::avatar::AvatarRenderer::size();
        Some((XY::new(size, size), decoder::DecodedFrame::new(avatar.output_rgba().to_vec(), size, size)))
    }

    This also fixes the hardcoded 33.0 ms / 1.0/30.0 dt values by passing in real elapsed time.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: crates/rendering/src/lib.rs
    Line: 2932-2997
    
    Comment:
    **Duplicated avatar rendering block in `prepare` and `prepare_with_encoder`**
    
    The entire face-tracking + avatar-render + camera-prepare block (~45 lines) is copy-pasted identically in both `prepare` and `prepare_with_encoder`. Any future changes (e.g. applying `AvatarBackground`, fixing the hardcoded dt) must be made in two places.
    
    Extract the shared logic into a helper that produces the `avatar_frame` and `avatar_xy`, then call it from both paths:
    
    ```rust
    fn build_avatar_frame(&mut self, device: &wgpu::Device, queue: &wgpu::Queue, camera_frame: Option<&decoder::DecodedFrame>, dt: f64) -> Option<(XY<u32>, decoder::DecodedFrame)> {
        if !self.avatar_enabled { return None; }
        if let (Some(face_tracker), Some(smoother)) = (self.face_tracker.as_mut(), self.face_pose_smoother.as_mut()) {
            if let Some(frame) = camera_frame {
                let raw_pose = face_tracker.track(frame.data(), frame.width(), frame.height());
                self.avatar_face_pose = smoother.update(&raw_pose, (dt * 1000.0) as f32);
            }
        }
        let avatar = self.avatar.as_mut()?;
        avatar.render(device, queue, &self.avatar_face_pose, dt);
        let size = crate::avatar::AvatarRenderer::size();
        Some((XY::new(size, size), decoder::DecodedFrame::new(avatar.output_rgba().to_vec(), size, size)))
    }
    ```
    
    This also fixes the hardcoded `33.0 ms` / `1.0/30.0` dt values by passing in real elapsed time.
    
    How can I resolve this? If you propose a fix, please make it concise.
  4. crates/rendering/src/avatar.rs, line 688 (link)

    P1 AvatarBackground setting is wired up but never applied

    general_settings.rs adds both avatar_mode: bool and avatar_background: AvatarBackground, and configuration.rs defines AvatarBackground::Dark | Light | Gradient. However, the bg_color passed to AvatarUniforms is always hardcoded to [0.15, 0.15, 0.18, 1.0] (dark) regardless of the user's selection.

    The AvatarBackground value is never read anywhere in the rendering path, making the setting dead code. Either apply it here (mapping Dark/Light/Gradient to the appropriate RGBA values) or remove the setting until it's implemented.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: crates/rendering/src/avatar.rs
    Line: 688
    
    Comment:
    **`AvatarBackground` setting is wired up but never applied**
    
    `general_settings.rs` adds both `avatar_mode: bool` and `avatar_background: AvatarBackground`, and `configuration.rs` defines `AvatarBackground::Dark | Light | Gradient`. However, the `bg_color` passed to `AvatarUniforms` is always hardcoded to `[0.15, 0.15, 0.18, 1.0]` (dark) regardless of the user's selection.
    
    The `AvatarBackground` value is never read anywhere in the rendering path, making the setting dead code. Either apply it here (mapping `Dark`/`Light`/`Gradient` to the appropriate RGBA values) or remove the setting until it's implemented.
    
    How can I resolve this? If you propose a fix, please make it concise.
  5. crates/rendering/src/lib.rs, line 2941 (link)

    P2 Hardcoded 33 ms delta time for smoother ignores actual frame rate

    self.avatar_face_pose = smoother.update(&raw_pose, 33.0);

    This assumes 30 fps regardless of the actual rendering frame rate. If the pipeline runs at 60 fps (16.6 ms) the spring will under-advance; at lower frame rates it will over-advance, causing stiff or springy animation artifacts.

    The actual elapsed time should be derived from segment_frames.recording_time or passed down through the call chain, and both occurrences (in prepare and prepare_with_encoder) need the same fix.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: crates/rendering/src/lib.rs
    Line: 2941
    
    Comment:
    **Hardcoded 33 ms delta time for smoother ignores actual frame rate**
    
    ```rust
    self.avatar_face_pose = smoother.update(&raw_pose, 33.0);
    ```
    
    This assumes 30 fps regardless of the actual rendering frame rate. If the pipeline runs at 60 fps (16.6 ms) the spring will under-advance; at lower frame rates it will over-advance, causing stiff or springy animation artifacts.
    
    The actual elapsed time should be derived from `segment_frames.recording_time` or passed down through the call chain, and both occurrences (in `prepare` and `prepare_with_encoder`) need the same fix.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: crates/rendering/src/avatar.rs
Line: 450-451

Comment:
**Uniform buffer size mismatch (56 vs 64 bytes)**

The Rust `AvatarUniforms` struct is 56 bytes, but the corresponding WGSL struct requires 64 bytes due to WGSL's struct alignment rules (`SizeOf(S) = RoundUp(AlignOf(S), last_offset + last_size)` = `RoundUp(16, 48+8) = 64`).

When wgpu validates the binding at draw time, it checks that the bound buffer is at least as large as the shader's declared struct size (64 bytes). With only 56 bytes in the buffer, this triggers a validation error:
> `Buffer binding size 56 is smaller than minimum binding size 64`

`_padding` should be extended from `[f32; 2]` to `[f32; 4]` to make the Rust struct 64 bytes:

```suggestion
    pub _padding: [f32; 4],
```

The WGSL `_padding: vec2<f32>` must also be updated to `vec4<f32>` to keep both sides in sync.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: crates/rendering/src/avatar.rs
Line: 748-759

Comment:
**Blocking GPU readback stalls per-frame rendering**

`device.poll(wgpu::PollType::Wait)` blocks the calling thread until ALL previously submitted GPU work completes. This is called on every single frame render and turns an inherently asynchronous GPU pipeline into a fully synchronous one.

For a 512×512 RGBA texture this may be acceptable in testing, but in the recording/rendering pipeline (which renders every video frame), this stall occurs on every frame: it submits GPU work, waits for completion, then maps the buffer — eliminating any CPU/GPU parallelism.

Consider using double-buffering (submit to buffer N, read from buffer N-1 which finished last frame) or keeping the avatar data GPU-side and compositing it directly without a CPU readback. If CPU readback is truly required, at minimum pipeline the map callback asynchronously rather than block with `PollType::Wait`.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: crates/rendering/src/lib.rs
Line: 2932-2997

Comment:
**Duplicated avatar rendering block in `prepare` and `prepare_with_encoder`**

The entire face-tracking + avatar-render + camera-prepare block (~45 lines) is copy-pasted identically in both `prepare` and `prepare_with_encoder`. Any future changes (e.g. applying `AvatarBackground`, fixing the hardcoded dt) must be made in two places.

Extract the shared logic into a helper that produces the `avatar_frame` and `avatar_xy`, then call it from both paths:

```rust
fn build_avatar_frame(&mut self, device: &wgpu::Device, queue: &wgpu::Queue, camera_frame: Option<&decoder::DecodedFrame>, dt: f64) -> Option<(XY<u32>, decoder::DecodedFrame)> {
    if !self.avatar_enabled { return None; }
    if let (Some(face_tracker), Some(smoother)) = (self.face_tracker.as_mut(), self.face_pose_smoother.as_mut()) {
        if let Some(frame) = camera_frame {
            let raw_pose = face_tracker.track(frame.data(), frame.width(), frame.height());
            self.avatar_face_pose = smoother.update(&raw_pose, (dt * 1000.0) as f32);
        }
    }
    let avatar = self.avatar.as_mut()?;
    avatar.render(device, queue, &self.avatar_face_pose, dt);
    let size = crate::avatar::AvatarRenderer::size();
    Some((XY::new(size, size), decoder::DecodedFrame::new(avatar.output_rgba().to_vec(), size, size)))
}
```

This also fixes the hardcoded `33.0 ms` / `1.0/30.0` dt values by passing in real elapsed time.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: crates/rendering/src/lib.rs
Line: 2889-2896

Comment:
**GPU resources not freed when avatar mode is disabled**

`set_avatar_enabled(false)` only sets `self.avatar_enabled = false`, but leaves `self.avatar`, `self.face_tracker`, and `self.face_pose_smoother` as `Some(...)`. The `AvatarRenderer` holds a `wgpu::RenderPipeline`, two `wgpu::Texture`s, two `wgpu::Buffer`s, and a `wgpu::BindGroup` — significant GPU memory that will remain allocated until the entire `RendererLayers` is dropped.

Users who toggle avatar mode off (Settings → Experimental) would expect these resources to be released:

```suggestion
    pub fn set_avatar_enabled(&mut self, device: &wgpu::Device, enabled: bool) {
        self.avatar_enabled = enabled;
        if enabled && self.avatar.is_none() {
            self.avatar = Some(crate::avatar::AvatarRenderer::new(device));
            self.face_tracker = Some(cap_face_tracking::FaceTracker::new());
            self.face_pose_smoother = Some(avatar_smoothing::FacePoseSmoother::new());
        } else if !enabled {
            self.avatar = None;
            self.face_tracker = None;
            self.face_pose_smoother = None;
        }
    }
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: crates/rendering/src/avatar.rs
Line: 688

Comment:
**`AvatarBackground` setting is wired up but never applied**

`general_settings.rs` adds both `avatar_mode: bool` and `avatar_background: AvatarBackground`, and `configuration.rs` defines `AvatarBackground::Dark | Light | Gradient`. However, the `bg_color` passed to `AvatarUniforms` is always hardcoded to `[0.15, 0.15, 0.18, 1.0]` (dark) regardless of the user's selection.

The `AvatarBackground` value is never read anywhere in the rendering path, making the setting dead code. Either apply it here (mapping `Dark`/`Light`/`Gradient` to the appropriate RGBA values) or remove the setting until it's implemented.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: crates/face-tracking/src/macos.rs
Line: 235-240

Comment:
**Unsound transmute to retrieve results from `landmarks_request`**

`self.landmarks_request` is typed as `arc::R<vn::Request>` but was constructed as a `VNDetectFaceLandmarksRequest` (or a `DetectFaceRectanglesRequest` in the fallback case). The code reinterprets its raw pointer as `&vn::DetectFaceRectanglesRequest` to call `.results()`:

```rust
let face_req: &vn::DetectFaceRectanglesRequest =
    &*(raw as *const vn::DetectFaceRectanglesRequest);
face_req.results()
```

In the fallback path (`create_landmarks_request` returns a `DetectFaceRectanglesRequest` transmuted to `vn::Request`), this double-cast works. But in the normal path where `self.landmarks_request` is genuinely a `VNDetectFaceLandmarksRequest`, calling methods through a `DetectFaceRectanglesRequest` pointer is technically UB in Rust, even if the Objective-C runtime happens to handle it correctly due to the class hierarchy.

Consider casting to `VNDetectFaceLandmarksRequest` directly (which is the correct type), or using the Objective-C runtime's `objc_msgSend` / cidre's typed APIs for `results()` on the actual class.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: crates/rendering/src/lib.rs
Line: 2941

Comment:
**Hardcoded 33 ms delta time for smoother ignores actual frame rate**

```rust
self.avatar_face_pose = smoother.update(&raw_pose, 33.0);
```

This assumes 30 fps regardless of the actual rendering frame rate. If the pipeline runs at 60 fps (16.6 ms) the spring will under-advance; at lower frame rates it will over-advance, causing stiff or springy animation artifacts.

The actual elapsed time should be derived from `segment_frames.recording_time` or passed down through the call chain, and both occurrences (in `prepare` and `prepare_with_encoder`) need the same fix.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: "chore: update Cargo...."

Greptile also left 2 inline comments on this PR.

Comment on lines +2889 to +2896
&constants.queue,
uniforms.camera,
Some((avatar_xy, &avatar_frame, segment_frames.recording_time)),
encoder,
);
self.camera_only.prepare_with_encoder(
&constants.device,
&constants.queue,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 GPU resources not freed when avatar mode is disabled

set_avatar_enabled(false) only sets self.avatar_enabled = false, but leaves self.avatar, self.face_tracker, and self.face_pose_smoother as Some(...). The AvatarRenderer holds a wgpu::RenderPipeline, two wgpu::Textures, two wgpu::Buffers, and a wgpu::BindGroup — significant GPU memory that will remain allocated until the entire RendererLayers is dropped.

Users who toggle avatar mode off (Settings → Experimental) would expect these resources to be released:

Suggested change
&constants.queue,
uniforms.camera,
Some((avatar_xy, &avatar_frame, segment_frames.recording_time)),
encoder,
);
self.camera_only.prepare_with_encoder(
&constants.device,
&constants.queue,
pub fn set_avatar_enabled(&mut self, device: &wgpu::Device, enabled: bool) {
self.avatar_enabled = enabled;
if enabled && self.avatar.is_none() {
self.avatar = Some(crate::avatar::AvatarRenderer::new(device));
self.face_tracker = Some(cap_face_tracking::FaceTracker::new());
self.face_pose_smoother = Some(avatar_smoothing::FacePoseSmoother::new());
} else if !enabled {
self.avatar = None;
self.face_tracker = None;
self.face_pose_smoother = None;
}
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/rendering/src/lib.rs
Line: 2889-2896

Comment:
**GPU resources not freed when avatar mode is disabled**

`set_avatar_enabled(false)` only sets `self.avatar_enabled = false`, but leaves `self.avatar`, `self.face_tracker`, and `self.face_pose_smoother` as `Some(...)`. The `AvatarRenderer` holds a `wgpu::RenderPipeline`, two `wgpu::Texture`s, two `wgpu::Buffer`s, and a `wgpu::BindGroup` — significant GPU memory that will remain allocated until the entire `RendererLayers` is dropped.

Users who toggle avatar mode off (Settings → Experimental) would expect these resources to be released:

```suggestion
    pub fn set_avatar_enabled(&mut self, device: &wgpu::Device, enabled: bool) {
        self.avatar_enabled = enabled;
        if enabled && self.avatar.is_none() {
            self.avatar = Some(crate::avatar::AvatarRenderer::new(device));
            self.face_tracker = Some(cap_face_tracking::FaceTracker::new());
            self.face_pose_smoother = Some(avatar_smoothing::FacePoseSmoother::new());
        } else if !enabled {
            self.avatar = None;
            self.face_tracker = None;
            self.face_pose_smoother = None;
        }
    }
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +235 to +240
}
}
}
}

unsafe extern "C-unwind" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Unsound transmute to retrieve results from landmarks_request

self.landmarks_request is typed as arc::R<vn::Request> but was constructed as a VNDetectFaceLandmarksRequest (or a DetectFaceRectanglesRequest in the fallback case). The code reinterprets its raw pointer as &vn::DetectFaceRectanglesRequest to call .results():

let face_req: &vn::DetectFaceRectanglesRequest =
    &*(raw as *const vn::DetectFaceRectanglesRequest);
face_req.results()

In the fallback path (create_landmarks_request returns a DetectFaceRectanglesRequest transmuted to vn::Request), this double-cast works. But in the normal path where self.landmarks_request is genuinely a VNDetectFaceLandmarksRequest, calling methods through a DetectFaceRectanglesRequest pointer is technically UB in Rust, even if the Objective-C runtime happens to handle it correctly due to the class hierarchy.

Consider casting to VNDetectFaceLandmarksRequest directly (which is the correct type), or using the Objective-C runtime's objc_msgSend / cidre's typed APIs for results() on the actual class.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/face-tracking/src/macos.rs
Line: 235-240

Comment:
**Unsound transmute to retrieve results from `landmarks_request`**

`self.landmarks_request` is typed as `arc::R<vn::Request>` but was constructed as a `VNDetectFaceLandmarksRequest` (or a `DetectFaceRectanglesRequest` in the fallback case). The code reinterprets its raw pointer as `&vn::DetectFaceRectanglesRequest` to call `.results()`:

```rust
let face_req: &vn::DetectFaceRectanglesRequest =
    &*(raw as *const vn::DetectFaceRectanglesRequest);
face_req.results()
```

In the fallback path (`create_landmarks_request` returns a `DetectFaceRectanglesRequest` transmuted to `vn::Request`), this double-cast works. But in the normal path where `self.landmarks_request` is genuinely a `VNDetectFaceLandmarksRequest`, calling methods through a `DetectFaceRectanglesRequest` pointer is technically UB in Rust, even if the Objective-C runtime happens to handle it correctly due to the class hierarchy.

Consider casting to `VNDetectFaceLandmarksRequest` directly (which is the correct type), or using the Objective-C runtime's `objc_msgSend` / cidre's typed APIs for `results()` on the actual class.

How can I resolve this? If you propose a fix, please make it concise.

…era avatar

Loads avatar.riv via createRive if available, drives state machine
inputs from face tracking events. Falls back to Canvas2D Clawd when
no .riv file is present.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant