Commit b941913
authored
fix: run text encoders on MPS GPU instead of CPU for Apple Silicon (Comfy-Org#12809)
On Apple Silicon, `vram_state` is set to `VRAMState.SHARED` because
CPU and GPU share unified memory. However, `text_encoder_device()`
only checked for `HIGH_VRAM` and `NORMAL_VRAM`, causing all text
encoders to fall back to CPU on MPS devices.
Adding `VRAMState.SHARED` to the condition allows non-quantized text
encoders (e.g. bf16 Gemma 3 12B) to run on the MPS GPU, providing
significant speedup for text encoding and prompt generation.
Note: quantized models (fp4/fp8) that use float8_e4m3fn internally
will still fall back to CPU via the `supports_cast()` check in
`CLIP.__init__()`, since MPS does not support fp8 dtypes.1 parent cad24ce commit b941913
1 file changed
+1
-1
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1003 | 1003 | | |
1004 | 1004 | | |
1005 | 1005 | | |
1006 | | - | |
| 1006 | + | |
1007 | 1007 | | |
1008 | 1008 | | |
1009 | 1009 | | |
| |||
0 commit comments