Skip to content

Commit b941913

Browse files
authored
fix: run text encoders on MPS GPU instead of CPU for Apple Silicon (Comfy-Org#12809)
On Apple Silicon, `vram_state` is set to `VRAMState.SHARED` because CPU and GPU share unified memory. However, `text_encoder_device()` only checked for `HIGH_VRAM` and `NORMAL_VRAM`, causing all text encoders to fall back to CPU on MPS devices. Adding `VRAMState.SHARED` to the condition allows non-quantized text encoders (e.g. bf16 Gemma 3 12B) to run on the MPS GPU, providing significant speedup for text encoding and prompt generation. Note: quantized models (fp4/fp8) that use float8_e4m3fn internally will still fall back to CPU via the `supports_cast()` check in `CLIP.__init__()`, since MPS does not support fp8 dtypes.
1 parent cad24ce commit b941913

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

comfy/model_management.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1003,7 +1003,7 @@ def text_encoder_offload_device():
10031003
def text_encoder_device():
10041004
if args.gpu_only:
10051005
return get_torch_device()
1006-
elif vram_state in (VRAMState.HIGH_VRAM, VRAMState.NORMAL_VRAM) or comfy.memory_management.aimdo_enabled:
1006+
elif vram_state in (VRAMState.HIGH_VRAM, VRAMState.NORMAL_VRAM, VRAMState.SHARED) or comfy.memory_management.aimdo_enabled:
10071007
if should_use_fp16(prioritize_performance=False):
10081008
return get_torch_device()
10091009
else:

0 commit comments

Comments
 (0)