Skip to content

Conversation

@juyterman1000
Copy link
Contributor

When using ZeRO-3 with zero_quantized_weights=True and bf16 enabled, the dequantized weights were incorrectly cast to fp16 instead of preserving the original bf16 dtype. This caused RuntimeError during training.

The fix adds original_dtype tracking to AllGatherCoalescedHandle, mirroring the existing pattern in AllGatherHandle, to ensure weights are converted back to their original dtype after dequantization.

@juyterman1000 juyterman1000 force-pushed the fix/bf16-zero3-quantized-weights branch from 8f82004 to 1c95ade Compare January 18, 2026 05:06
@PKUWZP
Copy link
Collaborator

PKUWZP commented Jan 18, 2026

@juyterman1000 Thanks for working on this. Can you check two things here: 1. Double-check AllGatherHandle (the non-coalesced version) doesn't have the same bug in its quantization path (lines 697-700 in partition_parameters.py file); 2. (Optional) Add a regression test for bf16 + zero_quantized_weights. I think 1) might be an interesting verification, and if true, we can have a follow up PR.

@PKUWZP PKUWZP self-requested a review January 18, 2026 06:07
@juyterman1000 juyterman1000 force-pushed the fix/bf16-zero3-quantized-weights branch 3 times, most recently from e25d8f1 to 72da75f Compare January 18, 2026 19:04
@juyterman1000
Copy link
Contributor Author

Thanks @PKUWZP for the suggestions! I've updated the PR with both items.

@juyterman1000 juyterman1000 force-pushed the fix/bf16-zero3-quantized-weights branch 4 times, most recently from 377b76b to 83bd5d1 Compare January 20, 2026 04:08
Signed-off-by: juyterman1000 <fastrunner10090@gmail.com>
@juyterman1000 juyterman1000 force-pushed the fix/bf16-zero3-quantized-weights branch from 83bd5d1 to 3940f49 Compare January 20, 2026 04:13
@juyterman1000
Copy link
Contributor Author

Hi @tohtana,
please have a look when you have a moment. I’m happy to make any necessary adjustments if you have concerns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants