Conversation
e047c31 to
1259ecc
Compare
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
632436c to
4793567
Compare
4793567 to
64cd454
Compare
64cd454 to
935ffb4
Compare
| elif self.config.use_2d_fsdp_sharding: | ||
| self.wi_kernel_axes = ("embed_no_exp", "mlp", None) | ||
| self.wo_kernel_axes = ("embed_no_exp", "mlp", None) | ||
| self.wi_kernel_axes = ("embed_no_exp_moe", "mlp", None) |
There was a problem hiding this comment.
I think we don't need to update for use_2d_fsdp_sharding? embed_no_exp_moe was not defined in 2dfsdp yml.
There was a problem hiding this comment.
"embed_no_exp" shares same logical rule contents with "embed_no_exp_moe", see
maxtext/src/maxtext/configs/base.yml
Lines 489 to 500 in 935ffb4
suexu1025
left a comment
There was a problem hiding this comment.
need some tests before change for all cases.
I added tests in the PR description. Would you like to elaborate what tests you want to see? |
935ffb4 to
79c778a
Compare
I added a 2d-fsdp test in PR description. |
Description
Split following logical names:
embed_no_expinto:embed_no_exp,embed_moeactivation_embedinto:activation_embed,activation_embed_moeactivation_norm_lengthinto:activation_norm_length,activation_norm_length_moeactivation_length_no_expinto:activation_length_no_exp,activation_length_no_exp_moeactivation_batchintoactivation_batch,activation_batch_moeactivation_batch_no_expintoactivation_batch_no_exp,activation_batch_no_exp_moeTests
model_name=deepseek3-testonv5p-8VM.V-llm test
Vllm test: https://diff.googleplex.com/#key=q4eBkLazREaW
Script: https://paste.googleplex.com/6329507825975296
2d-fsdp test:
2d-fsdp: https://diff.googleplex.com/#key=sZNNYqZFXNEu
Checklist
Before submitting this PR, please make sure (put X in square brackets):
gemini-reviewlabel.