This issue acted as a PR tracker to Intel customer support related PRs. The purpose is to get understanding of what each PR does and how important are they compared to other customer support related PRs. This also help us to aware of merged PRs and PRs progress.
Under review
Already merged
o MoE
o Ulysess
o AutoTP
o Accelerator Graph
o ZeRO
o Others
This issue acted as a PR tracker to Intel customer support related PRs. The purpose is to get understanding of what each PR does and how important are they compared to other customer support related PRs. This also help us to aware of merged PRs and PRs progress.
Under review
Already merged
o MoE
o Ulysess
o AutoTP
o Accelerator Graph
o ZeRO
o Others
non_reentrant_checkpointfix requires_grad of input must be true for activation checkpoint layer in pipeline train. #4224deepspeed.comminstead oftorch.distributed#5225torch.nan_to_numreplace numpy wrapper one #5877