Skip to content

[Feature] Offload optimizer states to CPU to reduce memory #1524

Open
tina-wen wants to merge 1 commit intoInternLM:mainfrom
tina-wen:swap_optimizer
Open

[Feature] Offload optimizer states to CPU to reduce memory #1524
tina-wen wants to merge 1 commit intoInternLM:mainfrom
tina-wen:swap_optimizer

Conversation

@tina-wen
Copy link

@tina-wen tina-wen commented Mar 3, 2026

Description

This PR adds CPU offloading for optimizer states to reduce NPU memory usage. Optimizer states stay in host memory and are transferred to device only during optimizer.step() via h2d/d2h communications.

Changes

  • Offload optimizer states to CPU memory
  • Transfer to device only during optimizer.step()
  • Resolve conflicts with DCP.save and RL offload_optimizer
  • Trade memory efficiency for performance

Testing

Verified with:

  • Memory reduction tests
  • DCP checkpoint compatibility
  • RL optimization workflows

Copy link
Collaborator

@HAOCHENYE HAOCHENYE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Please

@tina-wen tina-wen changed the title [Feature] Offload optimizer states to CPU to reduce NPU memory with minimal performance impact [Feature] Offload optimizer states to CPU to reduce memory Mar 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants