Skip to content

Conversation

@HAOCHENYE
Copy link
Collaborator

No description provided.

@HAOCHENYE HAOCHENYE force-pushed the yehc/training_with_hf branch 3 times, most recently from 0a929ca to 84bbe79 Compare January 28, 2026 08:44
The previous `clean_param_name` only matches the
"._checkpoint_wrapped_module" which starts with **.**, however, for
layers wrapper with checkpoint wrapper, the layer name start with
"_checkpoint_wrapped_module" cannot be cleaned for the missing prefix .


ghstack-source-id: 9c8da53
Pull-Request: InternLM#1452
…educe code duplication

ghstack-source-id: cf0d79c
Pull-Request: InternLM#1453
ghstack-source-id: 383761a
Pull-Request: InternLM#1457
`torch.autograd.grad` will raise an error if any tensor of `input` does
not require gradient, e.g, the frozen `lm_head`. This commit just fix
it with a simple control flow.


ghstack-source-id: ec3804f
Pull-Request: InternLM#1458
@HAOCHENYE HAOCHENYE force-pushed the yehc/training_with_hf branch from 84bbe79 to df0fa00 Compare January 28, 2026 09:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant