Skip to content

fix: DPO implementation bug in forward_dpo (issue #1449)#1855

Open
huyyxyshare wants to merge 1 commit intoFunAudioLLM:mainfrom
huyyxyshare:fix/dpo-bug-1449
Open

fix: DPO implementation bug in forward_dpo (issue #1449)#1855
huyyxyshare wants to merge 1 commit intoFunAudioLLM:mainfrom
huyyxyshare:fix/dpo-bug-1449

Conversation

@huyyxyshare
Copy link
Copy Markdown

Summary

Fix a serious bug in the DPO implementation as reported in issue #1449.

- Changed mask logic from '== IGNORE_ID' to '!= IGNORE_ID'
- Fixed gather index to use '~mask' for padding positions
- This ensures DPO loss is computed on valid tokens only, not IGNORE_ID positions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant