Skip to content

Reproduction Steps for TransMLA-llama3-8b-8k #42

@haok1402

Description

@haok1402

https://huggingface.co/fxmeng/TransMLA-llama3-8b-8k/

Hi, Thanks for the amazing work. I see on the Huggingface, there is a model release with TransMLA. Could you clarify how to reproduce the conversion from Meta-Llama3-8B to TransMLA-llama3-8b-8k? What's the training data used? In particular, the experimental setup from the paper seems to focus primarily on smolLM 1.7B and Llama 2 7B, with little mention of Llama3 8B.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions