Skip to content

Conversation

@TRM-coding
Copy link

完成算子编写

ma-hang and others added 30 commits September 15, 2025 11:49
…ise framework on CPU, NVIDIA, Cambricon, Metax, Moore, and Kunlun
* issue/450: change indexToReducedOffset() to indexToOffset in elementwise framework on CPU, NVIDIA, Cambricon, Metax, Moore, and Kunlun

* issue/450: remove indexToReducedOffset() in all platforms

* issue/450: add the testcases that pinpoint the issue in infiniop-test
…unlun to use the refactored interface and return unimplemented error for NEOX-style algorithm
…_rope_and_rope_v2

Issue/428: Merge `rope_v2` into `rope`
Signed-off-by: Ceng <441651826@qq.com>
Signed-off-by: Ceng <441651826@qq.com>
* issue/436: support kunlun rope U32

* issue/436: 支持9g7b 4b模型

---------

Co-authored-by: zhangyue <zhangyue@qiyuanlab.com>
…icon

issue/434 - added bf16 support for Cambricon MLU
issue/466: 昆仑平台rope关于NEOX算法的实现
* issue/459 - Support more data type combinations

* issue/459 - added test cases for 9G7B and 9G70B

* issue/459 - modified rms kernel to support larger tensors
issue/469: disable NVIDIA-dequantize on Iluvatar GPU via ENABLE_NVIDIA_API marco
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.