Checklist
Describe the bug
在使用openmmlab/lmdeploy:latest-cu12.8和openmmlab/lmdeploy:latest-cu12的时候,输入任何内容都会触发下面的错误。使用的环境是V100 32GB * 8.
[TM][ERROR] CUDA runtime error: an illegal memory access was encountered /opt/lmdeploy/src/turbomind/core/stream.h:27
Reproduction
下载最新版的docker pull openmmlab/lmdeploy:latest-cu12.8
加载glm-4.7-flash-awq模型
执行推理任务的时候出现错误,会导致框架重新启动
Environment
lmdeploy:latest-cu12.8
ubuntu 2204
Error traceback
[TM][ERROR] CUDA runtime error: an illegal memory access was encountered /opt/lmdeploy/src/turbomind/core/stream.h:27
[TM][ERROR] CUDA runtime error: an illegal memory access was encountered /opt/lmdeploy/src/turbomind/core/allocator.cc:49
[TM][ERROR] CUDA runtime error: an illegal memory access was encountered /opt/lmdeploy/src/turbomind/core/allocator.cc:49
[TM][ERROR] void turbomind::LlamaLinear::Impl::Forward(turbomind::core::Tensor&, const turbomind::core::Tensor&, const turbomind::LlamaDenseWeight&, const turbomind::core::Buffer_<int>&, const turbomind::core::Buffer_<int>&): 1
[TM][ERROR] CUDA runtime error: an illegal memory access was encountered /opt/lmdeploy/src/turbomind/core/allocator.cc:55
Checklist
Describe the bug
在使用openmmlab/lmdeploy:latest-cu12.8和openmmlab/lmdeploy:latest-cu12的时候,输入任何内容都会触发下面的错误。使用的环境是V100 32GB * 8.
[TM][ERROR] CUDA runtime error: an illegal memory access was encountered /opt/lmdeploy/src/turbomind/core/stream.h:27
Reproduction
下载最新版的docker pull openmmlab/lmdeploy:latest-cu12.8
加载glm-4.7-flash-awq模型
执行推理任务的时候出现错误,会导致框架重新启动
Environment
Error traceback