[DIPU]华为上设置DIPU_PYTHON_DEVICE_AS_CUDA=false报错module 'torch' has no attribute 'xpu'

背景：使用dipu在华为昇腾910B上跑llama2推理，`import torch_dipu` 提示

```
dipu device will show as cuda device. if it's not expected behavior, please set env DIPU_PYTHON_DEVICE_AS_CUDA=false
```

若设置 `export  DIPU_PYTHON_DEVICE_AS_CUDA=false` 则推理会出现报错：

```
AttributeError: module 'torch' has no attribute 'xpu'
```

恢复 `export  DIPU_PYTHON_DEVICE_AS_CUDA=true` 则报错消失，详细报错如下：

```
[W OperatorEntry.cpp:153] Warning: Warning only once for all operators,  other operators may also be overrided.
  Overriding a previously registered kernel for the same operator and the same dispatch key
  operator: aten::cat(Tensor[] tensors, int dim=0) -> Tensor
    registered at /pytorch/build/aten/src/ATen/RegisterSchema.cpp:6
  dispatch key: XLA
  previous kernel: registered at /pytorch/aten/src/ATen/LegacyBatchingRegistrations.cpp:1079
       new kernel: registered at /deeplink_main/deeplink.framework/dipu/third_party/DIOPI/impl/ascend_npu/torch_npu/csrc/DIOPIAdapter.cpp:3364 (function operator())
Thu May  9 09:03:37 2024 dipu | git hash:aea639af-dirty
No NVTX under your environment, ignore related API under this condition.
/root/miniconda3/envs/pt/lib/python3.8/site-packages/torch_npu/dynamo/__init__.py:18: UserWarning: Register eager implementation for the 'npu' backend of dynamo, as torch_npu was not compiled with torchair.
  warnings.warn(
Loading checkpoint shards:   0%|                                                                                                                                                                                                   | 0/3 [00:00<?, ?it/s]/root/miniconda3/envs/pt/lib/python3.8/site-packages/torch/_utils.py:831: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  return self.fget.__get__(instance, owner)()
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:07<00:00,  2.39s/it]
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565
/deeplink_main/deeplink.framework/dipu/third_party/DIOPI/build/_deps/op_plugin-src/op_plugin/utils/op_api_common.h:GetOpApiLibHandler:103 [PTA]:"dlopen libcust_opapi.so failed, error:(null)."Traceback (most recent call last):
  File "jiutian-dipu.py", line 26, in <module>
    generate_ids  = model.generate(**generate_input)
  File "/root/miniconda3/envs/pt/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/root/miniconda3/envs/pt/lib/python3.8/site-packages/transformers/generation/utils.py", line 1575, in generate
    result = self._sample(
  File "/root/miniconda3/envs/pt/lib/python3.8/site-packages/transformers/generation/utils.py", line 2697, in _sample
    outputs = self(
  File "/root/miniconda3/envs/pt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/root/miniconda3/envs/pt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/miniconda3/envs/pt/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 1196, in forward
    outputs = self.model(
  File "/root/miniconda3/envs/pt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/root/miniconda3/envs/pt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/miniconda3/envs/pt/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 1016, in forward
    layer_outputs = decoder_layer(
  File "/root/miniconda3/envs/pt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/root/miniconda3/envs/pt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/miniconda3/envs/pt/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 739, in forward
    hidden_states, self_attn_weights, present_key_value = self.self_attn(
  File "/root/miniconda3/envs/pt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/root/miniconda3/envs/pt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/miniconda3/envs/pt/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 360, in forward
    cos, sin = self.rotary_emb(value_states, position_ids)
  File "/root/miniconda3/envs/pt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/root/miniconda3/envs/pt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/miniconda3/envs/pt/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/root/miniconda3/envs/pt/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 140, in forward
    with torch.autocast(device_type=device_type, enabled=False):
  File "/root/miniconda3/envs/pt/lib/python3.8/site-packages/torch/amp/autocast_mode.py", line 208, in __init__
    self.fast_dtype = torch.xpu.get_autocast_xpu_dtype()  # type: ignore[attr-defined]
  File "/root/miniconda3/envs/pt/lib/python3.8/site-packages/torch/__init__.py", line 1833, in __getattr__
    raise AttributeError(f"module '{__name__}' has no attribute '{name}'")
AttributeError: module 'torch' has no attribute 'xpu'
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DIPU]华为上设置DIPU_PYTHON_DEVICE_AS_CUDA=false报错module 'torch' has no attribute 'xpu' #804

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[DIPU]华为上设置DIPU_PYTHON_DEVICE_AS_CUDA=false报错module 'torch' has no attribute 'xpu' #804

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions