Skip to content

[Bug] huawei - ascend RuntimeError: Failed to load the backend extension: torch_npu #7

@Al2O3Al2O3

Description

@Al2O3Al2O3

环境信息

  • 镜像: harbor.baai.ac.cn/flagtree/flagtree-ascend-910c-py311-torch2.6.0-cann8.5.0-ubuntu22.04-aarch64:202603
  • 芯片: ascend
  • GPU 数量: 8
  • 调度方式: SSH
  • 集群: online
  • 组件版本: 使用镜像自带环境

执行命令

#!/bin/bash
set -e  # 遇到错误立即退出
set -x  # 打印执行的命令

cd /home
git clone https://github.com/flagos-ai/KernelGenBench.git
cd /home/KernelGenBench

pip install -r requirements.txt
pip install -e .

# Single operator test
python scripts/generate_kernel_and_verify.py \
    --op-name aten::add \
    --single-test \
    --server-type openai \
    --model-name your-model-name \
    --max-rounds 3

# Full benchmark (all 210 operators)
python scripts/generate_kernel_and_verify.py \
    --server-type openai \
    --model-name your-model-name \
    --max-rounds 3

# Non-NVIDIA chips (ATen only)
python scripts/generate_kernel_and_verify.py \
    --dataset KernelGenBench-aten \
    --server-type openai \
    --model-name your-model-name \
    --max-rounds 3

错误信息

SSH execution failed on hosts: 10.0.0.31

失败主机日志

[bm-ctyun-wq-910b-64g-0-31] 10.0.0.31

202603: Pulling from flagtree/flagtree-ascend-910c-py311-torch2.6.0-cann8.5.0-ubuntu22.04-aarch64
Digest: sha256:e52a06419134ca8c1a1588754f28112adb02667641fe55091e52d74ab718a229
Status: Image is up to date for harbor.baai.ac.cn/flagtree/flagtree-ascend-910c-py311-torch2.6.0-cann8.5.0-ubuntu22.04-aarch64:202603
harbor.baai.ac.cn/flagtree/flagtree-ascend-910c-py311-torch2.6.0-cann8.5.0-ubuntu22.04-aarch64:202603
Cloning into 'FlagTest'...
+ cd /home
+ git clone https://github.com/flagos-ai/KernelGenBench.git
Cloning into 'KernelGenBench'...
+ cd /home/KernelGenBench
+ pip install -r requirements.txt
Looking in indexes: https://mirrors.huaweicloud.com/repository/pypi/simple, https://download.pytorch.org/whl/cpu/, https://mirrors.huaweicloud.com/ascend/repos/pypi
Collecting anthropic>=0.71.0 (from -r requirements.txt (line 2))
  Downloading https://mirrors.huaweicloud.com/repository/pypi/packages/b8/12/d9ab42790494d7c428391a46cd28492395566a6a8ccb138d681978594455/anthropic-0.104.1-py3-none-any.whl (832 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 833.0/833.0 kB 51.1 MB/s  0:00:00
Collecting openai>=2.24.0 (from -r requirements.txt (line 3))
  Downloading https://mirrors.huaweicloud.com/repository/pypi/packages/0a/bf/ccff9be562e24207716d04ef9dc931c76aff0c89a7265da43e2104d7fe06/openai-2.38.0-py3-none-any.whl (1.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 23.9 MB/s  0:00:00
Collecting PyYAML>=6.0.3 (from -r requirements.txt (line 4))
  Downloading https://mirrors.huaweicloud.com/repository/pypi/packages/0c/62/d2eb46264d4b157dae1275b573017abec435397aa59cbcdab6fc978a8af4/pyyaml-6.0.3-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl (775 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 775.6/775.6 kB 22.0 MB/s  0:00:00
Requirement already satisfied: rich>=14.3.3 in /usr/local/python3.11.13/lib/python3.11/site-packages (from -r requirements.txt (line 5)) (14.3.3)
Collecting tqdm>=4.67.3 (from -r requirements.txt (line 6))
  Downloading https://mirrors.huaweicloud.com/repository/pypi/packages/16/e1/3079a9ff9b8e11b846c6ac5c8b5bfb7ff225eee721825310c91b3b50304f/tqdm-4.67.3-py3-none-any.whl (78 kB)
Collecting fastapi>=0.134.0 (from -r requirements.txt (line 9))
  Downloading https://mirrors.huaweicloud.com/repository/pypi/packages/e0/82/45359b62a067409bd929ae8a56b8ed13e5a8c8a61194b3c236920999ab83/fastapi-0.136.3-py3-none-any.whl (117 kB)
Collecting uvicorn>=0.41.0 (from -r requirements.txt (line 10))
  Downloading https://mirrors.huaweicloud.com/repository/pypi/packages/01/be/72532be3da7acc5fdfbccdb95215cd04f995a0886532a5b423f929cda4cc/uvicorn-0.48.0-py3-none-any.whl (71 kB)
Requirement already satisfied: scipy in /usr/local/python3.11.13/lib/python3.11/site-packages (from -r requirements.txt (line 11)) (1.13.1)
Collecting ijson>=3.5.0 (from -r requirements.txt (line 12))
  Downloading https://mirrors.huaweicloud.com/repository/pypi/packages/cd/32/e05ff8b72a44fe9d192f41c5dcbc35cfa87efc280cdbfe539ffaf4a75

...(中间省略 73270 字符)...

_npu/__init__.py", line 39, in <module>
    import torch_npu.npu
  File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch_npu/npu/__init__.py", line 127, in <module>
    from torch_npu.utils import _should_print_warning
  File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch_npu/utils/__init__.py", line 1, in <module>
    from torch_npu import _C
ImportError: /usr/local/python3.11.13/lib/python3.11/site-packages/torch_npu/lib/libtorch_npu.so: undefined symbol: _ZNK3c1010TensorImpl20is_contiguous_customENS_12MemoryFormatE

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/__init__.py", line 2833, in _import_device_backends
    entrypoint = backend_extension.load()
                 ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/python3.11.13/lib/python3.11/importlib/metadata/__init__.py", line 202, in load
    module = import_module(match.group('module'))
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/python3.11.13/lib/python3.11/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1204, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 940, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch_npu/__init__.py", line 41, in <module>
    from torch_npu.utils._error_code import ErrCode, pta_error
  File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch_npu/utils/__init__.py", line 1, in <module>
    from torch_npu import _C
ImportError: /usr/local/python3.11.13/lib/python3.11/site-packages/torch_npu/lib/libtorch_npu.so: undefined symbol: _ZNK3c1010TensorImpl20is_contiguous_customENS_12MemoryFormatE

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/KernelGenBench/scripts/generate_kernel_and_verify.py", line 11, in <module>
    from kernelgenbench.dataset import TorchOpsLoader, APIInfo
  File "/home/KernelGenBench/src/kernelgenbench/__init__.py", line 2, in <module>
    import torch
  File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/__init__.py", line 2878, in <module>
    _import_device_backends()
  File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/__init__.py", line 2837, in _import_device_backends
    raise RuntimeError(
RuntimeError: Failed to load the backend extension: torch_npu. You can disable extension auto-loading with TORCH_DEVICE_BACKEND_AUTOLOAD=0.
qa-fef8efc52265

任务信息

  • 任务 ID: fef8efc522654beaab348970c8e68c0f
  • 任务名称: huawei
  • 测试类型: kernelgenbench
  • 创建时间: 2026-05-27 15:12:59
  • 完成时间: 2026-05-27 15:59:02

Metadata

Metadata

Assignees

Labels

P0Priority 0 - urgentbugSomething isn't workingflagos2.1-rc0

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions