SSH execution failed on hosts: 10.0.0.31
202603: Pulling from flagtree/flagtree-ascend-910c-py311-torch2.6.0-cann8.5.0-ubuntu22.04-aarch64
Digest: sha256:e52a06419134ca8c1a1588754f28112adb02667641fe55091e52d74ab718a229
Status: Image is up to date for harbor.baai.ac.cn/flagtree/flagtree-ascend-910c-py311-torch2.6.0-cann8.5.0-ubuntu22.04-aarch64:202603
harbor.baai.ac.cn/flagtree/flagtree-ascend-910c-py311-torch2.6.0-cann8.5.0-ubuntu22.04-aarch64:202603
Cloning into 'FlagTest'...
+ cd /home
+ git clone https://github.com/flagos-ai/KernelGenBench.git
Cloning into 'KernelGenBench'...
+ cd /home/KernelGenBench
+ pip install -r requirements.txt
Looking in indexes: https://mirrors.huaweicloud.com/repository/pypi/simple, https://download.pytorch.org/whl/cpu/, https://mirrors.huaweicloud.com/ascend/repos/pypi
Collecting anthropic>=0.71.0 (from -r requirements.txt (line 2))
Downloading https://mirrors.huaweicloud.com/repository/pypi/packages/b8/12/d9ab42790494d7c428391a46cd28492395566a6a8ccb138d681978594455/anthropic-0.104.1-py3-none-any.whl (832 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 833.0/833.0 kB 51.1 MB/s 0:00:00
Collecting openai>=2.24.0 (from -r requirements.txt (line 3))
Downloading https://mirrors.huaweicloud.com/repository/pypi/packages/0a/bf/ccff9be562e24207716d04ef9dc931c76aff0c89a7265da43e2104d7fe06/openai-2.38.0-py3-none-any.whl (1.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 23.9 MB/s 0:00:00
Collecting PyYAML>=6.0.3 (from -r requirements.txt (line 4))
Downloading https://mirrors.huaweicloud.com/repository/pypi/packages/0c/62/d2eb46264d4b157dae1275b573017abec435397aa59cbcdab6fc978a8af4/pyyaml-6.0.3-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl (775 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 775.6/775.6 kB 22.0 MB/s 0:00:00
Requirement already satisfied: rich>=14.3.3 in /usr/local/python3.11.13/lib/python3.11/site-packages (from -r requirements.txt (line 5)) (14.3.3)
Collecting tqdm>=4.67.3 (from -r requirements.txt (line 6))
Downloading https://mirrors.huaweicloud.com/repository/pypi/packages/16/e1/3079a9ff9b8e11b846c6ac5c8b5bfb7ff225eee721825310c91b3b50304f/tqdm-4.67.3-py3-none-any.whl (78 kB)
Collecting fastapi>=0.134.0 (from -r requirements.txt (line 9))
Downloading https://mirrors.huaweicloud.com/repository/pypi/packages/e0/82/45359b62a067409bd929ae8a56b8ed13e5a8c8a61194b3c236920999ab83/fastapi-0.136.3-py3-none-any.whl (117 kB)
Collecting uvicorn>=0.41.0 (from -r requirements.txt (line 10))
Downloading https://mirrors.huaweicloud.com/repository/pypi/packages/01/be/72532be3da7acc5fdfbccdb95215cd04f995a0886532a5b423f929cda4cc/uvicorn-0.48.0-py3-none-any.whl (71 kB)
Requirement already satisfied: scipy in /usr/local/python3.11.13/lib/python3.11/site-packages (from -r requirements.txt (line 11)) (1.13.1)
Collecting ijson>=3.5.0 (from -r requirements.txt (line 12))
Downloading https://mirrors.huaweicloud.com/repository/pypi/packages/cd/32/e05ff8b72a44fe9d192f41c5dcbc35cfa87efc280cdbfe539ffaf4a75
...(中间省略 73270 字符)...
_npu/__init__.py", line 39, in <module>
import torch_npu.npu
File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch_npu/npu/__init__.py", line 127, in <module>
from torch_npu.utils import _should_print_warning
File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch_npu/utils/__init__.py", line 1, in <module>
from torch_npu import _C
ImportError: /usr/local/python3.11.13/lib/python3.11/site-packages/torch_npu/lib/libtorch_npu.so: undefined symbol: _ZNK3c1010TensorImpl20is_contiguous_customENS_12MemoryFormatE
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/__init__.py", line 2833, in _import_device_backends
entrypoint = backend_extension.load()
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/python3.11.13/lib/python3.11/importlib/metadata/__init__.py", line 202, in load
module = import_module(match.group('module'))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/python3.11.13/lib/python3.11/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<frozen importlib._bootstrap>", line 1204, in _gcd_import
File "<frozen importlib._bootstrap>", line 1176, in _find_and_load
File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 940, in exec_module
File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch_npu/__init__.py", line 41, in <module>
from torch_npu.utils._error_code import ErrCode, pta_error
File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch_npu/utils/__init__.py", line 1, in <module>
from torch_npu import _C
ImportError: /usr/local/python3.11.13/lib/python3.11/site-packages/torch_npu/lib/libtorch_npu.so: undefined symbol: _ZNK3c1010TensorImpl20is_contiguous_customENS_12MemoryFormatE
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/KernelGenBench/scripts/generate_kernel_and_verify.py", line 11, in <module>
from kernelgenbench.dataset import TorchOpsLoader, APIInfo
File "/home/KernelGenBench/src/kernelgenbench/__init__.py", line 2, in <module>
import torch
File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/__init__.py", line 2878, in <module>
_import_device_backends()
File "/usr/local/python3.11.13/lib/python3.11/site-packages/torch/__init__.py", line 2837, in _import_device_backends
raise RuntimeError(
RuntimeError: Failed to load the backend extension: torch_npu. You can disable extension auto-loading with TORCH_DEVICE_BACKEND_AUTOLOAD=0.
qa-fef8efc52265
环境信息
harbor.baai.ac.cn/flagtree/flagtree-ascend-910c-py311-torch2.6.0-cann8.5.0-ubuntu22.04-aarch64:202603执行命令
错误信息
失败主机日志
[bm-ctyun-wq-910b-64g-0-31] 10.0.0.31
任务信息
fef8efc522654beaab348970c8e68c0f