Skip to content

[LOWOHA] Guard RMS_NORM AVX-512 dispatch with CPU capability check#24

Open
bghimireamd wants to merge 1 commit intoamd:mainfrom
bghimireamd:users/bghimire/fix/rms-norm-avx512-dispatch-guard
Open

[LOWOHA] Guard RMS_NORM AVX-512 dispatch with CPU capability check#24
bghimireamd wants to merge 1 commit intoamd:mainfrom
bghimireamd:users/bghimire/fix/rms-norm-avx512-dispatch-guard

Conversation

@bghimireamd
Copy link
Copy Markdown

Summary

  • Add get_avx512f_status() check in normalization_kernel_wrapper() before dispatching to rms_norm_avx512()
  • Fall through to the existing normalization_reference_wrapper() on non-AVX-512 platforms (Zen 3 and earlier)
  • Add #include "common/zendnnl_global.hpp" for zendnnl_platform_info() access

Fixes #23

Root cause

normalization_kernel_wrapper() unconditionally calls rms_norm_avx512() for RMS_NORM and FUSED_ADD_RMS_NORM, causing SIGILL on CPUs without AVX-512 (e.g., AMD EPYC 7313, Zen 3). Other operators like matmul correctly guard ISA-specific paths (see lowoha_matmul.cpp:351 checking get_f16_status()).

Test plan

  • Verified on AMD EPYC 7313 (Zen 3, AVX2-only): run_lowoha_rms_norm_fp32_example() no longer crashes
  • RMS_NORM falls through to reference kernel and executes successfully
  • Verify AVX-512 path still taken on Zen 4+ hardware (no regression)

🤖 Generated with Claude Code

normalization_kernel_wrapper() unconditionally dispatches RMS_NORM and
FUSED_ADD_RMS_NORM to rms_norm_avx512(), causing SIGILL on non-AVX-512
CPUs (Zen 3 and earlier). Add get_avx512f_status() check before the
AVX-512 path and fall through to the existing reference kernel when
AVX-512 is unavailable.

Also add early data type validation: normalization kernels only support
f32 and bf16. Unsupported types (f16, s8, etc.) now return failure with
a clear error message instead of silently misinterpreting the data.

Fixes amd#23

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@bghimireamd bghimireamd force-pushed the users/bghimire/fix/rms-norm-avx512-dispatch-guard branch from 1da81d3 to 0f2d9c6 Compare April 20, 2026 00:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RMS_NORM dispatcher unconditionally calls AVX-512 kernel, SIGILL on Zen3

1 participant