[LOWOHA] Guard RMS_NORM AVX-512 dispatch with CPU capability check#24
Open
bghimireamd wants to merge 1 commit intoamd:mainfrom
Open
[LOWOHA] Guard RMS_NORM AVX-512 dispatch with CPU capability check#24bghimireamd wants to merge 1 commit intoamd:mainfrom
bghimireamd wants to merge 1 commit intoamd:mainfrom
Conversation
normalization_kernel_wrapper() unconditionally dispatches RMS_NORM and FUSED_ADD_RMS_NORM to rms_norm_avx512(), causing SIGILL on non-AVX-512 CPUs (Zen 3 and earlier). Add get_avx512f_status() check before the AVX-512 path and fall through to the existing reference kernel when AVX-512 is unavailable. Also add early data type validation: normalization kernels only support f32 and bf16. Unsupported types (f16, s8, etc.) now return failure with a clear error message instead of silently misinterpreting the data. Fixes amd#23 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1da81d3 to
0f2d9c6
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
get_avx512f_status()check innormalization_kernel_wrapper()before dispatching torms_norm_avx512()normalization_reference_wrapper()on non-AVX-512 platforms (Zen 3 and earlier)#include "common/zendnnl_global.hpp"forzendnnl_platform_info()accessFixes #23
Root cause
normalization_kernel_wrapper()unconditionally callsrms_norm_avx512()forRMS_NORMandFUSED_ADD_RMS_NORM, causingSIGILLon CPUs without AVX-512 (e.g., AMD EPYC 7313, Zen 3). Other operators like matmul correctly guard ISA-specific paths (seelowoha_matmul.cpp:351checkingget_f16_status()).Test plan
run_lowoha_rms_norm_fp32_example()no longer crashes🤖 Generated with Claude Code