Improve newer-GPU compatibility and optional kernel fallbacks#14
Open
Rajesh-Nimmalapudi wants to merge 1 commit intoIntelliGen-AI:mainfrom
Open
Improve newer-GPU compatibility and optional kernel fallbacks#14Rajesh-Nimmalapudi wants to merge 1 commit intoIntelliGen-AI:mainfrom
Rajesh-Nimmalapudi wants to merge 1 commit intoIntelliGen-AI:mainfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR improves IntelliFold compatibility on newer NVIDIA GPUs while keeping
the default inference path unchanged for existing users.
The changes in this PR:
torch==2.6.0pin so users can install a PyTorch build thatsupports newer GPUs
deepspeedfrom a mandatory dependency to an optional extraimported
but unavailable
fast_layernormextensionbuild
that DS4Sci and
fast_layernormare optional kernel pathsMotivation
The current packaging and kernel-build defaults make installation on newer
NVIDIA GPUs harder than necessary:
for newer devices
deepspeedis treated as mandatory even though baseline inference does notrequire it
fast_layernormhardcodes an older CUDA architecture setThis PR keeps the baseline path conservative while removing those avoidable
compatibility blockers.
What changed
Packaging
setup.pytorch==2.6.0totorch>=2.6.0deepspeedintoextras_require["deepspeed"]environment.yamltorch==2.6.0totorch>=2.6.0deepspeedRuntime behavior
intellifold/openfold/model/primitives.pyintellifold/openfold/utils/kernel/deepspeed_compat.pyintellifold/openfold/inference_config.pyintellifold/openfold/v2_inference_config.pyintellifold/openfold/v2_flash_inference_config.pyinto model execution
Kernel build behavior
intellifold/openfold/utils/layer_norm/torch_ext_compile.pyTORCH_CUDA_ARCH_LIST="7.0;8.0"sm_70/sm_80/sm_86/sm_90listTORCH_CUDA_ARCH_LISTwhen it is providedDocumentation
README.mddocs/installation.mddocs/kernels.mdValidation
I validated the branch locally with:
fast_layernormJIT build and runtime on the same GPUScope
This PR does not make DS4Sci the default path. DS4Sci remains optional, and
this change is focused on compatibility and safer optional-kernel behavior.