Skip to content

Improve newer-GPU compatibility and optional kernel fallbacks#14

Open
Rajesh-Nimmalapudi wants to merge 1 commit intoIntelliGen-AI:mainfrom
Rajesh-Nimmalapudi:fix/newer-gpu-compatibility
Open

Improve newer-GPU compatibility and optional kernel fallbacks#14
Rajesh-Nimmalapudi wants to merge 1 commit intoIntelliGen-AI:mainfrom
Rajesh-Nimmalapudi:fix/newer-gpu-compatibility

Conversation

@Rajesh-Nimmalapudi
Copy link

Summary

This PR improves IntelliFold compatibility on newer NVIDIA GPUs while keeping
the default inference path unchanged for existing users.

The changes in this PR:

  • relax the hard torch==2.6.0 pin so users can install a PyTorch build that
    supports newer GPUs
  • move deepspeed from a mandatory dependency to an optional extra
  • keep baseline inference available when DeepSpeed is installed but cannot be
    imported
  • fall back to the standard attention path when DS4Sci is explicitly requested
    but unavailable
  • remove hardcoded older CUDA arch flags from the fast_layernorm extension
    build
  • document a clearer installation path for newer GPU architectures and clarify
    that DS4Sci and fast_layernorm are optional kernel paths

Motivation

The current packaging and kernel-build defaults make installation on newer
NVIDIA GPUs harder than necessary:

  • strict torch pinning prevents users from selecting a newer torch/CUDA build
    for newer devices
  • deepspeed is treated as mandatory even though baseline inference does not
    require it
  • fast_layernorm hardcodes an older CUDA architecture set

This PR keeps the baseline path conservative while removing those avoidable
compatibility blockers.

What changed

Packaging

  • setup.py
    • changed torch==2.6.0 to torch>=2.6.0
    • moved deepspeed into extras_require["deepspeed"]
  • environment.yaml
    • changed torch==2.6.0 to torch>=2.6.0
    • removed mandatory deepspeed

Runtime behavior

  • intellifold/openfold/model/primitives.py
    • treat DeepSpeed as optional at import time
  • intellifold/openfold/utils/kernel/deepspeed_compat.py
    • centralize DS4Sci availability checks
  • intellifold/openfold/inference_config.py
  • intellifold/openfold/v2_inference_config.py
  • intellifold/openfold/v2_flash_inference_config.py
    • disable DS4Sci cleanly when it is requested but unavailable
    • keep the standard path active instead of carrying a broken DS4Sci request
      into model execution

Kernel build behavior

  • intellifold/openfold/utils/layer_norm/torch_ext_compile.py
    • removed hardcoded TORCH_CUDA_ARCH_LIST="7.0;8.0"
    • removed the fixed sm_70/sm_80/sm_86/sm_90 list
    • respect TORCH_CUDA_ARCH_LIST when it is provided
    • otherwise defer to PyTorch's default extension-build behavior

Documentation

  • added a short newer-GPU installation note to README.md
  • added an explicit newer-GPU install path to docs/installation.md
  • clarified kernel build expectations in docs/kernels.md

Validation

I validated the branch locally with:

  • baseline runtime on a newer NVIDIA GPU using a newer torch/CUDA stack
  • fast_layernorm JIT build and runtime on the same GPU
  • DS4Sci-request fallback behavior from a clean installed package
  • package build and install from this branch into a clean target directory

Scope

This PR does not make DS4Sci the default path. DS4Sci remains optional, and
this change is focused on compatibility and safer optional-kernel behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant