How to adapt the Shared NVSwitch Virtualization Model of FM to activate nvlink in multi-gpu VMs

# Why is this needed?
In a virtualized environment, for DGX/HGX A100/H100 systems, NVIDIA provides the Shared NVSwitch Virtualization Model solution to enable NVLink connections for multi-gpu VMs. This requires that the GPUs assigned to the VM must belong to the same partition.

# What's the Shared NVSwitch Virtualization Model

![Image](https://github.com/user-attachments/assets/276525b3-2935-44b6-9b55-62e4e7606cd2)

Only GPUs passed through to the guests.
1. NVSwitch memory fabrics are managed by a dedicated trusted VM called Service VM.
2. NVSwitch memory fabrics are shared by the guest VMs, but the fabrics are not visible to guests.
3. Requires the tightest integration with the hypervisor.
4. Complete bandwidth for two and four GPU VMs.
5. No need for direct communication between the guest VM and the Service VM.

[shared-nvswitch-virtualization-model](https://docs.nvidia.com/datacenter/tesla/fabric-manager-user-guide/index.html#shared-nvswitch-virtualization-model)

# Proposal
The GPUs assigned to the VM must belong to the same partition.

# How to assign the GPUs belong to the same partition
- Implement the `GetDevicePluginOptions` interface to enable
  `GetPreferredAllocationAvailable`, allowing kubelet to request
  `GetPreferredAllocation` before allocating GPUs.
- The `GetPreferredAllocation` interface recommends H100/H800 GPUs based on GPU
  partitioning.
- The `Allocate` interface verifies whether the GPUs belong to the same
  partition during allocation.

The diagram below illustrates the partition tree for H100/H800. If `partition 4` has already been allocated, `partition 3` will be prioritized for the next allocation.

![Image](https://github.com/user-attachments/assets/810b219d-58de-4ede-8b8a-6362c186c70a)




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to adapt the Shared NVSwitch Virtualization Model of FM to activate nvlink in multi-gpu VMs #133

Why is this needed?

What's the Shared NVSwitch Virtualization Model

Proposal

How to assign the GPUs belong to the same partition

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

How to adapt the Shared NVSwitch Virtualization Model of FM to activate nvlink in multi-gpu VMs #133

Description

Why is this needed?

What's the Shared NVSwitch Virtualization Model

Proposal

How to assign the GPUs belong to the same partition

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions