NVLink support for H100 NVL
When I execute the
nvidia-smi topo -m
method on the H100 NVL * 2 pod, I can see the PIX topology between GPU0 and GPU1. Can I use NVLink connection to interconnect the H100 NVL GPUs? How does the PIX(PCIe bridge) performance differ from NVLink?14 Replies
Heres some result from an AI
Certainly! Let’s explore the differences between NVLink and PCIe for interconnecting H100 NVL GPUs.
NVLink:
NVLink represents a significant leap forward in GPU interconnect bandwidth.
It leverages the Socketed Multi-Chip Module (SXM) architecture, facilitating ultra-fast data exchange between GPUs.
Bandwidth: NVIDIA’s latest H100 GPUs using NVLink can achieve astonishingly high interconnect speeds, up to 900 GB/s12.
Applications Benefiting from NVLink:
Large-scale deep learning and AI model training.
High-performance computing simulations.
Data-intensive scientific research.
PCIe:
PCIe (Peripheral Component Interconnect Express) is the traditional backbone for GPU interconnectivity in servers.
Strengths:
Flexibility: PCIe is versatile and compatible with a diverse range of server architectures.
Broad Compatibility: It caters to various AI applications, especially where inter-GPU communication load is moderate.
Bandwidth: While PCIe offers lower bandwidth compared to NVLink, it remains a cost-effective solution for scenarios that don’t require the high bandwidth provided by NVLink.
Ideal Use Cases for PCIe:
Inference applications and lightweight AI workloads.
Small to medium-scale machine learning model training.
General-purpose computing requiring GPU acceleration.
Performance Comparison:
NVLink shines in environments where maximizing GPU-to-GPU bandwidth is paramount, offering superior performance for HPC and extensive AI model training.
PCIe appeals to applications with moderate bandwidth requirements, providing a flexible and economical solution without necessitating high-speed interconnectivity3.
In summary, choose wisely based on your specific AI application needs. If you require maximum inter-GPU bandwidth, NVLink is the way to go. For more moderate requirements, PCIe offers flexibility and cost-effectiveness3. 🚀
Im not sure about how can you use the nvlink connection to interconnect it, but i think its already setup
@flash-singh
@nerdylive thank you! when I use the H100 SXM5 GPU, the
nvidia-smi topo -m
command shows that the GPUs are being interconnected with NV# topology, indicating NVLink usage. This differs from the H100 NVL use cases, so it would be helpful to confirm if the H100 NVL pod uses NVLink or not!Yeah im not sure about the nvlink setup but H100 NVL is a gpu type that should be optimized for nvlink
also you can ask this on website support
@nerdylive i got it, thank you!
H100 NVL does using nvlink but its paired between 2 gpus, and our software isn't optimized to give you 2 that are paired, unless you ask for all 8x, its possible when you ask for 2 gpus, you might get the ones that are not paired, we plan to optimize this soon so they're always paired with NVLink
ahh
Wow they are actually faster
NVL is 2 gpus
oh like 2xSXM?
yep, its marketing gimmick, they're comparing single gpu to NVL which is 2 gpus
ahh icic
Thank you so much, that's why the 2 H100 NVL were slower than 2 SXM. Hope to be optimized very soon!
@flash-singh then can I get a NVLinked 2 H100 NVLs with trying several times theoretically?
yes