Water cooling for PCIe A100 data center GPUs

According to the company, this leads to a drop in consumption of around 30% within the servers.

If the criterion of energy efficiency is not – yet – a crucial element for home PCs, it is a major challenge for data centers. To improve it, NVIDIA and Equinix are working on liquid-cooled NVIDIA A100 PCIe GPUs. NVIDIA calls them “first in a line of consumer server GPUs that meet customer demand for green and efficient data centers”.

Credit: NVIDIA

Equinix is ​​a provider that manages more than 240 data centers and aims to be the first in its industry to become carbon neutral. To the PUE (power consumption effectiveness) Of its data centers, Equinix opened a dedicated facility in January to continue making advances in energy efficiency. One part deals with liquid cooling, especially with the A100 80GB PCIe Liquid-Cooled GPUs. These solutions are currently being sampled and should be available this summer.

Read > Gigabyte launches servers that combine an Altra Max CPU and an NVIDIA A100 GPU

A PUE of 1.15 instead of 1.6

The impact on the energy efficiency of liquid cooling is far from negligible. In separate tests, Equinix and NVIDIA found that a liquid-cooled data center can run the same workloads as an air-cooled facility while consuming about 30% less power. NVIDIA estimates that a liquid-cooled data center can achieve 1.15 PUE, well below the 1.6 of its air-cooled cousin.

In addition, liquid-cooled data centers are also much more compact. In fact, the water-cooled A100 GPUs only use one PCIe slot, while the air-cooled A100 GPUs take up two.

Figure 1: NVIDIA: Water cooling for PCIe A100 data center GPUs
Credit: NVIDIA

Specifically, liquid cooling involves the use of small amounts of liquid in closed systems that are concentrated in key hotspots, rather than systems that require millions of gallons of water per year to cool the air in data centers.

The technology developed by NVIDIA should also not be limited to data centers. The company responds to that “Cars and other facilities could benefit from the cooling of high-performance systems embedded in a small space”.

At least a dozen system builders plan to add these GPUs to their offerings later this year. These include ASUS, ASRock Rack, Foxconn Industrial Internet, GIGABYTE, H3C, Inspur, Inventec, Nettrix, QCT, Supermicro, Wiwynn and xFusion.

Read > NVIDIA describes its range of servers based on the Grace CPU

H100 GPUs next year

Indeed, as NVIDIA’s press release points out, this reduction in PUE is a global trend dictated by increasingly restrictive regulations; Liquid cooling will be one of the ways for suppliers to meet the new standards. By the way, you will notice that the EMEA zone (Europe Middle East & Africa) is currently a leader in green computing, with an average energy efficiency indicator much lower than that of America and Asia.

Figure 2: NVIDIA: Water cooling for PCIe A100 data center GPUs
Credit: NVIDIA

NVIDIA plans to follow the A100 PCIe card with a version with the Tensor Core H100 GPU next year. The company wants that too “Support for liquid cooling in its high-performance data center GPUs and NVIDIA HGX platforms in the near future”.

Source: Nvidia

Leave a Comment