Boosting efficiency, liquid-cooled GPUs debut at Computex


In the global effort to stop climate change, Zac Smith is part of a growing movement to build data centers that offer both high performance and energy efficiency.

He is responsible for edge infrastructure at Equinix, a global service provider that operates more than 240 data centers and is committed to becoming the first in its industry to be climate neutral.

“We have 10,000 customers who rely on us to help them on this journey. They demand more data and more intelligence, often with AI, and they want it in a sustainable way,” said Smith, a Julliard grad who got into the tech in the early 2000s creating websites for other musicians in New York.

Make progress in efficiency

In April, Equinix had issued $4.9 billion in green bonds. These are investment-grade instruments that Equinix will apply to reduce environmental impact by optimizing power utilization efficiency (PUE), an industry-leading measure of the amount of energy used by a data center that goes directly to computer tasks.

Data center operators are trying to reduce this ratio even closer to the ideal of 1.0 PUE. Equinix facilities today have an average PUE of 1.48, with its best new data centers hitting less than 1.2.

Equinix is ​​making steady progress in the energy efficiency of its data centers, as measured by PUE (box).

In another step forward, Equinix opened a dedicated facility in January to continue advancements in energy efficiency. Part of this work focuses on liquid cooling.

Born in the era of the mainframe, liquid cooling matures in the era of AI. It is now widely used in the world’s fastest supercomputers in a modern form called direct chip cooling.

Liquid cooling is the next step in accelerated computing for NVIDIA GPUs that already deliver up to 20x more power efficiency on AI inference and high-performance computing work than CPUs.

Efficiency through acceleration

If you switched all CPU-only servers running AI and HPC worldwide to GPU-accelerated systems, you could save 11 trillion watt-hours of energy per year. It’s like saving the energy consumed by more than 1.5 million homes each year.

Today, NVIDIA adds to its sustainability efforts with the release of our first data center PCIe GPU using direct chip cooling.

Equinix qualifies the A100 80GB PCIe Liquid-Cooled GPU for use in its data centers as part of an overall approach to sustainable cooling and heat capture. GPUs are being sampled and will be generally available this summer.

Saving water and electricity

“This is the first liquid-cooled GPU introduced to our lab, and it’s exciting for us as our customers are hungry for sustainable ways to harness AI,” Smith said.

Data center operators are aiming to eliminate chillers that evaporate millions of gallons of water per year to cool the air inside data centers. Liquid cooling promises systems that recycle small amounts of fluids in closed systems focused on key hotspots.

“We will turn a waste into an asset,” he said.

Same performance, less power

In separate tests, Equinix and NVIDIA found that a data center using liquid cooling could run the same workloads as an air-cooled facility while using about 30% less power. NVIDIA estimates the liquid-cooled data center could achieve 1.15 PUE, well below 1.6 for its air-cooled cousin.

Liquid-cooled data centers can also fit twice as much computing in the same space. This is because the A100 GPUs only use one PCIe slot; air-cooled A100 GPUs fill two.

NVIDIA improves efficiency with liquid-cooled GPUs
NVIDIA sees power savings, density gains with liquid cooling.

At least a dozen system makers plan to integrate these GPUs into their offerings later this year. They include ASUS, ASRock Rack, Foxconn Industrial Internet, GIGABYTE, H3C, Inspur, Inventec, Nettrix, QCT, Supermicro, Wiwynn and xFusion

A global trend

Regulations establishing energy efficiency standards are pending in Asia, Europe and the United States. It also motivates banks and other large data center operators to evaluate liquid cooling.

And the technology is not limited to data centers. Cars and other systems need it to cool high-performance systems built into confined spaces.

The road to sustainability

“It’s the start of a journey,” Smith said of the launch of consumer liquid-cooled accelerators.

Indeed, we plan to follow the A100 PCIe card with a version next year using the H100 Tensor Core GPU based on the NVIDIA Hopper architecture. We plan to support liquid cooling in our high performance data center GPUs and NVIDIA HGX platforms for the foreseeable future.

For rapid adoption, today’s liquid-cooled GPUs deliver the same performance for less power. In the future, we expect these cards to offer the ability to get more performance for the same power, which users say they want.

“Measuring horsepower alone is irrelevant, the performance you get for the carbon impact you have is what we need to be aiming for,” Smith said.

Learn more about our new A100 PCIe liquid-cooled GPUs.


About Author

Comments are closed.