Grace Hopper is great for recommender systems


Recommender systems, the economic engines of the Internet, are getting a new turbocharger: the NVIDIA Grace Hopper Superchip.

Every day, recommenders deliver trillions of search results, ads, products, music and news to billions of people. They are among the most important AI models of our time because they are incredibly effective at finding the pearls that users are looking for in the pandemonium of the Internet.

These machine learning pipelines run on terabytes of data. The more data recommenders consume, the more accurate their results and the more ROI they offer.

To process this tsunami of data, companies are already embracing accelerated computing to personalize services for their customers. Grace Hopper will take her advances to the next level.

GPUs drive 16% more engagement

Pinterest, the image-sharing social media company, was able to move to 100x larger recommendation models by adopting NVIDIA GPUs. This increased engagement by 16% for its more than 400 million users.

“Normally, we’d be happy with a 2% increase, and 16% is just the start,” a company software engineer said in a recent blog post. “We see additional gains – it opens many doors for opportunity.”

Recommendations consume tens of terabytes of embeds, tables of data that provide context for making accurate predictions.

The next generation of the NVIDIA AI platform promises even greater gains for companies dealing with massive datasets with oversized recommendation models.

Because data is the fuel of AI, Grace Hopper is designed to pump more data through recommender systems than any other processor on the planet.

NVLink Accelerates Grace Hopper

Grace Hopper achieves this because it is a superchip – two chips in one unit, sharing a super-fast chip-to-chip interconnect. It’s an Arm-based NVIDIA Grace CPU and Hopper GPU that communicate via NVIDIA NVLink-C2C.

Additionally, NVLink also connects many superchips into a supersystem, a computing cluster designed to run terabyte-class recommender systems.

NVLink transports data at 900 gigabytes per second, which is 7 times the bandwidth of PCIe Gen 5, the interconnect that most future systems will use.

This means Grace Hopper provides recommenders with 7x more integrations (context-rich data tables) they need to personalize results for users.

More memory, more efficiency

The Grace processor uses LPDDR5X, a type of memory that achieves the optimal balance of bandwidth, power efficiency, capacity, and cost for recommender systems and other demanding workloads. It provides 50% more bandwidth while using one-eighth the power per gigabyte of traditional DDR5 memory subsystems.

Any GPU Hopper in a cluster can access Grace’s memory via NVLink. It is a Grace Hopper feature that provides the largest GPU memory pools ever.

Additionally, NVLink-C2C requires only 1.3 picojoules per bit transferred, giving it more than 5 times the power efficiency of PCIe Gen 5.

The overall result is that recommenders get up to 4x more performance and greater efficiency using Grace Hopper than using Hopper with traditional processors (see chart below).

Grace Hopper speeds up recommendations

All the software you need

The Grace Hopper Superchip runs the full stack of NVIDIA AI software used in some of the largest recommender systems in the world today.

NVIDIA Merlin is the rocket fuel of recommenders, a collection of models, methods, and libraries for building AI systems that can deliver better predictions and increase clicks.

NVIDIA Merlin HugeCTR, a recommendation framework, helps users quickly process large datasets on distributed GPU clusters with the help of the NVIDIA Collective Communications Library.

Learn more about Grace Hopper and NVLink in this tech blog. Watch this GTC session to learn more about building recommender systems.

You can also hear NVIDIA CEO and co-founder Jensen Huang give his take on the recommenders here or watch the full GTC keynote below.


About Author

Comments are closed.