Hopper sweeps AI inference tests in early MLPerf


In their debut on the industry-standard MLPerf AI benchmarks, NVIDIA H100 Tensor Core GPUs set world records for inference across all workloads, delivering up to 4.5x better performance than GPUs of the previous generation.

The results demonstrate that Hopper is the premium choice for users who demand top performance on advanced AI models.

Additionally, NVIDIA A100 Tensor Core GPUs and the NVIDIA Jetson AGX Orin Module for AI-Powered Robotics continued to deliver leading overall inference performance in all MLPerf tests: image recognition and speech, natural language processing and recommender systems.

The H100, aka Hopper, raised the performance bar per accelerator on all six neural networks on the lathe. It has demonstrated throughput and speed leadership in separate server and offline scenarios.

NVIDIA H100 GPUs set new high limits for all data center-class workloads.

The NVIDIA Hopper architecture delivered up to 4.5x more performance than NVIDIA Ampere architecture GPUs, which continue to provide overall leadership in MLPerf results.

Thanks in part to its Transformer Engine, Hopper excelled on the popular BERT model for natural language processing. It is one of the largest and most performance-intensive MLPerf AI models.

These inference benchmarks mark the first public demonstration of the H100 GPUs, which will be available later this year. H100 GPUs will participate in future MLPerf rounds for training.

A100 GPUs show leadership

NVIDIA A100 GPUs, available today from leading cloud service providers and system manufacturers, continued to show overall consumer performance leadership on AI inference in the latest tests.

A100 GPUs won more tests than any submission in the Data Center and Edge Computing categories and scenarios. In June, the A100 also secured overall leadership in the MLPerf training benchmarks, demonstrating its capabilities across the entire AI workflow.

Since their debut in July 2020 on MLPerf, A100 GPUs have seen a 6x increase in performance, thanks to continuous improvements in NVIDIA AI software.

NVIDIA AI is the only platform to run all MLPerf inference workloads and scenarios in data centers and edge computing.

Users need versatile performance

The ability of NVIDIA GPUs to deliver cutting-edge performance across all major AI models makes users the real winners. Their real-world applications typically use many different types of neural networks.

For example, an AI application might need to understand a user’s voice request, classify an image, make a recommendation, and then provide a response as a spoken message in a human-sounding voice. Each stage requires a different type of AI model.

MLPerf benchmarks cover these popular AI workloads and scenarios, as well as others – computer vision, natural language processing, recommender systems, speech recognition, and more. Testing ensures that users will get reliable performance and flexible to deploy.

Users rely on MLPerf results to make informed purchasing decisions because the tests are transparent and objective. The references have the support of a large group including Amazon, Arm, Baidu, Google, Harvard, Intel, Meta, Microsoft, Stanford and the University of Toronto.

Orin leads to the edge

In edge computing, NVIDIA Orin ran every MLPerf benchmark, winning more tests than any other low-power system-on-chip. And it showed up to a 50% fuel efficiency gain over its debut on MLPerf in April.

In the previous round, Orin ran up to 5x faster than the previous generation Jetson AGX Xavier module, while delivering 2x the average power efficiency.

Orin leads MLPerf in edge inference
Orin delivered up to 50% power efficiency gains for AI inference at the edge.

Orin integrates an NVIDIA Ampere architecture GPU and a cluster of powerful Arm processor cores into a single chip. It is available today in the NVIDIA Jetson AGX Orin development kit and production modules for robotics and autonomous systems, and supports the entire NVIDIA AI software stack, including platforms for vehicles autonomous (NVIDIA Hyperion), medical devices (Clara Holoscan) and robotics (Isaac) .

Extensive NVIDIA AI Ecosystem

MLPerf results show that NVIDIA AI is backed by the industry’s largest machine learning ecosystem.

Over 70 submissions in this round were executed on the NVIDIA platform. For example, Microsoft Azure submitted results running NVIDIA AI on its cloud services.

Additionally, 19 NVIDIA-certified systems appeared in this round of 10 system manufacturers, including ASUS, Dell Technologies, Fujitsu, GIGABYTE, Hewlett Packard Enterprise, Lenovo, and Supermicro.

Their work shows that users can achieve excellent performance with NVIDIA AI both in the cloud and on servers running in their own data centers.

NVIDIA partners participate in MLPerf because they know it is a valuable tool for customers evaluating AI platforms and vendors. The latest cycle results demonstrate that the performance they deliver to users today will increase with the NVIDIA platform.

All of the software used for these tests is available in the MLPerf repository, so anyone can get these world-class results. Optimizations are continually folded into containers available on NGC, NVIDIA’s catalog for GPU-accelerated software. This is where you will also find NVIDIA TensorRT, used by every submission this round to optimize AI inference.

Read our tech blog for a deeper dive into the technology that powers NVIDIA’s MLPerf performance.


About Author

Comments are closed.