NVIDIA Blackwell crushes MLPerf records for AI

NVIDIA broke records in the recent MLPerf Inference V5.0 tests with their Blackwell platform. This marked their first entry using the GB200 NVL72 system. The rack-scale solution excels at AI reasoning tasks. Modern AI requires special computing facilities called AI factories instead of traditional data centers.

AI factories transform raw data into useful insights immediately rather than just storing information. They must deliver accurate answers quickly, cheaply, and to many users simultaneously. Making this happen becomes very complex behind the scenes. Larger AI models with billions of parameters require more computing power for each output piece.

This higher requirement reduces the total output an AI factory can generate and raises costs per token. Maintaining high performance with low costs demands improvements across all technology layers—computer chips, networks, and software. The industry experts at MLPerf added Llama 3.1 405B to their tests, one of the largest and most challenging models available today.

The updated Llama 2 70B Interactive benchmark requires much faster response times than previous versions. This better reflects what real systems need for excellent user experiences. Beyond Blackwell, NVIDIA's Hopper platform also delivered outstanding results across all tests, showing major improvements since last year thanks to system-wide enhancements.

Their GB200 NVL72 system links 72 NVIDIA Blackwell GPUs to function as one massive processor. It achieved 30 times higher throughput on Llama 3.1 405B compared to the H200 NVL8. Engineers accomplished this through triple the performance per GPU plus a connection system nine times larger than before.

Many companies run internal MLPerf tests, but only NVIDIA and partners have published official results on the challenging Llama 3.1 405B benchmark. Production AI systems focus on two key speed metrics. The time to first token measures how quickly users see the beginning of a response. Time per output token tracks how rapidly the system delivers each text segment.

The enhanced Llama 2 70B Interactive test demands five times faster output speed and 4.4 times quicker initial response. Users experience much more responsive interactions this way. NVIDIA's DGX B200 with eight Blackwell GPUs performed three times better than eight H200 GPUs on this harder test variant, setting a new standard.

Combining Blackwell hardware with optimized software creates unprecedented performance levels. AI factories benefit through smarter responses, higher capacity, and faster text generation. The Hopper architecture NVIDIA launched in 2022 powers numerous current AI systems and continues supporting model training activities.

Continuous software optimization increases the capabilities of Hopper-based systems, boosting their value. On the year-old Llama 2 70B test, H100 GPU throughput improved by 1.5 times. The H200 GPU, built on identical architecture but with expanded, faster memory, pushes that improvement to 1.6 times.

Hopper successfully ran every benchmark, including new additions like Llama 3.1 405B, Llama 2 70B Interactive, and graph neural network tests. This flexibility enables Hopper to manage diverse workloads as models and usage scenarios grow increasingly complex. During these MLPerf tests, 15 partner companies achieved impressive results using NVIDIA technology.

These partners include ASUS, Cisco, CoreWeave, Dell Technologies, Fujitsu, Giga Computing, Google Cloud, Hewlett-Packard Enterprise, Lambda, Lenovo, Oracle Cloud Infrastructure, Quanta Cloud Technology, Supermicro, Sustainable Metal Cloud, and VMware. Such widespread participation demonstrates the availability of NVIDIA platforms through every cloud service provider and server manufacturer worldwide.

MLCommons continuously updates the MLPerf Inference benchmarks to match the latest AI advances. They supply the industry with rigorously reviewed performance data, helping IT decision-makers select optimal AI infrastructure for specific needs.
 

Attachments

  • NVIDIA Blackwell crushes MLPerf records for AI.webp
    NVIDIA Blackwell crushes MLPerf records for AI.webp
    38.1 KB · Views: 27

Trending content

Latest posts

Top