NVIDIA Unveils Innovations at Hot Chips to Enhance Data Center Performance and Energy Efficiency

[ad_1]

A significant conference focused on deep technology for processor and system architects from both the industry and academia is emerging as a crucial platform for the immense data center computing market, valued at trillions of dollars.

Next week at Hot Chips 2024, leading engineers from NVIDIA will showcase the newest advancements driving the NVIDIA Blackwell platform, along with studies on liquid cooling for data centers and AI-driven tools for chip design.

During their presentations, they will discuss:

How the NVIDIA Blackwell integrates various chips, systems, and NVIDIA CUDA software to fuel the next generation of AI across diverse industries and regions.
The NVIDIA GB200 NVL72 — an innovative multi-node, liquid-cooled, rack-scale system that connects 72 Blackwell GPUs and 36 Grace CPUs — setting new benchmarks for AI system architecture.
The use of NVLink interconnect technology which allows for comprehensive GPU communication, achieving unprecedented throughput and low-latency inference for generative AI.
The NVIDIA Quasar Quantization System, which pushes the boundaries of physics to enhance AI computational power.
Ongoing projects by NVIDIA researchers to develop AI models that facilitate the creation of processors for AI applications.

An NVIDIA Blackwell session scheduled for Monday, August 26, will delve into the novel architectural features and showcase examples of generative AI models operating on Blackwell silicon.

This session is preceded by three tutorials on Sunday, August 25, focusing on how hybrid liquid-cooling solutions can transition data centers to energy-efficient systems, alongside how AI models, including large language model (LLM)-driven agents can assist engineers in designing next-generation processors.

Altogether, these presentations highlight the innovative approaches NVIDIA engineers are employing across data center computing and design, aiming for unmatched performance, efficiency, and optimization.

Prepare for Blackwell

The NVIDIA Blackwell represents a comprehensive computing challenge, incorporating various NVIDIA technologies including the Blackwell GPU, Grace CPU, BlueField data processing unit, ConnectX network interface card, NVLink Switch, Spectrum Ethernet switch, and Quantum InfiniBand switch.

Ajay Tirumala and Raymond Wong, directors of architecture at NVIDIA, will unveil the platform and explain how these technologies cohesively work to establish a new benchmark for AI and accelerated computing performance while promoting energy efficiency.

The multi-node NVIDIA GB200 NVL72 solution exemplifies this approach. To handle LLM inference, which demands low-latency and high-throughput token generation, the GB200 NVL72 operates as an integrated system, achieving up to 30 times faster inference for LLM workloads and enabling real-time processing of trillion-parameter models.

Tirumala and Wong will also explore how the NVIDIA Quasar Quantization System, which integrates algorithmic advancements, NVIDIA software libraries, and tools with Blackwell’s second-generation Transformer Engine, maintains high accuracy in low-precision models, featuring examples with LLMs and generative AI.

Cooling Data Centers

The traditional sound of air-cooled data centers may soon be considered obsolete as researchers develop more efficient and sustainable hybrid cooling solutions that utilize both air and liquid cooling.

Liquid cooling techniques offer a more effective way to dissipate heat from systems than air cooling, ensuring computing systems remain cool during extensive workloads. Additionally, liquid cooling equipment occupies less space and uses less power compared to air-cooling systems, enabling data centers to install more server racks — thus increasing computational capacity.

Ali Heydari, director of data center cooling and infrastructure at NVIDIA, will present various designs for hybrid-cooled data centers.

Some designs involve retrofitting existing air-cooled data centers with liquid-cooling units, offering a convenient way to integrate liquid-cooling into current racks. Other designs necessitate the installation of pipework for direct-to-chip liquid cooling through cooling distribution units or even fully submerging servers in immersion cooling systems. While these methods may require a larger initial investment, they result in significant energy savings and reduced operational costs over time.

Heydari will also discuss his team’s contributions to COOLERCHIPS, a U.S. Department of Energy initiative aimed at advancing data center cooling technologies. As part of this effort, the team is employing the NVIDIA Omniverse platform to develop physics-informed digital twins to model energy consumption and cooling efficiency, ultimately enhancing their data center designs.

AI Agents Contribute to Processor Design

Designing semiconductors is a substantial challenge, especially at a microscopic scale. Engineers tasked with creating cutting-edge processors strive to maximize computing power on small silicon wafers, pushing the limits of physical capabilities.

AI models step in to enhance the design process, improving quality and productivity, making manual methods more efficient, and automating time-consuming tasks. These models use prediction and optimization tools that assist engineers in rapidly analyzing and refining designs, and they include LLMs that can help answer questions, generate code, debug design issues, and much more.

Mark Ren, director of design automation research at NVIDIA, will provide an overview of these models and how they are utilized in a tutorial. In another session, he will delve into agent-based AI systems for chip design.

AI agents driven by LLMs can be instructed to autonomously execute tasks, opening up wide-ranging applications across various industries. In the realm of microprocessor design, NVIDIA researchers are working on agent-based systems capable of reasoning and taking action with specialized circuit design tools, interacting with skilled designers, and learning from both human and agent experiences.

NVIDIA experts are not just developing this technology; they are also actively using it. Ren will present real-world examples demonstrating how engineers can leverage AI agents for timing report analysis, cell cluster optimization, and code generation. Notably, the work on cell cluster optimization was awarded best paper at the inaugural IEEE International Workshop on LLM-Aided Design.

Join Hot Chips, happening August 25-27, at Stanford University and online.