CIO Insider

CIOInsider India Magazine

Separator

Nvidia Unveiled New Configuration to Accelerate AI Applications

CIO Insider Team | Wednesday, 9 August, 2023
Separator

For its cutting-edge artificial intelligence chips, Nvidia recently unveiled a new configuration that is intended to accelerate generative AI applications.

According to Ian Buck, vice president of hyperscale and HPC at Nvidia, the new Grace Hopper Superchip increases the amount of high-bandwidth memory, enabling the architecture to support larger AI models. The configuration is designed to efficiently fuel generative AI applications like ChatGPT by performing AI inference functions.

With its Grace Hopper Superchip, Nvidia connects one of its H100 graphics processing units (GPUs) with a central processor of its own design.

According to some estimations, Nvidia currently holds a market share of over 80% for AI chips. Graphics processing units, or GPUs, are the company's area of expertise. These processors are now the ones of choice for the sizable AI models that support generative AI applications, including Google's Bard and OpenAI's ChatGPT. However, Nvidia's chips are hard to come by as tech behemoths, cloud service providers, and startups compete for GPU power to create their own AI models.

According to Nvidia VP Ian Buck, the company's new GH200 is built for inference since it has more memory capacity and can accommodate larger AI models on a single device.

The H100, the company's current highest-end AI chip, shares a GPU with Nvidia's latest device, the GH200. However, the GH200 combines that GPU with 141 GB of state-of-the-art RAM and a 72-core ARM CPU.

First, a model is trained using large amounts of data, a process that can take months and sometimes requires thousands of GPUs, such as, in Nvidia’s case, its H100 and A100 chips. Then the model is used in software to make predictions or generate content, using a process called inference. Like training, inference is computationally expensive, and it requires a lot of processing power every time the software runs, like when it works to generate a text or image. But unlike training, inference takes place near-constantly, while training is only required when the model needs updating.

The release coincides with AMD's recent launch of its own AI-focused chip, the MI300X, which can accommodate 192GB of memory and is being touted for its ability for AI inference. AMD is the main GPU opponent of Nvidia. Companies like Google and Amazon are also creating their own unique AI inference processors.

According to Nvidia VP Ian Buck, the company's new GH200 is built for inference since it has more memory capacity and can accommodate larger AI models on a single device. The new GH200 has 141GB of RAM, compared to 80GB on Nvidia's H100. For even larger models, Nvidia has unveiled a solution that integrates two GH200 chips into a single computer.

Current Issue
VKRAFT Software Services: Pioneering Innovation In Integration & Beyond