Amazon Web Services (AWS) and Nvidia expanded their strategic collaboration to deliver the most-advanced infrastructure, software and services to power customers’ generative artificial intelligence (AI) innovations.
The two tech companies will bring together the best of their technologies — from Nvidia’s newest multi-node systems featuring next-generation GPUs (graphics processing unit), CPUs (central processing unit) and AI software, to AWS Nitro System advanced virtualization and security, Elastic Fabric Adapter (EFA) interconnect, and UltraCluster scalability. These technologies are ideal for training foundation models and building generative AI applications.
Combining AI and Cloud Computing
At the AWS re:Invent 2023 event, Adam Selipsky, the CEO of AWS, and Jensen Huang, CEO of Nvidia, emphasized the crucial role of generative AI in shaping the future of cloud transformation. They shed light on the significance of expanding collaboration between their companies in advancing the capabilities of generative AI within the cloud landscape.
This collaboration builds on a 13-year partnership between AWS and Nvidia. “Today, we offer the widest range of NVIDIA GPU solutions for workloads including graphics, gaming, high performance computing, machine learning, and now, generative AI,” AWS CEO Adam Selipsky said. “We continue to innovate with NVIDIA to make AWS the best place to run GPUs, combining next-gen NVIDIA Grace Hopper Superchips with AWS’s EFA powerful networking, EC2 UltraClusters’ hyper-scale clustering, and Nitro’s advanced virtualization capabilities.”
As the generative AI era takes center stage, AWS and Nvidia are positioned to lead the way in providing advanced solutions across the computing stack. “Generative AI is transforming cloud workloads and putting accelerated computing at the foundation of diverse content generation,” said Jensen Huang, founder and CEO of Nvidia. “Driven by a common mission to deliver cost-effective state-of-the-art generative AI to every customer, NVIDIA and AWS are collaborating across the entire computing stack, spanning AI infrastructure, acceleration libraries, foundation models, to generative AI services.”
Details of AWS and Nvidia collaboration
Here’s a breakdown of the key highlights from the Nvidia-AWS extended partnership:
AWS – the first cloud provider to offer GH200 Grace Hopper Superchip
AWS is set to become the first cloud provider to offer Nvidia GH200 Grace Hopper Superchips with multi-node NVLink technology. This will provide customers with on-demand access to supercomputer-class performance, allowing highest performance for complex generative AI workloads.
“Grace Hopper, which is GH200, connects two revolutionary processors together in a really unique way,” Huang said. He explained that the GH200 connects Nvidia’s Grace Arm CPU with its H200 GPU using a chip-to-chip interconnect called NVLink, at an astonishing one terabyte per second.
AWS and Nvidia connect 32 Grace Hopper Superchips in each rack using a new NVLink switch. Each 32 GH200 NVLink-connected node can be a single Amazon EC2 instance. When these are integrated with AWS Nitro and EFA networking, customers can connect GH200 NVL32 instances to scale to thousands of GH200 Superchips. “With AWS Nitro, that becomes basically one giant virtual GPU instance,” Huang said.
Nvidia DGX Cloud on AWS
AWS and Nvidia will collaborate to host Nvidia DGX Cloud on AWS, featuring the GH200 NVL32. This move aims to accelerate the training of advanced generative AI models and large language models that can reach beyond 1 trillion parameters.
Project Ceiba – World’s Fastest GPU-powered AI Supercomputer
This partnership will bring about the first DGX Cloud AI supercomputer powered by the GH200 Superchips, demonstrating the power of AWS’s cloud infrastructure and Nvidia’s AI expertise.
Huang announced that this new DGX Cloud supercomputer design in AWS, codenamed Project Ceiba (named after the majestic Amazonian Ceiba tree), will serve as Nvidia’s newest AI supercomputer as well, for its own AI research and development.
“This first-of-its-kind supercomputer—featuring 16,384 NVIDIA GH200 Superchips and capable of processing 65 exaflops of AI—will be used by NVIDIA to propel its next wave of generative AI innovation.”
New Amazon EC2 Instances
AWS will introduce three additional Amazon EC2 instances powered by Nvidia GPUs (GH200, H200, L40S, and L4), catering to a wide range of applications, including generative AI, high-performance computing (HPC), design, graphics and simulation workloads.
A single Amazon EC2 instance with GH200 NVL32 can provide up to 20 TB of shared memory to power terabyte-scale workloads. These instances will take advantage of AWS’s third-generation Elastic Fabric Adapter (EFA) interconnect, providing up to 400 Gbps per Superchip of low-latency, high-bandwidth networking throughput, enabling customers to scale to thousands of GH200 Superchips in EC2 UltraClusters.
Nvidia Software Integration
Nvidia’s software, including the NeMo LLM framework, NeMo Retriever, and BioNeMo, will be integrated into AWS. This collaboration aims to boost generative AI development, enabling the creation of custom models, semantic retrieval, and drug discovery.