NVIDIA and Japanese telecommunications company SoftBank have partnered to create a platform for generative artificial intelligence (AI) and 5G/6G applications based on NVIDIA’s GH200 Grace Hopper Superchip.
SoftBank plans to build data centers in Japan that, together with NVIDIA, can house generative AI and wireless apps on a shared server platform. This multi-tenant solution is expected to reduce costs and improve energy efficiency.
The proposed platform will utilize NVIDIA’s new MGX modular reference architecture with the GH200 Superchip, boosting the performance of application workloads. The Grace Hopper architecture, featuring 72 Arm Neoverse V2 processor cores with LPDDR5X, includes a combination of central processing units (CPUs) and graphics processing units (GPUs) for accelerated computing.
NVIDIA will be rolling out MGX later this year, with SoftBank as the first customer testing the architecture.
Also see: 100+ Top AI Companies 2023
Telco Modernization Requires Accelerated Computing
For the last six decades, most computing architectures have been primarily CPU-focused. These traditional designs are versatile and adapt well to various workloads, but are now being replaced by accelerated computing.
Accelerated computing uses a full stack, meaning it has both a CPU and an accelerator—like a GPU—working together to divide processing tasks. This approach is optimized to handle different workloads. Networking between these two elements is crucial for maintaining the best possible performance.
The joint platform will also leverage NVIDIA’s BlueField-3 data processing units (DPU) to accelerate 5G virtualized radio access network (vRAN) and generative AI apps. It’s expected to achieve 5G speeds in the range of 36 Gbps downlink capacity.
Compared to a single purpose 5G virtual RAN, this approach offers about four times the return on investment because a data center can also be used for AI, explained Ronnie Vasishta, Sr. VP of Telecom at NVIDIA, during a briefing with analysts.
Also see: Generative AI Companies: Top 12 Leaders
Open RAN Enables Greater Agility
Until recently, the main model for telecommunications was a proprietary RAN, developed as a single monolithic stack. This approach offers high performance but is expensive to maintain. Many have transitioned to an open virtualization RAN, moving their compute workloads to a server architecture.
However, a virtualized RAN has struggled to match the performance of a proprietary RAN. To overcome this, purpose-built accelerators were introduced. Yet these single-purpose accelerators could only be used for RAN workloads, resulting in poor performance and subpar cloud economics.
“Single-purpose networks purely for 5G must be built for peak demand. As new AI applications come in, that peak demand will grow. The power requirements are going to grow. The compute requirements are also going to grow,” said Vasishta. “We see a significant underutilization of the networks being built, and the return on investment (ROI) on 5G has been relatively low.”
NVIDIA has developed a GPU-accelerated, software-defined architecture, where one accelerator can run both AI tasks and RAN. This allows RAN and AI to coexist within a data center, which can be public, distributed, or on-premises. This approach essentially allows 5G to run as a software overlay on AI clouds, with the hardware remaining the same. In fact, as 6G algorithms get developed, they can be incorporated into the existing hardware, without requiring new hardware to be deployed.
Also see: Top Generative AI Apps and Tools
Generative AI is an “iPhone Moment”
Vasishta called this an “iPhone moment for AI” (which is something NVIDIA CEO Jensen Huang has stated repeatedly) — the game-changing impact of artificial intelligence on data centers, akin to how the iPhone revolutionized smartphones.
This moment is marked by a convergence of two transformative factors in the tech industry: a change in computing architectures and the emergence of generative AI. Generative AI requires scale-out architectures, which is driving tremendous demand for networked “AI factories” or data centers, said Vasishta. Apps like chatbots and video conferencing, made possible by generative AI, are creating a significant demand on telecom networks.
NVIDIA is addressing this issue by making 5G infrastructure not just virtualized, but also completely software-defined. So, it’s possible to run a high-performance, efficient 5G network alongside AI applications within the same data center. This opens up new monetization opportunities for telcos like SoftBank and others, allowing them to become regional cloud service providers. By building an AI factory, they can also provide RAN services, thereby using their purchased spectrum more effectively.
SoftBank is exploring 5G applications in various sectors, including autonomous driving, augmented and virtual reality (AR/VR), computer vision, and digital twins. The partnership with NVIDIA is a significant step in the evolution of data centers, where the demand for accelerated computing and generative AI drives fundamental changes.
On a related topic: The Future of Artificial Intelligence