Vast Data Annouces AI Cloud Architecture Built on Nvidia DPUs – High-Performance Computing News Analysis
Vast said the Nvidia BlueField networking platform combines compute power and integrated hardware accelerators to create software-defined accelerated computing infrastructure for AI. By outfitting each GPU server with a dedicated Nvidia BlueField DPU running a stateless container that powers the VAST parallel services operating system, this new architecture design embeds storage and database processing services directly into AI servers and delivers “true linear data services designed to scale to hundreds of thousands of GPUs,” according to Vast.
In addition, the company said, by removing multiple layers of x86 hardware and networking from VAST’s network-attached Data Platform infrastructure, this new AI factory architecture reduces the cost, footprint, and power associated with AI data services.
The company said:
-
VAST’s Disaggregated, Shared Everything (DASE) architecture leverages the processingof Nvidia BlueField-3 to require less independent compute and networking resources, reducing the power usage and data center footprint for VAST infrastructure by 70 percent. The result is net energy consumption savings of over 5 percent compared to deploying Nvidia-powered supercomputers with the previous VAST distributed data services infrastructure.
-
By providing each GPU server with a dedicated and parallel storage and database container, the architecture eliminates contention for data services infrastructure. VAST’s DASE architecture features parallelism such that each Nvidia BlueField-3 can read and write into shared namespaces of the VAST Data Platform without coordinating IO across containers. This is important for multi-tenant service providers that need to meet the contractual Service Level Objectives of their clients while maximizing GPU computing assets.
-
The architecture is designed to ensure that data and data management remain protected and isolated from host operating systems. Compared to AI computers that use parallel file system clients (which have an intimate understanding of the data services layer), VAST eliminates many attack vectors in a multi-tenant environment by hosting industry-standard network attached services, object services and database services from Nvidia BlueField-3 DPUs via standard client protocols that do not expose the underlying data platform system topology – such as NFS, SMB, S3 and Apache Arrow.
-
VAST systems, powered by the Nvidia DOCA software framework that enables the rapid development of containerized services, now provides block storage services natively to host operating systems — combining with VAST’s file, object and database services to provide a set of data presentations to high-performance applications.
“We’re extremely proud to partner with Nvidia to help industrialize AI computing,” said Jeff Denworth, co-founder at VAST Data. “With Nvidia BlueField-3 DPUs, we can now realize the full potential of our vision for disaggregated data centers that we’ve been working toward since the company was founded.”
This architecture is being tested at CoreWeave, a specialized GPU cloud provider. VAST and CoreWeave began partnering in 2023 to build scalable AI machinery and to help LLM builders and enterprise customers build AI factories.
“With VAST’s operating system, next-generation accelerated computing solutions are paired with next-generation accelerated network infrastructure, enabling enterprises and service providers to benefit from simpler, more secure experiences with high-performance systems,” said Rob Davis, Vice President of Storage Technology at Nvidia.
“VAST’s revolutionary architecture is a game-changer for CoreWeave, enabling us to fully disaggregate our data centers. We’re seamlessly integrating VAST’s advanced software directly into our GPU clusters,” said Peter Salanki, vice president of Engineering at CoreWeave. “Leveraging Nvidia BlueField DPUs, we’ve been at the forefront of creating sophisticated, software-defined data center abstractions. Now, by natively incorporating storage and database services onto BlueField, we’re not just streamlining our infrastructure but we are also elevating the user experience for our customers by removing bottlenecks in the AI data computing pipeline.”