Generative AI

Overcoming Generative AI-Driven Supply Chain Delays and Resource Shortages


Overcoming Generative AI-Driven Supply Chain Delays and Resource Shortages

AI applications will continue to flourish in the coming years, so back-end data center ecosystems must prepare to supply the necessary infrastructure.

The unprecedented demand for artificial intelligence (AI), and the new paradigm of accelerated computing propelling it, are the latest drivers helping to transform the data center landscape, causing a ripple effect of supply chain delays and resource shortages. Between rumors of Sam Altman raising $7 trillion to build a new AI and semiconductor ecosystem and the exponential stock performance of NVIDIA over the last year, investors are seeking additional direct and indirect opportunities to meet the exponential demands of AI and build the necessary higher-density infrastructure to support it.

Data centers are a key component of AI enablement, hence the scramble to secure the land, power, long-lead equipment, building materials, and talent required to construct data center campuses that can accommodate customers’ ever-evolving AI technology. Providing certainty amid supply chain constraints and challenges — as well as preparing for massive growth, scale, and resource procurement — will remain a top priority in 2024.

AI Advancements and Effects on Data Center Capacity

We’re at the beginning of a period of rapid development in the field of AI, with hyperscalers — the world’s largest cloud and internet providers — requiring even greater capacity, density, and, therefore, power. According to Newmark’s 2023 U.S. Data Center Market Overview Report, the U.S. data center market is slated to more than double by 2030 thanks to the AI boom — and increase its power needs from 17 gigawatts (GW) to over 35 gigawatts. An even more bullish forecast by SemiAnalysis, the specialty analyst firm focused on AI and semiconductors, points to the need to grow even faster and sooner in the U.S., from 22.5 GW in 2023 to 83.5 GW in 2028 — a nearly 4X increase. We’ve seen other turning points in the data center industry’s scale and growth trajectory over the last 30 years driven by major technological advancements such as the dot-com bubble, the proliferation of the internet, and the emergence of cloud computing. However, AI represents a trajectory of demand and pressure upon the data center industry unlike anything we’ve seen before.

A key insight is that the size of the largest AI data sets started to meaningfully outpace Moore’s Law. Starting in 2013, AI data sets doubled every 3-6 months, compared with Moore’s Law’s doubling every 18-24 months. What was on the periphery of demand has now risen to compounding levels where there is also concern about running out of the data needed to train new models, yet another supply constraint for scaling AI. Computing resources need to keep pace with data set growth, and that points to scaling dynamics greater than Moore’s Law.

The prominent AI offerings of OpenAI (ChatGPT), Microsoft (Copilot), and Google (Gemini) have captured widespread public attention, but AI is also being used by companies for very industry-specific applications, ranging from healthcare to financial services and retail. Every company generates vast troves of data, and making sense of that data via inference (the application of the LLMs being trained) is also creating a new class of powerful GPU-driven computing that lives alongside traditional cloud architectures. Huge amounts of data will drive even greater computing needs. Recent OpenAI corporate clashes highlight existential concerns about AI, but whether physical digital infrastructure can keep pace with AI’s growth needs is a more pressing concern. Innovation may stall if density and capacity can’t be provided to enable AI software and hardware to operate.

The GPU Shortage and the Data Center Industry

In addition to capacity and resource shortages, there are chip packaging shortages in the current supply chain that may further delay innovation and AI developments. Accelerated computing is driven by graphics processing units (GPUs), which deliver efficiency and performance compared to sequential processing technology (CPUs). Therefore, the density and growth of GPUs versus CPUs remain critical for generative AI companies. However, this explosion in demand has increased backorders for the chips, ultimately creating a delay in data center builds to support the rapidly developing technology.

To support industry growth, NVIDIA alone reported that 1.6 to 2 million H100 GPU shipments are being put into production in 2024, and AMD AI also announced the delivery of 300-400K of their latest AI GPU series, the Instinct MI100, in 2024. An estimated 70 percent of these shipments will land domestically, adding 2.5 to 3.0 gigawatts of data center and power demand pressure on the domestic market, and recent announcements indicate deliveries may be higher.

Simply put, neither NVIDIA nor AMD can keep up with the demand pressure that AI has cultivated without also ensuring that they maintain a robust supply of the underlying key subcomponents for GPUs (CoWoS and other elements), which is another challenge data center providers must be prepared for this year. This is also why rumors of a $7 trillion capital raise to effectively double the semiconductor ecosystem lingers on.

Addressing Demand and Resource Shortages: Density

To get ahead of resource shortages — both in capacity and GPUs — and continue to meet the needs of exponential AI demand, data center providers will need to do more than solely build more facilities. They must focus on building facilities designed for density and capable of rapidly scaling to meet the capacity, scale, and power demands of emerging technology. Density has emerged as a critical element of modern data center design that allows facilities to meet the increasing capacity demands of AI workloads.

On a micro level, density is achieved in the amount of kilowatts (kW) per rack. Now, data centers will be pressured to support about 5-10 times the kW per rack of current designs, meaning next-generation data centers may need to provide well north of 100kW per rack. NVIDIA’s Blackwell announcement pointed specifically to liquid cooling technologies being part of the baseline specification for deployment. However, only focusing on micro-densities misses the greater point that supporting multiple large clusters is even more important. AI-driven GPU environments involve the coordination of many thousands of GPUs working together as one large supercomputer. On a macro level, providers will look to set up multiple large-capacity clusters of various micro densities, with total aggregate MW at a campus being of highest value, particularly when single models may be as large as 50MW.

Big Iron infrastructure such as stepdown transformers, backup generators, and air-cooled chillers take time to source, procure, fabricate, deliver, and commission, and have a greater impact on a data center’s delivery schedule than the building itself. Data center providers are making proactive investments in this infrastructure to bring delivery time frames within around 24 months (and faster where possible). 

The applications of AI are only going to flourish in the coming years, so back-end data center ecosystems need to prepare properly to supply the infrastructure necessary for these workloads. Deploying the infrastructure ahead of time for high-density environments offers hyperscalers a path to support new AI requirements and train multiple generative AI models on one campus, both now and in the future.

About the Author

Tom Traugott leads all activities related to strategy at EdgeCore Digital Infrastructure. While interacting with a broad spectrum of technologists, industry thought leaders, and community stakeholders, Tom ensures that EdgeCore stays at the forefront of innovation, geographic expansion, and future trends to deliver on the company’s promise to provide safe, sustainable, and future-proof data center solutions to the world’s largest cloud and technology companies. Tom’s industry experience began in the post-dot-com early 2000s period and stretches through the rise of enterprise wholesale colocation in the 2000s into today’s hyperscale cloud-driven world. He is closely monitoring the potential for generative AI to define the next era for data centers.





Source

Related Articles

Back to top button