AI Infrastructure & the Compute Arms Race

Overview

Executive Summary

Hyperscaler capital expenditure on AI infrastructure is approaching $700 billion in 2026, nearly doubling the ~$365 billion spent in 2025.
The dominant investment thesis has shifted from training to inference: serving AI models to billions of users in real time now absorbs the vast majority of new spending.
Power availability has replaced chip supply as the binding constraint. US data centre energy demand is projected to nearly double between 2025 and 2028.
NVIDIA's product cadence has accelerated to roughly six-month cycles, with Vera Rubin NVL72 arriving in H2 2026 and gigawatt-class "AI factories" under construction.
The combined 2026 capex of Amazon, Google, Meta, and Microsoft exceeds the GDP of all but the top 20 national economies.

Capital Allocation

2026 Hyperscaler Capital Expenditure

~$700B

Total Hyperscaler AI Capex 2026

Up from ~$365B in 2025

$200B

Amazon (AWS)

AI chips, robotics, data centres, LEO satellites

$175–185B

Google / Alphabet

Nearly doubled from $91B in 2025

$115–135B

Hyperscaler AI Capex: 2025 vs 2026

Sources: Company earnings reports, S&P Global estimates, analyst projections. 2026 figures are guidance midpoints or estimates.

Architecture

The Pivot from Training to Inference

2023–24: Training Era

Massive GPU clusters to train foundation models. Race to 100K-GPU clusters drove NVIDIA past $3T market cap.

→

2025: Transition

Clusters scale to 300K+ GPUs. Enterprise adoption across finance, government, automotive, education.

→

2026: Inference Scale

Majority of investment into inference: serving AI to billions of users in real time.

Silicon

NVIDIA GPU Roadmap Acceleration

2024

Blackwell GB200 at 400G. 100K-GPU clusters operational.

2025

GB300 at 800G. Blackwell Ultra: 35x throughput, 30x energy efficiency over Hopper.

2026

Vera Rubin NVL72 rack-scale systems. Google Cloud among first to offer.

2028

Feynman architecture. Six-month product cadence now established.

Mega-Clusters

Oracle Zettascale10: 800,000 GPUs, 16 ZettaFLOPS claimed peak. Computing backbone of OpenAI Stargate (Abilene, TX).
Meta's $10B campus in Lebanon, Indiana: 1 GW power consumption.
B200 sold out through mid-2026. New buyers face 12–18 month lead times.

Custom Silicon

Microsoft: Maia 100 AI accelerator and Cobalt 100 ARM CPU alongside NVIDIA Blackwell.
Google: G4 VMs with RTX Pro 6000 Blackwell. Industry-first fractional vGPU support.
Amazon: Trainium2 and Graviton4 expanding AWS custom chip portfolio.

Energy

Power Has Replaced Chips as the Bottleneck

A single 10,000-GPU cluster: 10–15 MW (equivalent to a small town)

↓

US data centres: ~4% of national electricity (up from 2% in 2020)

↓

Projected 8–12% by 2028. Demand doubling from 80 to 150 GW

↓

Northern Virginia grid wait times: 3–5 years for large connections

↓

Training a large model: ~50 GWh (San Francisco for 3 days)

Key Figures

80→150 GW

US demand 2025–2028

3–5 yrs

Grid wait, N. Virginia

5M gal/day

Water, large AI campus

500ml

Water per 100-word prompt

Infrastructure

Cooling, Density & Data Centre Design

Liquid Cooling

Mainstream adoption driven by GPU thermal requirements. Rear-door chillers and liquid-to-chip solutions replacing air cooling. Cable routing now driven by thermal constraints.

Rack Density

Populated based on what facilities can realistically supply, not theoretical limits. Many sites cannot power fully populated GPU racks. Physical design reshaping around power density.

AI-First Architecture

Purpose-built modular data centres replacing retrofits. 74% of organisations prefer hybrid cloud. AI back-end networks generating dense east-west traffic at unprecedented scale.

Geopolitics

Regional Dynamics & Sovereign AI

North America: Dominant

Leads global AI buildout by a wide margin. Majority of mega-clusters in the US.
Stargate (Oracle/OpenAI, Abilene TX) as flagship sovereign-scale project.
Meta's $600B US infrastructure commitment through 2028.
Constraint: energy availability, not demand or capital.

Rest of World: Accelerating

Asia-Pacific: strong momentum, second behind North America.
Europe: lagging due to energy costs, permitting timelines, fragmented policy.
Sovereign AI strategies driving parallel buildout for domestic compute capacity.
2026 brings rising retrofit activity as enterprises integrate GPUs into existing facilities.

Financing

GPU-Backed Lending & Capital Structures

Financing Model	Structure	Typical Deal Size	Key Feature
Direct capex	Cash reserves / balance sheet	$50M–$500M+	Full ownership, no leverage
Asset-backed lending	GPUs as collateral in SPV	$5M–$500M+	60–80% LTV, non-recourse
Cloud rental	Reserved instances / on-demand	Variable	No upfront capex, opex model
Strategic co-investment	Chip vendor co-funds buildout	$1B+	NVIDIA's $2B in CoreWeave
Bank equipment finance	Traditional term loan	$50M+ minimum	40–50% LTV, 60–90 day approval

Assessment

Landscape Summary

Dimension	Current State (2026)	Direction of Travel
Primary bottleneck	Energy, not chips	Grid wait times 3–5 years in key markets
GPU supply	B200 sold out through mid-2026	Vera Rubin NVL72 arriving H2 2026
Cluster scale	100K-GPU clusters; 300K+ building	Gigawatt-class AI factories
Cooling	Liquid cooling mainstream	Density governed by power supply
Financing	GPU-backed SPV lending emerging	$5M–$500M+ deals, 60–80% LTV
Free cash flow	Microsoft FCF down ~28% in 2026	Expected recovery 2027

Monitoring

What to Watch: Key Indicators

Energy Availability

Grid interconnection timelines in key data centre markets. Utility capex plans for AI-driven demand.

GPU Supply / Pricing

B200/Vera Rubin lead times. Secondary market premiums. Custom silicon adoption rates.

Hyperscaler FCF

Whether capex translates into revenue growth. Azure, AWS, GCP margin trajectories.

Inference Economics

API pricing trends. Tokens served per dollar. Inference-to-training spend ratio.

Sovereign AI Policy

National compute strategies. Export controls on advanced chips. EU digital sovereignty initiatives.

Cooling Technology

Liquid cooling adoption curves. Data centre REIT capex on retrofit. PUE trends.

Conclusion

Conclusion & Takeaways

The AI infrastructure buildout is the largest coordinated capital expenditure programme in technology history. At ~$700B in 2026, it dwarfs prior cycles in cloud, mobile, and fibre.
The shift from training to inference changes the investment thesis: inference workloads are more distributed, more predictable, and more directly tied to revenue.
Energy is the binding constraint. Companies solving power, cooling, and grid access will capture outsized value in this cycle.
GPU financing is creating a new asset class. Asset-backed lending with GPUs as collateral is emerging at scale, with non-recourse SPV structures.
The near-term financial strain is real: Microsoft's projected 28% FCF decline illustrates the tension. The question is whether long-term inference revenue justifies the capital at risk.