NVIDIA CEO Jensen Huang's recent statement has made it clear: the primary bottleneck for the AI revolution is no longer the GPU chip itself, but the specialized memory required to power it.
This shift is driven by the insatiable appetite of new AI accelerators for memory. First, the amount of High Bandwidth Memory, or HBM, packed into each new GPU is skyrocketing. For example, NVIDIA's H100 used 80 GB of HBM, while the upcoming B200 will use 192 GB—a 140% increase. Second, the sheer number of AI servers being deployed is growing at a staggering rate, with projections showing over 28% year-over-year growth in 2026. When you combine more servers with more memory per server, you get an explosive surge in demand for HBM.
On the supply side, manufacturers are struggling to keep up. This isn't just an NVIDIA problem; it's an industry-wide constraint acknowledged by the key players themselves. SK hynix's chairman predicted the AI memory shortage could last until 2030, even with plans to double production capacity. Samsung has also warned of 'significant shortages' lasting through 2027. The challenge isn't just making the memory chips; it's also about advanced packaging technology like TSMC's CoWoS, which is needed to integrate the HBM and GPU together. This packaging capacity remains a critical choke point.
The financial markets have been signaling this reality for months. While NVIDIA's stock performance has been impressive, the returns for memory makers have been even more remarkable recently. Between late 2025 and mid-2026, stock prices for SK hynix, Micron, and Samsung far outpaced NVIDIA's. This indicates that investors believe the memory suppliers, not just the GPU designers, will capture a significant portion of the value in the AI supply chain as they control the scarcest resource.
Adding another layer of complexity are geopolitical factors, particularly U.S. export controls on advanced AI chips and memory. These regulations can create uncertainty and re-route the limited supply, further tightening the market for everyone. In essence, the pace of AI development is now dictated less by NVIDIA's design prowess and more by the manufacturing capacity of a handful of memory and packaging companies. The world needs more HBM and more CoWoS capacity, and until it gets it, the supply constraint will remain a central theme.
- HBM (High Bandwidth Memory): A type of high-performance RAM that uses stacked silicon dies to achieve a much wider memory interface and higher bandwidth, essential for AI accelerators.
- CoWoS (Chip-on-Wafer-on-Substrate): An advanced packaging technology developed by TSMC that allows multiple chips, such as GPUs and HBM stacks, to be integrated side-by-side on a silicon interposer, enabling high-speed communication between them.
- AI Accelerator: Specialized hardware, like a GPU (Graphics Processing Unit) or a custom ASIC, designed to dramatically speed up the mathematical operations required for artificial intelligence tasks.
