A significant shift in AI computing architecture, long discussed in theory, is now becoming a reality. KAIST Professor Kim Jung-ho, often called the “father of HBM,” recently warned that the era of GPU dominance is ending, predicting a future where “memory swallows the GPU.” This isn't just a bold prediction; it's a trend supported by concrete developments in the semiconductor industry.
The primary driver behind this change is the rise of agentic AI. Unlike previous AI models, these advanced systems require vast amounts of data to be accessed instantly and persistently, creating a bottleneck not in computation speed but in memory bandwidth and capacity. Professor Kim’s vision addresses this by proposing a new memory hierarchy: ultra-fast HBM (High Bandwidth Memory) acts as short-term memory, while new HBF (High Bandwidth Flash) serves as a vast, long-term memory library. In this model, GPUs and CPUs become smaller components embedded within a powerful, memory-centric system.
This vision is credible today for several key reasons. First, the supply chain for the most advanced memory is finally robust. Samsung has begun commercial shipments of HBM4, joining SK hynix and Micron to form a stable, three-supplier ecosystem for major customers like Nvidia. This de-risks the development of next-generation AI accelerators, such as Nvidia's upcoming Rubin platform, which can now be designed with a heavy reliance on a steady supply of HBM4.
Second, the concept of a larger, second-tier memory is materializing. The collaboration between SanDisk and SK hynix to standardize HBF provides a clear path for this technology. As HBM becomes more expensive and remains in short supply, it becomes economically sensible to offload less frequently accessed data to a cheaper, high-capacity HBF layer. This tiered approach optimizes both performance and cost for demanding agentic AI workloads.
Finally, the financial markets are already signaling this shift. In 2026, the stock prices of memory leaders like Samsung and SK hynix have significantly outperformed GPU-focused companies. This capital rotation suggests that investors believe memory suppliers will hold greater pricing power and capture more value as the AI hardware paradigm moves from being compute-centric to memory-centric. The pieces are in place for a fundamental re-architecting of AI hardware.
- HBM (High Bandwidth Memory): A type of high-performance RAM that stacks memory chips vertically to achieve significantly higher bandwidth than conventional memory, essential for training large AI models.
- HBF (High Bandwidth Flash): An emerging standard for a new tier of memory based on flash technology, designed to offer much higher capacity than HBM at a lower cost, acting as a 'long-term memory' for AI.
- Agentic AI: Advanced AI systems that can autonomously reason, plan, and execute complex, multi-step tasks to achieve a goal, requiring large and persistent memory to maintain context and learn over time.
