The ongoing AI revolution is causing a significant memory shortage, but the story is now expanding beyond the well-known HBM chips used for GPUs.
A major shift is underway in how AI systems are built and used. Until recently, the primary focus was on training massive AI models. This process is like teaching a student an entire library of books; it requires immense power, handled by many GPUs working together, managed by just a few CPUs. In this setup, the critical memory was HBM (High Bandwidth Memory), which is tightly integrated with GPUs.
However, the industry is now moving into the 'agentic era,' where the focus is on inference—the stage where the trained AI actually performs tasks. Think of it as the student, now graduated, applying their knowledge to solve real-world problems. Recent announcements from tech giants like Intel and Google confirm that this inference stage requires a different computer architecture. Instead of one CPU managing eight GPUs, the ratio is shifting closer to one-to-one. The CPU's role evolves from a simple manager to an 'orchestrator,' coordinating complex tasks across the system.
This architectural change has a profound impact on memory demand. First, increasing the number of CPUs by up to eight times per system naturally multiplies the demand for the memory they use: DDR5 DRAM. Unlike HBM, which is specialized for GPUs, DDR5 is the standard workhorse memory for modern CPUs. A server CPU can have up to 3 terabytes of DDR5 attached, so more CPUs directly translates to a huge increase in DDR5 demand.
Second, this demand surge is happening at a challenging time. Memory manufacturers like Samsung and SK hynix are already diverting their production capacity to create the highly profitable HBM chips that AI training still requires. This strategic shift creates a supply squeeze for conventional DDR5, just as demand from new inference-focused servers is taking off.
As a result, we're seeing a classic supply and demand crunch. Market analysts at TrendForce are forecasting a dramatic price increase for DRAM, and the memory makers themselves have stated that they expect this tightness to persist into 2027, which is the earliest that significant new production capacity is expected to come online. The AI boom isn't just an HBM story anymore; it's now engulfing the entire DRAM market.
- Glossary
- Inference: The process of using a trained AI model to make predictions or perform tasks based on new data. It's the 'execution' phase, as opposed to the 'learning' or training phase.
- DDR5 / HBM: Both are types of DRAM (memory). DDR5 is the standard, high-performance memory used by CPUs in servers and PCs. HBM (High Bandwidth Memory) is a more specialized, ultra-high-performance memory stacked vertically and placed very close to a processor, typically a GPU, for maximum speed.
- Agentic AI: A more advanced form of AI that can proactively take actions, make decisions, and perform multi-step tasks to achieve a goal, acting like an autonomous 'agent'.
