A significant rumor is circulating that NVIDIA may be changing a key component in its next-generation AI accelerator, the Rubin CPX.
Originally announced with GDDR7 memory, the CPX is now rumored to be pivoting to the much faster and higher-capacity HBM (High Bandwidth Memory). This isn't just a minor tweak; it's a fundamental design change with major implications for the entire AI hardware supply chain. So, why would NVIDIA make such a significant shift?
The primary driver appears to be the relentless growth of AI models. First, we are firmly in the era of 'million-token context windows,' where AI needs to process vast amounts of information at once. Second, the initial processing phase, known as 'prefill,' becomes a massive bottleneck with these large contexts. Third, while GDDR7 is fast, offering around 2 TB/s of bandwidth, it pales in comparison to HBM3e (around 4.8 TB/s) or the upcoming HBM4. For production-scale AI, the superior bandwidth and on-package capacity of HBM are becoming essential even for specialized accelerators like CPX.
This potential change couldn't come at a more critical time. The AI industry is already grappling with what DeepMind's CEO recently called a memory 'choke point.' A shift by CPX to HBM would introduce a major new consumer into an already sold-out market, intensifying competition for limited supply. Furthermore, HBM requires sophisticated 2.5D advanced packaging technology, like TSMC's CoWoS, which is also experiencing a severe shortage. With NVIDIA already booking the majority of CoWoS capacity for its flagship GPUs, adding CPX to the queue would squeeze supply even further for everyone else.
Ultimately, this rumor, while unconfirmed, aligns perfectly with recent industry trends: the explosion in AI model size and HBM becoming the de facto standard for high-performance computing. If confirmed, this pivot would signal that the memory bottleneck is even more severe than anticipated, forcing major design changes and putting further strain on a fragile supply chain. All eyes are now on NVIDIA's upcoming earnings call and GTC conference for clarification.
- HBM (High Bandwidth Memory): A type of high-performance memory that stacks memory chips vertically to achieve significantly higher bandwidth than traditional memory like GDDR. It's essential for high-end AI accelerators.
- GDDR7 (Graphics Double Data Rate 7): The latest generation of memory typically used in consumer graphics cards. It offers high speeds but generally less bandwidth and capacity than HBM.
- CoWoS (Chip-on-Wafer-on-Substrate): An advanced packaging technology from TSMC that allows multiple chips, like a processor and HBM, to be integrated closely together on a single package, enabling ultra-fast communication between them.