At its GTC 2026 conference, a major announcement was made that could reshape the AI hardware landscape: Groq's third-generation Language Processing Unit (LPU) will be manufactured by Samsung Foundry on its 4nm process, with shipments starting in the third quarter of 2026.
So, what exactly is this LPU? Think of it as a highly specialized processor designed for one thing: AI inference. While powerful GPUs are great for training AI models, inference is the process of actually using those models to get answers, like powering a chatbot. Groq's LPU is engineered to do this incredibly fast. Its secret weapon is using on-die SRAM memory instead of the more common HBM. This design allows for extremely low latency, meaning near-instantaneous responses.
This architectural choice is a brilliant strategic move. For the past couple of years, the entire AI industry has been constrained by a severe bottleneck in the supply of HBM memory and the advanced packaging technology, like CoWoS, needed to assemble it. First, this shortage created intense demand for alternative solutions. Second, by designing a chip that doesn't rely on HBM, Groq can sidestep this entire problem, potentially delivering powerful inference chips to customers faster and more reliably.
Furthermore, Samsung's role as the manufacturer is highly significant. For a long time, Taiwan's TSMC has been the undisputed leader in advanced chip manufacturing. Groq's decision to partner with Samsung, a collaboration that officially began back in 2023, is a major win for the Korean tech giant. It validates Samsung's efforts to compete at the highest level, especially as it has been offering aggressive pricing to attract major clients like Groq.
Finally, this isn't just Groq's story; it's also about NVIDIA. In a February 2026 earnings call, NVIDIA revealed it had licensed Groq's low-latency technology. This move suggests NVIDIA is evolving its strategy. Instead of relying solely on its own GPUs, it's building a more diverse ecosystem. The GTC announcement confirms this vision of a hybrid "GPU+LPU" stack, where each processor is used for the task it's best suited for. This event, therefore, is more than a new product launch; it's a signal of diversifying supply chains, intensifying foundry competition, and a more sophisticated future for AI computing.
- LPU (Language Processing Unit): A specialized processor designed specifically for running AI language models very quickly (inference).
- SRAM (Static Random-Access Memory): A type of very fast memory located directly on the chip. It's faster than external memory like HBM but typically available in smaller capacities.
- Foundry: A company that manufactures semiconductor chips for other companies that design them, like Samsung making chips for Groq.
