Google has just unveiled its next-generation AI accelerators, a move that signals a significant shift in both AI hardware strategy and the global supply chain. This new generation is the first to be split into two distinct chips: the TPU 8t for training large models and the TPU 8i optimized for running them, a process known as inference.
The key reason for this split is a fundamental change in the AI landscape. Until now, the primary focus has been on 'training'—the computationally intensive process of teaching an AI model with vast amounts of data. However, the economic bottleneck is rapidly shifting to 'inference,' which is the act of using the trained model to generate answers, create images, or make predictions. Since inference happens far more frequently than training, making it faster and cheaper is now the top priority for major tech companies.
To achieve this, Google is turning to ASICs, or custom-designed chips, that are perfectly tailored for these specific tasks. This is where the supply chain story begins. First, to avoid relying on a single chip designer, Google is reportedly in talks with Marvell, in addition to its long-standing partner Broadcom. This multi-sourcing strategy gives Google more bargaining power and protects it from potential production bottlenecks, especially as key components like high-bandwidth memory (HBM) remain in short supply.
Second, this strategic shift directly benefits Taiwanese manufacturers. With new, specialized server designs required for the TPU 8t and 8i, Google is expanding its partnerships with experienced assembly firms. Companies like Inventec and Foxconn are set to handle a larger portion of the motherboard and final server assembly. Reports suggest Inventec's share of certain components could triple, indicating a deliberate move to diversify production beyond existing partners.
In essence, Google's new chip strategy is a direct response to the evolving demands of the AI era. By optimizing for inference and building a more resilient, geographically diverse supply chain, the company is preparing for a future where the efficiency of running AI is just as important as the power to build it.
- ASIC (Application-Specific Integrated Circuit): A type of chip designed for a single, specific purpose, like running AI models, making it much more efficient for that task than a general-purpose chip like a CPU or GPU.
- Inference vs. Training: Training is the process of 'teaching' an AI model using massive datasets. Inference is the process of 'using' the trained model to perform a task, such as answering a question or generating an image.
- ODM/EMS (Original Design Manufacturer / Electronics Manufacturing Service): Companies that design and/or manufacture products and components for other companies. For example, Foxconn is an EMS that assembles iPhones for Apple.
