China's artificial intelligence sector has reached a significant milestone, with daily large-model token calls surpassing 140 trillion by March 2026.
This number is staggering—it represents a more than 1,000-fold increase from early 2024. To put it in perspective, 140 trillion tokens per day is equivalent to about 1.62 billion tokens every second. Sustaining this level of activity requires immense computing power, estimated to be in the range of tens of EFLOPS. This demonstrates that China's AI usage is no longer experimental but has entered an era of industrial-scale application.
So, what caused this explosive growth? We can trace it back to three key factors. First is the supportive yet supervised policy environment. China's 'Interim Measures for Generative AI Services,' effective from August 2023, created a clear regulatory pathway. This allowed hundreds of AI models to be officially filed and deployed for public use, creating a broad foundation for widespread adoption.
Second, there was a massive investment in digital infrastructure. The national strategy known as 'East Data, West Computing' has been crucial. By building large-scale data centers in western regions where energy is cheaper and transmitting data from the east, China has significantly expanded its computing capacity while lowering operational costs and improving energy efficiency. This provided the physical backbone needed to handle the surge in token traffic.
Finally, a ferocious price war among tech giants acted as a powerful catalyst. In May 2024, Alibaba slashed its API prices by up to 97%, prompting competitors like ByteDance to follow suit. This price collapse made AI models accessible to a much wider range of developers and businesses, leading to a surge in demand due to high price elasticity. The 'token' has effectively become a commoditized unit of compute.
In conclusion, the combination of a clear regulatory framework, robust infrastructure, and aggressive price competition has propelled China's 'token economy' to its current scale. The future trajectory now depends heavily on overcoming physical constraints, namely the availability of advanced AI accelerators and the supply of electricity to power them.
- Glossary
- Token: The basic unit of data that AI models process. For text, one token is roughly equivalent to 4 characters in English.
- EFLOPS (Exaflops): A unit of computing speed equal to one quintillion (10^18) floating-point operations per second. It measures the performance of supercomputers.
- PUE (Power Usage Effectiveness): A ratio that measures data center energy efficiency. A lower PUE indicates a more energy-efficient data center.
