X (Twitter)

AI is redefining what chips should look like. Chips are also determining how far AI can go. There are three trends: 1. From general-purpose to specialized. Previously, everyone used general-purpose GPUs to run AI. However, it is now being discovered that this is problematic for large model inference, training, and edge deployment. The design logic of the chips is completely different. I think we will see more dedicated AI chips in the next 3-5 years. For example, training chips need to have high computing power, inference chips need to save power, and edge chips need to have low latency. Nvidia is now also differentiating its product lines, with the H series for training and the L series for inference. Domestic companies like Biren and Suiyuan are also looking for differentiated positioning. In the future, there won't be a single dominant player; instead, there will be a landscape where "there are kings in training, overlords in deduction, and players in client-side games." 2. Breakthrough in in-memory computing, solving the memory wall problem. The biggest bottleneck for large-scale models now is not computing power, but data transfer. The chip has to constantly read data from memory, perform calculations, and then write it back, which is too slow and consumes too much power. In-store computing combines computing and storage, eliminating the need to move data back and forth. If the technology breaks through, it will have a huge impact on AI. Tsinghua University, the Chinese Academy of Sciences, and some startups are all working in this direction. If in-memory computing chips can be mass-produced in the next 3-5 years... This will reduce the inference cost of large models by an order of magnitude, enabling many applications that are currently impossible to implement to be done in the future. 3. Chip and algorithm are optimized together. Previously, algorithm engineers wrote code, and chip engineers made chips; they did their own thing. But now many companies are doing collaborative design. The algorithm knows the characteristics of the chip, and the chip is optimized for the algorithm. Apple is an example; their neural engine and iOS AI features are designed together, so running AI models on the iPhone is very smooth. Tesla's FSD chip is the same; it's customized for autonomous driving algorithms. Domestically, Huawei is considered to have done a better job in this regard. The Ascend chip is integrated with the Pangu large model and the HarmonyOS system. In the future, this integrated hardware and software capability will become a core competitive advantage.

Thread by 向阳乔木 (@vista8)

Author details

Thread content