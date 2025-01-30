Chinese tech giant Alibaba is also feeling pressure after Chinese startup DeepSeek released an AI model that triggered an uproar in the West because DeepSeek claimed to have trained the model, comparable in capabilities to advanced Western models, at a fraction of the cost and with far fewer AI chips.

Alibaba released a new AI language model, Qwen 2.5-Max, on Jan. 28, a day before the Chinese New Year, when the country’s economy traditionally shuts down for 15 days

Qwen 2.5-Max is a mixture of expert (MoE) model pre-trained on 20 trillion tokens and post-trained with curated supervised fine-tuning and reinforcement learning from human feedback.

MoE is a technique in which a model is structured with multiple “minds” and each mind is compartmentalized so that whenever there is a query, the model uses adaptive routing to go to the specific mind, or region, that has the answer. For example, if a model is geared towards coding, the model routes queries to that mind.

MoE allows a model to be trained with less compute, so training can be faster and more cost-efficient. Other AI vendors, such as France-based Mistral, have also used this technique.

Pressure in China While Qwen 2.5-Max is not comparable to the DeepSeek R1 model that after its release on Jan. 20 caused a global selloff of AI companies’ stock, it is like DeepSeek V3, another MoE model released earlier this month. Alibaba’s release shows the threat the tech giant – the world’s fourth-ranked public cloud vendor in terms of market share -- and other Chinese tech vendors feel regarding the startup. Following the DeepSeek R1 release, TikTok owner ByteDance also released an update to its AI model. Moreover, Chinese tech giants engaged in a price war with DeepSeek last year after the AI startup released V2 for a user cost of only 1 yuan or $0.14 per million tokens. By comparison, OpenAI’s GPT-4 model’s lowest price tier costs $10 per million tokens. The timing of the Alibaba and ByteDance releases shows that DeepSeek has spurred bigger AI technology vendors to launch their products quicker than they originally planned. “We know that Alibaba’s cloud unit has been voraciously beefing up its AI technology, but I think this underscores the immense pressure put on all AI companies in the wake of DeepSeek's spectacular rise,” said Lisa Martin, an analyst with Futurum Group.

A shift in AI market The competitive edge that DeepSeek brings also reflects a new shift in the AI market. "The progress around building leaner and more powerful models continues," said Arun Chandrasekaran, a Gartner analyst. "We will see a lot more innovation in algorithmic and the software layer, in terms of building more efficient models that run on constrained infrastructure, and that are also more price competitive from an inferencing API standpoint." The apparently distinct innovations work together and are not standalone, Chandrasekaran said. "It's almost like one model company is building on top of the other," he continued. "These model companies are becoming very, very good at reverse engineering these techniques and then quickly improving on those techniques to do something bigger, better, cheaper and smaller." The innovation shows that what was previously thought about model training and inferencing has changed, said Bradley Shimmin, an Omdia analyst. The AI market has shifted to the degree that massive costs the market previously understood were needed to build a big AI model is no longer the case. GPT-4 cost upwards of $100 million to train, according to CEO Sam Altman, while DeepSeek said it spent about $6 million to build R1. "We've spent the last almost three years now really trying to optimize how transformers function, and these are the gains that you're seeing right now," Shimmin said. "There are a number of these now that are showcasing just how efficient we've been able to push these basic machine learning ideas that we've had and been working with for the last 60 years."