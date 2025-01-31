The Allen Institute for Artificial Intelligence released the 405 billion-parameter version of its latest language model this week, claiming that Tülu 3 405B performs better than Chinese startup DeepSeek's V3 and OpenAI's GPT-4o.

Ai2's open source model was created using reinforcement learning from verifiable rewards (RLVR). This approach involves training the model to enhance specific skills, such as mathematical problem-solving and instruction following.

The AI research lab first introduced Tülu 3 in November, using the same RLVR approach and method. Despite its success in scaling the method with Tülu 3 405B, Ai2 had some technical challenges with the model. For one, Tülu 3 405B required 256 GPUs running in parallel; hyperparameter tuning was limited as a result of those computational costs, the research lab said.

Ai2's release of Tülu 3 405B comes at a time when Chinese startup DeepSeek has disrupted both the U.S. and Chinese AI markets with its reasoning models DeepSeek-R1-Zero and DeepSeek-R1. Its DeepSeek-V3 model was released last year.

Tülu 3 405B was also released on the same day that French startup Mistral AI released its open source model Mistral Small 3, a 24 billion-parameter model, under the Apache 2.0 license.

Innovation and openness DeepSeek, Mistral Small 3 and Tülu 3 405B all show the continual growth of the open source market and ongoing innovation in the AI market. "We're seeing the iterative and evolutionary change ... morphing of these models," said Mark Beccue, an analyst at Enterprise Strategy Group, now part of Omdia. They were very open about [the fact that] this wasn't cheap. Mark BeccueAnalyst, Enterprise Strategy Group While it's important to see the models get better in terms of performance and accuracy, Ai2's strength comes from its openness, he said. "They were very open about [the fact that] this wasn't cheap," Beccue said. This is different from DeepSeek's reasoning model DeepSeek-R1, which the Chinese startup claims is open source. However, many experts are questioning whether DeepSeek-R1 is truly open source because the data it was trained on and the components used to build it are not publicly available. There are also questions about the validity of DeepSeek-R1 being cost-efficient. In contrast with DeepSeek and others, Ai2 is known for releasing not just its training code and models, but also its data sets. "Ai2's fully open approach ... ensures users can easily customize their pipeline for everything from data selection through evaluation," Constellation Research analyst Andy Thurai said. An open approach is also better for accuracy, Beccue said. "I hope Ai2 [and] this kind of model really takes off," he said.