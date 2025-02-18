XAI CEO Elon Musk is again challenging OpenAI through his startup's updated large language model, Grok 3, about one week after he made a bid to buy OpenAI.

XAI introduced Grok 3 and Grok 3 mini Monday through a live stream with Musk, xAI co-founders Jimmy Ba and Yuhuai Wu, and lead engineer Igor Babuschkin. The new models have 10x more compute power than Grok 3, according to the vendor. Grok 3 and Grok 3 mini surpassed OpenAI GPT-4o, Gemini and DeepSeek's V3 across benchmarks testing math, science and coding, xAI said.

The startup also created reasoning capabilities in Grok 3 and Grok 3 mini, surpassing other models like OpenAI o1, DeepSeek-R1 and Gemini 2.0 FlashThinking on benchmarks testing for math, science and coding.

The AI startup claimed that an early version of Grok 3 achieved a high score on Chatbot Arena, a public LLM benchmarking site that produces answers from two different unknown models. Grok 3's early version codename was Chocolate.

XAI also revealed that X will now have a new "Deep Search" tool, which will act as a next-generation search engine.

More reasoning models Grok 3 comes as the competition between AI vendors has grown in the past few weeks, starting with Chinese AI startup DeepSeek. Since then, AI vendors, including xAI's rival OpenAI, have refined their reasoning models or introduced new ones. With DeepSeek-R1 being an open source model, many vendors can now turn any of their models into reasoning models, said Bradley Shimmin, an analyst with Omdia. "You can train any model to behave as a test time reasoner," he said. "That's what they're doing with Grok 3." I don't see huge differences, except that it isn't encumbered by the censorship built into DeepSeek. David Nicholson Analyst, Futurum Group XAI is not the only vendor that can do this. For example, on February 12, Open Thoughts, a community of researchers, released OpenThinker-32B, an open-data reasoning model that sprouted from reasoning traces from DeepSeek-R1. Grok 3 also seems like DeepSeek's reasoning model, said David Nicholson, an analyst with Futurum Group. "I don't see huge differences, except that it isn't encumbered by the censorship built into DeepSeek," Nicholson said.

Open vs. closed While it's unclear whether xAI used DeepSeek, it's also unclear how it added reasoning and thinking into its models because the vendor did not release any supporting material or information outside its live stream. "There's no transparency into how this thing was made, what it's doing and why it is, as Elon so eloquently put it, so based," Shimmin said. The lack of supporting material significantly departs from xAI's initial approach: releasing an Open source Grok-1. Musk said on Monday's live stream that while the vendor has yet to open source Grok-2, it plans to do so once Grok 3 is fully available and mature. Shimmin said that the strategy of open sourcing only the previous version of the model, rather than the current one, helps xAI protect its value proposition. XAI strategy is a reasonable middle ground to the conversation around open source and AI vendors gaining money from their technology, Nicholson said. "That's a reasonable balancing act to say we reserve the right to keep secret close to the vest, the leading edge of what we do and then over time, we will open this stuff up for developers to use with, like unlimited licensing," he said.