XAI Grok 3 highlights openness and transparency concerns
The AI startup's new Grok model has 10x more compute power than the previous generation. XAI also introduced reasoning capabilities that it said surpass reasoning models.
XAI CEO Elon Musk is again challenging OpenAI through his startup's updated large language model, Grok 3, about one week after he made a bid to buy OpenAI.
XAI introduced Grok 3 and Grok 3 mini Monday through a live stream with Musk, xAI co-founders Jimmy Ba and Yuhuai Wu, and lead engineer Igor Babuschkin. The new models have 10x more compute power than Grok 3, according to the vendor. Grok 3 and Grok 3 mini surpassed OpenAI GPT-4o, Gemini and DeepSeek's V3 across benchmarks testing math, science and coding, xAI said.
The startup also created reasoning capabilities in Grok 3 and Grok 3 mini, surpassing other models like OpenAI o1, DeepSeek-R1 and Gemini 2.0 FlashThinking on benchmarks testing for math, science and coding.
The AI startup claimed that an early version of Grok 3 achieved a high score on Chatbot Arena, a public LLM benchmarking site that produces answers from two different unknown models. Grok 3's early version codename was Chocolate.
XAI also revealed that X will now have a new "Deep Search" tool, which will act as a next-generation search engine.
More reasoning models
Grok 3 comes as the competition between AI vendors has grown in the past few weeks, starting with Chinese AI startup DeepSeek. Since then, AI vendors, including xAI's rival OpenAI, have refined their reasoning models or introduced new ones.
With DeepSeek-R1 being an open source model, many vendors can now turn any of their models into reasoning models, said Bradley Shimmin, an analyst with Omdia.
"You can train any model to behave as a test time reasoner," he said. "That's what they're doing with Grok 3."
David Nicholson Analyst, Futurum Group
XAI is not the only vendor that can do this. For example, on February 12, Open Thoughts, a community of researchers, released OpenThinker-32B, an open-data reasoning model that sprouted from reasoning traces from DeepSeek-R1.
Grok 3 also seems like DeepSeek's reasoning model, said David Nicholson, an analyst with Futurum Group.
"I don't see huge differences, except that it isn't encumbered by the censorship built into DeepSeek," Nicholson said.
Open vs. closed
While it's unclear whether xAI used DeepSeek, it's also unclear how it added reasoning and thinking into its models because the vendor did not release any supporting material or information outside its live stream.
"There's no transparency into how this thing was made, what it's doing and why it is, as Elon so eloquently put it, so based," Shimmin said.
The lack of supporting material significantly departs from xAI's initial approach: releasing an Open source Grok-1.
Musk said on Monday's live stream that while the vendor has yet to open source Grok-2, it plans to do so once Grok 3 is fully available and mature.
Shimmin said that the strategy of open sourcing only the previous version of the model, rather than the current one, helps xAI protect its value proposition.
XAI strategy is a reasonable middle ground to the conversation around open source and AI vendors gaining money from their technology, Nicholson said.
"That's a reasonable balancing act to say we reserve the right to keep secret close to the vest, the leading edge of what we do and then over time, we will open this stuff up for developers to use with, like unlimited licensing," he said.
Enterprise use of Grok
However, the lack of transparency could also mean many enterprises may take a wait-and-see approach when using Grok 3.
Enterprises tend to prefer vendors like IBM that are very transparent and make even their pre-training data open, versus those that are closed, Shimmin said.
"That level of transparency is crucial for companies to make a choice of a model that they know ... is indemnified against any sort of future litigation, or at least lets them address any sort of biases that they want to address in their solution," Shimmin said. "We don't know at all what those based biases are in Grok 3."
There is also a question of whether enterprises are ready for the kind of "honesty" that Grok 3 may have, Nicholson said.
"It remains to be seen whether enterprise customers will embrace an approach that is personified by the kinds of behavior that Elon Musk exhibits," he said. Musk has been clear that Grok does not embody what he calls a "woke" agenda. This starkly contrasts OpenAI and Google's approach to censoring their respective LLMs, and it's unclear which approach enterprises prefer.
However, Nicholson added that Grok 3 being a contender in the AI market is beneficial.
"It's good news that another contender is jumping in, and ultimately, it will drive down the cost of AI for everyone," he said.
According to Musk, the current version of Grok 3 will have some imperfections, but improvements will be made daily. Moreover, xAI will introduce voice capability in the coming months.
XAI also revealed it's starting a new SuperGrok subscription and a website called Grok.com.
Esther Shittu is an Informa TechTarget news writer and podcast host covering artificial intelligence software and systems.