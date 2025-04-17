OpenAI introduced the next iteration of its reasoning models, which it says are its smartest and most capable models to date.

The new o3 and o4-mini models, released on April 16, can perform tool calling and use and combine every tool within ChatGPT, the vendor’s AI chatbot, including searching the web and analyzing uploaded files and other data with Python.

According to OpenAI, the new model versions -- which succeed the o1 generation -- are a step toward a more agentic ChatGPT that is capable of acting autonomously or semi-autonomously.

The models are trained on when and how to use the tools within ChatGPT.

The o3 model is adapted to handle complex questions that require in-depth analysis, with answers that are not obvious. It can analyze visuals such as images, charts and graphics. Meanwhile, o4-mini is optimized for fast, cost-efficient reasoning. It is good for math, coding and visual tasks.

Both models demonstrate improved instruction following and more useful responses than OpenAI's previous reasoning models, the vendor said.

The AI vendor said the models can also think with images. Users can upload a photo of a whiteboard or hand-drawn sketch, and the models can interpret it, even if the image is blurry.

The new iteration of OpenAI reasoning models comes a few days after the AI vendor launched GPT-4.1 while telling users it plans to turn off GPT-4.5. It also comes three months after the vendor introduced the o3-mini reasoning model.

Test-time reasoning and agentic AI The new models use the same technology as other reasoning models are based on, such as test-time reasoning, with a slight improvement. Test-time reasoning is a technique that enables a model to think more and use different problem-solving skills instead of just regurgitating responses from the web and other data sources. "Perhaps, if anything, what they're doing is setting expectations a little bit better for how long different questions or tasks will take with it," said Bradley Shimmin, an analyst with Futurum Group. However, many model providers are doing similar things with their models, and OpenAI's moves are unsurprising, Shimmin continued. In any case, the new image thinking capabilities of o3 and o4-mini are a significant step for foundation models meant to support agentic AI, said Lian Jye Su, an analyst with Omdia, a division of Informa TechTarget. "The “o” series models have always meant to be multimodal," Su said. "They're meant to be less text-based and on the image side. It's a significant thing more in the sense that ... they're almost like an agent because of their capability. As the models become more powerful, they will continue to improve, meaning they have almost the agentic capability." He added that the “o” series models could be considered to have agentic capabilities because they differ from traditional models that follow a specific instruction given straightforwardly. Instead, the models and other multimodal models can be given more complex instructions or targets and can use problem-solving skills to answer them. "I do expect the multimodal foundation models to have that capability as the model becomes smarter," Su said. "It doesn't mean they will replace completely replace [agents]; it just means, or more of the complex tasks that AI agents do, they can now fulfill as well." On the other hand, the agentic capabilities that OpenAI highlighted in o3 and o4-mini, such as web searching and analyzing data, are not necessarily new and are a natural evolution of how AI models have been used, Shimmin said. He added that the image capability of the o3 and o4-mini models is not "a new class of models." "It's just saying the model is multimodal in that it can take in audio, visual, image, videoand text, and that it can reason about how to work with those and do things with those because it's a model that features test-time reasoning," he continued.