Nabugu - stock.adobe.com

Nvidia new world models power physical AI systems, apps

The hardware and software vendor's new world models target the development of robotic applications and reasoning in robots.

While conversations about agentic AI and GenAI technology permeate, Nvidia and other IT vendors are also introducing technologies that support the physical AI ecosystem.

On Monday, Nvidia introduced new Omniverse SDKs for building and deploying industrial AI and robotics applications, and new world foundation models.

Omniverse and world models

New Omniverse SDKs allow robot learning developers to simulate robots across platforms such as Universal Scene Description and MuJoCo, a physics engine used in robotics, biomechanics and machine learning.

Omniverse NuRec libraries and AI models introduce a new rendering technique that lets developers capture, reconstruct and simulate the real world with sensor data.

The Nvidia Isaac Sim 5.0 and Nvidia Isaac Lab 2.2 are source robot simulations now available on GitHub. Isaac Sim includes sensor schemas that robot developers can use to close the gap between simulation and reality.

Nvidia also revealed that Cosmos Transfer-2, a world foundation model simplifying prompting and accelerating photorealistic synthetic data generation, is coming soon.

The AI vendor introduced a distilled version of Cosmos Transfer, which requires only one step of distillation instead of 70, so developers can run the model on Nvidia RTX Pro Servers.

Contributing to the open market, the company introduced Nvidia Cosmos Reason, a new open, customizable 7 billion-parameter reasoning vision language model for physical AI and robotics. The model  lets robots and vision AI agents reason like humans, Nvidia said.

The open model is for applications such as data curation and annotation, robot planning and reasoning, and video analytics AI agents.

Physical AI and robots

The release of these new models for physical AI shows the growing interest in the market as GenAI and agentic AI technology continue to mature.

The idea around world models and all the associated technologies ... is a huge next step in AI.
Tuong Huy NguyenAnalyst, Gartner

"The idea around world models and all the associated technologies ... is a huge next step in AI," said Tuong Huy Nguyen, an analyst at Gartner. "We are not talking about something mature or final yet. We are talking about different techniques and architectures being built so that AI can understand, anticipate and react to the world better. ... Each of these is a step in that direction."

World models like the ones Nvidia released are geared toward helping robots figure out how to better interact with the world, Nguyen added.

It helps address the need for robots to understand gravity, mass, speed, light, sound and objects.

Nvidia is not the only vendor working within this market. On Tuesday, AI research vendor Ai2 released a new class of models called Action Reasoning Models (ARM) to help robots and machines overcome some of the challenges and limitations of just using language or vision language models to reason. The first ARM is called MolmoAct, built on Ai2's Molmo, an open source family of vision language models. MolmoAct is the gap between language and action, Ai2 said. It helps robots or machines follow instructions.

The challenge with physical AI

Specific models, like the ones Nvidia and Ai2 provide, are needed because physical AI technology training is complex.

"The sort of software that powers humanoids is very complex," said Ray Wang, an analyst at Futurum Group. "You need a model designed particularly for physical AI workloads to train a humanoid."

He said humanoids need to process images and objects.

Nvidia is not only providing technology for developers to create physical AI applications but also expanding the AI technology ecosystem, Wang added.

He added that while Nvidia's software and hardware technologies are trusted, giving the vendor a significant advantage, there is still much to do to make physical AI technologies commercially viable for customers.

"It's not yet mature, but we can see the software development on this evolve rapidly over the past two to three years," Wang said.

In related news, Ansys, an engineering simulation software part of Synopsys, will offer access to Nvidia Omniverse technology within its software.

Esther Shittu is an Informa TechTarget news writer and podcast host covering artificial intelligence software and systems.

Dig Deeper on AI technologies