Dremio Cloud: An autonomous lakehouse powered by AI agents
The new platform automates complex data preparation and pipeline management tasks that fuel AI development and could be a differentiator for the vendor.
Dremio has launched a new, agentic AI-powered version of its data lakehouse aimed at making it faster and easier to operationalize data for developing AI and analytics applications.
Released on Thursday, Dremio Cloud is a self-managed platform that autonomously learns, adapts and optimizes performance to save data engineers and other data experts from monotonous, time-consuming work such as discovering relevant data and building pipelines.
Features include an open catalog to govern data across lakehouses and databases federated throughout an organization, a semantic layer to contextualize data and support for Model Context Protocol to provide a framework for agents to access myriad data sources. In addition, Dremio Cloud provides an active metadata system that independently analyzes patterns and trends to help make decisions.
Given that lakehouses are complex data management environments that require time and expertise to operate, Dremio Cloud is a valuable new version of the vendor's platform, according to Kevin Petrie, an analyst at BARC U.S.
"Dremio is executing on a cool vision of the modern lakehouse -- federated, easier to manage and readily accessible to agentic applications," he said. "This addresses the key requirements we see among AI adopters."
Federated query capabilities that allow users to query distributed data, including unstructured data, and enabling agents to quickly access high-priority datasets are especially valuable, Petrie continued.
"Lakehouse platforms can be hard to manage, so the more you can automate the tuning with AI, the better," he said.
Based in Santa Clara, Calif., Dremio's lakehouse platform is optimized for storing Apache Iceberg tables. In April, the vendor released AI-powered semantic search capabilities designed to help customers discover relevant data for AI and analytics tools.
An autonomous lakehouse
Many enterprises have significantly increased their investments in AI development since ChatGPT's November 2022 launch by OpenAI marked a significant improvement in generative AI (GenAI) technology.
Dremio is executing on a cool vision of the modern lakehouse -- federated, easier to manage and readily accessible to agentic applications. This addresses the key requirements we see among AI adopters.
Kevin PetrieAnalyst, BARC U.S.
GenAI tools, such as chatbots, can make a company's workforce better informed by enabling non-technical employees to explore and analyze data using natural language. They can also make technical workers more efficient by automating certain processes, such as documentation and generating code.
Now, AI development has evolved to include agents that, unlike chatbots and other GenAI tools, can act autonomously to surface search data for insights that might not otherwise be found and execute jobs without being prompted.
However, data management platforms -- including lakehouses -- developed before the burgeoning era of AI were not built for the demands of developing and managing agents and other AI tools. Such applications require constant training and tuning using large volumes of high-quality data to be effective.
As a result, training and maintaining AI tools often requires constant manual work.
Dremio Cloud automates much of that work, acting not merely as a copilot that assists data engineers, data scientists and other data workers but as the actual executor of the work.
As a result, the new version of Dremio's platform is a significant upgrade, according to William McKnight, president of McKnight Consulting. In particular, its value lies in allowing non-experts to take on more responsibilities and in enabling the easy exploration of both structured and unstructured data.
"The significance of Dremio Cloud lies in its shift to supporting a fully autonomous, AI-first platform where AI agents interface with the data to deliver instant answers and optimized queries," he said.
Beyond aiding customers, Dremio Cloud could help the vendor differentiate its capabilities from peers such as Databricks, Starburst and Teradata, which also provide lakehouse platforms, McKnight continued.
"Dremio Cloud has unique capabilities in the data lakehouse space, offering autonomous capabilities and AI-driven features that set it apart from competitors," he said. "Dremio's focus … enables Dremio Cloud to eliminate manual toil and optimize performance."
Petrie similarly noted that Dremio Cloud is a unique offering.
"Dremio's big competitors are the lakehouse gorillas, but those companies are not as invested in hybrid or multi-platform data management, which gives Dremio a competitive edge," he said.
Specific features of Dremio Cloud include the following:
Unified and governed access to data through Dremio's open catalog, including the vendor's Intelligent Query Engine.
An automatically generated semantic layer to give AI tools the context about data they need to be accurate.
Learning capabilities that enable the platform to improve its performance.
AI Agents to enable users to query and analyze data.
Native MCP support to give users a choice of AI vendors to choose from when building agents.
Active metadata to provide an intelligence layer for agents.
Clustering capabilities that reorganize data layouts and access patterns to optimize query performance.
A mix of customer feedback and Dremio's own evolution over the past couple of years provided the impetus for developing Dremio Cloud, according to Rahim Bhojani, the vendor's chief technology officer.
"The message [from customers] was clear that ease of use matters above everything else," he said. "[Additionally], as AI changes how people interact with data, the next logical step was to remove infrastructure and data management complexity."
Enabling access to unstructured data along with capabilities that use AI to simplify data preparation and study metadata to optimize performance are perhaps Dremio Cloud's most significant features, according to Petrie.
McKnight similarly highlighted the value of helping users operationalize unstructured data, which now makes up the overwhelming majority of all data. Meanwhile, he noted that the overall construction of Dremio Cloud supports the vendor's aim of building an autonomous lakehouse.
"The Dremio Cloud platform appears logically put together," he said. "Dremio Cloud is built for agents."
However, the platform could have potential limitations if its integrated AI functions aren't seamlessly connected to data residing in external federated databases, McKnight continued.
"It represents a potential limitation to the goal of 'unified data access' and could prevent the AI agent from fully addressing organization-wide data silo challenges," he said.
Looking ahead
AI will remain a focal point for Dremio in 2026 with specific initiatives including improving Dremio Cloud and helping advance Apache Iceberg and Apache Polaris, according to Bhojani.
"AI represents an incredible opportunity to empower users through self-service and simplicity," he said. "We see AI driving a step-change in ease of use. At the same time, our commitment to an open foundation … ensures interoperability and freedom of choice for customers."
Focusing on AI is an appropriate way for Dremio to serve the needs of its users and potentially attract new ones, according to McKnight. Specifically, extending AI features to a wider array of federated data sources would be wise, he said.
Petrie, meanwhile, suggested that Dremio add partnerships with AI and machine learning platforms to better enable customers to build AI tools.
"I would recommend deepening their ecosystem support with partnerships in the AI/ML platform space," he said."
Eric Avidon is a senior news writer for Informa TechTarget and a journalist with more than 25 years of experience. He covers analytics and data management.