Dremio, at its Subsurface virtual conference on March 2, made its Sonar query engine generally available and released a preview of the new Arctic metadata management service for its data lakehouse cloud platform.
The new Dremio Sonar query engine is built on top of the open source Apache Iceberg technology, which provides data table services for data lakes.
Sonar supports the SQL Data Manipulation Language (DML) that enables users to insert, update and delete information directly in a data lake. The other new feature is Dremio's Arctic metastore for data, which aims to replace Apache Hive technology.
"The Lakehouse concept, the idea that organizations will be able to consolidate multiple workloads onto a single data platform, is certainly gaining advocates and vendor support," said Constellation Research analyst Doug Henschen.
"The promise is consolidation of platforms and reduced cost, but organizations will have to make sure that a single platform meets their BI [business intelligence], analytics, data science and engineering needs," he continued.
Building out the data lakehouse to replace data warehouses
Henschen said he sees the new functionality that Dremio unveiled on Wednesday as aimed at BI and analytics professionals.
For example, he noted that Dremio is enhancing its platform with added update and delete capabilities with DML that fill out the full record-level manipulation ability that data professionals expect from a data warehouse platform.
Doug HenschenAnalyst, Constellation Research
In the opening keynote for the Subsurface event, Dremio's co-founder and chief product officer, Tomer Shiran, fleshed out the data lakehouse concept.
With the data lakehouse, rather than bringing data into a query engine, users bring the query engines to the data, Shiran said. So data stored in cloud object storage such as Amazon S3 can be queried by any number of different technologies and users don't have to move data into a data warehouse to use it.
Dremio Sonar provides new data lakehouse query engine
The new Dremio Sonar query engine is powered by the open source Apache Arrow technology.
Among the features that Sonar enables are data queries across any type of data. Shiran said queries can be run against a data metastore, like Apache Hive, directly against data in a data lake or even against a relational database.
Sonar also supports DML queries that enable users to insert, update and delete records in data lakes. The DML capability uses the open source Apache Iceberg technology for data lake tables and the Apache Parquet data format.
"Apache Iceberg is a table format that is built on top of Parquet, so you can start thinking of your data not as files but as tables," Shiran said.
Dremio Arctic enables a data lake metastore
Shiran, in his keynote, also publicly previewed Dremio Arctic, which he described as an intelligent metastore for Apache Iceberg.
Shiran explained that Arctic will work with other data lake query engines, including Apache Spark, Trino and Presto -- not only Dremio Sonar. Dremio's goal is to create a modern metastore for data lakehouse deployments.
"For a very long time, the only kind of metadata management capability in the lake was the Hive metastore, which is one of the last remaining pieces of the original Hadoop stack," Shiran said. "We thought it was the right time and it is actually necessary to provide something a lot more sophisticated, much more capable than what Hive metastore can provide."