shyshka - Fotolia

Dremio accelerates cloud data lake queries for AWS

New features in the AWS Edition of Dremio's data lake engine make it easier for organizations to scale and manage data lake queries in the cloud.

Dremio Tuesday released into general availability its cloud data lake engine offerings with a new purpose-built AWS edition that provides enhanced data query capabilities.

The Dremio AWS Edition expands on the Santa Clara, Calif., data lake vendor's Data Lake Engine technology base with a specially optimized system for AWS users.

Among the new features in the AWS edition is an elastic engines capability that can help to accelerate cloud data lake queries, and a new parallel project feature that helps organizations with scalability to better enable automation across multiple Dremio instances. Dremio had previously made its data lake engine available on AWS but had not developed a version that was optimized for Amazon's cloud.

The parallel project and elastic engines capabilities in Dremio's AWS Edition can help data consumers manage their time and infrastructure more efficiently, said Kevin Petrie, vice president of research at Eckerson Group.

The Dremio platform provides simple access for a wide range of analysis and fast results for reporting, which is becoming increasingly important to enterprises with the sudden onset of a new business era triggered by the COVID-19 pandemic, Petrie said.

"COVID-19 accelerates the cloud modernization trend and therefore the adoption of cloud-native object stores for data lakes," Petrie said. "Dremio's AWS marketplace offering provides enterprises the opportunity to modernize their data lakes on AWS infrastructure."

Data lake vendor's AWS Edition data lake dashboard
AWS Edition dashboard provides visibility into data lake storage and data sets.

Big money for Dremio's cloud data lake efforts

The AWS Edition release is the first major launch for Dremio since it made public a $70 million Series C funding round on March 26, bringing total funding to $212 million.

Tomer Shiran, co-founder and chief product officer at Dremio, said the funding was a "great vote of confidence" for his firm, especially given the current global pandemic. Analytics and business intelligence are two key categories that many large organizations that Dremio targets will continue to spend on, even during the COVID-19 crisis, he said.

"Part of the reason for the large investment even during an economic crisis, and obviously a health crisis, is the fact that we're playing in such a hot space," Shiran said.

How Dremio's elastic engines improve cloud data lake queries

Most of Dremio's customers use the vendor's data lake engine in the cloud already either on AWS, or on Microsoft Azure but the new edition advances Dremio's AWS offering specifically, Shiran noted.

COVID-19 accelerates the cloud modernization trend and therefore the adoption of cloud-native object stores for data lakes. Dremio's AWS marketplace offering provides enterprises the opportunity to modernize their data lakes on AWS infrastructure.
Kevin PetrieVice president of research, Eckerson Group

"The idea is to drastically reduce the complexity and make it much easier for companies to get started with Dremio on AWS, and to take advantage of all the unique capabilities that Amazon brings as a platform," Shiran said.

He added that typically with query engines there is a single execution cluster, even if multiple sets of workloads and different users are on the same system. The approach requires organizations to size their query engine deployment for peak workload.

With the new AWS Edition, the elastic engines feature is debuting, providing a separate query engine for each workload. With elastic engines, the query engine elastically scales up or down based on demand, rather than needing to run one large cluster that has been sized for peak utilization.

"This is really taking advantage of the fact that, in the cloud Amazon is willing to rent you servers by the second," Shiran said.

How elastic engines work

Dremio is managing the AWS EC2 (Elastic Compute Cloud) instances on the user's behalf, handling the configuration and optimization for autoscaling the required resources for running the data lake query engine.

"So, what you do with this AWS Edition of Dremio is you spin up literally one instance of Dremio from the Amazon Marketplace and that's all you're interacting with ever is that one instance," Shiran said. "Automatically, behind the scenes it is using Amazon APIs to provision and deprovision resources."

The elastic engines feature is first available in the AWS Edition of Dremio, but the vendor plans to expand the capability with future support for Microsoft Azure and Google Cloud Platform, as well as on-premises Kubernetes environments.

Parallel projects enables multi-tenancy

Another new feature in the Dremio AWS Edition is a feature the company has dubbed parallel projects. Shiran said parallel projects is an effort to make it easier to achieve multi-tenancy for Dremio deployments.

"So, now we have a notion of a project, where all the state of your Dremio environment is saved and you can shut it down entirely and then bring it back up later," he said.

With parallel projects, an organization can choose to have different environments for development and production. Each of the environments is also automatically backed up and gets automated patching for upgrades as well.

Dremio will continue to focus on the cloud and ease of use for customers, Shiran said.

"We are investing in making Dremio easier for people who want to run in the cloud and you're seeing the first step of that with the AWS Edition, but we're going to extend that to other clouds as well," he said.

Next Steps

Dremio accelerates data lake operations with Dart Initiative

Dig Deeper on Database management

Business Analytics
Content Management