Data warehouse vendor Yellowbrick Data is out with the general availability of version 6.0 of its namesake platform, marking the debut of a new architecture for cloud deployments.
Based in Mountain View, Calif., Yellowbrick develops an on-premises platform with purpose-built hardware and in recent years has been advocating for a hybrid approach for data warehouse.
Part of the hybrid strategy is providing a cloud data warehouse service as well as an on-premises version.
With Yellowbrick 6, the vendor has designed a new architecture for its cloud service that uses the Kubernetes cloud-native container orchestration system, providing a service that can scale up or down as required. The new cloud service also supports the open source Apache Parquet data format, which is commonly used for data lake files.
Yellowbrick can now be deployed in an organization's own AWS virtual private cloud (VPC) instance. The new Yellowbrick cloud data warehouse service is not yet available in Google Cloud or Microsoft Azure, though that is on the vendor's roadmap for the third quarter.
The continued focus on hybrid environments is a differentiator for Yellowbrick, said Kevin Petrie, an analyst at Eckerson Group.
"Despite various consolidation efforts, the reality remains that most enterprises need to manage and analyze data across distributed, heterogeneous environments," Petrie said. "Yellowbrick seeks to address this opportunity with its distributed data cloud architecture on premises and for AWS. The next step will be to support Azure and Google Cloud."
Yellowbrick cloud data warehouse grows hybrid options
With the on-premises version of the vendor's data warehouse technology, the service is provided in specialized hardware instances with combined compute and storage.
Kevin PetrieAnalyst, Eckerson Group
With the specialized hardware, the original idea was to provide an optimized platform that Yellowbrick designed to run its data warehouse software. With Yellowbrick 6.0, compute and storage are now separated for the cloud data warehouse deployment, to be able to support the scale and elasticity needs of cloud demand.
Most of the performance advantages -- including speed, storage and memory management -- of Yellowbrick's on-premises technology are now available in the cloud version, said Mark Cusack, CTO of Yellowbrick.
Enabling performance in the cloud involved engineering efforts with the networking stack, memory management and storage. Yellowbrick had to create its own new non-volatile memory express (NVMe) storage drivers to accelerate access to high-performance storage in the cloud, Cusack said.
Before the 6.0 update, Yellowbrick maintained a cloud service, but it was a managed private cloud service running in Yellowbrick's data centers.
"This is the kind of the first time where we've moved fully into the public cloud," Cusack said.
How Yellowbrick is deployed in the cloud
The Yellowbrick cloud data warehouse is not available in an as-a-service model, in which the platform is entirely deployed and managed in a cloud instance that Yellowbrick owns.
Rather, the service works by enabling organizations to deploy Yellowbrick into their own VPC deployments. With the VPC approach, all the data is still owned and managed by the customer within its own cloud account.
Though Yellowbrick is deployed into a VPC, Cusack said organizations are not left on their own to get up and running.
"We've automated the whole installation and lifecycle management of Yellowbrick running in a customer's own cloud account," he said.