Cloud data platform vendor Varada updated its namesake service with a new 3.0 release today.
Varada, based in Tel Aviv, introduced its 2.0 release in December 2020, providing a data virtualization approach to help organizations with data queries on cloud data lakes. A core component of Varada's stack is the open source Trino query engine, which was known as PrestoSQL until December 2020.
With the new update, the vendor added improved data lake query acceleration capabilities with elastic scaling that can help organizations grow or reduce the size of their query cluster as needed.
Varada 3.0 also integrates a new data tiering approach that includes different layers of data storage, including solid-state drive non-volatile memory express (SSD NVMe) attached nodes in users' virtual private clouds, as well as cloud object storage that can be accessed by the query engine.
Cloud data platform update accelerates data lake insights
Matt Aslett, an analyst at S&P Global Market Intelligence, said Varada was initially focused on providing virtualized access to data from multiple sources in multiple locations. Aslett noted that Varada uses a combination of its own query orchestration technology with the Trino open source distributed SQL query engine and some machine learning-driven optimization.
"The company has added elastic scaling, thanks to its use of cloud storage as a cache for indexed data, which enables more rapid provisioning of the compute tier to meet demand," Aslett said. "This is particularly relevant to making it quicker and easier to generate insight from data stored in data lake environments."
Why Varada is adding more scalability to its cloud data platform
Eran Vanounou, CEO of Varada, said the new capabilities in the platform update were driven by customer demand.
He noted that a Varada customer came to him with a requirement to scale rapidly to peaks in query volume. While the Varada platform was able to scale, the user needed to scale faster as demand peaks surfaced.
Before the update, Varada had two data tiers: one for hot data and the other for cold data. The hot tier of data, which uses SSD NVMe-attached nodes, holds the data index and the most frequently accessed data. The hot data tier pulls data up from the cold data tier, which come from basic cloud object storage.
The new release introduces the warm tier, which also uses cloud object storage. However, it has been configured and optimized to enable rapid data loading -- as demand scales, the warm data can now be rapidly queried. Vanounou explained that the warm data tier will enable Varada to scale out quickly to handle demand peaks.
How Varada data platform elasticity works
A key component as to how elasticity is enabled in the Varada 3.0 cloud data platform is a feature called Varada Control Center.
The VCC provides full observability of everything running in the cluster. That includes CPU and memory utilization, as well as insight into what queries are running and how they can be optimized. The ability to grow or shrink a cluster is managed in the VCC by way of what Vanounou called an elasticity definition.
An elasticity definition can include a limit from a cost perspective as to how large a cluster can grow. Vanounou noted that the VCC can also help users shrink cluster sizes, based on actual workload behavior, to avoid having underutilized capacity.
Looking forward, Vanounou said Varada will likely add more capabilities to the VCC that will help users with additional cost management capabilities for operating and scaling data clusters in the cloud.