Ocient Hyperscale Data Warehouse scales data operations
The data warehouse vendor is targeting enterprises that need to use a trillion rows of data or more for analysis, with hyperscale technology that is now ready for broader adoption.
Data warehouse vendor Ocient updated its namesake platform with a new release that is generally available today.
Ocient, based in Chicago, was founded in 2016 and has been iterating its Hyperscale Data Warehouse platform over the last six years.
While it has had production customers, the vendor considers version 19 of its Hyperscale Data Warehouse to be the first release of the platform that is ready for broader enterprise adoption, particularly in the ad tech, telecommunications, government, logistics and financial services industries Ocient targets.
A core focus for Ocient is being able to execute queries quickly at large scale, with data sets that can include a trillion rows or more. To support that scale, Ocient Hyperscale Data Warehouse platform version 19 includes a SQL optimizer to accelerate queries.
Ocient also integrates what it calls a hyperscale extract, transform and load (ETL) service to help get large volumes of data from source locations into the data warehouse. The new data warehouse service also supports multiple data types, including geospatial data.
Version 19 can run on premises and in the cloud.
"It's interesting to see a startup supporting the option to run high-scale analytical data warehouse workloads on premises and in private clouds, as well as on AWS and Google Cloud," said Doug Henschen, an analyst at Constellation Research.
Doug HenschenAnalyst, Constellation Research
"I do like Ocient's focus on very specific use cases," Henschen said, referring to the industries Ocient specializes in. "These are the sorts of organizations that still have their own data centers and that often prefer the performance and control advantages and the predictable costs of running mission-critical workloads on premises."
Ocient Hyperscale Data Warehouse foundation is accelerated data access
Ocient was co-founded by Chris Gladwin, who is also the company's CEO. Before Ocient, Gladwin founded storage vendor Cleversafe, which he sold to IBM in 2015.
At Cleversafe, his team helped to develop technology that made storage access faster. Gladwin said he realized there was also a need to accelerate data access for large data sets inside of data warehouses, which led to the birth of Ocient.
Gladwin said the process of building out the Ocient Hyperscale Data Warehouse has involved a lot of time and effort as his team learned how to accelerate data access for large deployments in which there could be as many as a trillion or more rows of data.
While the new release is numbered 19, Gladwin said it's the first broadly available version of the data warehouse. He said the first pilot deployments of Ocient with early users were in 2020, and the first production users were onboarded in 2021.
"We could only handle a handful of customers and that's it," Gladwin said. "We just had to focus on them and now this is the first release that's generally available."
Building scale for the Ocient Hyperscale Data Warehouse
Ocient has had to develop its own approach to support the large-scale deployments it wants to support in industries like adtech that use a lot of rapidly changing data.
One such area is with ETL. Gladwin said the vendor had to build out a whole new engine to rapidly extract, load and transform data.
Adtech involves millions of data auctions every second, and both suppliers and buyers gain a lot of value from analyzing that data. Due to the large volume of data, many organizations have had to take a sample of the data, rather than look at all of it, for analysis.
The Ocient system includes a cluster of data loaders that are linearly scalable to extract data from the source. The extracted data is then transformed into relational schema, indexed and compressed. The new data is available within seconds for data queries, Gladwin said.
"The reason we had to do this is [that] we really only deal with data sets that have at least a trillion rows," he said. "If you have a trillion things in your data set, it's never some static blob that you batch load one time and forget about it."