Google grows data cloud capabilities for data management
The tech giant brings open source Apache Iceberg table format support to its BigLake data lake as it extends BigQuery support for unstructured data and Apache Spark.
Google expanded its data cloud offerings with a series of preview capabilities released on Tuesday.
Among the previews unveiled at the Google Cloud Next '22 conference were expanded capabilities in the BigLake data lakehouse service, which was originally released in April.
Google is also bringing support for the open source Apache Iceberg data lake table format that is being widely adopted by vendors, including Cloudera, Starburst and Snowflake.
In addition, Google is adding new capabilities to its BigQuery data warehouse service to support unstructured data, as well as adding support for the Apache Spark query engine. The tech giant is also integrating BigQuery with the Google Datastream service to bring data into BigQuery from multiple databases, including Oracle, PostgreSQL and MySQL.
"Google continues to aggressively expand its market footprint for data management," said Gartner analyst Merv Adrian.
Google is following the trend of supporting open table formats with its support for Apache Iceberg, Adrian said.
Merv AdrianAnalyst, Gartner
The BigQuery previews -- including support for Apache Spark, unstructured data and connections with Datastream -- signal Google's intent to pursue both database management system data and new data lake opportunities more aggressively together with the BigLake platform, he said.
Iceberg ahead for Google's data cloud
In a media briefing, Gerrit Kazmaier, vice president and general manager for database, data analytics and Looker at Google Cloud, said adding support for more formats is a key focus for Google.
Kazmaier said that with support for Apache Iceberg, Google users will be able to store and manage data in the open source table format and still be able to use BigQuery for data queries.
Iceberg is the first of a trio of open source data lake table formats that Google will support over time. Kazmaier said that after the Iceberg preview, Google plans to support the open source Apache Hudi and Delta Lake table formats. Hudi is already used by some large organizations including Walmart. The Delta Lake data lake table format that was created by Databricks is now an open source project run by the Linux Foundation.
Open data cloud brings updates to BigQuery
Support for open source data lake table formats is part of Google's larger effort to enable an open data cloud, according to the vendor.
Kazmaier said the idea of an open data cloud is built on the core principle of being open to all possible types of data in every format. To that end, Google is now previewing support for unstructured data for BigQuery.
"In the past, there was a big divide between structured data and unstructured data," he said. "This is now converging as customers seek to unify their data architecture, and they are seeking to unify all types of data they are working with."
As part of the unification of data, Google said it also wants to help organizations ensure data quality. Google today also introduced preview capabilities for automating data quality management, as well as features for data lineage.
"Data management is one of the key concerns in an open data cloud ecosystem because we want to give our customers the ability to understand, manage and secure their data landscape," Kazmaier said.