Getty Images/iStockphoto

Snowflake grows cloud data platform with unstructured data

The cloud data vendor's winter release updates its data platform with new capabilities to enable organizations to query and manage more data types across different environments.

Snowflake today released a series of platform updates as part of the vendor's Snowday virtual event.

Snowflake, which got its start as a cloud data warehouse platform, has been growing its suite of services in recent years into what it refers to as the Snowflake data cloud.

Among the services the platform supports are data application development with Snowpark, as well as data lake query support.

With its Winter 2021 update introduced at Snowday, Snowflake is now set to add support for unstructured data that can be queried and governed by the data cloud platform.

Snowpark is also being expanded with support for the open source Python programming language, which has become increasingly popular for data science.

To date, Snowpark has supported only the Java and Scala languages. Snowpark for Python and the unstructured file support capabilities are currently in a private preview, with wider availability set for 2022.

How Pacific Life is set to use Data Cloud unstructured data support

In a media session Monday, John Damalas, vice president and CTO at Pacific Life, outlined how the insurer is using Snowflake.

We'll still probably have some lingering on-premise assets, but our critical data workloads we are moving to Snowflake.
John DamalasVice president and CTO, Pacific Life

Pacific Life was founded in 1868. Damalas said that over its long history, the company has used many different technologies and in recent years has begun to widely adopt Snowflake for its data platform.

"We have standardized on Snowflake as our enterprise data platform," Damalas said. "Considering we have been around for 153 three years at this point, we'll still probably have some lingering on-premise assets, but our critical data workloads we are moving to Snowflake."

Damalas said Snowflake's new support for Python is a positive move, noting that Python is the language of choice for data science and analytics teams. He said he is also particularly enthusiastic about Snowflake's other platform advancements.

"I am extremely excited about unstructured data support; it's just part of our ecosystem and always will be," Damalas said.

In particular, Damalas said he's looking forward to being able to use the same types of governance and data querying capabilities that Pacific Life has for structured data, with the new unstructured data support in Snowflake.

Unstructured data support comes to Snowflake data cloud

Christian Kleinerman, senior vice president of product at Snowflake, explained that the vendor has supported structured and semi-structured data since it was founded in 2012.

Those data types have some form of organization, with a structure, that can be parsed for data analytics and queries. Unstructured data by definition lacks any form, making it more difficult to analyze.

Among the common types of unstructured data that Kleinerman said Snowflake users commonly encounter are PDF documents, scanned text and even call center voice recordings. He noted that organizations want to be able to execute the same type of data analytics and business intelligence operation on unstructured data as they do with structured data.

Kleinerman explained that with the new capabilities, Snowflake can directly query unstructured data from where it resides in a data lake.

The unstructured data can also be ingested and loaded into Snowflake as an unstructured data file. By having the data loaded in Snowflake, Kleinerman said users will get data governance and replication capabilities that they wouldn't be able to take advantage of if the data stays in a data lake.

"We want to provide choice," Kleinerman said. "There are performance and governance benefits of ingesting data into Snowflake, but we don't want to force anyone to do that so we can interact with data that resides out of Snowflake."

Dig Deeper on Data warehousing

Business Analytics
Content Management