The Snowflake partner network is finding optimization opportunities around the company's cloud data management platform.
Snowflake's customer base, which includes 590 Forbes Global 2000 companies, would seem to offer partners plenty of room for data improvement. Customers collectively run more than 515 million data workloads daily on the platform, according to the company. The average number of daily queries for April 2023, the most recent month for which data is available, was 2.9 billion.
Forms of Snowflake optimization
Snowflake's architecture includes a cloud storage foundation, a compute layer and cloud services. Query processing takes place in the compute layer's virtual data warehouses, while cloud services manage authentication, metadata and query parsing, among other functions.
Optimization tends to be multifaceted in such an environment.
"Optimization can mean a lot of things to a lot of different people," said Laura McKinley, principal analytics consultant at DAS42, a data consultancy and Snowflake services partner based in New York City.
A customer new to Snowflake might first focus on managing costs. Snowflake offers a pay-for-what-you-use, consumption-based pricing approach, where credits are the unit of measure. Customers coming from an on-premises data warehouse setting might have concerns around the consumption model and tracking cloud spend. Initial discussions with consultants or Snowflake's solutions architects revolve around the amount of data the customer plans to store and the nature of the I/O workload.
"They estimate that with the client in order to negotiate the right pricing based on how big and how much consumption they're going to have," said Erik Duffield, co-founder and CEO at Hakkoda, a Snowflake services partner that offers data modernization and managed services.
Achieving a balance
But customers quickly move on from focusing strictly on costs. McKinley said she hesitates to call cost cutting "optimization" because such measures don't always meet an organization's broader business goals.
Smart optimization, which looks to balance cost and performance, helps businesses make better use of data -- while also driving down credit usage or cost, she said.
"What I'm really seeing in the market is more around, 'How do I better use Snowflake, use it in a more intelligent way?' rather than, 'How do I cut costs?'" McKinley noted.
Duffield also cited balancing price and performance as an optimization opportunity. Snowflake actually makes this task easier because its consumption model illuminates use, he noted. Other data warehousing approaches bury costs in numerous on-premises appliances or cloud services, obscuring visibility.
With Snowflake, "you have a finer-grained ability to balance costs and performance," Duffield said. "But there is a learning curve to get good at that."
For example, businesses need to determine whether they should balance cost-performance for individual workloads or optimize their overall infrastructure costs as a whole, he added.
Tuning Snowflake infrastructure to balance performance, cost
Tuning, or rightsizing, Snowflake infrastructure falls within the scope of optimization. Partners offer consulting services and tools for doing so.
Capital One Software, the financial services company's enterprise B2B software business, launched a Snowflake optimization product, Slingshot, in 2022. The offering originated as a tool for simplifying Capital One's internal Snowflake provisioning processes. The latest release, which rolled out in June 2023, includes an updated recommendation engine that targets infrastructure efficiency.
Patrick BarchSenior director of product management, Capital One Software
"We learned a lot about the term efficiency over the past year," said Patrick Barch, senior director of product management at Capital One Software, a Snowflake Marketplace partner. "For some customers, that just means running at the lowest possible cost. But for most customers, efficiency really does mean striking that perfect balance between cost savings and performance."
While the previous recommendation system focused on helping customers save money, the updated engine takes on performance as well.
"We've broadened the scope to also help customers understand when it might be worth applying a little extra horsepower to warehouses that are performing poorly," Barch said. "Or you have some workloads that are taking longer than you might expect. So, you might want to spend a little bit more to give your users a better experience."
Keeping user personas in mind can help identify effective infrastructure optimization opportunities. McKinley pointed to the example of senior executive personas who emphasize speed and aren't willing to wait for data much longer than 30 seconds -- and might want to see results in less than 10 seconds. This use case might call for adding aggregation tables or provisioning a new warehouse, she said. Such measures reduce queuing, which happens when a warehouse's compute resources become overloaded and queries must wait for available resources.
Snowflake query optimization
Rightsizing infrastructure represents half of the Snowflake optimization task, Barch noted. The other half is optimizing workloads. Slingshot's Query Advisor identifies an organization's most costly workloads and provides recommendations on how to rewrite queries to be more efficient and cost-effective, he said.
In a product demonstration, Barch analyzed a query's text and identified 19 performance improvement opportunities. For example, Query Advisor flagged OR operators and JOIN clauses that can contribute to poor performance.
Hakkoda, meanwhile, offers a tool that addresses workload issues. HakkodaRM, a resource monitoring and observability service, lets customers respond to alerts through Slack. The tool identifies long-running queries, which customers can reengineer for better performance, Duffield said.
HakkodaRM doesn't aim to create new capabilities around Snowflake, he added. Instead, the objective is to integrate with components such as Slack to help technologists more easily manage Snowflake environments.
Limited engagement or ongoing opportunity?
Service providers differ on the durability of Snowflake optimization as a service.
Duffield said he sees optimization as primarily a concern for early Snowflake adopters. Customers, as they gain experience and grow on the platform, will have less cause to discuss optimization.
"As customers have matured, this conversation has kind of slowed down," Duffield said.
In the typical customer lifecycle, an organization initially adopts Snowflake for a narrow workload, gets comfortable on it and then looks to add more workloads. Environments become more efficient as they grow, Duffield noted. Organizations operate fewer data silos as Snowflake expands its reach as a unified platform. As a result, there's less data moving among different systems and reduced data management requirements, he added.
Customer conversations, therefore, shift from optimization to determining ways to expand Snowflake's use in the business, Duffield said.
McKinley, however, said she sees a continuing need for optimization. She cited the example of a fintech client that asked DAS42 to assess its Snowflake-connected business intelligence (BI) platform, which users had deemed too slow. The company had been using Snowflake for several years. DAS42 optimized the client's Snowflake environment, driving "better usage patterns" for downstream BI, analyst and data science users.
Other reasons to revisit older Snowflake environments include events such as adding a new data source, McKinley said.
"I always think there is an opportunity to be optimizing any data warehouse," she said.
John Moore is a writer for TechTarget Editorial covering the CIO role, economic trends and the IT services industry.