Databricks has launched a new lakehouse designed specifically for manufacturers.
Based in San Francisco, Databricks is a data management vendor whose data lakehouses combine the structured data storage capabilities of data warehouses with the unstructured data storage capabilities of data lakes.
The vendor unveiled its first industry-specific lakehouse in January 2022 when it make Databricks Lakehouse for Retail generally available. Since then, it has also launched industry-specific lakehouses for finance, healthcare and life sciences, and media and entertainment.
Introduced on April 4, the Databricks Lakehouse for Manufacturing is now the fifth industry-specific lakehouse the vendor has released.
Each contains prebuilt capabilities -- including connectors, a series of best practices and external datasets -- that tailor their use to the needs of enterprises in each business and make it easier for those organizations to get started using Databricks.
The concept isn't new, according to Doug Henschen, an analyst at Constellation Research. In fact, Databricks competitor Snowflake has also launched five industry-specific versions of its data cloud, starting with financial services in 2021.
But given that they can make data analysis and data science simpler for organizations in a particular industry, the targeted industry platforms are significant.
"These industry clouds are an update of the industry blueprints, templates and prebuilt solutions that enterprise vendors have offered for decades," Henschen said. "They typically include prebuilt, industry-specific content including data connectors, data transformation aids and industry data models. In Databricks' case, the emphasis seems to be on data science for typical manufacturing use cases."
The new lakehouse
While many industries require that systems stay up and running, manufacturing is perhaps more reliant than most others on making sure systems remain operational.
When one part of a manufacturing enterprise's operations fail, it can affect the entire production process and effectively shut down a business until the failure is fixed. For example, when one part of an energy grid breaks down, the entire grid can fail.
Therefore, predictive maintenance and preventative action are particularly important in manufacturing.
The predictive analytics capabilities enable customers to ingest data from IoT devices in real time and perform time-series processing and analysis that lead to maximized operational efficiency and minimized maintenance costs.
The digital twins, meanwhile, enable engineers to assess risk and optimize designs before delivering insights to applications.
In addition, the Databricks Lakehouse for Manufacturing includes the following:
- A feature that enables organizations to forecast demand for individual parts rather than overall demand, which helps organizations more precisely predict demand for parts and maintain inventory.
- Tools to monitor the overall effectiveness of equipment by ingesting and processing data collected by sensors and other IoT devices.
- Computer vision that helps organizations automate certain manufacturing processes to improve quality and reduce expenses.
"The emphasis seems to be on data science for typical manufacturing use cases, with examples here including support for digital twins, predictive maintenance, part-level forecasting and computer vision," Henschen said.
One of the key reasons Databricks began developing industry-specific lakehouses is to help organizations better succeed in their analytics and data science initiatives, according to Shiv Trisal, global industry leader of manufacturing at Databricks.
The majority of data science projects never make it into production, with some estimating a failure rate approaching 90%. By providing prebuilt features geared specifically toward organizations in a given industry, Databricks is attempting to make it easier to build data products and models that lead to data-informed decisions.
"Our intent with these accelerators is to … help [customers] get quick wins and replicate them in a standardized manner," Trisal said. "That can then lead to greater benefits as [organizations] scale the number of assets, users, locations and customers."
Doug HenschenAnalyst, Constellation Research
Manufacturing, meanwhile, is ideal for an industry-specific lakehouse. There's been rapid growth in the amount of data being collected by manufacturers, and that has resulted in difficulty when attempting to harness that data to make it usable for analysis.
The COVID-19 pandemic forced manufacturers to become more digital as a means of continuing to do business, Trisal noted. As a result, while there continues to be exponential growth in the amount of data generated worldwide, that growth is even more rapid in manufacturing.
Now manufacturers' data volumes are between now two and four times more than Databricks has seen in any other industry, largely the result of IoT devices, according to Trisal.
In addition, the data can be structured, unstructured or semi-structured, making lakehouses a good landing point rather than data warehouses or data lakes that are geared to handle only one type of data.
"The complexity is increasing every day," Trisal said. "Digitization of the products themselves is driving the market in this direction. It comes down to massive data volumes and how we can create value for these new IoT data sets."
Down the road
April 4 marked the initial launch of the Databricks Lakehouse for Manufacturing. As with any other capability, Databricks plans to add and enhance its capabilities over time.
Based on customer feedback during the testing process, Databricks added a use case for quality control. Going forward, the vendor plans to add more applications to the system, according to Trisal.
Databricks is seeing manufacturers evolve to become more like technology companies. As a result, manufacturers are developing more digital products that Databricks can subsequently make more intelligent with AI, Trisal noted.
"That's something we want to emphasize in the next few quarters," he said.
As far as adding the next industry-specific lakehouse, Henschen said that telecommunications would be a potential choice.
He noted that Snowflake launched a data cloud for telecommunications in February, while Databricks does not yet have a telecommunications-specific lakehouse.
"They're obviously considering which vertical industry has the biggest potential for Databricks," Henschen said. "I'm sure they're also considering which industries are most in need of industry-specific capabilities."
Eric Avidon is a senior news writer for TechTarget Editorial and a journalist with more than 25 years of experience. He covers analytics and data management.