Databricks and Microsoft collaboration boosts MLflow tool

Databricks' MLflow gets support from Microsoft and new features, including easier model tracking, as the open source tool prepares for its 1.0 release next month.

SAN FRANCISCO -- Microsoft now natively supports MLflow, an open source machine learning management tool first developed by Databricks, within its Microsoft Azure Machine Learning service. Also, the tech giant, which is a longtime partner of Databricks, said it will actively contribute to MLflow.

Unveiled at the Spark + AI Summit 2019, sponsored by Databricks, the new Databricks and Microsoft collaboration is a sign of the companies' deepening ties, but it is too new to say how effectively the partnership will advance MLflow for developers, said Mike Gualtieri, a Forrester analyst.

Microsoft has sold Azure Databricks, an Apache Spark-based analytics service that uses Databricks technology, since 2017. Previously, MLflow was only available natively on that Microsoft service.

"MLflow is a welcome tool for ML [machine learning] developers, but I think it is very overhyped, because this is still early days for these types of tools," Gualtieri said.

Gualtieri noted that similar open source tools, like Kubeflow for Google, have benefited from the support of big cloud providers.

The rise of MLflow

Databricks first introduced MLflow in June 2018. Right away, startups and larger enterprises started using it to manage their machine learning lifecyclesSince its release, more than 80 contributors from some 40 companies have worked on the open source machine learning tool, and it regularly sees more than 500,000 downloads per month.

Matei Zaharia, Databricks, Spark + AI Summit 2019, MLflow, Databricks Microsoft Azure
Matei Zaharia, co-founder and chief technologist at Databricks, speaks on MLflow at the Spark + AI Summit 2019 in San Francisco on April 25.

Hotels.com, a travel booking site that is a part of the multibillion dollar travel technology company Expedia, uses MLflow to augment some of the many data science platforms it uses.

"Most companies -- ours included -- have huge experimentation platforms, but those platforms are often missing some of those core machine-learning-specific metric sets, ability sets," Matthew Fryer, vice president and chief data science officer at Hotels.com, based in Dallas, said in an interview at the conference.

"MLflow allows us to augment these platforms. It allows us to pick up some of those core aspects you don't typically see in more generic experimentation platforms," Fryer continued.

Hotels.com was an early adopter of MLflow, Fryer noted. The site uses other Databricks products, too, as well as a number of other data science and machine learning tools, including TensorFlow and Amazon platforms.

"Clearly, it's a product that is developing and evolving," Fryer said of MLflow. "The use cases it's trying to solve, it's super important to us. It's already helping, but it's very exciting to see where development is going to go."

New features and the Databricks and Microsoft collaboration

Clearly, it's a product that is developing and evolving.
Matthew FryerVice president and chief data science officer, Hotels.com

At the same time Databricks and Microsoft made public their new collaboration, Databricks revealed that MLflow 1.0 is set to be released in May. MLflow 0.91 came out on April 21 in preparation for the 1.0 release, which will help stabilize the API in MLflow for long-term use, the vendor said.

Databricks also presented two new features: MLflow Workflows and MLflow Model Registry.

Meanwhile, Matei Zaharia, co-founder and chief technologist at Databricks, based in San Francisco, explained during a keynote on April 25 that the Workflows component will enable users to go into their data change parameters in real time, without having to overhaul their code.

As for Model Registry, it "lets you manage, tag and version models in the server, and then keep track [of] where it's deployed, what version is deployed and so on," Zaharia said.

The Databricks and Microsoft collaboration was the headlining MLflow story of the Spark + AI Summit 2019, however.

Microsoft, a longtime user of Spark-based products, is "embracing the open source culture," Shivani Patel, a program manager at Microsoft, said in an interview at the conference.

"You can still use MLflow, but you can also use those APIs with our machine learning service," she said.

With the new collaboration between Microsoft Azure and Databricks, Azure Machine Learning users can use MLflow, but don't have to use Microsoft code.

"They can write all their code in the MLflow API, and then they can pull it right up on the portal," Patel said. "We are continuously investing in MLflow to make sure it's integrating with Machine Learning."

The Spark + AI Summit 2019 was held at the Moscone Center April 23 to 25.

Dig Deeper on AI technologies

Business Analytics
CIO
Data Management
ERP
Close