TechTarget.com/whatis

https://www.techtarget.com/whatis/definition/machine-learning-operations-MLOps

What is machine learning operations (MLOps)?

By Cameron Hashemi-Pour

Machine learning operations (MLOps) is the development and use of machine learning models by development operations (DevOps) teams. MLOps adds discipline to the development and deployment of ML models, making the development process more reliable and productive.

MLOps encompasses a set of processes, rather than a single framework, that machine learning developers use to build, deploy and continuously monitor and train their models. It's at the heart of machine learning engineering, blending artificial intelligence (AI) and machine learning techniques with DevOps and data engineering practices.

There are many steps needed before an ML model is ready for production, and several players are involved. The MLOps development philosophy is relevant to IT pros who develop ML models, deploy the models and manage the infrastructure that supports them. Producing iterations of ML models requires collaboration and skill sets from multiple IT groups, such as data science teams, software engineers and ML engineers.

Development of deep learning and other ML models is considered experimental, and failures are part of the process in real-world use cases. The discipline is evolving, and it's understood that, sometimes, even a successful ML model might not function the same way from one day to the next.

How MLOps works

MLOps implements the machine learning lifecycle. These are the stages that an ML model must undergo to become production-ready. The following are the four cycles that make up the ML lifecycle:

  1. Data cycle. The data cycle entails gathering and preparing data for training. First, raw data is culled from appropriate sources, and then techniques such as feature engineering are used to transform, manipulate and organize raw data into labeled data that's ready for model training.
  2. Model cycle. This cycle is where the model is trained with this data. Once a model is trained, tracking future versions of it as it moves through the rest of the lifecycle is important. Certain tools, such as the open source tool MLflow, can be used to simplify this.
  3. Development cycle. Here, the model is further developed, tested and validated so that it can be deployed to a production environment. Deployment can be automated using continuous integration/continuous delivery (CI/CD) pipelines and configurations that reduce the number of manual tasks.
  4. Operations cycle. The operations cycle is an end-to-end monitoring process that ensures the production model continues working and is retrained to improve performance over time. MLOps can automatically retrain an ML model either on a set schedule or when triggered by an event, such as a model performance metric falling below a certain threshold.

Main components of MLOps

Various components make up the MLOps model building process. They're usually implemented sequentially and ensure the reproducibility of the process. The four steps in the MLOps lifecycle provide an overview of the process, but these cycles can be broken down into the more detailed components:

Why is MLOps necessary?

Machine learning models aren't built once and forgotten; they require continuous training so that they improve over time. That's where MLOps comes in. It provides the ongoing training and constant monitoring needed to ensure ML models operate successfully.

MLOps documents reliable processes and governance strategies to prevent problems, reduce development time and create better models. MLOps uses repeatable processes in the same way businesses use workflows for organization and consistency. In addition, MLOps automation ensures time isn't wasted on tasks that are repeated each time new models are built.

What are the benefits of MLOps?

MLOps provides a range of benefits, such as the following:

MLOps challenges

MLOps might be more efficient than traditional approaches, but it's not without its challenges. They include the following:

Key use cases for MLOps

On the surface, MLOps appears to be exclusive to the tech industry; however, other industries find value in using MLOps practices to enhance their operations:

MLOps vs. DevOps

The most obvious similarity between DevOps and MLOps is the emphasis on streamlining design and production processes. However, the clearest difference between the two is that DevOps produces the most up-to-date versions of software applications for customers as fast as possible, a key goal of software vendors. MLOps is instead focused on surmounting the challenges that are unique to machine learning to produce, optimize and sustain a model.

DevOps typically involves development teams that program, test and deploy software apps into production. MLOps means to do the same with ML systems and models but with a handful of additional phases. These include extracting raw data for analysis, preparing data, training models, evaluating model performance, and monitoring and training continuously.

MLOps vs. ML engineering

The term ML engineering is sometimes used interchangeably with MLOps; however, there are key differences. MLOps encompasses all processes in the lifecycle of an ML model, including predevelopment data aggregation, data preparation, and post-deployment upkeep and retraining. Meanwhile, ML engineering is focused on the stages of developing and testing a model for production, similar to what software engineers do.

For example, an MLOps team designates ML engineers to handle the training, deployment and testing stages of the MLOps lifecycle. These professionals possess the same skills as typical software developers. Others on the operations team may have data analytics skills and perform predevelopment tasks related to data. Once the ML engineering tasks are completed, the team at large performs continual maintenance and adapts to changing end-user needs, which might call for retraining the model with new data.

Best practices for MLOps

There are many useful strategies that MLOps teams adhere to. The following set of practices can help guide a successful machine learning project to completion and reduce its likelihood of failure:

How an organization can implement MLOps

There is no single right way to acquire the skilled employees, tools and infrastructure needed to run an MLOps operation. That said, there are three levels of MLOps implementation that coincide with an organization's needs:

There are four types of ML training approaches. Supervised machine learning is the most common, but there's also unsupervised learning, semisupervised learning and reinforced learning. Learn the steps involved in machine learning training.

05 Sep 2024

All Rights Reserved, Copyright 1999 - 2025, TechTarget | Read our Privacy Statement