machine learning operations (MLOps)
Machine learning operations (MLOps) is the use of machine learning models by development/operations (DevOps) teams. MLOps seeks to add discipline to the development and deployment of machine learning models by defining processes to make ML development more reliable and productive.
The development of machine learning models is inherently experimental, and failures are often a part of the process. The discipline is still evolving and it is understood that sometimes even a successful ML model may not function the same way the next day. Documenting reliable processes and creating safeguarding measures to help reduce development time can create better models.
The MLOps development philosophy is used by those who develop machine learning models, those who deploy them and those who manage the infrastructure that supports them. Standard practices for MLOps include:
- Starting with existing product API from existing AI services.
- Taking a modular approach.
- Running parallel model development, halving the problems if a single model fails.
- Having pre-trained models ready to show proof of concept.
- Generalized algorithms showing some success can be further trained for their specific task.
- Bridging gaps in training data with publicly available data sources.
- Taking time to develop generalized AI in order to broaden opportunities.
Staffing is a challenging and important part of developing MLOps. This is because the same data scientists responsible for developing machine learning algorithms may not be the most effective at deploying them -- or explaining to software developers how to use them. Some of the best MLOps teams embrace the idea of cogntive diversity, the inclusion of people who have different styles of problem-solving and can offer unique perspectives because they think differently.