Machine learning is being used to create predictive algorithms for everything from stock market forecasts and customer prospecting to predicting when manufacturing equipment will fail and which patients are at risk of opioid abuse.
With so many possible applications, it's no wonder leaders in the C-suite want their IT teams to get going with machine learning projects. But as any machine learning practitioner will tell you, it isn't the solution for every problem.
When to use machine learning to create a predictive algorithm and how to make it work is a common question for Nick Patience, co-founder and research vice president at 451 Research. He explained that there are four necessary components you need before you can move into machine learning for predictive modeling:
- a chosen algorithm with which to work;
- a large amount of good data that isn't siloed;
- compute resources, mostly in the cloud; and
- a team with the right skills.
Machine learning musts
Machine learning only works when you understand the business problem and the established rules around it. When you have these two things, you know the process in question is valuable because it has been hardcoded. But business rules often become too rigid over time, so they could be made more flexible, Patience said.
Key components of machine learning
A large portion of machine learning data is stored in the cloud, so companies need to be comfortable with having their data stored outside of their own data centers; otherwise, the investment in on-premises infrastructure will be significant, said Patience, who also led a recent AI Trends webinar called "2018 Trends in AI and Machine Learning."
Another important factor to note is that the data targeted for analysis with a predictive algorithm can't be siloed in different proprietary systems; it needs to be integrated, so there has to be a data preparation process. That's where Hadoop data lakes, Spark and other big data platforms come into play, as well as data warehouses.
You also have to make sure you're integrating not only data and platforms, but domain experts who bring invaluable information and skills to the data science team, according to David Ledbetter, a data scientist at Children's Hospital Los Angeles.
"The machine learning community often isolates themselves and thinks they can solve all the problems, but domain experts bring value," Ledbetter said during a panel discussion at the AI World Conference & Expo in Boston in December. "Every time we meet with the clinical team, we learn something about what's going on with the data."
Machine learning algorithms and outcomes
The project team, with its mix of skills, needs to also identify good vs. bad outcomes based on the business problem you're trying to solve with a predictive algorithm.
"It's important to set clear success criteria at the beginning of a project, and [to] pick something that has a reasonable likelihood of success," said William Mark, president of SRI International, a research and development firm that works on AI projects for customers, during the same panel discussion at AI World. "This requires discussion with technical experts to define a problem that there's a reasonable probability of solving."
William Markpresident, SRI International
Another critical aspect is the possible impact on the user experience, as well as communicating to business execs how to interpret the data that results from a predictive algorithm.
"This isn't being thought through much yet," 451's Patience said, "but say you are looking at a lead scoring or prediction application in a sales automation process, and you put 78% on a screen. That's the score for this particular client, and you need to make sure the user understands what to do with that data. It's great to be able to produce all this smart data, but you need to think through the impact on the user experience."
In addition, users are more likely to tolerate the errors that are bound to surface in any machine learning project when they involve tasks people don't like doing themselves -- especially highly repetitive tasks, Ledbetter said.
Choosing the right machine learning algorithm is, of course, critical -- and there are plenty to consider, most of them open source. In the realm of supervised machine learning, you can choose from classification algorithms, including support vector machines, nearest neighbor, Naive Bayes and discriminant analysis, or regression algorithms, such as decision trees and neural networks.