10 AI tech trends data scientists should know
The rising environmental and monetary costs of deep learning are catching enterprises' attention, as are new AI techniques like graph neural networks and contrastive learning.
AI adoption is accelerating across industries, driven by a combination of concrete results, high expectations and a lot of money. Among the many new AI concepts and techniques launching almost daily, 10 AI tech trends in particular grab data scientists' attention.
Machine learning operations (MLOps) isn't a new concept, but it's a relatively new "Ops" practice which operationalizes machine learning models. MLOps seeks to understand what works and doesn't work in a model in order to create more reliable models in the future.
It's the last mile of machine learning model building, and a practice that historically hasn't been given much attention, said Lee Rehwinkel, VP of science at B2B pricing and sales software company Zilliant.
"It's one of the reasons a lot of models never see the light of day, but it's super important [because] you build a model but how do you know the uptime of that model? How fast is it going to make predictions? Does it need to be trained or retrained?" he said.
2. Contrastive learning
Contrastive learning is a machine learning technique that finds similar and dissimilar things in a data set without labels. It can be used on an image database, for example, to find images like each other.
"Contrastive learning is becoming the new paradigm in unsupervised learning. The reason unsupervised learning is so useful is that the internet is a treasure trove of unlabeled data of text and pictures," said Cameron Fen, head of research at A.I. Capital Management.
"Typically, you could do this with transfer learning, but what makes contrastive learning so exciting is that you can do this with data that are too expensive to label and with a much larger data set than fine-tuning a prebuilt image classifier on ImageNet," he said.
A transformer is a neural network architecture that, like recurrent neural networks (RNNs), handles sequential input data. It's used widely in language models, including language translation and speech-to-text applications.
Created by Google researchers in 2017, transformers have come to replace popular RNN models, such as the long short-term memory (LSTM) algorithm, used in natural language processing applications.
A transformer "learns to put higher weights on time periods that it wants to pay attention to, creating a weighted average of your inputs to feed into the model," Fen said. "This allows the model to be parallelized and have a longer memory [than LSTM models].
4. Carbon footprint
Higher data storage and compute needs for AI workloads increase a company's carbon emissions in an era when many countries are participating in The Paris Agreement and U.S. state governors are joining the United States Climate Alliance.
As companies utilize more storage and compute to take advantage of deep learning, they're increasing their carbon footprint, which directly conflicts with corporate "sustainability" (carbon emission reduction) imperatives.
"There are pitfalls around the cost of running deep learning," said Ravi Guntur, head of machine learning at Traceable.ai, which enables API and app security for cloud-native apps. "[The University of Massachusetts at Amherst] found that training one deep learning model [produces 626,000 pounds of planet-warming carbon dioxide], equal to the emission of five cars during their lifetime."
5. The monetary cost of deep learning
Machine learning also has a monetary cost. For example, it's entirely possible to run a neural network for an entire day, only to find out there's an overfitting problem. There's the cost of data storage and compute, and potentially a data scientist's wasted time spent waiting for the results.
"The cost of machine learning is impacting practitioners," said Guntur. "We constantly think about whether we need this cluster or this cluster of machines and GPUs. So, the question back to the engineering team is, is there an alternative algorithm we can use so we don't have to pay upfront for the CPUs and GPUs we want? Why can't you build an algorithm that's more efficient?"
Graphs are all about relationships. Made up of nodes -- representing a subject, such as person, object or place -- and edges -- representing the relationships between nodes -- graphs can capture complex relationships.
Graph neural networks (GNNs) are a type of neural network architecture that can help make sense of graphs, enabling people to make node or edge predictions. For example, using GNNs, someone could predict which movie genre an actor will star in, or the side effects a new drug might illicit.
"These kinds of graphs are becoming more and more popular because it's rich information," said Guntur. It's challenging to work with graphs due to how much information they contain, he added.
7. Integrated tool sets that are easier to use
Data science team leads and data scientists have traditionally been forced to cobble together tools to build, test, train and deploy. In recent years, however, big-name tech vendors have acquired capabilities to round out their offerings so they can be a one-stop shop.
This enables data scientists to use a single platform, instead of multiple platforms and tools, to work in, eliminating problems that arise from transporting data and models between tools. Many of these platforms also feature low-code or no-code applications, meaning they are faster and easier for data scientists to use.
"I can build a very good predictive model without ever necessarily getting my hands super deep in any type of code," Rehwinkel said. "It really helps me accelerate my ability to problem-solve."
8. Models that explain other models
In 2020, there was a major uptick in AI regulations and work to draft more AI regulations. Of significant note were the guidelines released by the U.S. Federal Trade Commission on "truth, fairness and equity" in AI, which issued a warning to companies using biased algorithms. The European Commission also released a proposal for the regulation of AI which includes hefty fines for noncompliance.
As regulation ramps up, more AI vendors release AI models that can help explain other models, making it easier for enterprises to see the underlying reasons why their models make certain predictions.
"Soon we'll be using models to explain models," said Josh Poduska, chief data scientist at Domino Data Lab. "Interpretability, explanation and audit of machine and deep learning models is becoming vital due to this increased regulatory pressure and the need to be able to explain the why and how of predictions, not just what."
Since some AI systems are automating decisions, this creates "equity as code," said Chris Bergh, CEO and founder of DataOps platform vendor DataKitchen.
"Data scientists and business stakeholders must first work together to develop application-specific metrics that test for bias. These metrics can then be applied during the model development process to ensure that a biased application is never deployed," said Bergh. "Equity-as-code can be run on demand to detect bias and make sure it isn't deployed."
9. Contextualized word embeddings
Static word embeddings represent words as mathematical entities (e.g., vectors in a vector space), allowing the use of mathematics to analyze the semantic relatedness of words by the similarities of their embeddings. For example, "apple" is closer to "lemon" than "house."
"One of the most influential trends has been the move from stasis word embeddings like word2vec and GloVe to contextualized word embeddings like ELMo and BERT," said Silke Dodel, machine learning architect at conversational translation solution provider Language I/O.
Besides being Sesame Street characters, BERT and ELMo are language models that reduce training times and increase performance of state-of-the-art models.
"Contextual word embeddings resolve the problem of semantic dependence of a word on its context, such as 'bank' in the context of 'park' has a different meaning than 'bank' in the context of 'money,'" she said.
10. Small data
In today's big data era, there's a general misconception that big data is necessary to understand anything. Yet, there is also value in small data.
Small data is data that's small enough for people to understand, such as U.S. postal codes.
"When you're dealing with small data, you need to go back to some old concepts in machine learning and data scientists. You need to read up on some old papers to solve some of these small data and proprietary data problems," Guntur said. "Processing small data and coming up with algorithms for small data is very different from the current trend where everyone tries to use a neural network or all the variations of deep learning."