Generative adversarial networks and variational autoencoders are two of the most popular approaches used for producing AI-generated content. In general, GANs tend to be more widely used for generating multimedia, while VAEs see more use in signal analysis.
How does this translate to real-world, pragmatic value? Generative AI techniques help create AI models, synthetic data and realistic multimedia, such as voices and images. Although these techniques are sometimes used for creating deepfakes, they can also create realistic dubs for movies and generate images from brief text descriptions. They also generate drug discovery targets, recommend product design choices and improve security algorithms.
How do GANs work?
Ian Goodfellow and fellow researchers at the University of Montreal introduced GANs in 2014. They have shown tremendous promise in generating many types of realistic data. Yann LeCun, chief AI scientist at Meta, has written that GANs and their variations are "the most interesting idea in the last ten years in machine learning."
For starters, GANs have been used to generate realistic speech, including matching voices and lip movements to produce better translations. They have also translated imagery, differentiated between night and day and delineated dance moves between bodies. Combined with other AI techniques, they improve security and build better AI classifiers.
This article is part of
The actual mechanics of GANs involve the interplay of two neural networks that work together to generate and then classify data that is representative of reality. GANs generate content using a generator neural network that is tested against a second neural network: the discriminator, which determines whether the content looks "real." This feedback helps train a better generator network. The discriminator can also detect fake content or a piece of content that is not part of the domain. Over time, both neural networks get better, and the feedback helps them learn to generate data that's as close to reality as possible.
How do VAEs work and compare with GANs?
VAEs were also introduced in 2014, but by Diederik Kingma, a research scientist at Google, and Max Welling, research chair in machine learning at the University of Amsterdam. VAEs also promise to create more effective classification engines for various tasks, with different mechanics. At their core, they build on neural network autoencoders made up of two neural networks: an encoder and a decoder. The encoder network optimizes for more efficient ways of representing data, while the decoder network optimizes for more efficient ways of regenerating the original data set.
Traditionally, autoencoder techniques clean data, improve predictive analysis, compress data and reduce the dimensionality of data sets for other algorithms. VAEs take this further to minimize errors between the raw signal and the reconstruction.
"VAEs are extraordinarily strong in providing near-original content with just a reduced vector. It also allows us to generate inexistent content that can be used free of licensing," said Tiago Cardoso, group product manager at Hyland Software.
The biggest difference found when juxtaposing GANs vs. VAEs is how they are applied. Pratik Agrawal, partner in the digital transformation and AI practice at management consulting company Kearney, said that GANs are typically employed when dealing with any kind of imagery or visual data. He finds that VAEs work better for signal processing uses cases, such as anomaly detection for predictive maintenance or security analytics applications.
Generative AI use cases
Generative AI techniques likes GANs and VAEs can be deployed in a seemingly limitless variety of use cases, including the following:
- Implementing chatbots for customer service and technical support.
- Deploying deepfakes for mimicking people.
- Improving dubbing for movies.
- Writing email responses, dating profiles, resumes and term papers.
- Creating photorealistic art in a particular style.
- Suggesting new drug compounds to test.
- Designing physical products and buildings.
- Optimizing new chip designs.
- Writing music in a specific style or tone.
Since both VAEs and GANs are examples of neural networks, their applications can be limited in actual live business examples, Agrawal said. Data scientists and developers working with these techniques must tie results back to inputs and run sensitivity analysis. It is also essential to consider factors such as the sustainability of these solutions and to address who runs them, how often they are maintained and the technology resources needed to update them.
It's worth noting that a variety of other techniques have recently emerged in generative AI, including diffusion models, which are used for generating and optimizing images; transformers such as Open AI's ChatGPT, widely used in language generation; and neural radiance fields, or NeRFs, a new technique being used to create realistic 3D media from 2D data.