metamorworks -

GANs vs. VAEs: What is the best generative AI approach?

Generative AI is gaining steam in the tech sector. Two popular approaches are GANs, which are used more for multimedia, and VAEs, which are used more for signal analysis.

Generative adversarial networks and variational autoencoders are two of the most popular approaches for working with generative AI techniques. In general, GANs tend to be more widely used with multimedia, while VAEs see more use in signal analysis.

How does this translate to real-world, pragmatic value? Generative AI techniques help create AI models, synthetic data and realistic multimedia, such as voices and images. Although these techniques are sometimes used for creating deep fakes, they can also create realistic dubs for movies and generate images from brief text descriptions. They also generate drug discovery targets, recommend product design choices and improve security algorithms.

How do GANs work?

GANs were first introduced by Ian Goodfellow and fellow researchers at the University of Montreal in 2014. They have shown tremendous promise in generating many types of realistic data. Yann LeCun, chief AI scientist at Meta, has written that GANs and their variations were "the most interesting idea in the last ten years in machine learning."

For starters, they have been used to generate realistic speech, mimicking people for better translations, including matching voices and lip movements. They have also translated imagery and differentiated between night and day, as well as delineating dance moves between bodies. They are also combined with other AI techniques to improve security and build better AI classifiers.

The actual mechanics of GANs involve the interplay of two neural networks that work together to generate and then classify data that is representative of reality. GANs generate content using a generator neural network that is tested against a second neural network: the discriminator network, which determines whether the content looks "real." This feedback helps train a better generator network. The discriminator can also detect fake content or a piece of content that is not part of the domain. Over time, both neural networks get better and the feedback helps them learn to generate data that's as close to reality as possible.

How do VAEs work and compare with GANs?

VAEs were also first introduced in 2014, but by Diederik Kingma, research scientist at Google, and Max Welling, research chair in machine learning at the University of Amsterdam. VAEs also promise to create more effective classification engines for various tasks, with different mechanics. At their core, they build on neural network autoencoders made up of two neural networks: an encoder and a decoder. The encoder optimizes for more efficient ways of representing data, while the decoder optimizes for more efficient ways of regenerating the original data set.

Traditionally, autoencoder techniques clean data, improve predictive analysis, compress data and reduce the dimensionality of datasets for other algorithms. VAEs take this further to minimize errors between the raw signal and the reconstruction.

Tiago Cardoso, product manager at enterprise content management software provider Hyland, said, "VAEs are extraordinarily strong in providing near-original content with just a reduced vector. It also allows us to generate inexistent content that can be used free of licensing."

The biggest difference found when juxtaposing GANs vs. VAEs is how they are applied. Pratik Agrawal, partner in the digital transformation practice at management consulting company Kearney, said that GANs are typically employed when dealing with any kind of imagery or visual data. He finds that VAEs work better for signal processing uses cases, such as anomaly detection for predictive maintenance or security analytics applications.

Since both VAEs and GANs are examples of neural networks, their applications can be limited in actual live business examples, Agrawal said. Data scientists and developers working with these techniques must tie results back to inputs and run sensitivity analysis. It is also essential to consider factors such as the sustainability of these solutions and address who runs them, how often they are maintained and the technology resources needed to update them.

Dig Deeper on AI technologies