A generative adversarial network (GAN) is a machine learning (ML) model in which two neural networks compete with each other to become more accurate in their predictions. GANs typically run unsupervised and use a cooperative zero-sum game framework to learn.
The two neural networks that make up a GAN are referred to as the generator and the discriminator. The generator is a convolutional neural network and the discriminator is a deconvolutional neural network. The goal of the generator is to artificially manufacture outputs that could easily be mistaken for real data. The goal of the discriminator is to identify which outputs it receives have been artificially created.
Essentially, GANs create their own training data. As the feedback loop between the adversarial networks continues, the generator will begin to produce higher-quality output and the discriminator will become better at flagging data that has been artificially created.
How GANs work
The first step in establishing a GAN is to identify the desired end output and gather an initial training dataset based on those parameters. This data is then randomized and input into the generator until it acquires basic accuracy in producing outputs.
After this, the generated images are fed into the discriminator along with actual data points from the original concept. The discriminator filters through the information and returns a probability between 0 and 1 to represent each image's authenticity (1 correlates with real and 0 correlates with fake). These values are then manually checked for success and repeated until the desired outcome is reached.
Popular use cases for GANs
GANs are becoming a popular ML model for online retail sales because of their ability to understand and recreate visual content with increasingly remarkable accuracy. Use cases include:
- Filling in images from an outline.
- Generating a realistic image from text.
- Producing photorealistic depictions of product prototypes.
- Converting black and white imagery into color.
In video production, GANs can be used to:
- Model patterns of human behavior and movement within a frame.
- Predict subsequent video frames.
- Create deepfake