An explanation of AI model collapse

By

Sabrina Polin, Managing Editor

In this video, TechTarget editor Sabrina Polin talks about AI model collapses and the threat it poses to data.

Just like how a healthy ecosystem needs biodiversity, AI needs diversity in its training data to be effective. Otherwise, you get model collapse.

Model collapse is what happens when AI models are trained on synthetic, AI-generated content -- as opposed to human-generated content -- and degrade. Simply put, it's a feedback loop.

As generative AI models create more and more content that gets shared on the internet, the next generations of AI models eventually train on that content, instead of human-generated content.

These new models will rely too heavily on patterns, overestimating probable events and underestimating improbable events. This means these synthetically trained models will compound errors, misinterpret data and give increasingly wrong and homogeneous outputs.

This phenomenon has the potential to create data pollution on a large scale. Although generative AI enables more efficient text generation than ever seen before, model collapse implies that none of this data will be valuable to train the next generation of AI models.

Sabrina Polin is a managing editor of video content for the Learning Content team. She plans and develops video content for TechTarget's editorial YouTube channel, Eye on Tech. Previously, Sabrina was a reporter for the Products Content team.

View All Videos

Search Networking

What is multi-access edge computing? Benefits and use cases
Multi-access edge computing (MEC) is a network architecture concept that brings cloud computing capabilities and IT services ...
What is 5G?
Fifth-generation wireless or 5G is a global standard and technology for wireless and telecommunications networks.
What is a small cell in wireless networks?
A small cell is a type of low-power cellular radio access point or base station that provides wireless service within a limited ...

Search Security

What is identity and access management? Guide to IAM
No longer just a good idea, IAM is a crucial piece of the cybersecurity puzzle. It's how an organization regulates access to ...
What is data masking?
Data masking is a security technique that modifies sensitive data in a data set so it can be used safely in a non-production ...
What is antivirus software?
Antivirus software (antivirus program) is a security program designed to prevent, detect, search and remove viruses and other ...

Search CIO

What is a chief data officer (CDO)?
A chief data officer (CDO) in many organizations is a C-level executive whose position has evolved into a range of strategic data...
What is user-generated content?
User-generated content (UGC) is published information that an unpaid contributor provides to a website.
What is business process outsourcing (BPO)?
Business process outsourcing (BPO) is a business practice in which an organization contracts with an external service provider to...

Search HRSoftware

What is performance management software?
Performance management software is a tool that enables human resources (HR) teams to measure and track the performance of ...
What is succession planning?
Succession planning is the strategic process of identifying and developing internal candidates to fill key organizational roles ...
What is compensation management?
Compensation management is the discipline and process for determining employees' appropriate pay, incentives, rewards, bonuses ...

Search Customer Experience

What are virtual agents and how are they being used?
A virtual agent is an AI-powered software application or service that interacts with humans or other digital systems in a ...
Customer acquisition cost (CAC): How to calculate and reduce it
Customer acquisition cost (CAC) is the cost associated with convincing a consumer to buy your product or service, including ...
What is direct marketing?
Direct marketing is a type of advertising campaign that seeks to elicit an action (such as an order, a visit to a store or ...

Close