TechTarget.com/whatis

https://www.techtarget.com/whatis/feature/Foundation-models-explained-Everything-you-need-to-know

Foundation models explained: Everything you need to know

By Ben Lutkevich

Foundation models will form the basis of generative AI's future in the enterprise.

Large language models (LLMs) fall into a category called foundation models. Language models take language input and generate synthesized output. Foundation models work with multiple data types. They are multimodal, meaning they work in other modes besides language.

This enables businesses to draw new connections across data types and expand the range of tasks that AI can be used for. As a starting point, a company can use foundation models to create custom generative AI models, using a tool such as LangChain, with features tailored to its use case.

The GPT-n (generative pre-trained transformer) class of LLMs has become a prime example of this. The release of powerful LLMs such as OpenAI's GPT-4 spurred discussions of artificial general intelligence -- basically, saying that AI can do anything. Since their release, numerous applications powered by GPTs have been created.

GPT-4 and other foundation models are trained on a broad corpus of unlabeled data and can be adapted to many tasks.

What is a foundation model?

Foundation models are a new paradigm in AI system development. AI was previously trained on task-specific data to perform a narrow range of functions.

A foundation model is a large-scale machine learning model trained on a broad data set that can be adapted and fine-tuned for a wide variety of applications and downstream tasks. Foundation models are known for their generality and adaptability.

GPT-4, Dall-E 2 and BERT -- which stands for Bidirectional Encoder Representations from Transformers -- are all foundation models. The term was coined by authors at the Stanford Center for Research on Foundation Models and the Stanford Institute for Human-Centered Artificial Intelligence (HAI) in a 2021 paper called "On the Opportunities and Risks of Foundation Models."

The authors of the paper stated: "While many of the iconic foundation models at the time of writing are language models, the term language model is simply too narrow for our purpose: as we describe, the scope of foundation models goes well beyond language."

The name foundation model underscores the fundamental incompleteness of the models, according to the paper. They are the foundation for specific spinoff models that are trained to accomplish a narrower, more specialized set of tasks. The authors of the Stanford HAI paper stated: "We also chose the term 'foundation' to connote the significance of architectural stability, safety, and security: poorly-constructed foundations are a recipe for disaster and well-executed foundations are a reliable bedrock for future applications."

How are foundation models used?

Foundation models serve as the base for more specific applications. A business can take a foundation model, train it on its own data, and fine-tune it to a specific task or a set of domain-specific tasks.

Several platforms, including Amazon SageMaker, IBM Watsonx, Google Cloud Vertex AI and Microsoft Azure AI, provide organizations with a service for building, training and deploying AI models.

For example, an organization could use one of these platforms to take a model from Hugging Face, train the model on its proprietary data and use prompt engineering to fine-tune the model. Hugging Face is an open source repository of many LLMs, like a GitHub for AI. It provides tools that enable users to build, train and deploy machine learning models.

How do foundation models work?

Foundation models use predictive algorithms to "learn" a pattern and generate the next item in that pattern. The algorithms that foundation models use can vary, including transformer-based architectures, variational encoders and generative adversarial networks.

A foundation model, applied to text, learns common patterns in that text and predicts the next word based on existing patterns in the text and any additional input a user might provide. A foundation model applied to video learns underlying patterns in a database of videos and generates new videos that adhere to those patterns. Foundation models are generative AI programs; they learn from existing corpuses of content to produce new content.

There are three broad steps underlying foundation models' functionality:

  1. Pretraining. The foundation model learns patterns from a large data set.
  2. Fine-tuning. The model is fine-tuned for specific tasks with smaller, domain-specific data sets.
  3. Implementation. The model is ready to receive new data as input and generate predictions about that data based on patterns learned in pretraining and fine-tuning.

Foundation models are expensive to train and run. The compute hardware underlying foundation models usually consists of multiple parallel GPUs.

Importance of foundation models

Foundation models are important because of their adaptability. Instead of training specialized models from the ground up for a narrow set of tasks, engineers can use pretrained foundation models to develop new applications for their specific use case.

Despite the energy and compute costs of developing, training and maintaining foundation models, their ability to scale predictably and set the basis for downstream AI applications makes them a worthy investment for some organizations with the necessary resources.

Characteristics of foundation models

The main traits of foundation models include the following:

Examples of foundation model applications

Foundation models are fine-tuned to create apps. Below are a few examples of foundation models and the applications they underlie.

Opportunities and challenges of foundation models

Foundation models are multimodal because they have multiple capabilities, including language, audio and vision.

Because of their general adaptability, foundation models could provide numerous opportunities and use cases in a variety of different industries, including the following:

Despite their broad potential, foundation models pose many challenges, including the following:

Other important AI research papers

"On the Opportunities and Risks of Foundation Models" is just one of the influential research papers about foundation models. AI research is being published at a significant clip. Here are some other foundational AI research papers to know about:

Ben Lutkevich is the site editor for Software Quality. Previously, he wrote definitions and features for WhatIs.com.

06 Jan 2025

All Rights Reserved, Copyright 1999 - 2025, TechTarget | Read our Privacy Statement