Browse Definitions :

An explanation of large language models

In this video, TechTarget editor Sabrina Polin talks about the benefits and challenges of large language models.

Humans need language to communicate, so it makes sense that AI does, too.

A large language model -- or LLM -- is a type of AI algorithm based on deep learning (and huge amounts of data) that can understand, generate and predict new content.

Language models aren't new -- the first AI language model can be traced back to 1966 -- but large language models use a significantly larger pool of data used for training, which means a significant increase in the capabilities of the AI model.

So, just how large are large language models?

Well, there's no universally accepted figure for how large an LLM training data set is, but it's typically in the petabytes range. For context, a single petabyte is equivalent to 1 million gigabytes; the human brain is believed to store about 2.5 petabytes of memory data.

The LLM training consists of multiple steps, usually starting with unsupervised learning, where the model starts to derive relationships between words and concepts, then fine-tuned with supervised learning. The training data then passes through a transformer, which enables the LLM to recognize relationships and connections using a self-attention mechanism.

Once the LLM is trained, it can serve as the base for any AI uses, including the following:

  • Generate text.
  • Translate languages.
  • Summarize or rewrite content.
  • Organize content.
  • Analyze sentiment of content, like humor or tone.
  • And converse naturally with a user, unlike older generations of AI chatbot technologies.

LLMs can be particularly useful as a foundation for customized uses for both businesses and individuals. They're fast, accurate, flexible and easy to train. However, users should heed caution, too. LLMs come with a number of challenges, too, including the following:

  • The cost of deployment and operation.
  • Bias, depending on what data it was trained on.
  • AI hallucinations, where a response is not based on the training data.
  • Troubleshooting complexity.
  • And glitch tokens, or words or inputs maliciously designed to make the LLM malfunction.

Sabrina Polin is a managing editor of video content for the Learning Content team. She plans and develops video content for TechTarget's editorial YouTube channel, Eye on Tech. Previously, Sabrina was a reporter for the Products Content team.

Networking
  • What is asynchronous?

    In general, asynchronous -- from Greek asyn- ('not with/together') and chronos ('time') -- describes objects or events not ...

  • What is a URL (Uniform Resource Locator)?

    A URL (Uniform Resource Locator) is a unique identifier used to locate a resource on the internet.

  • What is FTP?

    File Transfer Protocol (FTP) is a network protocol for transmitting files between computers over TCP/IP connections.

Security
CIO
  • What is a software license?

    A software license is a document that provides legally binding guidelines for the use and distribution of software.

  • What is data storytelling?

    Data storytelling is the process of translating complex data analyses into understandable terms to inform a business decision or ...

  • What is demand shaping?

    Demand shaping is an operational supply chain management (SCM) strategy where a company uses tactics such as price incentives, ...

HRSoftware
Customer Experience
Close