Tech Accelerator What is GenAI? Generative AI explained

Prev Next

Feature

27 of the best large language models in 2025

Large language models have been affecting search for years and have been brought to the forefront by ChatGPT and other chatbots.

Sean Michael Kerner
Ben Lutkevich, Site Editor

Published: 10 Jul 2025

Large language models are the dynamite behind the generative AI boom. However, they've been around for a while.

LLMs are black box AI systems that use deep learning on extremely large datasets to understand and generate new text. Modern LLMs began taking shape in 2014 when the attention mechanism -- a machine learning technique designed to mimic human cognitive attention -- was introduced in a research paper titled "Neural Machine Translation by Jointly Learning to Align and Translate." In 2017, that attention mechanism was honed with the introduction of the transformer model in another paper, "Attention Is All You Need."

Some of the most well-known language models today are based on the transformer model, including the generative pre-trained transformer series of LLMs and bidirectional encoder representations from transformers (BERT).

ChatGPT, which runs on a set of language models from OpenAI, attracted more than 100 million users just two months after its release in 2022. Since then, many competing models have been released. Some belong to big companies such as Google, Amazon and Microsoft; others are open source.

Constant developments in the field can be difficult to keep track of. Here are some of the most influential models, both past and present. Included in it are models that paved the way for today's leaders as well as those that could have a significant effect in the future.

This article is part of

What is GenAI? Generative AI explained

Which also includes:
8 top generative AI tool categories for 2025
Will AI replace jobs? 18 job types that might be affected
27 of the best large language models in 2025

Top current LLMs

Below are some of the most relevant large language models today. They do natural language processing and influence the architecture of future models.

BERT

BERT is a family of LLMs that Google introduced in 2018. BERT is a transformer-based model that can convert sequences of data to other sequences of data. BERT's architecture is a stack of transformer encoders and features 342 million parameters. BERT was pre-trained on a large corpus of data then fine-tuned to perform specific tasks along with natural language inference and sentence text similarity. It was used to improve query understanding in the 2019 iteration of Google search.

Claude

The Claude LLM focuses on constitutional AI, which shapes AI outputs guided by a set of principles that aim to make the AI assistant it powers helpful, harmless and accurate. Claude was created by the company Anthropic. Claude's latest iterations understand nuance, humor and complex instructions better than earlier versions of the LLM. They also have broad programming capabilities that make them well-suited for application development.

There are three primary branches of Claude -- Opus, Haiku and Sonnet. The Claude Sonnet 4 and Claude Opus 4 models debuted in early 2025. Opus 4, the premium model, can perform long-running tasks and agentic workflows. Sonnet 4, the efficiency-focused model, shows continued improvement in coding, reasoning and instruction-following compared to previous iterations. Both models also include:

Extended thinking with tool-use.
Improved memory and instruction-following.
Integrations with IDEs and APIs.
Code execution.
MCP connector.
Files API.
Prompt caching.

In October 2024, Claude added an experimental computer-use AI tool in public beta that enables the LLM to use a computer like a human does. It's available to developers via the API.

Cohere

Cohere is an enterprise AI platform that provides several LLMs including Command, Rerank and Embed. These LLMs can be custom-trained and fine-tuned to a specific company's use case. The company that created the Cohere LLM was founded by one of the authors of Attention Is All You Need.

DeepSeek-R1

DeepSeek-R1 is an open-source reasoning model for tasks with complex reasoning, mathematical problem-solving and logical inference. The model uses reinforcement learning techniques to refine its reasoning ability and solve complex problems. DeepSeek-R1 can perform critical problem-solving through self-verification, chain-of-thought reasoning and reflection.

Ernie

Ernie is Baidu's large language model powering the Ernie chatbot. The bot was released in August 2023 and has garnered more than 45 million users. Near the time of its release, it was rumored to have 10 trillion parameters, which turned out to be an overestimation -- later models have parameter counts in the billions. More recent versions of the Ernie chatbot include Ernie 4.5 and Ernie X1. The recent models are based on a mixture-of-experts architecture. Baidu open sourced it's Ernie 4.5 LLM series in 2025.

Falcon

Falcon is a family of transformer-based models developed by the Technology Innovation Institute. It is open source and has multi-lingual capabilities. Falcon 2 is available in an 11 billion parameter version that provides multimodal capabilities for both text and vision. Falcon 3 is available in several sizes ranging from 1-10 billion parameters.

The Falcon series also includes a pair of larger models with Falcon 40B and Falcon 180B, as well as several specialized models. Falcon models are available on GitHub as well as on cloud providers including Amazon.

Gemini

Gemini is Google's family of LLMs that power the company's chatbot of the same name. The model replaced Palm in powering the chatbot, which was rebranded from Bard to Gemini upon the model switch. Gemini models are multimodal, meaning they can handle images, audio and video as well as text. Gemini is also integrated in many Google applications and products. It comes in several sizes -- Ultra, Pro, Flash and Nano. Ultra is the largest and most capable model, Pro is the mid-tier model, Flash prioritizes speed for agentic systems and real-time applications, and Nano is the smallest model, designed for efficiency with on-device tasks.

Among the most recent models at the time of this writing is Gemini 2.5 Pro and Gemini 2.5 Flash.

Gemma

Gemma is a family of open-source language models from Google that were trained on the same resources as Gemini. Gemma 2 was released in June 2024 in two sizes -- a 9 billion parameter model and a 27 billion parameter model. Gemma 3 was released in March 2025, with 1B, 4B, 12B and 27B versions, and has expanded capabilities. Gemma models can run locally on a personal computer, and are also available in Google Vertex AI.

GPT-3

GPT-3 is OpenAI's large language model with more than 175 billion parameters, released in 2020. GPT-3 uses a decoder-only transformer architecture. GPT-3 is 10 times larger than its predecessor. GPT-3's training data includes Common Crawl, WebText2, Books1, Books2 and Wikipedia.

GPT-3 is the last of the GPT series of models in which OpenAI made the precise parameter counts publicly available. The GPT series was first introduced in 2018 with OpenAI's paper "Improving Language Understanding by Generative Pre-Training."

GPT-3.5

GPT-3.5 is an upgraded version of GPT-3. It was fine-tuned using reinforcement learning from human feedback. There are several models, with GPT-3.5 Turbo being the most capable, according to OpenAI. GPT-3.5's training data extends to September 2021.

It was also integrated into the Bing search engine but was replaced with GPT-4.

GPT-4

GPT-4 was released in 2023. Like the others in the OpenAI GPT family, it's a transformer-based model. Unlike the others, its parameter count has not been released to the public, though there are rumors that the model has more than 1 trillion. OpenAI describes GPT-4 as a multimodal model, meaning it can process and generate both language and images as opposed to being limited to only language.

GPT-4 demonstrated human-level performance in multiple academic exams. At the model's release, some speculated that GPT-4 came close to artificial general intelligence, which means it is as smart or smarter than a human. That speculation turned out to be unfounded.

GPT-4o

GPT-4 Omni (GPT-4o) is OpenAI's successor to GPT-4 and offers several improvements over the previous model. GPT-4o creates a more natural human interaction for ChatGPT and is a large multimodal model, accepting various inputs including audio, image and text. The conversations let users engage as they would in a normal human conversation, and the real-time interactivity can also pick up on emotions. GPT-4o can see photos or screens and ask questions about them during interaction.

GPT-4o can respond in 232 milliseconds, similar to human response time and faster than GPT-4 Turbo. The free tier of ChatGPT runs on GPT-4o at the time of this writing.

Granite

The IBM Granite family of models are fully open source under the Apache v.2 license. The first iteration of the open source model models debuted in May 2024, followed by Granite 3.0 in October, Granite 3.1 in December 2024, Granite 3.2 in February 2025 and Granite 3.3 in April 2025.

There are multiple variants in the Granite model family including General-purpose models (8B and 2B variants), guardrail model and Mixture-of-Experts models. While the model can be used for general purpose deployments, IBM itself is focusing deployment and optimization for enterprise use cases like customer service, IT automation and cybersecurity.

Grok

Grok is an LLM from xAI that powers a chatbot of the same name. Grok 3 was released in May 2025. Grok 3 mini is a smaller, more cost-efficient version of Grok 3. The Grok 3 chatbot gives the user two modes that augment the chatbot's default state -- Think mode and DeepSearch mode. In Think mode, Grok uses chain-of-thought reasoning, explaining outputs in step-by-step detail. DeepSearch delves more deeply into internet research to produce an output. Grok performs particularly well -- relative to other top models -- on reasoning and mathematics benchmarks such as GPQA and AIME. Grok 3 is closed source and written in primarily Rust and Python.

Grok's training infrastructure is composrd of the Colossus supercomputer, which contains more than 100,000 GPUs from Nvidia. The supercomputer was built in a repurposed Electrolux factory near Memphis, Tenn. xAI and Colossus have drawn criticism from residents and activists for a lack of transparency surrounding the environmental effects of the facility's emissions.

The name Grok comes from Robert Heinlein's 1961 novel, Stranger in a Strange Land. The book coined the term to describe the ability to understand something deeply.

Lamda

Lamda (Language Model for Dialogue Applications) is a family of LLMs developed by Google Brain in 2021. Lamda used a decoder-only transformer language model and was pre-trained on a large corpus of text. In 2022, Lambda gained widespread attention when then-Google engineer Blake Lemoine went public with claims that the program was sentient.

Llama

Large Language Model Meta AI (Llama) is Meta's LLM which was first released in 2023. The Llama 3.1 models were released in July 2024, including both a 405 billion and 70 billion parameter model.

The most recent version is Llama 4, which was released in April 2025. There are three main models -- Llama 4 Scout, Llama 4 Maverick and Llama 4 Behemoth. Behemoth is only available for preview at the time of this writing. Llama 4 is the first iteration of the Llama family to use a mixture-of-experts architecture.

Previous iterations of Llama used a transformer architecture and were trained on a variety of public data sources, including webpages from CommonCrawl, GitHub, Wikipedia and Project Gutenberg. Earlier versions of Llama were effectively leaked and spawned many descendants, including Vicuna and Orca. Llama is available under an open license, allowing for free use of the models. Lllama models are available in many locations including llama.com and Hugging Face.

Mistral

Mistral is a family of mixture-of-experts models from Mistral AI. Mistral Large 2 was first released in July 2024. The model operates with 123 billion parameters and a 128k context window, supporting dozens of languages including French, German, Spanish, Italian and many others, along with more than 80 coding languages. In November 2024, Mistral released Pixtral Large, a 124-billion-parameter multimodal model that can handle text and visual data. Mistral Medium 3 was released in May 2025, which is touted as their "frontier-class multimodal model".

Mistral models are available via Mistral's API to those with a Mistral billing account.

o1

The OpenAI o1 model family was first introduced in Sept. 2024. The o1 model's focus is to provide what OpenAI refers to as - reasoning models, that can reason through a problem or query before offering a response.

The o1 models excel in STEM fields, with strong results in mathematical reasoning (scoring 83% on the International Mathematics Olympiad compared to GPT-4o's 13%), code generation and scientific research tasks. While they offer enhanced reasoning and improved safety features, they operate more slowly than previous models due to their thorough reasoning processes and come with certain limitations, such as restricted access features and higher API costs. The models are available to ChatGPT Plus and Team users, with varying access levels for different user categories.

o3

OpenAI introduced the successor model, o3, in December 2024. According to OpenAI, o3 is designed to handle tasks with more analytical thinking, problem-solving and complex reasoning and will improve o1's capabilities and performance. The o3 model became available to the public in June 2025.

o4-mini

Like others in the o-series, o4-mini is a reasoning model that aims to excel at tasks that require complex reasoning and problem-solving. OpenAI claims that o-4 mini is superior to o3-mini across all key benchmarks. It comes in o-4-mini and o4-mini-high, which uses more extensive reasoning for complex problems. Just like other mini variants from OpenAI, it is designed to be especially cost-efficient. The model also uses a technique called deliberative alignment, which aims to identify attempts to exploit the system and create unsafe content.

Orca

Orca is an LLM developed by Microsoft that has 13 billion parameters. It aims to improve on advancements made by other models by imitating the reasoning procedures achieved by LLMs. The research surrounding Orca involved teaching smaller models to reason the same way larger models do. Orca 2 was built on top of the 7 billion and 13 billion parameter versions of Llama 2.

Palm

The Pathways Language Model is a 540 billion parameter transformer-based model from Google powering its AI chatbot Bard. It was trained across multiple TPU 4 Pods -- Google's custom hardware for machine learning. Palm specializes in reasoning tasks such as coding, math, classification and question answering. Palm also excels at decomposing complex tasks into simpler subtasks.

Palm gets its name from a Google research initiative to build Pathways, aiming to create a single model that serves as a foundation for multiple use cases. In October 2024, the Palm API was deprecated, and users were encouraged to migrate to Gemini.

Phi

Phi is a transformer-based language model from Microsoft. The Phi 3.5 models were first released in August 2024. Phi-4 models were released late 2024 and early 2025. The series includes the base model, Phi-4-reasoning, Phi-4-reasoning-plus, Phi-4-mini-reasoning and Phi-4-mini-instruct.

Released under a Microsoft-branded MIT License, they are available for developers to download, use, and modify without restrictions, including for commercial purposes.

Qwen

Qwen is large family of open models developed by Chinese internet giant Alibaba Cloud. The newest set of models are the Qwen 3 suite, which was pre-trained on almost twice the number of tokens that its predecessor was trained on. These models are suitable for a wide range of tasks, including code generation, structured data understanding, mathematical problem-solving as well as general language understanding and generation.

StableLM

StableLM is a series of open language models developed by Stability AI, the company behind image generator Stable Diffusion.

StableLM 2 debuted in January 2024 initially with a 1.6 billion parameter model. In April 2024 that was expanded to also include a 12 billion parameter model. StableLM 2 supports seven languages: English, Spanish, German, Italian, French, Portuguese, and Dutch. Stability AI positions these models as offering different options for various use cases, with the 1.6B model suitable for specific, narrow tasks and faster processing while the 12B model provides more capability but requires more computational resources.

Tülu 3

Allen Institute for AI's Tülu 3 is an open-source 405 billion-parameter LLM. The Tülu 3 405B model has post-training methods that combine supervised fine-tuning and reinforcement learning at a larger scale. Tülu 3 uses a "reinforcement learning from verifiable rewards" framework for fine-tuning tasks with verifiable outcomes -- such as solving mathematical problems and following instructions.

Vicuna 33B

Vicuna is another influential open source LLM derived from Llama. It was developed by LMSYS and was fine-tuned using data from sharegpt.com. It is smaller and less capable that GPT-4 according to several benchmarks but does well for a model of its size. Vicuna has only 33 billion parameters.

LLM precursors

Although LLMs are a recent phenomenon, their precursors go back decades. Learn how recent precursor Seq2Seq and distant precursor ELIZA set the stage for modern LLMs.

Seq2Seq

Seq2Seq is a deep learning approach used for machine translation, image captioning and natural language processing. It was developed by Google and underlies some more modern LLMs, including LaMDA. Seq2Seq also underlies AlexaTM 20B, Amazon's large language model. It uses a mix of encoders and decoders.

Eliza

Eliza was an early natural language processing program created in 1966. It is one of the earliest examples of a language model. Eliza simulated conversation using pattern matching and substitution. Eliza, running a certain script, could parody the interaction between a patient and therapist by applying weights to certain keywords and responding to the user accordingly. The creator of Eliza, Joshua Weizenbaum, wrote a book on the limits of computation and artificial intelligence.

Sean Michael Kerner is an IT consultant, technology enthusiast and tinkerer. He has pulled Token Ring, configured NetWare and been known to compile his own Linux kernel. He consults with industry and media organizations on technology issues.

Ben Lutkevich is site editor for Informa TechTarget Software Quality. Previously, he wrote definitions and features for Whatis.com.

Next Steps

Generative AI challenges that businesses should consider

Generative AI ethics: Biggest concerns

Generative AI landscape: Potential future trends

Generative models: VAEs, GANs, diffusion, transformers, NeRFs

AI content generators to explore

27 of the best large language models in 2025