
putilov_denis - stock.adobe.com
Meta Llama 4 explained: Everything you need to know
Meta released Llama 4 -- a multimodal LLM that analyzes and understands text, images, and video data. There are three primary versions of Llama 4 -- Scout, Maverick and Behemoth.
When Meta, formerly Facebook, launched its large language model (LLM) Llama in February 2023, it was originally spelled LLaMA. The acronym stands for Large Language Model Meta AI.
Since Llama 2's release in July 2023, Meta has provided the model under an open permissive license, easing organizational access and use. Its multiple iterations have expanded Llama's capabilities and improved its standing among rivals, including models from OpenAI, Anthropic and Google.
On April 5, 2025, Meta released the Llama 4 model family, the Llama 4 herd.
What is Meta Llama 4?
Meta Llama 4 is a multimodal LLM that analyzes and understands text, images and video data. This fourth-generation model also supports multiple languages from all parts of the globe.
The Llama 4 models are the first LLMs in the Llama family to employ a mixture-of-experts architecture: Only a subset of the total parameters activate for an input token. This approach targets a balance of power with efficiency.
The Llama 4 community license is not an official Open Source Initiative-approved license, but Meta refers to its Llama 4 models as open source. The Meta Llama license provides free usage and modification of the Llama 4 models with certain limits. As of April 2025, the limit was 700 million monthly users. After that point, a commercial license is required.
The three primary versions of Llama 4 are Scout, Maverick and Behemoth. The Scout and Maverick models were available at launch, with the Behemoth model still in training. This chart compares them.
Feature | Llama 4 Scout | Llama 4 Maverick | Llama 4 Behemoth |
Active parameters | 17 billion | 17 billion | 288 billion |
Number of experts | 16 | 128 | 16 |
Total parameters | 109 billion | 400 billion | 2 trillion |
Context window | 10 million tokens | 1 million tokens | Not specified |
Knowledge cutoff | August 2024 | August 2024 | Not specified |
Release date | April 5, 2025 | April 5, 2025 | Not yet released |
What can Meta Llama 4 do?
The Meta Llama 4 models are applicable across a wide range of operations, including:
- Native multimodality. Llama 4 models understand text, images and video simultaneously.
- Content summarization. Llama 4 models summarize multiple content types as part of the multimodal understanding.
- Long-context processing. The Llama 4 Scout examines and processes large volumes of content, thanks to its 10 million-token context window.
- Multilingual modality. All Llama 4 models support multiple languages for text: Arabic, English, French, German, Hindi, Indonesian, Italian, Portuguese, Spanish, Tagalog, Thai and Vietnamese. However, image understanding is supported in English only.
- Text generation. Full text generation, including creative writing, is available with Llama 4 models.
- Advanced reasoning. The models reason through complex science and math problems.
- Code generation. Llama 4 understands and generates application code.
- Act as a base model. Perhaps most importantly, as an open model, Llama 4 is foundational for other models distilled from it.
How was Meta Llama 4 trained?
Meta introduced a series of advanced techniques to train its fourth-generation Llama family LLMs to improve accuracy and performance over prior iterations. Among the techniques used to train Llama 4 were:
- Training data. The foundation of all LLM training is training data; more is better. To that end, Meta trained Llama 4 on more than 30 trillion tokens, doubling the size of Llama 3's training data.
- Early fusion multimodality. The Llama 4 series was trained with the "early fusion" approach, which integrates text and vision tokens into a unified model. According to Meta, the approach creates a more natural understanding between visual and text information without separate encoders and decoders.
- Hyperparameter optimization. This technique sets critical model hyperparameters such as per-layer learning rates. The goal is to be more reliable and consistent in training results.
- iRoPE architecture. The interleaved attention layers without positional embeddings architecture, or iRoPE architecture, improve handling of long sequences in training and support the 10 million-token context window in Llama 4 Scout.
- MetaCLIP vision encoder. The new Meta vision encoder helps translate images into token representations, leading to better multimodal understanding.
- GOAT safety training. Meta used the Generative Offensive Agent Tester throughout training, a technique meant to highlight LLM susceptibilities and improve model safety.
Previous iterations of LlamaAfter the big debut of ChatGPT in November 2022, vendors of all sizes scrambled to find their footing in the LLM market. Meta was among the many that responded with in-house models, first Llama models, publicly announced in early 2023, which provided limited access. From Llama 2's mid-2023 release onward, all models have been available under open licenses.
- Llama 1. The original was released in February 2023 with limited access.
- Llama 2. Released in July 2023 as the first Llama with an open license, Llama 2 was accessible and usable for free. This iteration has 7B, 13B and 70B parameter versions.
- Llama 3. The Llama 3 models debuted in April 2024, initially with 8B and 70B parameter versions.
- Llama 3.1. Llama 3.1, released in July 2024, added a 405B parameter model.
- Llama 3.2. This model, Meta's first fully multimodal LLM, was released in October 2024.
- Llama 3.3. Meta claimed at its December 2024 release that Llama 3.3's 70B variant offered the same performance as 3.1's 405B variant, with fewer compute requirements.
How Llama 4 compares to other models
The generative AI landscape is increasingly competitive, featuring major players such as OpenAI's GPT-4o, Google Gemini 2.0 and various open source projects including DeepSeek.
Here's how Llama 4 rates on three benchmarks: MMMU, or Massive Multi-discipline Multimodal Understanding, for image reasoning; LiveCodeBench, for coding; and GPQA Diamond, or Graduate-Level Google-Proof Q&A Diamond, for reasoning and knowledge. A higher score is better.
Llama 4 Maverick | Gemini 2.0 Flash | GPT-4o | |
MMMU image reasoning | 73.4 | 71.7 | 69.1 |
LiveCodeBench | 43.4 | 34.05 | 32.3 |
GPQA Diamond | 69.8 | 60.1 | 53.6 |
Where is Llama 4 available?
Meta Llama 4 Maverick and Scout are easily accessible from several different locations.
- Llama.com. Download Scout and Maverick directly from the Meta-operated llama.com website for free.
- Meta.ai. The Meta.ai web interface provides browser-based access to Llama 4.
- Hugging Face. Llama 4 is also available at https://huggingface.co/meta-llama.
- Meta AI app. Llama 4 is the LLM of Meta's AI virtual assistant that users can access through voice or text across various platforms to complete tasks such as summarizing text, creating content and answering queries.
Sean Michael Kerner is an IT consultant, technology enthusiast and tinkerer. He has pulled Token Ring, configured NetWare and been known to compile his own Linux kernel. He consults with industry and media organizations on technology issues.