Meta Unveils Llama 4 Scout and Maverick with Massive Token Window

Munyaradzi Mafaro · Apr 7, 2025

Meta improved their AI with Llama 4. These new models can work with pictures and words right from the start. The main model, Llama 4 Scout, has 17 billion working parts spread across 16 expert sections. It runs on just one NVIDIA H100 chip using FP4 math. Scout can handle up to 10 million pieces of text at once, beating older models like Google Gemini 1.5 Pro, which only managed 2 million pieces.

Scout uses a special design that turns on just some parts for each piece of text. This makes everything run faster and costs less money. Another model, named Llama 4 Maverick, also has 17 billion active parts but spreads them across 128 expert sections, giving it about 400 billion total parts. Maverick works really well with code, pictures, many languages, and logic puzzles. It even beats many top models from other companies.

Both Scout and Maverick mix pictures and words early when they process stuff. They use a special vision system based on MetaCLIP to look at many pictures and words at the same time. All these bits go into one big processor together. This helps them see pictures clearly and know exactly where things are. These models can describe pictures, answer questions about what they see, and even study how pictures change over time.

The biggest model Meta made is Llama 4 Behemoth, with 288 billion working parts and almost two trillion total parts. This huge model teaches both Scout and Maverick through advanced learning methods. Behemoth keeps learning more stuff right at this moment. When ready, it will rank among the very best AI systems anywhere. Meta trains these models using FP8 math, which differs from Llama 3 models that use FP16 and FP8. The company found ways to use less precise math but still keep high-quality results.

Meta put these models up against others from Google, Anthropic, and OpenAI. The test results show Llama 4 models perform very well compared to these rivals. The big jump in what these models can do comes from mixing words and pictures from the start. Their huge text memory lets them work with way more information than before. The special design helps them think faster without needing as much computer power as you might expect.

Meta Unveils Llama 4 Scout and Maverick with Massive Token Window

Attachments

Similar threads

Latest media

Trending content

Sponsored

Latest posts

Featured content

Misc

NALA grabs Nigeria IMTO license for cross-border payments

Zambia rolls out SmartCare Pro to 2,000 health facilities

Showmax Originals move to DStv Stream before April shutdown

Côte d’Ivoire hikes digital budget by 37 percent

Vodacom Lesotho drops $40 million for network upgrade