Meta: AI News Week Ending 04/11/2025

Meta: AI News Week Ending 04/11/2025

April 11, 2025

Image created with Flux Pro Ultra. Image prompt: A Minecraft screenshot displaying a vast interconnected network of portals linking different biomes and realms with players teleporting between them, with “META” written in pixelated Minecraft font across the top

“22/ @ferdousbhai built a new LLM client that runs in your terminal, supports all the models with your own api key and custom functions via Model Context Protocol. Less than 250 lines of code, made possible thanks to @Pydantic_AI and @textualizeio https://x.com/AtomSilverman/status/1910057879656448041

“Introducing Mocha ✨ We’re incredibly excited to launch the absolute best way to build full-stack web apps. No code. No templates. Just describe what you want and watch it come to life. https://x.com/nichochar/status/1906748998322655529

“We’re excited to introduce a brand-new layout agent within LlamaParse that gives you the best-in-class document parsing and extraction with precise visual citations. It uses SOTA VLM models to 1) detect all the blocks on a page (tables/charts/paragraphs), and 2) dynamically https://x.com/llama_index/status/1909264185034506590

“We evaluated Llama 4 ourselves: On GPQA Diamond, Maverick and Scout scored 67% and 52%, similar to Meta’s reported 57% and 69.8%. On MATH Level 5, Maverick and Scout got 73% and 62%. Maverick is competitive with leading open or low-cost models, and both outperform Llama 3. https://x.com/EpochAIResearch/status/1909700016249479506

“Llama 4 analysis v1: 1. Maverick mixes MoE layers & dense – every odd MoE 2. Scout uses L2 Norm on QK (not QK Norm) 3. Both n_experts = 1 4. Official repo uses torch.bmm (not efficient) 5. Maverick layers 1, 3, 45 MoE are “special” layers 6. 8192 chunked attention Details: 1. https://x.com/danielhanchen/status/1909726119500431685

“Llama 4 is here! Meta has released two smaller versions of its new Llama 4 family of models: Llama 4 Scout and Maverick, and announced a larger version called Behemoth that is still in training. In this thread, we dig into their training details and benchmark performance 🧵” / X https://x.com/EpochAIResearch/status/1909699970594394173

“Llama 4 independent tests suggest a Maverick is very solid model, but not enough to beat DeepSeek v3 (non-reasoner version), though size-performance trade-offs make it hard to do exact comparisons.” / X https://x.com/emollick/status/1909632109964108285

“Llama-4 doesn’t disappoint! My notes: – Ease of deployment is now a more important OSS feature than sheer size. There’s emphasis that Llama 4 Scout can run on a single H100, as opposed to Llama-3-401B, which was powerful but ultimately had lesser adoption. Mixture of Expert is a https://x.com/DrJimFan/status/1908615861650547081

“You’re being unfair in a few ways. I disagree that: 1. “The soul of the Llama series died by not releasing enough models frequently enough.” Better to have fewer, better releases than more worse releases. It’s an impossible balance; I prefer Meta’s strategy to what you suggest.” / X https://x.com/jefrankle/status/1909244633764261987

“NEW: Llama 3.1 Nemotron Ultra 253B – beats Llama 4 Behemouth, Maverick & competitive with DeepSeek R1 – Commercially permissive! 🔥🔥🔥 > Open weights on the hub! https://x.com/reach_vb/status/1909584596401815691

“We’ve seen questions from the community about the latest release of Llama-4 on Arena. To ensure full transparency, we’re releasing 2,000+ head-to-head battle results for public review. This includes user prompts, model responses, and user preferences. (link in next tweet) Early” / X https://x.com/lmarena_ai/status/1909397817434816562

“New free & open source Together AI example app! Screenshot -> code powered by Llama 4. https://x.com/togethercompute/status/1910369366056882217

“Llama-4 Maverick on BigCodeBench-Full 61.9% Complete 49.7% Instruct Average 55.8% Both GPT-4o-2024-05-13 & DeepSeek V3 got 56.1% on average. There may be some gaps between Llama-4 Maverick and the recent (3-month) frontier models, given the fast pace in AI development these” / X https://x.com/terryyuezhuo/status/1909275015511687179

“We’ve linearized the experts in Llama-4 Scout; you can now fine-tune w/ 2x48GB GPUs @ 4k context. (adapters on self attention and shared experts). 8k context + adapters on experts uses only 2x53GB Support for linearized Llama-4 now in Axolotl OSS v0.8.1. Details & Model in 🧵” / X https://x.com/winglian/status/1909413876669558967

“@UnslothAI Smaller Llama size, same Llama power 💪 Absolutely stoked to see what the world builds with Llama 4” / X https://x.com/AIatMeta/status/1910010433576264036

“Today is the start of a new era of natively multimodal AI innovation. Today, we’re introducing the first Llama 4 models: Llama 4 Scout and Llama 4 Maverick — our most advanced models yet and the best in their class for multimodality. Llama 4 Scout • 17B-active-parameter model https://x.com/AIatMeta/status/1908598456144531660

“Some carifications about Llama-4.” / X https://x.com/ylecun/status/1909313264460378114

“chat, Llama 4 is so back! – with some more thoughtful post-training, we’ve got a pretty strong model here! 🔥 https://x.com/reach_vb/status/1909658152234033200

“Meta announced MoCha, an AI that turns speech and text into movie-grade talking/singing character animations It enables multi-character conversations with turn-based dialogue generation, and near-perfect lip-sync https://x.com/adcock_brett/status/1908913575403323545

“We’re glad to start getting Llama 4 in all your hands. We’re already hearing lots of great results people are getting with these models. That said, we’re also hearing some reports of mixed quality across different services. Since we dropped the models as soon as they were” / X https://x.com/Ahmad_Al_Dahle/status/1909302532306092107

“The Llama 4 model that won in LM Arena is different than the released version. I have been comparing the answers from Arena to the released model. They aren’t close. The data is worth a look also as it shows how LM Arena results can be manipulated to be more pleasing to humans. https://x.com/emollick/status/1909414182962790467

“Just built a Llama 4 company researcher 🌐 Ask any information about a company then it will search the web and extract structured data for you using Firecrawl’s new /extract endpoint. Built with Meta’s new Llama 4 Maverick Model, @togethercompute, and @firecrawl_dev. https://x.com/nickscamara_/status/1910361430970515787

“The power of Llama 4 🤝 The simplicity of Vertex AI What will you build?” / X https://x.com/AIatMeta/status/1910034596638646584

“Llama 4 just released with a 10M context window. You could fit the entire Harry Potter series, A Song of Ice and Fire (Books 1–5), The Lord of the Rings, The Hobbit, The Bible, The Quran, and Dune. Still with millions of tokens to spare. This is library-scale reasoning. https://x.com/skirano/status/1908613559069635032

“Llama 4 Intelligence Index Update: We have now replicated Meta’s claimed values for MMLU Pro and GPQA Diamond, pushing our Intelligence Index scores for both Scout and Maverick higher Key update details: ➤ We noted in our first post 48 hours ago that we noticed discrepancies https://x.com/ArtificialAnlys/status/1909624239747182989

“See that purple banner on the Llama 4 models? It’s Xet storage, and this is actually huge for anyone building with AI models. With models getting bigger and downloads exploding, Git LFS is becoming less practical. Xet lets you version large files like code, with compression and https://x.com/fdaudens/status/1908646412125941989

“3 important updates from last week: 1. @AIatMeta’s surprise Saturday release of the Llama 4 herd: It sparked initial hype, quickly followed by widespread criticism. While mixture-of-experts (MoE) architecture allows for massive parameter counts, users report underwhelming https://x.com/TheTuringPost/status/1909933246823223668

“Llama 4 Scout is now live on the @vercel AI SDK Playground. Bewildering how fast @groqinc serves it. https://x.com/rauchg/status/1908616519430631552

“Hopefully the Llama 4 models improve rapidly, as they did in the Llama 3 generation. The initial launch got pretty mixed feedback (including from me) but a good open weights model from Meta would be very useful for many people.” / X https://x.com/emollick/status/1909306675174977637

“Me (earlier this year0): “Llama models aren’t optimized for production.” Meta: “Bet. Here’s the Llama 4 suite, MoE models with 16 & 128 experts” Me: “Yeah… maybe dense wasn’t so bad after all.”” / X https://x.com/rasbt/status/1909041971970072707

“Run Llama 3.2 from scratch – smol PyTorch implementation of Llama 3.2 text models with minimal code dependencies ⚡ by @rasbt on Hugging Face 🤗 https://x.com/reach_vb/status/1910353750746681805

“This is the clearest evidence that no one should take these rankings seriously. In this example it’s super yappy and factually inaccurate, and yet the user voted for Llama 4. The rest aren’t any better. https://x.com/vikhyatk/status/1909403603409969533

The Llama 4 herd: The beginning of a new era of natively multimodal AI innovation https://ai.meta.com/blog/llama-4-multimodal-intelligence/

“Pretty impressive outcomes in the leaderboard as well for Llama 4.” / X https://x.com/emollick/status/1908625383257436165

“@omarsar0 have you tested llama 4? It’s generating slop for me, but some saying it’s good.” / X https://x.com/Yuchenj_UW/status/1909062763789566100

“👀 Curious about Llama 4? Jump into Smol Arena and test it yourself! Play around with any AI model under 30B – no fancy stuff needed. Come geek out with us! 🤖” / X https://x.com/fdaudens/status/1909707381933568036

“We are excited to partner with @AIatMeta to welcome Llama 4 Maverick (402B) & Scout (109B) natively multimodal Language Models on the Hugging Face Hub with Xet 🤗 Both MoE models trained on up-to 40 Trillion tokens, pre-trained on 200 languages and significantly outperforms its https://x.com/huggingface/status/1908600868074639806

“🎊 Llama Nemotron Ultra 253B is here 🎊 ✅ 4x higher inference throughput over DeepSeek R1 671B 🏆Highest accuracy on reasoning benchmarks: 💎 GPQA-Diamond for advanced scientific reasoning 💎 AIME 2024/25 for complex math 💎 LiveCodeBench for code generation and completion https://x.com/NVIDIAAIDev/status/1909742262814490840

“If Meta actually did this for Llama 4 training to maximize benchmark scores, it’s fucked. https://x.com/Yuchenj_UW/status/1909061004207816960

“Gemini 2.5 Pro and Llama-4 results on Tic-Tac-Toe-Bench! playing as O (depends more on generalization) – Gemini 2.5 Pro is the 6th best model, but surprisingly worse than all the other frontier thinking models! – Llmao-4 Maverick scores below both Llama-3 70B versions because https://x.com/scaling01/status/1909028821396836369

“Meta announced three MoE-based Llama 4 models: 109B param Scout, 400B param Maverick, and 2T param Behemoth (in training) Scout features a 10M context window and beats Gemma 3 & Mistral 3 Maverick, with a 1M window, outperforms GPT-4o and Gemini 2.0 https://x.com/rowancheung/status/1909154238464147838

“Llama-4 Series on BigCodeBench-Hard *Inference via NVIDIA NIM Llama-4 Maverick Ranked 41th/192 Similar to Gemini-2.0-Flash-Thinking & GPT-4o-2024-05-13 29.1% Complete 25% Instruct Llama-4-Scout Ranked 97th/192 16.9% Complete 16.9% Instruct Also, new visuals on the leaderboard! https://x.com/terryyuezhuo/status/1909247540379148439

nvidia/Llama-3_1-Nemotron-Ultra-253B-v1 · Hugging Face https://huggingface.co/nvidia/Llama-3_1-Nemotron-Ultra-253B-v1#evaluation-results

“On vibes, I don’t think the currently released Llama 4 models are a big enough increase in capability that they definitively take the lead in open weights models from the main Chinese models (and open weights continues to lag frontier closed). Meta is still training, though.” / X https://x.com/emollick/status/1908921025057653175

“@Ahmad_Al_Dahle Congrats on the release again! Quite excited about future llamas 🦙 Always in awe of your commitment to open science and weights! It always takes some time to iron out all edge cases etc 🤗” / X https://x.com/reach_vb/status/1909316136526832054

“Github 👨‍🔧: TTS Towards Human-Sounding Speech ——- → Leverages a Llama-3b LLM backbone for speech synthesis, showcasing emergent capabilities. → Produces highly natural speech output, focusing on realistic intonation, emotion, and rhythm. → Supports zero-shot voice https://x.com/rohanpaul_ai/status/1909126492971536685

5/Apr/2025 – ASI checklist item #5, Llama 4 Behemoth 2T on ~30T tokens, 1X NEO AGI – YouTube https://www.youtube.com/watch?v=oipjJwRVW20&t=888s