Llama: AI News Week Ending 11/07/2025

Image created with gemini-2.5-flash-image with claude-sonnet-4-5. Image prompt: Photorealistic wide shot of six completed Ionic limestone columns with classical entablature on a sunlit campus quad, ‘LLAMA’ carved in Roman serif letters in the center frieze, stone carving chisels and mallets resting at column bases beside fresh limestone blocks and marble chips on green grass, late afternoon golden light, red brick academic buildings in background, sharp focus, natural shadows, collaborative monument-building atmosphere.

A detailed look into the new WebUI of llama.cpp https://x.com/ggerganov/status/1985727389926555801

LlamaBarn v0.10.0 (beta) is out – feedback appreciated https://x.com/ggerganov/status/1986072781889347702

New Llama.cpp UI is a blessing for the local AI world 🌎 – Blazing fast, beautiful, and private (ofc) – Use 150,000+ GGUF models in a super slick UI – Drop in PDFs, images, or text documents – Branch and edit conversations anytime – Parallel chats and image processing – Math and https://x.com/victormustar/status/1985742628776706151

congrats to llama 3 large for winning the LLM trading contest by not participating https://x.com/yifever/status/1986064968262062088

How much RAM do you need to run tiny models? Jamba Reasoning 3B runs on just 2.25 GiB, the lightest among small models like Qwen (@Alibaba_Cloud), Llama (@Meta), Granite (@IBM), and Gemma (@GoogleDeepMind). 👉 Try Jamba Reasoning 3B yourself: https://x.com/AI21Labs/status/1986439953539076169

Qwen3-VL Accuracy Differences on Ollama vs MLX Video: https://x.com/andrejusb/status/1985612661447331981

I am very sympathetic to the delays in publishing papers, but I think we need to be careful with “”AI can’t do this”” claims when our empirical evidence pre-dates even o1 class Reasoners. The strongest model here is GPT-4 (which does better) and the next best is Llama 2 70B(!!)… https://x.com/emollick/status/1985610450709434527