Llama: AI News Week Ending 08/08/2025

Image created with Flux Pro v1.1 Ultra. Image prompt: Ornate showgirl glamour in orange-and-teal tones, sparkling rhinestone llama prop onstage, stylized text “Llama” glowing in playful glitter letters above the performer’s head; spotlit, dramatic contrast, vintage grain, cinematic, high-detail

just one more cup before ollama gets ready https://x.com/ollama/status/1952762052755480893

LMStudio are using the upstream ggml implementation which is significantly better and well optimized. Looking at ollama’s modifications in ggml, they have too much branching in their MXFP4 kernels and the attention sinks implementation is really inefficient. Along with other”” / X https://x.com/ggerganov/status/1953088008816619637

getting ready for the day. @nvidia GeForce RTX is powered on. https://x.com/ollama/status/1952764954484027727

We’re excited to introduce a new parsing mode within LlamaCloud that lets you get complex visual recognition capabilities over documents 🖼️📑 at a cheaper price compared to pretty much anything else out there ⚡️ There’s a variety of VLM-enabled document parsing solutions out https://x.com/jerryjliu0/status/1953227974716665996