Locally Run: AI News Week Ending 04/10/2026

Locally Run: AI News Week Ending 04/10/2026

April 10, 2026

Image created with gemini-3.1-flash-image-preview with claude-sonnet-4-5. Image prompt: Using the provided reference image, keep the exact compositional layout with subject dominating left third in tight profile and atmospheric blue-purple smoke dissolving rightward, but replace the central figure with a delivery driver in worn jacket, head bowed, glitter catching light on sleeve, with blurred small-town storefront fading into misty haze behind, maintaining the same cinematic melancholy and post-party emotional weight, category name ‘local’ in thin lowercase white Helvetica Neue Light on right two-thirds.

Gemma 4 E2B on iPhone 17 Pro Max in AI Edge Gallery! Using skills to query wikipedia. 🔥 App link below. [cr: @mweinbach]
https://x.com/_philschmid/status/2041171039598543064

Insane I’m running Gemma 4 on my iPhone 16 pro max Vibe coded the app in under 1h Singularity is here
https://x.com/enjojoyy/status/2040563245925151229

Gemma 4 E4B is impressive for an on-device LLM. GPT-4ish quality, and expect hallucinations. Here is: “List five sociological theories starting with u and what they are. Then describe them in a rhyming verse” Its in real time, the last is a little bit of a stretch, but not bad!
https://x.com/emollick/status/2040851723774808310

I cancelled my Claude subscription. Gemma 4 is free, runs locally, and hits 80% … The gap is basically gone. Why are you still paying? 💵💰
https://x.com/AlexEngineerAI/status/2040260903053197525

GLM-5.1 can now be run locally!🔥 GLM-5.1 is a new open model for SOTA agentic coding & chat. We shrank the 744B model from 1.65TB to 220GB (-86%) via Dynamic 2-bit. Runs on a 256GB Mac or RAM/VRAM setups. Guide:
https://t.co/LgWFkhQ5rr GGUF:
https://x.com/UnslothAI/status/2041552121259249850

Google quietly launched an AI dictation app that works offline | TechCrunch

Google quietly launched an AI dictation app that works offline

Google’s Gemma 4 E2B running on-device on iPhone 17 Pro Gemma 4 is built from the same research as Gemini 3, has image understanding capabilities and can reason if needed Running at ~40tk/s with MLX optimized for Apple Silicon
https://x.com/adrgrondin/status/2040512861953270226

Lots of people want Gemma 4! Google AI Edge is #8 on the iOS App Store for productivity apps.
https://x.com/OfficialLoganK/status/2040874501777317982

Gemma 2 Release – a google Collection
https://huggingface.co/collections/google/gemma-2-release

Gemma 3 Release – a google Collection
https://huggingface.co/collections/google/gemma-3-release

Gemma 4 – a google Collection
https://huggingface.co/collections/google/gemma-4

Gemma 4 is now available in the Gemini API and Google AI Studio. Use `gemma-4-26b-a4b-it` and `gemma-4-31b-it` with the same `google-genai` sdk as Gemini. 📝 Text generation with generate_content . 🧭 System instruction + Function Calling example. 🖼️ Image understanding example.
https://x.com/_philschmid/status/2041532358969446596

Run Gemma 4 locally with OpenClaw 🦀 in 3 steps:
https://x.com/googlegemma/status/2041512106269319328

There were some exceptionally cool demos from @ollama and omlx using MLX to run Qwen 3.5 and Gemma 4 on Apple silicon. The capabilities of local LLMs and the surrounding ecosystem have come a long way in the past couple years.
https://x.com/awnihannun/status/2042456446122803275

Gemma-4 finetuning 2B, 4B, 26B, 31B all work in Unsloth! We also fixed a few issues: 1. Grad accumulation no longer causes losses to explode 2. Index Error for 26B and 31B for inference 3. use_cache=False had gibberish for E2B, E4B 4. float16 audio -1e9 overflows on float16
https://x.com/danielhanchen/status/2041516671119327590

Introducing Gemma 4, our series of open weight (Apache 2.0 licensed) models, which are byte for byte the most capable open models in the world! Gemma 4 is build to run on your hardware: phones, laptops, and desktops. Frontier intelligence with a 26B MOE and a 31B Dense model!
https://x.com/OfficialLoganK/status/2039735606268314071

People underestimate the level of collaboration that needs to happen for a model such as Gemma 4 to land Before the launch, we worked with HF, VLLM, llama.cpp, Ollama, NVIDIA, Unsloth, Cactus, SGLang, Docker, CloudFlare, and so many others This ecosystem is amazing 🔥
https://x.com/osanseviero/status/2041154555530932578

Gemma 4 31B, quantized and evaluated. Instruction following evals are live on our NVFP4 and FP8-block model cards. Results look great. Reasoning and vision evals coming later this week. NVFP4:
https://t.co/GIc7y1Abkc FP8:
https://x.com/RedHat_AI/status/2040766645480628589

Gemma 4 is #1 on @huggingface!
https://x.com/ClementDelangue/status/2040911131108069692

Gemma 4 is a beast.
https://x.com/Yampeleg/status/2040495537598648357

Speculative decoding for Gemma 4 31B (EAGLE-3) A 2B draft model predicts tokens ahead; the 31B verifier validates them. Same output, faster inference. Early release. vLLM main branch support is in progress (PR #39450). Reasoning support coming soon.
https://x.com/RedHat_AI/status/2042660544797110649

Gemma 4 is the #1 trending model on @huggingface 🤗
https://x.com/GlennCameronjr/status/2040529333794824456

We taught a 1.3M parameter model to play DOOM. It outperforms LLMs up to 92,000x its size. Happy Easter Monday! Here’s our Easter egg release: SauerkrautLM-Doom-MultiVec-1.3M. 17.8 average points per episode. We benchmarked our tiny model against GPT-4o-mini (via OpenAI API),
https://x.com/DavidGFar/status/2041063368656585002

Some folks try to spin a narrative that I don’t like local models, meanwhile I spent a lot of time making it easy to use OpenClaw with them. Latest release adds support for inferrs, which is a new super efficient TurboQuant inference server:
https://x.com/steipete/status/2041935840935371034

It’s insane! I have a jumpy WiFi on a plane – but who cares. As long as a message goes through, my #openclaw – Alan Tea – is working – it’s running on an old computer 24/7. So so fun, no need so far to buy any other equipment
https://x.com/TheTuringPost/status/2041163715031261258