Google: AI News Week Ending 04/03/2026

Google: AI News Week Ending 04/03/2026

April 3, 2026

Image created with gemini-3.1-flash-image-preview with claude-sonnet-4-5. Image prompt: Using the provided reference image, preserve the exact square faceted perfume bottle with amber-gold liquid, crystal stopper, pure white background, soft shadow, and high-key studio lighting. Replace the label text with ‘Google’ in matching black serif typography. Add a delicate sterling silver chain draped naturally around the bottle neck below the stopper, with a small dainty magnifying glass pendant hanging from it–rendered as refined high-fashion jewelry, miniature scale, polished silver with circular lens detail.

Build autonomous agents that plan, navigate apps, and execute multi-step tasks – like searching databases or triggering APIs – with native tool use. With up to 256K context, it can analyze full codebases and retain complex action histories without losing focus.
https://x.com/GoogleDeepMind/status/2039735455533453316

Inbox Zero is a thing of the past. Introducing AI Inbox: cut through your email clutter with smart prioritization and daily personalized briefings. Rolling out today in Beta for Google AI Ultra subscribers in the US. →
https://x.com/gmail/status/2039107985281008078

NEW paper from Google DeepMind The biggest threat to AI agents isn’t a smarter attacker. It’s the web itself. This work introduces the first systematic framework for understanding how the open web can be weaponized against autonomous agents. The paper defines “”AI Agent Traps””:
https://x.com/omarsar0/status/2039383554510217707

GOOGLE IS CLOSE TO STRIKING A DEAL TO FUND ANTHROPIC’S DATA CENTER, ACCORDING TO FT.
https://x.com/FirstSquawk/status/2037586926375743904

. @googlegemma have open sourced the perfect model for local open source agents. Gemma 4 comes in all the sizes we need for mobile, local, and code. This is how I’ll be switching my @thdxr opencode agent over. Let’s go local agents.
https://x.com/ben_burtenshaw/status/2039740590091362749

🎉 Gemma 4 is officially available on vLLM! Byte-for-byte, these are the most capable open models for advanced reasoning and agentic workflows. Key features include: – Native Multimodal Support: Full vision and audio capabilities with up to a 256K context window. – Broad
https://x.com/vllm_project/status/2039762998563418385

A 12-month time difference between Gemma 3 27b and Gemma 4 31b. The jump is absolutely enormous. Just look at the evaluations between the two models. GPQA doubled, AIME 2026 went from ~20% to ~90%, and so on. Crazy.
https://x.com/kimmonismus/status/2039759264680747219?s=20

A Visual Guide to Gemma 4 With almost 40 (!) custom visuals, explore the new models from Google DeepMind. We explore various techniques, ranging from Mixture of Experts and the Vision Encoder all the way up to Per-Layer Embeddings and the Audio Encoder. Link below 👇
https://x.com/MaartenGr/status/2040099556948390075

Gemma 4 — Google DeepMind
https://deepmind.google/models/gemma/gemma-4/

Gemma 4 31B (Reasoning) is very token efficient, using ~1.2M tokens on the GPQA Diamond evaluation, fewer than peers models such as Qwen3.5 27B (~1.5M) and Qwen3.5 35B A3B (~1.6M)
https://x.com/ArtificialAnlys/status/2039752015811866652

Gemma 4 31B running with TurboQuant KV cache on MLX 🔥 128K context: → KV Memory: 13.3 GB → 4.9 GB (63% reduction) → Peak Memory: 75.2 GB → 65.8 GB (-9.4 GB) → Quality preserved TurboQuant compression scales with sequence length, so the longer the context, the bigger the
https://x.com/Prince_Canuma/status/2039840313074753896

Gemma 4 outperforms models over 10x their size! (note the x-axis is log scale!)
https://x.com/demishassabis/status/2040067244349063326

Gemma 4: Our most capable open models to date
https://blog.google/innovation-and-ai/technology/developers-tools/gemma-4/

Gemma-4-31B is now live in Text Arena – ranking #3 among open models (#27 overall), matching much larger models at 10× smaller scale! A significant jump from Gemma-3-27B (+87 pts). Highlights: – #3 open (#27 overall), on par with the best open models Kimi-K2.5, Qwen-3.5-397b –
https://x.com/arena/status/2039739427715735645

Getting Started with Gemma 4 in AI Studio
https://x.com/GoogleAIStudio/status/2040090067709075732

Google just open-sourced Gemma 4. Unprecedented performance for advanced reasoning and agentic workflows, and big leap in efficiency on a parameter basis. Use it now in KerasHub. I recommend the JAX backend – best performance!
https://x.com/fchollet/status/2039845249334510016

Google just re-entered the game 🔥🔥 They want to take the crown 👑 back from Chinese open source AI. And… Gemma 4 is FINALLY Apache 2.0 aka real-open-source-licensed. From what I’ve seen it’s going to be a pretty significant model. But give it a try yourself today: brew
https://x.com/ClementDelangue/status/2039941213244072173

got Gemma 4 up and running at 34 tokens per second this is the 26B-A4B model, running on my mac mini m4 with 16GB ram next time i hit my claude session limits i’ll have this fast free local AI as a backup :]
https://x.com/measure_plan/status/2040069272613834847

Got Gemma-4-26B-A4 MoE running on iPhone w/Flash SSD in Swift MLX. Still pretty slow, I expect 10+ t/s once optimized properly for Swift.
https://x.com/anemll/status/2040126326708031969

Introducing a Visual Guide to Gemma 4 👀 An in-depth, architectural deep dive of the Gemma 4 family of models. From Per-Layer Embeddings to the vision and audio encoders. Take a look!
https://x.com/osanseviero/status/2040105484061954349

Let’s look at how the open model Gemma has progressed across its last three versions. – Gemma 4 ranks 100 places above Gemma 3 – Gemma 3 ranks 87 above Gemma 2 All three models from @GoogleDeepMind are roughly the same size (31B, 27B, 27B), and these gains came only 9 and 13
https://x.com/arena/status/2039848959301361716

Lets go: Running a full AI assistant locally on a MacBook Air M4 with 16GB, completely free, open source, no API keys needed. Atomic Bot makes it really simple: install, pick Gemma 4, and you have an always-on AI agent running on your machine. No cloud. No subscription. No data
https://x.com/kimmonismus/status/2039989730901623049

Meet Gemma 4: our new family of open models you can run on your own hardware. Built for advanced reasoning and agentic workflows, we’re releasing them under an Apache 2.0 license. Here’s what’s new 🧵
https://x.com/GoogleDeepMind/status/2039735446628925907

NEW: Google releases Gemma 4, their most capable open models yet! 🤯 Apache-2.0, multimodal (text, image, and audio input), and multilingual (140 languages)! They can even run 100% locally in your browser on WebGPU. Watch it describe the Artemis II launch! 🚀 Try the demo! 👇
https://x.com/xenovacom/status/2039741226337935430

To explain why I consider Gemma 4 a bigger release than most people realize. This is a big deal because models like Gemma 4 E4B can run directly on devices, bringing powerful AI (even a 2B model ~60% on MMLU Pro) to phones, laptops, and edge systems without relying on the cloud,
https://x.com/kimmonismus/status/2039978863644537048

Today, we’re launching Gemma 4, our most intelligent open models to date. Built with the same breakthrough technology as Gemini 3, Gemma 4 brings advanced reasoning to your personal hardware and devices. Here’s what Gemma 4 unlocks for developers: — Intelligence-per-parameter:
https://x.com/GoogleAI/status/2039735543068504476

We just released Gemma 4 — our most intelligent open models to date. Built from the same world-class research as Gemini 3, Gemma 4 brings breakthrough intelligence directly to your own hardware for advanced reasoning and agentic workflows. Released under a commercially
https://x.com/Google/status/2039736220834480233

You can run Gemma 4 100% locally in your browser thanks to HF transformers.js. That means 100% private and 100% free! @xenovacom created a demo for it here:
https://x.com/ClementDelangue/status/2039782910996148508

run OpenClaw, Hermes Agent and Pi with Gemma 4 with few lines of change 🔥
https://x.com/mervenoyann/status/2039788257815261400

So happy to see Google release Gemma 4 today in apache 2.0 that gives you frontier capabilities locally. You can use it right away in all your favorite open agent platforms like openclaw, opencode, pi, Hermes by asking it to change your model to local gemma 4 with
https://x.com/ClementDelangue/status/2039740419899056152

Improve coding agents’ performance with Gemini API Docs MCP and Agent Skills.
https://blog.google/innovation-and-ai/technology/developers-tools/gemini-api-docsmcp-agent-skills/

Introducing Gemini 3.1 Flash Live, our new realtime model to build voice and vision agents!! We have spent more than a year improving the model + infra + experience, the results? A step function improvement in quality, reliability, and latency.
https://x.com/OfficialLoganK/status/2037187750005240307

Apple’s Gemini-powered Siri upgrade could still arrive this month – 9to5Mac

Apple’s Gemini-powered Siri upgrade could still arrive this month

I would say that Suno is generally a better music generator at this point (though not for all songs), but Lyria is the first music creator available through an API, and seems to not have the same copyright & training issues that plague Suno and restrict how it can be used.
https://x.com/emollick/status/2036962853861662917

2 years at Google, time flies, we are just getting started : )
https://x.com/OfficialLoganK/status/2039362994832822454

Apache 2.0!! 👀
https://x.com/matvelloso/status/2039736260529635836

Flex and Priority tiers in the Gemini API
https://blog.google/innovation-and-ai/technology/developers-tools/introducing-flex-and-priority-inference/

Shawn doesn’t know… the robot will never forget this
https://x.com/adcock_brett/status/2039171786051629275

You asked, we delivered. If you’re a U.S. Google user, you can now change your account username for tools like Gmail, Photos, Drive and more — while keeping your emails, data and account history. Here’s what to know: 1️⃣ You can choose any available @gmail.com username. 2️⃣
https://x.com/Google/status/2038969843701989773

(9) AIE Europe Day 1: Keynotes & OpenClaw/Personal Agents ft OpenAI, Vercel, Google Deepmind & more – YouTube

Love that Google DeepMind is following OpenAI’s suit w/ using Apache 2.0 license for their open weights models – congrats! but, can we please stop using Arena Elo as the de facto measure of performance?
https://x.com/reach_vb/status/2040070816247734720

Quality of life updates to @GoogleAIStudio we just shipped (using Gemini): – You can now (optionally) save a temp chat in the playground – You can now turn a playground chat into an app in 2 clicks – Updated colors for playground to add some soul to it – Simplified the mobile
https://x.com/OfficialLoganK/status/2039137446932185266

So much of this, every day. You really have to develop thick skin (exoskeleton?) when working on successful open source. (The Chrome extension has been removed since Google added native access in 144+, which is simpler, but yes, it does require a one-time setting change)
https://x.com/steipete/status/2037988925818519763

How developers can use Veo 3.1 Lite for AI video generation
https://blog.google/innovation-and-ai/technology/ai/veo-3-1-lite/

Veo 3.1 Lite now available in Gemini API and @GoogleAIStudio. Designed for rapid prototyping and high-volume video generation, starting at $0.05/sec. 🪶 – 1/2 the cost of Veo 3.1 Fast. – Text-to-Video (T2V) & Image-to-Video (I2V). – Landscape (16:9) and Portrait (9:16) format –
https://x.com/_philschmid/status/2039014102811427263

meet the music playground, with Lyria 3. construct the perfect prompt with composer mode: describe it, hear it, then export to code and build.
https://x.com/GoogleAIStudio/status/2039055128276148454

Today we announce a new evaluation framework to improve AI benchmark reproducibility. By optimizing the ratio of the number of items to human raters per item, we can better capture the nuance of human disagreement in subjective tasks. Learn more:
https://x.com/GoogleResearch/status/2039014600927043926

.@GoogleDeepMind Gemma 4 is here with state-of-the-art models targeting edge and workstations. Requires Ollama 0.20+ that is rolling out. 4 models: 4B Effective (E4B) ollama run gemma4:e4b 2B Effective (E2B) ollama run gemma4:e2b 26B (4B active MoE) ollama run gemma4:26b
https://x.com/ollama/status/2039738348647108680

.@UnslothAI supports @GoogleGemma 4 models, optimized for RTX GPUs. 🦥 Run & fine-tune locally in Unsloth Studio.
https://x.com/NVIDIA_AI_PC/status/2040096993800761579

Axolotl support for Gemma 4 is in v0.16.1 is released! Finetune @GoogleAIStudio Gemma4 26B-A4B on your own 5090 using our optimized fused MoE+LoRA kernels!
https://x.com/winglian/status/2039823559363629432

Deploy Gemma4 31B and 26B-A4B with one click on Hugging Face Inference Endpoints 🔥👇
https://x.com/ErikKaum/status/2040008281796513939

Excited to launch Gemma 4: the best open models in the world for their respective sizes. Available in 4 sizes that can be fine-tuned for your specific task: 31B dense for great raw performance, 26B MoE for low latency, and effective 2B & 4B for edge device use – happy building!
https://x.com/demishassabis/status/2039736628659269901

Flagship open-weight release days are always exciting. Was just reading through the Gemma 4 reports, configs, and code, and here are my takeaways: Architecture-wise, besides multi-model support, Gemma 4 (31B) looks pretty much unchanged compared to Gemma 3 (27B). Gemma 4
https://x.com/rasbt/status/2039780905619705902

future is local 🔥 Google DeepMind just released Gemma 4: local frontier in many sizes, all modalities with free license 🤯 we ship Gemma 4 in transformers, llama.cpp, transformers.js and more for your convenience 🫡 plug-and-play with your agents 🙌🏻 read our blog ⤵️
https://x.com/mervenoyann/status/2039739097611215344

Gemma
https://x.com/OfficialLoganK/status/2039486016751366431

Gemma 4 26B MoE (4B active) on a single RTX 4090: – 162 t/s decode – 8,400 t/s prefill – Full 262K native context — 19.5 GB VRAM – Only 10 Elo below the 31B dense Q8_0 on dual 4090+3090: 9,024 t/s prefill at 10K. 2,537 t/s at full 262K — that’s a novel in about 100
https://x.com/basecampbernie/status/2039847254534852783

Gemma 4 architecture analysis thread Just as Gemma3n, this thing has a galaxybrained architecture, very much not a standard transformer
https://x.com/norpadon/status/2039740827975500251

Gemma 4 by @GoogleDeepMind debuts at 3rd and 6th on the open source leaderboard, making it the #1 ranked US open source model. By total parameter count, Gemma 4 31B is 24× smaller than GLM-5 and 34× smaller than Kimi-K2.5-Thinking, delivering comparable performance at a
https://x.com/arena/status/2039782449648214247

Gemma 4 is here! The best open-source model you can run on your machine. Day-0 support in a llama.cpp. Check it out!
https://x.com/ggerganov/status/2039744468899811419

Gemma 4 is live on Baseten and available to all customers on day 0 via the Baseten model library. All models in the Gemma 4 family are multimodal, supporting text and image inputs with text output. Key capabilities include: -> Advanced reasoning and thinking -> Coding and
https://x.com/baseten/status/2039751071284015393

Gemma4 is amazing. You’ll read that everywhere. Let’s focus on what is HUGE here: the revenge of dense models…. Throw away your b200, not needed anymore, throw away the millions of lines of code we had to write to make MOEs faster, training stable etc… throw away your
https://x.com/art_zucker/status/2039740402517893361

Google Deep Mind’s impressive fully-open Gemma 4 is live day-zero on Modular Cloud. Modular provides the fastest performance on NVIDIA Blackwell and AMD MI355X, thanks to MAX and Mojo🔥. The team took this impressive new model to production inference in days.🚀
https://x.com/clattner_llvm/status/2039738590213910558

google gemma 4 architecture is very interesting and every model has some subtle differences, here is a recap: > per layer embedding only on the small variant > no attention scale (usually you divide qk^T by sqrt(d), they don’t) > they do QK norm + V norm as well > they share
https://x.com/eliebakouch/status/2039751171556954531

Google has released Gemma 4, a new family of multimodal open-weight models including Gemma 4 E2B, Gemma 4 E4B, Gemma 4 31B and Gemma 4 26B A4B @GoogleDeepMind’s new Gemma 4 family introduces four multimodal models supporting text, image, and video inputs. We evaluated Gemma 4
https://x.com/ArtificialAnlys/status/2039752013249212600

Google releases Gemma 4. ✨ Gemma 4 introduces 4 models: E2B, E4B, 26B-A4B, 31B. The multimodal reasoning models are under Apache 2.0. Run E2B and E4B on ~6GB RAM, and on phones. Run 26B-A4B and 31B on ~18GB. GGUFs:
https://t.co/fpX21yWbge Guide:
https://x.com/UnslothAI/status/2039739190536286313

I have to give credit to Google for Apache 2.0 on Gemma 4! This is huge!
https://x.com/QuixiAI/status/2039862230452252926

Intel is partnering with @GoogleAI to deliver fully functional #Gemma4 models on Intel hardware from day zero–across Intel Xeon CPUs, Intel Xe GPUs, and Intel Core Ultra processors, with support across open frameworks including @vllm_project and @huggingface. This means
https://x.com/intelnews/status/2040106767258906707

Just do this: brew install llama.cpp –HEAD Then; llama-server -hf ggml-org/gemma-4-26B-A4B-it-GGUF:Q4_K_M
https://x.com/julien_c/status/2039746054355067002

Let me demonstrate the true power of llama.cpp: – Running on Mac Studio M2 Ultra (3 years old) – Gemma 4 26B A4B Q8_0 (full quality) – Built-in WebUI (ships with llama.cpp) – MCP support out of the box (web-search, HF, github, etc.) – Prompt speculative decoding The result:
https://x.com/ggerganov/status/2039752638384709661

Say hello to Gemma 4 from @GoogleDeepMind 🚀🔥 💎 Comes in 4 sizes: E2B, E4B, 26B A4B, 31B 💎 Supports vision and reasoning 💎 Apache 2.0 💎 Available now in LM Studio
https://x.com/lmstudio/status/2039738625525502426

Son lead the development on HF/llama.cpp side for adding support for the new Gemma 4 models. As always, he did an outstanding job throughout the collaboration with the Google DeepMind team. Day-0 support is possible thanks to his hard work!
https://x.com/ggerganov/status/2039943099284140286

Thanks for following us! We’re excited to see what you all build with Gemma 4! In case you missed it, you can find all our checkpoints, with an Apache 2.0 License, on Hugging Face:
https://x.com/googlegemma/status/2040107948010242075

thinking about google’s gemma 4 and what it means a few months ago running something this capable locally meant serious hardware and serious tradeoffs on quality now it runs on your laptop, works offline on your phone (!!!), speaks 140 languages natively, 256k context window,
https://x.com/gregisenberg/status/2039853864082424198

Today we’re releasing Gemma 4, our new family of open foundation models, built on the same research and technology as our Gemini 3 series. These models set a new standard for open intelligence, offering SOTA reasoning capabilities from edge-scale (2B and 4B w/ vision/audio) up
https://x.com/JeffDean/status/2039748604232122707

Two years ago, we released Gemma, Google DeepMind family of open models. Today, I’m thrilled to share a new milestone: Gemma 400M downloads and 100,000 variants! Thank you to every developer, partner, and contributor. We can’t wait to see what you build next!👀
https://x.com/osanseviero/status/2039120000095547722

What you need to know about @googlegemma 4: 4️⃣ 4 sizes (E2B, E4B, 26B4A, 31B) 🪟 Up to 256K context window 🛠️ Native function-calling, structured JSON output 👁️ + audio on edge models (E2B/E4B) 🌍 Trained on 140+ languages 🏆 31B ranks #3 open model on Arena AI 🪪 Apache 2.0
https://x.com/_philschmid/status/2039736207676965264

Yowza! @ollama is on it with new Gemma 4 models
https://x.com/MichaelGannotti/status/2039903041642508541

Gemma 4 31B shifts the Pareto frontier, scoring +30 Arena points above similarly priced models like DeepSeek 3.2. Its position on the Pareto frontier is based on early pricing indicators from third parties.
https://x.com/arena/status/2040128319719670101

impressive, very nice. now let’s compare a 31b dense to a 31b active 670b total instead. flop for flop
https://x.com/stochasticchasm/status/2039912148676264334

MoE models differ from the likes of DeepSeek and Qwen: instead of using shared experts in parallel to the routed ones, Gemma adds MoE blocks as separate layers in addition to the normal MLP blocks. So the architecture is Attention -> MLP -> MoE
https://x.com/norpadon/status/2039750841754697767

Nemotron Super / Ultra Arcee Trinity Large (soon) Gemma 4 (eventually) Reflection’s first models (maybe) GPT OSS 2? (maybe) Thinky? Other neolabs? Things looking up for open models built in the US in 2026. We had 0 for a bit there.
https://x.com/natolambert/status/2039499358325129530

Almost everyone is talking about @GoogleResearch’s TurboQuant (and for good reason) ➡️ It lets you run a 3-bit system with the accuracy of a full-precision model. Technically, TurboQuant is a compression algorithm that shrinks high‑dimensional vectors to low precision without
https://x.com/TheTuringPost/status/2037182800466698718

We need to publicly clarify serious issues in Google’s ICLR 2026 paper TurboQuant. TurboQuant misrepresents RaBitQ in three ways: 1. Avoids acknowledging key methodological similarity (JL transform) 2. Calls our theory “suboptimal” with no evidence 3. Reports results under
https://x.com/gaoj0017/status/2037552350924042488

Video’s here to stay – introducing Veo 3.1 Lite, our most cost efficient video generation model to date, and on April 7th we are also reducing the price for Veo 3.1 Fast : )
https://x.com/OfficialLoganK/status/2039015034286694618