Chips and Hardware: AI News Week Ending 12/19/2025

Chips and Hardware: AI News Week Ending 12/19/2025

December 19, 2025

Image created with gemini-2.5-flash-image with claude-sonnet-4-5. Image prompt: Photorealistic 35mm cinema shot of child aged 6-8 from side angle sitting on plush rug in warm-lit bedroom, holding silicon wafer disc like treasure, surrounded by arc of glowing TV screens displaying abstract circuit patterns, scattered semiconductor wafers on open books and newspapers, colorful RAM sticks and processors arranged like toys nearby, shallow depth of field, warm pastels contrasted with cool blue screen glow, soft focus, gentle composition, large bold text CHIPS at top of frame, tender and intimate mood, cozy uncanny aesthetic.

Tinker is now generally available. We also added support for advanced vision input models, Kimi K2 Thinking, and a simpler way to sample from models. https://x.com/thinkymachines/status/1999543421631946888

Tinker: General Availability and Vision Input – Thinking Machines Lab
https://thinkingmachines.ai/blog/tinker-general-availability/

Tinker: General Availability and Vision Input – Thinking Machines Lab https://thinkingmachines.ai/blog/tinker-general-availability/

Today we are releasing Tinker to everyone, and now with vision input! You can now finetune a frontier Qwen3-VL-235B on your own image+text data, bringing your own algorithm (sft, RL, something else?). We’ll take care of the GPU infra. Full update: https://x.com/rown/status/1999544121984245872

NVIDIA Debuts Nemotron 3 Family of Open Models | NVIDIA Newsroom https://nvidianews.nvidia.com/news/nvidia-debuts-nemotron-3-family-of-open-models

OpenAI in Talks to Raise At Least $10 Billion From Amazon and Use Its AI Chips — The Information https://www.theinformation.com/articles/openai-talks-raise-least-10-billion-amazon-use-ai-chips

OpenAI in talks with Amazon about investment could top $10 billion https://www.cnbc.com/2025/12/16/openai-in-talks-with-amazon-about-investment-could-top-10-billion.html

Inside XAI All-Hands: AGI, Funding, and Data Centers in Space – Business Insider https://www.businessinsider.com/xai-all-hands-agi-superintelligence-funding-success-optimus-space

Sergey Brin on the genius of Jeff Dean. He credits Jeff’s early obsession with neural networks, back when they were “telling cats from dogs”, as the spark for everything. TPU was Jeff’s idea. He calculated that if users spoke to Google for just three minutes a day, Google would https://x.com/Yuchenj_UW/status/2000627610561458682

Inference Economics 101: Reserved Compute versus Inference APIs https://www.datagravity.dev/p/inference-economics-101-reserved

@AMD × vLLM Semantic Router — 𝘄𝗲’𝗿𝗲 𝗯𝘂𝗶𝗹𝗱𝗶𝗻𝗴 𝘁𝗵𝗲 𝗦𝘆𝘀𝘁𝗲𝗺 𝗜𝗻𝘁𝗲𝗹𝗹𝗶𝗴𝗲𝗻𝗰𝗲 𝘁𝗼𝗴𝗲𝘁𝗵𝗲𝗿. With AMD, vLLM Semantic Router is evolving into an Intelligence Control Plane: • 𝗪𝗼𝗿𝗹𝗱 𝗜𝗡: secure what comes IN (inputs) • 𝗪𝗼𝗿𝗹𝗱 𝗢𝗨𝗧: govern”” / X https://x.com/vllm_project/status/2000819485834698813

From @tri_dao FlashAttention creator’s shop comes a new super optimized gift to the ML community this time in the form of cute-DSL MoE implementation on Hopper. Blackwell is coming next!”” / X https://x.com/StasBekman/status/2001823298360086787

On-policy distillation would revolutionize multi-turn tool-use training beyond RL, but neither Tinker nor TRL which implements on-policy supports anything other than single-turn distillation. We therefore have taken this upon ourselves and implemented this feature in native https://x.com/HeMuyu0327/status/1999316923885191376

Tinker is now open to everyone! We are also adding: – Vision support with Qwen3-VL – New model: Kimi K2 Thinking (1T params) – OpenAI API-compatible inference Start training models within minutes: https://x.com/dchaplot/status/1999543675765031289

Turn any Autoregressive LLM into a Diffusion LM. dLLM is a Python library that unifies the training & evaluation of diffusion language models. You can also use it to turn ANY autoregressive LM into a diffusion LM with minimal compute. 100% open-source. https://x.com/akshay_pachaar/status/2001562985043783908

We are deeply grateful for the message and example set by AI research leaders who have pledged $1M through this public letter supporting OpenReview. Their encouragement strengthens the infrastructure behind open scientific dialogue and peer review innovation. Others are welcome”” / X https://x.com/openreviewnet/status/2001837352692675007

we’re open-sourcing a new frontier science eval in biology, chemistry, and physics. there are 2 tracks: olympiad level and advanced research level. as models become saturated on GPQA, this is a nice unsaturated alternative with clean test-time compute scaling. kudos to https://x.com/tejalpatwardhan/status/2000982763500175683

Google names new chief of AI infrastructure buildout | Semafor https://www.semafor.com/article/12/10/2025/google-names-new-chief-of-ai-infrastructure-buildout

Nvidia and Alphabet VC arms back vibe coding startup Lovable https://www.cnbc.com/2025/12/18/google-and-n.html

🚀 Introducing Nemotron-Cascade! 🚀 We’re thrilled to release Nemotron-Cascade, a family of general-purpose reasoning models trained with cascaded, domain-wise reinforcement learning (Cascade RL), delivering best-in-class performance across a wide range of benchmarks. 💻 Coding https://x.com/_weiping/status/2000947255088701628

Those with Blackwell access have been waiting for Flash Attention 4 to come out since FA2 on Blackwell is really slow (Blackwell dropped WGMMA and requires a rewrite). But you’re likely to see a 50% and higher end-to-end speedup with FA3 if you’re on Hopper, the longer the https://x.com/StasBekman/status/2001839591243026593

“”Nemotron 3 Nano runs nicely with mlx-lm on an M4 Max. Could be a great model for local use on Mac: MoE + hybrid attention make it fast even for very long context. Generating in realtime with 4-bit model: https://x.com/awnihannun/status/2000718403380691417

Nemotron 3 Nano for MLX is now available in LM Studio. General purpose reasoning and chat model trained from scratch by @nvidia. 30B, 3.5B active MoE runs blazingly fast on Apple Silicon 🍎🚀 https://x.com/lmstudio/status/2001015687003963730

🚀🚀🚀 We’re excited to support @NVIDIA and their new open family of models: NVIDIA Nemotron 3! Open in weights, data, tools, and training, Nemotron 3 is built for multi-agent apps and features: ⚡️An efficient hybrid Mamba‑Transformer MoE architecture 🧾1M token context for”” / X https://x.com/vllm_project/status/2000623058076492276

Agent demos often fail for reasons that are hard to see: unclear tool traces, silent failures, and changes that improve one behavior but break another. Our new course with @Nvidia shows how to use their NeMo Agent Toolkit to surface these issues with OpenTelemetry tracing, run https://x.com/DeepLearningAI/status/2001329113622073611

Baseten supports @nvidia Nemotron 3 Nano on day zero Up to 4× faster token generation, high accuracy, and predictable inference built for agentic AI. Available to deploy today on Baseten for high-performance inference. Read more here: https://x.com/basetenco/status/2000582868532121688

Introducing NVIDIA Nemotron 3 Nano, a fully open 30B with 3B active parameter hybrid MoE model engineered for maximum efficiency and benchmark-leading accuracy. AI natives can now use Nemotron 3 Nano on Together AI — with fast, reliable inference for specialized agentic systems https://x.com/togethercompute/status/2000572943718314392

NEWS: NVIDIA announces the NVIDIA Nemotron 3 family of open models, data, and libraries, offering a transparent and efficient foundation for building specialized agentic AI across industries. Nemotron 3 features a hybrid mixture-of-experts (MoE) architecture and new open https://x.com/nvidianewsroom/status/2000588337896198481

.@nvidia Nemotron 3 Nano is now available on Ollama! Local ollama run nemotron-3-nano Cloud ollama run nemotron-3-nano:30b-cloud https://x.com/ollama/status/2000820163231232167

🚀 Day-0 support for @NVIDIA Nemotron 3 Nano in SGLang SGLang now supports Nemotron 3 Nano on Day 0 🎉 A highly efficient, fully open Hybrid MoE model with 1M context, thinking budget, and industry-leading accuracy per compute. ✅ Open weights, data, and recipes ⚡ Fast, https://x.com/lmsysorg/status/2000567938949243111

As AI Grows More Complex, Model Builders Rely on NVIDIA | NVIDIA Blog https://blogs.nvidia.com/blog/leading-models-nvidia/

BREAKING CUDA MOAT EXPANDS: Today, NVIDIA has acquired SchedMD, makers of SLURM, a widely used “”open source”” workload scheduler. Many AI companies such as Mistral, Thinking Machines, parts of Meta’s FAIR division, university academic labs use SLURM. NVIDIA’s acquisition expands https://x.com/SemiAnalysis_/status/2000620209262985641

BREAKING: NVIDIA just dropped an open 30B model that beats GPT-OSS and Qwen3-30B — and runs 2.2-3.3× faster Nemotron 3 Nano: • Up to 1M-token context • MoE: 31.6B total params, 3.6B active • Best-in-class performance for SWE-Bench • Open weights + training recipe + https://x.com/AskPerplexity/status/2000589984818954719

First time I see a major org release @huggingface collections inside collections 🤯 Kudos @nvidia for this brilliant release https://x.com/NielsRogge/status/2000639749514760465

In collaboration with NVIDIA, the new Nemotron 3 Nano model is fully supported in llama.cpp Nemotron 3 Nano features an efficient hybrid, Mamba, MoE architecture. It’s a promising model, suitable for local AI applications on mid-range hardware. The large context window makes it”” / X https://x.com/ggerganov/status/2000574990425415765

Inside NVIDIA Nemotron 3: Techniques, Tools, and Data That Make It Efficient and Accurate | NVIDIA Technical Blog https://developer.nvidia.com/blog/inside-nvidia-nemotron-3-techniques-tools-and-data-that-make-it-efficient-and-accurate/

New mlx-lm release: pip install -U mlx-lm Includes support for a few new models: – Nemotron 3 Nano (Nvidia) – Devstral (Mistral) – rnj-1 (Essential AI) https://x.com/awnihannun/status/2000974327660077298

Nvidia continues to put out some of the strongest and fastest open models. Pretraining and post training data are released as well, something very few orgs have done”” / X https://x.com/tri_dao/status/2000707760288092655

NVIDIA Debuts Nemotron 3 Family of Open Models | NVIDIA Newsroom https://nvidianews.nvidia.com/news/nvidia-debuts-nemotron-3-family-of-open-models/?ncid=so-twit-561360

NVIDIA has just released Nemotron 3 Nano, a ~30B MoE model that scores 52 on the Artificial Analysis Intelligence Index with just ~3B active parameters Hybrid Mamba-Transformer architecture: Nemotron 3 Nano combines the hybrid Mamba-Transformer approach @NVIDIAAI has used on https://x.com/ArtificialAnlys/status/2000602570092675402

NVIDIA just released Nemotron-Agentic-v1 on Hugging Face This dataset empowers LLMs as interactive, tool-using agents for multi-turn conversations and reliable task completion. Ready for commercial use. https://x.com/HuggingPapers/status/2000628009049760072

NVIDIA just released Nemotron-Cascade-8B on Hugging Face A powerful 8B general-purpose reasoning model that achieves best-in-class performance across diverse benchmarks, from math to coding, by using novel Cascade RL. https://x.com/HuggingPapers/status/2001065870676603333

NVIDIA releases Nemotron 3 Nano, a new 30B hybrid reasoning model! 🔥 Nemotron 3 has a 1M context window and the best in class performance for SWE-Bench, reasoning and chat. Run the MoE model locally with 24GB RAM. Guide: https://x.com/UnslothAI/status/2000568378407452746

Really impressive release from NVIDIA, who not only went head-to-head with Qwen3, but: – innovated on the architecture (risky for most open labs) – did legit multi-env RL, complete with agentic evals (first time I see this from an open lab) – plan to open source the pretraining”” / X https://x.com/_lewtun/status/2000599470099099990

SemiAnalysis InferenceMAX showing GPT OSS on Blackwell is 33% more tokens per $ in just 1 month thanks to the awesome work of @vllm_project and @nvidia”” / X https://x.com/dylan522p/status/2002135815233970295

This is not just another strong open model. Nemotron actually releases training data (!), RL environments, and training code. This is a big difference: almost all model developers just want people to use their models; NVIDIA is enabling people to make their own models. We are”” / X https://x.com/percyliang/status/2000608134205985169

Today, @NVIDIA is launching the open Nemotron 3 model family, starting with Nano (30B-3A), which pushes the frontier of accuracy and inference efficiency with a novel hybrid SSM Mixture of Experts architecture. Super and Ultra are coming in the next few months. https://x.com/ctnzr/status/2000567572065091791

vLLM delivers even more inference performance with the same GPU platform. In just 1 month, we’ve worked with NVIDIA to increase @nvidia Blackwell maximum throughput per GPU by up to 33% — significantly reducing cost per token — while also enabling even higher peak speed for https://x.com/vllm_project/status/2001449658984632699

When @NVIDIA announced Nemotron 3 – it marked a symbolic turning point in a year that fundamentally reshaped open-source AI leadership. Is NVIDIA the new open-source king? What’s behind this strategy? Let’s see. ▪️ It releases 3 trillion tokens of new pretraining, 18 million https://x.com/TheTuringPost/status/2001087448299065372

Compute enabled our first image generation launch (and a +32% jump in WAU over the following weeks) as well as our latest image generation launch yesterday. We have a lot more coming… and need a lot more compute. https://x.com/OpenAI/status/2001336514786017417

SkyPilot + @NVIDIA Dynamo: unmatched flexibility for inference workloads ⚡ Blazing fast MoE inference with PD disaggregation & KV-aware routing ☁️ Deploy on any cloud or k8s in minutes 🔌 Drop-in OpenAI API replacement Recipe: https://x.com/skypilot_org/status/2000999292333666339

First look at a learning system running on real hardware. This is a simple demo, but still a meaningful milestone. What stands out is not complexity, but consistency. So far, what’s visible: • Smooth and repeatable motions • Stable execution from demonstration-only policies https://x.com/IlirAliu_/status/1999554110933062022

Mech interp question: do the new nemotron models make use of negative zero for any circuits?”” / X https://x.com/andrew_n_carr/status/2000744793480270236