Ethan B. Holland

Over 54,900 manually organized AI links and counting

Chips and Hardware: AI News Week Ending 12/05/2025

December 5, 2025

Image created with gemini-2.5-flash-image with claude-sonnet-4-5. Image prompt: Ultra-minimalist photograph of a single silicon wafer on a chrome examination table in a vast empty semiconductor clean room, harsh overhead lighting creating dramatic shadows, cold blue-grey color palette with metallic accents, architectural photography style emphasizing negative space and isolation, the word CHIPS in bold white sans-serif text prominently overlaid

Microsoft and Nvidia are investing up to $10 billion and $5 billion respectively in Anthropic, making Claude available on all three major cloud platforms (Microsoft, Google, Amazon). As part of the deal, Anthropic will buy $30 billion in computing capacity from Microsoft. The https://x.com/DeepLearningAI/status/1996081964395200773

NVIDIA and AWS Expand Full-Stack Partnership | NVIDIA Blog https://blogs.nvidia.com/blog/aws-partnership-expansion-reinvent/

🐣 It happened. Our decentralized confidential compute network, Cocoon, is live. The first AI requests from users are now being processed by Cocoon with 100% confidentiality. GPU owners are already earning TON. https://x.com/durov/status/1995208789600182391?s=20

Databricks reportedly in talks to raise $5B at $134B valuation – SiliconANGLE https://siliconangle.com/2025/11/30/databricks-reportedly-talks-raise-5b-134b-valuation/

Deutsche Telekom and Schwarz Group to build AI data centre, German newspaper reports | Reuters https://www.reuters.com/business/media-telecom/deutsche-telekom-schwarz-group-build-ai-data-centre-german-newspaper-reports-2025-11-30/

Excited to announce that @Azaliamirh and I are launching @RicursiveAI, a frontier AI lab creating a recursive self-improving loop between AI and the hardware that fuels it. Today, chip design takes 2-3 years and requires thousands of human experts. We will reduce that to weeks.”” / X https://x.com/annadgoldie/status/1995936368959389809?s=20

Introducing Ricursive Intelligence, a frontier AI lab enabling a recursive self-improvement loop between AI and the chips that fuel it. Learn more at https://x.com/RicursiveAI/status/1995932204703346946

LLMs are getting crazily good at reasoning — but also crazily slow. Hard problems can make them think for hours. Why? Even with tons of GPUs, they still decode one. token. at. a. time.⏳ More GPUs ≠ faster answers Our ThreadWeaver🧵⚡asks: “Why not make LLMs think in parallel?” https://x.com/LongTonyLian/status/1995561005557186963

Soooo satisfying to watch… Most teams underestimate how much reliability work happens after the product is built. This photo shows a small detail that decides if your hardware survives real deployments: An IP67 seal applied directly on the board to protect against water, dust, https://x.com/IlirAliu_/status/1995931745255092391

Touching the Elephant – TPUs | Consider the Bulldog https://considerthebulldog.com/tte-tpu/

@awnihannun added batched generation to MLX-LM >2 months ago. Everybody, since, has been asking for batching in the MLX-LM server. Well, enjoy the first version in the latest MLX-LM release. The following video is serving 4 consecutive requests for Qwen3 30B on an M2 Ultra. https://x.com/angeloskath/status/1996364526749639032

EU to Open Bidding for AI Gigafactories in Early 2026 – WSJ https://www.wsj.com/tech/ai/eu-to-open-bidding-for-ai-gigafactories-in-early-2026-809b7570?st=k7U8kH&reflink=desktopwebshare_permalink

a new visual document retriever model just outperformed NVIDIA’s models on ViDoRe v2 😮 EvoQwen2.5-VL comes in two sizes, 3B and 7B, outperforming NVIDIA’s latest retrievers all commercially permissive 🙌🏻 https://x.com/mervenoyann/status/1996221079757439374

Beyond the engine, v0.12.0 ships EAGLE speculative decoding improvements, new model families, NVFP4 / W4A8 / AWQ quantization options, and tuned kernels across NVIDIA, AMD ROCm, and CPU. We recommend building new images with PyTorch 2.9.0 + CUDA 12.9, validating on staging”” / X https://x.com/vllm_project/status/1996947375827701892

Great to see things heading this direction. But be warned: it’s blackwell-only — i.e only the very latest, most expensive GPUs. So if you make stuff with this, most people won’t be able to use it.”” / X https://x.com/jeremyphoward/status/1997087621085122999

NVIDIA Advances Open Model Development for Digital and Physical AI | NVIDIA Blog https://blogs.nvidia.com/blog/neurips-open-source-digital-physical-ai/

NVIDIA and Synopsys Announce Strategic Partnership to Revolutionize Engineering and Design | NVIDIA Newsroom https://nvidianews.nvidia.com/news/nvidia-and-synopsys-announce-strategic-partnership-to-revolutionize-engineering-and-design

NVIDIA just introduced CUDA Tile – the biggest change to CUDA since 2006. ▪️ It shifts GPU programming from thread-level SIMT to tile-based operations, where you define data chunks (tiles) and the system optimizes how they run. • At its core is CUDA Tile IR, a virtual https://x.com/TheTuringPost/status/1997096340611019089

We joined forces with NVIDIA to unlock high-speed AI inference on RTX AI PCs and DGX Spark using llama.cpp. The latest Ministral-3B models reach 385+ tok/s on @NVIDIA_AI_PC GeForce RTX 5090 systems. Blog: https://x.com/ggerganov/status/1995931445425271232

NVIDIA Shatters MoE AI Performance Records With a Massive 10x Leap on GB200 ‘Blackwell’ NVL72 Servers, Fueled by Co-Design Breakthroughs https://wccftech.com/nvidia-shatters-moe-ai-performance-records-with-a-massive-10x-leap-on-gb200-nvl72/

Flexion Robotics is a rising embodied AI startup from Zurich, Switzerland. The company aims to build a general-purpose autonomy stack for humanoid robots. The founders are alumni from ETH Zurich and NVIDIA, specializing in reinforcement learning (RL) and control systems. The https://x.com/TheHumanoidHub/status/1995768893164413184

Intel presents SignRoundV2 Closing the Performance Gap in Extremely Low-Bit Post-Training Quantization for LLMs https://x.com/_akhaliq/status/1996975161854017702

The Art of Scaling Test-Time Compute for LLMs This is a large-scale study of test-time scaling (TTS). It also provides a practical recipe for selecting the best test-time scaling strategy. (bookmark it) My takeaways: Test-time compute scaling works – Allocating more https://x.com/omarsar0/status/1995862532310057320

Thrilled to share that @annadgoldie and I are launching @RicursiveAI, a frontier lab enabling recursive self-improvement through AIs that design their own chips. Our vision for transforming chip design began with AlphaChip, an AI for layout optimization used to design four”” / X https://x.com/Azaliamirh/status/1995937492194001367

Transformers v5’s first release candidate is out 🔥 The biggest release of my life. It’s been five years since the last major (v4). From 20 architectures to 400, 20k daily downloads to 3 million. The release is huge, w/ tokenization (no slow tokenizers!), modeling & processing. https://x.com/LysandreJik/status/1995558230567878975