Ethan B. Holland

Over 54,900 manually organized AI links and counting

Chips and Hardware: AI News Week Ending 12/12/2025

December 12, 2025

Image created with gemini-2.5-flash-image with claude-sonnet-4-5. Image prompt: Black and white photograph of stratocumulus clouds arranged in precise horizontal layers against dark sky, resembling semiconductor cross-section with distinct strata from dense low clouds to wispy high cirrus, bold sans-serif title card CHIPS anchored in bottom dark layer, high contrast film photography, minimal composition looking straight up

Broadcom reveals its mystery $10 billion customer is Anthropic https://www.cnbc.com/2025/12/11/broadcom-reveals-its-mystery-10-billion-customer-is-anthropic.html

A new open-source physics engine just dropped… and it could change how robots learn. Newton, built by @nvidia with support from @GoogleDeepMind and Disney Research, is now part of The Linux Foundation. It’s designed to bring precise, GPU-powered physics to robotics and https://x.com/IlirAliu_/status/1997014378508423344

Nvidia Gets US Approval for H200 AI Chip Exports to China – Bloomberg https://www.bloomberg.com/news/articles/2025-12-08/nvidia-set-to-win-us-approval-to-export-h200-ai-chips-to-china

Trump: Nvidia can sell H200 AI chips to China if U.S. gets 25% cut https://www.cnbc.com/2025/12/08/trump-nvidia-h200-sales-china.html

We have just used the @Nvidia H100 onboard Starcloud-1 to train the first LLM in space! We trained the nano-GPT model from Andrej @Karpathy on the complete works of Shakespeare and successfully ran inference on it. We have also run inference on a preloaded Gemma model, and we https://x.com/AdiOltean/status/1998769997431058927

Nvidia-backed Starcloud trains first AI model in space, orbital data centers https://www.cnbc.com/2025/12/10/nvidia-backed-starcloud-trains-first-ai-model-in-space-orbital-data-centers.html

Microsoft Deepens Its Commitment to Canada with Landmark $19B AI Investment – Microsoft On the Issues https://blogs.microsoft.com/on-the-issues/2025/12/09/microsoft-deepens-its-commitment-to-canada-with-landmark-19b-ai-investment/

Actual first principles would have you building nuclear-powered AI data centers here, not in space. https://x.com/YIMBYLAND/status/1998785782082056626

Everyone throws around “first principle thinking” deserves to be challenged w a follow-up question: what principle are you referring? The common misconception that cooling in space is “”automatic”” stems from space’s near-absolute-zero temperature (~3 K), but in a vacuum, there’s”” / X https://x.com/jenzhuscott/status/1998591718338486757

I really don’t get this. Terrestrial power via gas & solar & batteries is very cheap. Build generation onsite and you don’t have to pay for transmission. These are 30+ year assets. How is it remotely possible that this will be cheaper in space?”” / X https://x.com/clawrence/status/1998753444598010254

One common issue of sample packing is it changes the loss curves and dynamics Unsloth now has padding free support for Llama, Qwen, Mistral, Gemma and more, and removes all padding automatically, enabling identical loss and grad norms, but 2x faster and uses 50% less VRAM! https://x.com/danielhanchen/status/1998770349975155060

We just rebooted CoreWeave Mission Control — the operating standard for AI at scale. https://x.com/CoreWeave/status/1998381210884571452

What does it take to scale self-distillation across many GPUs? Self-distillation replaces the fixed, pre-trained teacher in standard knowledge distillation with a teacher that is an exponential moving average (EMA) of the student’s weights: – The student is trained with https://x.com/TheTuringPost/status/1997090382689993090

We’ve identified an Alibaba data center in Zhangbei, China with an estimated 200-500 MW capacity. Roughly half of that predates the AI boom and likely hosts little modern compute. But several newer buildings show a high power density consistent with advanced AI chips. 🧵 https://x.com/EpochAIResearch/status/1997013150072557759

Low-bit LLM quantization doesn’t have to mean painful accuracy trade-offs or massive tuning runs. Intel’s AutoRound PTQ algorithm is now integrated into LLM Compressor, producing W4A16 compressed-tensor checkpoints you can serve directly with vLLM across Intel Xeon, Gaudi, Arc https://x.com/vllm_project/status/1998710451312771532

Google is taking two different paths to build its next-generation TPU v8 systems: TPUv8ax (Sunfish) and TPUv8x (Zebrafish). The key difference is how much is fully built by a supplier versus how much Google sources and assembles on its own. With TPUv8ax “Sunfish,” Broadcom https://x.com/SemiAnalysis_/status/1998830078629724596

13 secs for 5 GB between @huggingface & @googlecloud thanks to our new collaboration 🤯🤯🤯 https://x.com/ClementDelangue/status/1998157804020941044

This Google paper presented at #NeurIPS2025 is a true gem. In their search for a better backbone for sequence models, they: • Reframe Transformers & RNNs as associative memory systems driven by attentional bias • Reinterpret “”forgetting”” as retention regularization, not as https://x.com/TheTuringPost/status/1997808277116338266

US accuses 2 men of smuggling Nvidia chips to China amid Trump AI announcement https://www.axios.com/2025/12/09/us-doj-nvidia-chips-smuggling-china-ai

At reported selling prices of $30-40k, this implies chip-level gross margins around 80%. But NVIDIA’s realized Blackwell margins are likely lower since most revenue comes from servers and racks bundled with other components, which likely carry lower margins. NVIDIA’s overall”” / X https://x.com/EpochAIResearch/status/1998819296353595424

Deep Dive into NVIDIA’s Virtuous Cycle – Philippe Oger https://philippeoger.com/pages/deep-dive-into-nvidias-virtuous-cycle

How much does it cost NVIDIA to make a B200 GPU? By our estimate, around $6,400, implying chip-level gross margins of ~80%. The counterintuitive part: the logic die, the core component of the GPU, is one of the cheaper parts, accounting for less than 15% of the total cost. 🧵 https://x.com/EpochAIResearch/status/1998819237251657890

Just released: Multi2Vec CLIP inference container 1.5.0 🎉 This release contains: – Support for facebook MetaClip2 models – Support for ModernVBERT/modernvbert-embed model – Added support for running inference container on NVIDIA Jetson devices Check out the docs to spin it https://x.com/weaviate_io/status/1998060177501614130

NVIDIA just introduced CUDA Tile – the biggest change to CUDA since 2006. ▪️ It shifts GPU programming from thread-level SIMT to tile-based operations, where you define data chunks (tiles) and the system optimizes how they run. • At its core is CUDA Tile IR, a virtual https://x.com/TheTuringPost/status/1997096340611019089

🚀 New InferenceMAX results are live! The team at @NVIDIA has pushed the boundaries of sglang-dsr1-1k1k-FP8 on the @SemiAnalysis_ InferenceMAX dashboard. The new submission delivers: 🔹 20% higher peak throughput 🔹 4260 tok/s/GPU at 30 TPS/user 🔹 Interactivity extended to 102 https://x.com/lmsysorg/status/1998454089903226967

Long horizon robotics is hard. Robots drift, forget the goal, and fall apart when a task takes hundreds of steps. This small independent team hit rank 1 on Stanford’s Behavior1K benchmark by Fei Fei Li’s lab… Ahead of teams from NVIDIA, CMU, and several big labs. What they https://x.com/IlirAliu_/status/1997322375965143399

Robotics feels upside down sometimes. We can get humanoids to kick and jump, but opening a door with RGB is still a frontier. This new work from NVIDIA GEAR shows we are finally getting closer. DoorMan is a sim to real pipeline that learns the full door opening behavior from https://x.com/IlirAliu_/status/1996520475673899112

DeepSeek is Using Banned Nvidia Chips in Race to Build Next Model — The Information https://www.theinformation.com/articles/deepseek-using-banned-nvidia-chips-race-build-next-model

Microsoft’s Fairwater Atlanta (today’s largest data center) could likely train over 20 models the size of GPT-4 in the course of a month. This computational power will enable AI companies to increase the number and scale of both experiments and training runs. https://x.com/EpochAIResearch/status/1997040687561449710

nanoGPT – the first LLM to train and inference in space 🥹. It begins.”” / X https://x.com/karpathy/status/1998806260783919434

As AI grows more complex, model builders rely on NVIDIA. @OpenAI’s GPT-5.2 and other leading models including @Runwayml leverage NVIDIA’s tech stack to advance frontier of AI. Read More: https://x.com/nvidia/status/1999198240407699710