Chips and Hardware: AI News Week Ending 09/05/2025

Chips and Hardware: AI News Week Ending 09/05/2025

September 5, 2025

Image created with Flux Pro v1.1 Ultra. Image prompt: Chips, “banana chips” pun: dried banana chips arranged as a microchip die, gold connector pins formed by aligned small bananas, photorealistic, editorial, minimal, high detail, 3:2 landscape

Alibaba shares jump 19% on cloud unit growth, report of new AI chip https://www.cnbc.com/2025/09/01/alibaba-shares-hong-kong-today.html

Alibaba reportedly developing new AI chip as China’s Xi rejects AI’s ‘Cold War mentality’ | Euronews https://www.euronews.com/next/2025/09/01/alibaba-reportedly-developing-new-ai-chip-as-chinas-xi-rejects-ais-cold-war-mentality

NVIDIA continues to lead on open-sourcing pretraining data — Nemotron-CC-v2 has dropped! https://x.com/ZeyuanAllenZhu/status/1962119316427706828

Jensen on NVIDIA Q2 Earnings Call: “”Our new robotics computing platform, Thor, is now available. Thor delivers an order of magnitude greater AI performance and energy efficiency than NVIDIA’s AGX Orin. It runs the latest generative and reasoning AI models at the edge in real https://x.com/TheHumanoidHub/status/1961342309209100670

Nvidia launched Jetson AGX Thor, a $3,499 chip for real-time physical AI It uses a 2,560-core Blackwell GPU, 96 fifth-generation Tensor cores, and 128GB of memory to deliver up to 2,070 FP4 teraflops of AI compute https://x.com/adcock_brett/status/1962184408246415687

NVIDIA’s Jetson AGX Thor, a $3,499 ‘robot brain,’ is now available. Powered by a Blackwell GPU with 128GB memory, it delivers up to 2,070 FP4 teraflops in 130W. Early adopters include Boston Dynamics, Agility, and Figure—pushing humanoid robotics into a new era. 🤖✨ https://x.com/StarSnap_1/status/1960153258389053561

OpenAI Plans to Build Data Center in India in Major Stargate Expansion in Asia – Bloomberg https://www.bloomberg.com/news/articles/2025-09-01/openai-plans-india-data-center-in-major-stargate-expansion?srnd=phx-technology&embedded-checkout=true

NEW: Google is talking to several GPU cloud providers about putting its tensor processing units in their data centers.
The push to expand in the data centers of Nvidia-focused cloud providers is a new strategy for Google. https://x.com/anissagardizy8/status/1963228123144819167

60 years of exponential growth in chip density was achieved not through one breakthrough or technology, but a series of problems solved and new paradigms explored as old ones hit limits. I don’t think current AI has hit a wall, but even if it does, there many paths forward now. https://x.com/emollick/status/1962742358404948087

A mini factory in a box – Runs 24/7 – Costs $5,000 – Tools can be swapped – Builds electronics by itself Credit: MicroFactory https://x.com/IlirAliu_/status/1961332536350527542

Disappointingly, AMD currently has over 200 unit tests in PyTorch that are skipped exclusively (skipIfRocm) on ROCm and not on CUDA, along with another 200+ tests explicitly disabled for ROCm. The situation has deteriorated since the AMD Advancing AI event in June 2025. Since https://x.com/SemiAnalysis_/status/1963708743218339907

Finland unveils world’s largest sand battery for heating https://newatlas.com/energy/largest-sand-battery-finland-pornainen/

Never felt that gpu poor but result are kinda impressive (depsite the y axis crime chart) blog: https://x.com/eliebakouch/status/1962806132193333668

Since AMD’s Advancing AI event, there is an net change of over 160+ newly disabled PyTorch unit tests on exclusively ROCm. There is a direct correlation between the the number of disabled/skipped test and the end user experience. We are glad that @AnushElangovan and his team is”” / X https://x.com/dylan522p/status/1963711185225687267

📣Groq’s first agentic system is ready for production at scale. Already battle tested by 100K+ developers across 5M+ requests. Compound is now GA, available to everyone on GroqCloud. Go Build ⬇️ https://x.com/GroqInc/status/1963635205899710798

We tested our controllers in hardware on the real LIGO system. Our results show that Deep Loop Shaping: 🔹controls noise up to 30-100 times better than existing controllers. 🔹can eliminate the most unstable, difficult feedback loop as a meaningful source of noise on LIGO for https://x.com/GoogleDeepMind/status/1963664045216579999

For the first time, we show that GPU-accelerated database systems can be both faster AND cheaper than their CPU counterparts https://x.com/bailuding/status/1962269979262542044

🇸🇪 Together AI now has GPU infrastructure located in Sweden – Lower latency across Europe – EU data residency & compliance – GPU clusters + endpoints on demand – Serverless API for GPT-OSS, DeepSeek, Llama, Qwen https://x.com/togethercompute/status/1963498998720872686

We are ending strong with GPU Programming 🚀! 2 talks today back to back! First @exists_forall for intro to CUDA and then @simran_s_arora for Thunder Kittens 🐈! Today at: 1:00pm EST / 11:00am PT – https://x.com/jyo_pari/status/1961442690249216491

ZeroGPU on 🤗 HF Spaces enables anyone to build delightful ML demos, benefitting from powerful compute. But, due to its serverless nature, it is hard to optimize these demos. That CHANGES today 🪖 Use AoT compilation to melt our ZeroGPU servers 🔥 Details ⬇️ https://x.com/RisingSayak/status/1962844485118996545

ZeroGPU on Hugging Face enables anyone to build and deploy AI apps dynamically allocates and releases NVIDIA H200 GPUs as needed But, due to its serverless nature, it is hard to optimize these apps Now use AoT compilation to melt ZeroGPU servers on Hugging Face for vibe https://x.com/_akhaliq/status/1962920105186115621

NVIDIA Blackwell GPUs are incredible, but requires knowing the hardware to get the most out of it. This blog post series aims to demystify what it takes to get peak performance out of this sophisticated device!”” / X https://x.com/clattner_llvm/status/1961491323875455029

So, Nvidia is doing ablation on 13B model for 10T token only to show their 4-bit (NVFP4) training is stable? https://x.com/eliebakouch/status/1962805948184998064

Rest the world finally picking up on Google selling TPUs externally.
In the Accelerator model, we’ve discussed details about it over a month ago months regarding both TPUv7 Ghostfish and TPUv8 Sunfish / Zebrafish https://x.com/dylan522p/status/1963355683170246659

🍁 In collaboration with @NVIDIAAIDev, @RedHat_AI, and @VectorInst, vLLM is hosting a meetup in Toronto September 25th! Come hear about project update, distributed inference, EAGLE spec decode, and FlashInfer! https://x.com/vllm_project/status/1963736578674893071

What is @OpenAI’s Responses API, and should you use it instead of Chat Completions? 🤔 TL;DW: → Built for agents (& remote MCPs 🤫) → Better streaming control → Better structured + multimodal outputs The Responses API is available now on @GroqInc in beta. https://x.com/benankdev/status/1961444239327240500

Hmmm… @Jack2LOneill how come @MrBeast gets a tour of SGC before all the Stargate enjoyers? Maybe he *is* one himself? 😂 https://x.com/bilawalsidhu/status/1962560442842181656

We said “coming weeks” and proceeded to add three over the holiday weekend.😅 Now live on @OpenRouterAI via our W&B Inference service: • @deepseek_ai V3.1 • @OpenAI’s gpt-oss-120b & gpt-oss-20b https://x.com/weights_biases/status/1962943063711744115

Modern AI teams need hyperscalers & neoclouds, but legacy tools like SLURM can’t keep up. @AbridgeHQ moved from SLURM to multi-cloud AI infra with @skypilot_org. ✅ 10x faster dev cycles ✅ SLURM-like convenience, K8s’ reliability ✅ Scale on any infra https://x.com/skypilot_org/status/1963637217055646139

GPT-OSS uses MXFP4 quantization (which MLX now supports). There are two FP4 formats circulating right now: MXFP4 and NVFP4 (NV for Nvidia). From looking at how GPT-OSS uses MXFP4, it is somewhat suboptimal. I’m thinking NVFP4 will be the more commonly used format in the https://x.com/awnihannun/status/1961500133990043967