Ethan B. Holland

Over 54,900 manually organized AI links and counting

Chips and Hardware: AI News Week Ending 04/18/2025

April 18, 2025

Image created with Ideogram V2. Image prompt: A vibrant spring meadow with exaggerated blooming flowers in bright colors. Hidden comically in the middle is a massive computer chip/processor the size of a small house, with circuit pathways glowing and tiny transistors blinking. It’s ineffectively covered with leaves and has a few birds nesting in its heat sinks. Silicon wafers reflect sunlight like mirrors. Woodland animals are playing hopscotch across the chip’s architecture. RAM sticks sprout from the ground like strange technological plants. The whole scene is bathed in golden sunshine with lens flares. Vibrant colors and high detail. The word “CHIPS” integrated into the scene.

Nvidia to produce AI servers worth up to $500 billion in US over four years | Reuters https://www.reuters.com/technology/artificial-intelligence/nvidia-says-working-with-partners-make-ai-supercomputers-us-2025-04-14/

Jerry Tworek on X: “Scaling is incredibly hard and demanding and leaves very little room for error in every little part of the training stack But once it works, it’s beautiful to see it https://t.co/I13hW5gAuE” / X
https://x.com/MillionInt/status/1912568397419954642

“Jim Fan’s predictions: ⦿ In the next 2–5 years, robotics will uncover its own scaling laws – similar to those seen in LLMs – by analyzing how model size, real‑world data, simulation data, and compute affect performance. ⦿ Within the next 20 years, robotics will accelerate https://x.com/TheHumanoidHub/status/1910367639425384568

“There is an alternate reality where Cray took their vector supercomputers, ditched FP64 calculations, and went with one FP32 pipe and a BF16 tensor core pipe. The same instruction set, memory architecture, and vector registers would have made a sweet deep learning machine, in https://x.com/ID_AA_Carmack/status/1911872001507016826

“Three key enablers, according to Jim Fan, are aligning to bring robotics into the mainstream: ⦿ Foundation models that can understand and reason with the 3D world ⦿ GPU-accelerated simulation addressing data scarcity ⦿ Cheaper and more capable robot hardware https://x.com/TheHumanoidHub/status/1910606309210415406

“Anyone purchased a RTX 5090 windows laptop they’re exceedingly happy with? I might wait it out for the Razer Blade 18, but some of the OEM alternatives look pretty good.” / X https://x.com/bilawalsidhu/status/1912253808958403073

“If you’re excited about optimizing code that runs equally well on a single or thousands of GPUs and if you have the ability to submit a single substantial PR to a major OSS library, we want you on the PyTorch team – especially if you’re early in your career.” / X https://x.com/marksaroufim/status/1912540037625094457

“🔔 Epic sight on the floor of the @NYSE! The NYSE lit up its iconic boards to celebrate Together AI’s selection to the 2025 @Forbes AI 50 list. Seeing https://x.com/togethercompute/status/1912990460416803085

“Glad to share Seaweed-7B, a cost-effective foundation model for video generation. Our tech report highlights the key designs that significantly improve compute efficiency and performance given limited resources, achieving comparable quality against other industry-level models. To https://x.com/CeyuanY/status/1911618555210334350

Stable Diffusion Now Optimized for AMD Radeon™ GPUs and Ryzen™ AI APUs — Stability AI https://stability.ai/news/stable-diffusion-now-optimized-for-amd-radeon-gpus

[2504.10449] M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models https://arxiv.org/abs/2504.10449

“If GPU optimization and systems problems excite you, why limit your impact to a single company or lab? Working on PyTorch allows you to ship impact to the entire AI industry! We’re hiring across experiences — junior and senior engineers. Read more below 👇” / X https://x.com/soumithchintala/status/1912600604657975595

“OpenAI will change their naming scheme from GPT-4 to GPU-4, GPV-4, GPW-4, GPX-4 as they have run out of possible numbers” / X https://x.com/scaling01/status/1911912903260721331

“Transformers process long sequences inefficiently; Mamba models face unstable learning and generalization. This paper introduces TransMamba, unifying Transformer and Mamba architectures using shared parameters. It dynamically switches between Attention and State Space Model https://x.com/rohanpaul_ai/status/1911543921957880186

“if you are interested in infrastructure and very large-scale computing systems, the scale of what’s happening at openai right now is insane and we have very hard/interesting challenges. please consider joining us! we could desperately use your help.” / X https://x.com/sama/status/1911504090989035824

“Github 👨‍🔧: Serverless AI Workflows for Data & ML Teams Use it to build and orchestrate sophisticated AI workflows and agents. Julep manages complex operations, state across interactions, and integrates with your data infrastructure and tools, letting you compose and scale https://x.com/rohanpaul_ai/status/1912018297476026431

“Dating advice: if you go on your first date with GPU indexing, stay as logical as you can. Whatever happens, don’t get too physical.” / X https://x.com/hyhieu226/status/1912933636879585518

“As we all know by now, reasoning models often generate longer responses, which raises compute costs. Now, this new paper ( https://x.com/rasbt/status/1911494805101986135

“. @Google ‘s new TPU Ironwood, built for the “age of inference”. The performance numbers are unbelievable.. 🔥 – 192 GB per chip, 6x that of Trillium, which enables processing of larger models and datasets, reducing the need for frequent data transfers and improving https://x.com/rohanpaul_ai/status/1911378316285616301

“384 Huawei Ascend 910Cs > GB300NVL72. 300 PFLOPS/server? I guess they compare to NVL72’s 180, that’s TF32, naively means 600 PFLOPS FP16, and 1 910C being 3.2x slower than 1 Blackwell Ultra. Or… 1.6x? should be possible to make 2000 such units with TSMC loot as reported by CSIS. https://x.com/teortaxesTex/status/1911683572953493750

“Huawei’s new AI server is insanely good People need to reset their priors This is why banning H20 without banning tools and sub components is idiotic because Huawei is not far behind H20. The admin needs to act fast to slow down Huawei’s ramp or the H20 ban will be useless” / X https://x.com/dylan522p/status/1912373100668137883

Nvidia H20 chip exports hit with license requirement by US government | TechCrunch https://techcrunch.com/2025/04/15/nvidia-h20-chip-exports-hit-with-license-requirement-by-us-government/

NVIDIA 8K Other Events | NVDA 15 Apr 25 https://capedge.com/filing/1045810/0001045810-25-000082/NVDA-8K

NVIDIA to Manufacture American-Made AI Supercomputers in US for First Time | NVIDIA Blog https://blogs.nvidia.com/blog/nvidia-manufacture-american-made-ai-supercomputers-us/

“Still not entirely finished, but here are some WIP ansible playbooks for AMD-SEV with nvidia confidential compute. Need to finish the intel TDX version also. What a pain. https://x.com/jon_durbin/status/1911710236529852787

nvidia/Nemotron-H-8B-Base-8K · Hugging Face https://huggingface.co/nvidia/Nemotron-H-8B-Base-8K

“Nvidia released Llama Nemotron-Ultra, a 253B param reasoning AI that beats DeepSeek R1, Llama 4 Behemoth and Maverick The model includes a reasoning toggle to optimize for cost and is fully open-source with code, weights, and post-training data on HF https://x.com/adcock_brett/status/1911450216164700252

Exclusive: Alphabet, Nvidia invest in OpenAI co-founder Sutskever’s SSI, source says | Reuters https://www.reuters.com/technology/artificial-intelligence/alphabet-nvidia-invest-openai-co-founder-sutskevers-ssi-source-says-2025-04-12/

EquiVDM https://research.nvidia.com/labs/genair/equivdm/

“Nvidia and Stanford researchers unveiled an AI technique to generate consistent, minute-long cartoons Test-Time Training works on top of existing models, producing multi-scene stories with dynamic motion and character interactions Here’s an example: https://x.com/adcock_brett/status/1911450240143536333

“Our new @OpenAI o3 and o4-mini models further confirm that scaling inference improves intelligence, and that scaling RL shifts up the whole compute vs. intelligence curve. There is still a lot of room to scale both of these further. https://x.com/polynoamial/status/1912564068168450396

“Something magic about these charts with compute on the x-axis, and clear steady improvements (across many metrics) on the y-axis. Really makes you feel like the field is uncovering some underlying fundamental laws of intelligence.” / X https://x.com/gdb/status/1912590623955382445

“We’re partnering with @CerebrasSystems to bring the fastest Llama 4 experience right to you! 🔥 Join us tomorrow in our hackathon to build real-time systems, code agents/ assistants AND more! Bonus: We’re giving 20USD free inference credits and one month Pro subscription to all https://x.com/huggingface/status/1910801830126174632

[2410.16144] 1-bit AI Infra: Part 1.1, Fast and Lossless BitNet b1.58 Inference on CPUs https://arxiv.org/abs/2410.16144