Chips and Hardware: AI News Week Ending 03/06/2026

Chips and Hardware: AI News Week Ending 03/06/2026

March 6, 2026

Image created with gemini-3.1-flash-image-preview with claude-sonnet-4-5. Image prompt: High altitude aerial photography of a reflective silicon wafer with etched circuit patterns freefalling through crisp blue sky, spinning and catching sunlight with prismatic reflections, golden bond wires trailing like streamers, the word CHIPS in large bold typography integrated into the sky, miniature circuit board landscape far below, bright daylight, wide angle, dynamic action shot, clean composition.

Dang. Not to mention all the GPUs and TPUs Amazon and Google provide to Anthropic.”” https://x.com/bilawalsidhu/status/2027530947051045011

A statement on the comments from Secretary of War Pete Hegseth.”” https://x.com/AnthropicAI/status/2027555481699446918

AI Defense Contractors: The Economics Behind the Pentagon Pivot – philippdubach.com https://philippdubach.com/posts/when-ai-labs-become-defense-contractors/

Anthropic and Alignment Anthropic is in a standoff with the Department of War; while the company’s concerns are legitimate, it position is intolerable and misaligned with reality.”” https://x.com/stratechery/status/2028425096054931921

Anthropic Labeled Supply-Chain Risk, Pentagon Says – Bloomberg https://www.bloomberg.com/news/articles/2026-03-05/pentagon-says-it-s-told-anthropic-the-firm-is-supply-chain-risk

Anthropic Made Pitch in Drone Swarm Contest During Pentagon Feud – Bloomberg https://www.bloomberg.com/news/articles/2026-03-02/anthropic-made-pitch-in-drone-swarm-contest-during-pentagon-feud

Anthropic-Palantir Partnership at Risk After Pentagon Decision — The Information https://www.theinformation.com/articles/anthropic-palantir-partnership-risk-pentagon-ruling

NEW: After the Pentagon threatened to designate Anthropic a “supply chain risk,” Anthropic’s relationship with government contractor Palantir may be at risk. Palantir could stop using Anthropic for federal work, which makes up ~42% of Palantir’s $4.5B in revenue last year.”” https://x.com/srimuppidi/status/2028943303581024412

NVIDIA CFO on the earnings call: “”Physical AI is here, having already contributed north of $6B in NVIDIA Corporation revenue in fiscal year 2026.”” Jensen: “”Now, the thing that is the wave that we are seeing now is the agentic AI inflection, and the next inflection beyond”” https://x.com/TheHumanoidHub/status/2026815968807366703

Nvidia, Amazon, Google will have to divest from Anthropic if Hegseth gets his way. This is simply attempted corporate murder. I could not possibly recommend investing in American AI to any investor; I could not possibly recommend starting an AI company in the United States.”” https://x.com/deanwball/status/2027515599358730315

One legal point: The DoW “”supply chain risk”” designation applies to DoW *contracts,* not generally. DoW can tell suppliers “”don’t use Anthropic when performing DoW contracts.”” But DoW can’t, legally, tell its contractors “”don’t use Anthropic even in your private contracts.”””” https://x.com/petereharrell/status/2027517998555160645

Scoop: Anthropic’s business partnership with Palantir could be the first casualty of its Pentagon spat”” https://x.com/aaronpholmes/status/2028942999548297464

Sorry if this is woke or whatever but it is FUCKING INSANE that the DoD is explicitly, publicly trying to create an AI powered mass surveillance program of American citizens, attempting to destroy a company for refusing to comply with that directive, and *bragging about doing so*”” https://x.com/quantian1/status/2027537341410160917

Thank you for your attention to this matter. cc: @AnthropicAI @DarioAmodei”” https://x.com/petehegseth/status/2027487514395832410?s=12

The amendment for the DoW-OAI deal may help, but I think it still fails to address key problems. The core surveillance prohibition is limited to “”intentional””/””deliberate”” surveillance. If the DoW says the use is incidental, it’s seemingly permitted, regardless of scale. 🧵”” https://x.com/justanotherlaw/status/2028673906870223286

Think about the power Hegseth is asserting here. He is claiming that the DoD can force all contractors to stop doing business of any kind with arbitrary other companies. In other words, every operating system vendor, every manufacturer of hardware, every hyperscaler, every type”” https://x.com/deanwball/status/2027521251263000765

threats do not change our position: we cannot in good conscience accede to their request.”” @AnthropicAI drawing a moral line against enabling mass domestic surveillance & fully autonomous weapons, and holding it under pressure. Almost unheard of in BigTech. I stand in support.”” https://x.com/mmitchell_ai/status/2027478619430523273

Until a full account of the Anthropic-DoW negotiations eventually comes out in testimony under oath, it’s hard to know for sure how to interpret it. But the supply chain risk decision will be pretty consequential for the AI industry and likely not in a productive way.”” https://x.com/jachiam0/status/2027531473319055581

US government just announced they are looking for a new supplier for their *checks notes* mass domestic surveillance”” https://x.com/janleike/status/2027521943491252501

What the frick: “”I am directing the Department of War to designate Anthropic a Supply-Chain Risk to National Security. Effective immediately, no contractor, supplier, or partner that does business with the United States military may conduct any commercial activity with”” https://x.com/kimmonismus/status/2027517309120635120

As Nvidia pours $30 billion into OpenAI so OpenAI can spend 20+ billion on Nvidia chips, remember this story from last week. Reuters reported two weeks ago that OpenAI had become dissatisfied with the performance of Nvidia’s hardware for certain types of inference tasks,”” https://x.com/michaeljburry/status/2027499260279652459?s=12

Nvidia CEO Huang says $30 billion OpenAI investment ‘might be the last’ https://www.cnbc.com/2026/03/04/nvidia-huang-openai-investment.html

Helping AI reach more people requires deep collaboration across the ecosystem. Today we’re announcing new investment, with support from @SoftBank, @NVIDIA, and @Amazon, to scale the infrastructure needed to bring AI to everyone.”” https://x.com/OpenAI/status/2027376050263793814

Scaling AI for everyone | OpenAI
https://openai.com/index/scaling-ai-for-everyone/

We have raised a $110 billion round of funding from Amazon, NVIDIA, and SoftBank. We are grateful for the support from our partners, and have a lot of work to do to bring you the tools you deserve.”” https://x.com/sama/status/2027386252555919386

This is not an extension of the November deal. It is a fundamentally different relationship. The November 2025 agreement was straightforward. OpenAI committed $38 billion over seven years to rent NVIDIA GPU capacity on AWS. EC2 UltraServers, GB200s and GB300s, pure”” https://x.com/patrickmoorhead/status/2027382529712349413

Anthropic’s Compute Advantage: Why Silicon Strategy is Becoming an AI Moat https://www.datagravity.dev/p/anthropics-compute-advantage-why

@AIatAMD and @EmbeddedLLM built 7 attention backends for vLLM on ROCm — and animated the internals. Shuffled KV cache layouts. Batch reordering. Log-sum-exp merging across chunks. This is how ROCM_AITER_FA gets 4.4x decode throughput on AMD GPUs 👇”” https://x.com/vllm_project/status/2027572563547742264

2026 is the year of AI infrastructure.”” https://x.com/Zai_org/status/2028457036308947393

AMD open sourced rocprof-trace-decoder! This was one of the last pieces of closed source code on the CPU side — the definitions of the hardware SQTT traces are now public. AMD’s tracing infrastructure is better than NVIDIA’s, it can trace the timing of every instruction.”” https://x.com/__tinygrad__/status/2028679089650041069

Asymmetric hardware scaling is here. Blackwell tensor cores are now so fast, exp2 and shared memory are the wall. FlashAttention-4 changes the algorithm & pipeline so that softmax & SMEM bandwidth no longer dictate speed. Attn reaches ~1600 TFLOPs, pretty much at matmul speed!”” https://x.com/tedzadouri/status/2029569295806841236

Attack of the asynchronous machines. We’ve seen this a lot in GPU kernels. This time the same principle applies in speculative decoding”” https://x.com/tri_dao/status/2029273056364118407

GPU vs. Taalas HC – two powerful hardware with two different inference paradigms. Here’s the workflow break down. ➡️ GPU: Run the model in software A GPU is a programmable compute engine. The model is software, and the chip executes it. ▪️ Inference on a GPU, step by step: 1.”” https://x.com/TheTuringPost/status/2028458565917360363

I found an interesting bug: when you make CuTeDSL kernels torch compile compatible using a custom op, there is a speed regression v.s. when the kernel is ran as it is. This becomes a problem when the operation is a part of a larger block that must be torch compiled. It also”” https://x.com/maharshii/status/2028863745641112008

Inference hardware: 7 notable ASICSs (Application-Specific Integrated Circuits) ▪️ Taalas HC (Hardcore Models) ▪️ MatX One ▪️ Google TPU Ironwood ▪️ AWS Inferentia2 ▪️ Groq LPU (Language Processing Unit) ▪️ d-Matrix Corsair ▪️ Cerebras WSE-3 Explore their workflow and features”” https://x.com/TheTuringPost/status/2027705308089618928

Key design choices: KV cache layout redesigned around AMD’s CDNA architecture. Decode ops skip layout conversion entirely → 15-20% gain. For DeepSeek MLA: ROCM_AITER_MLA compresses KV cache from ~8K → 576 dims (14x reduction) with hand-tuned assembly decode kernels.”” https://x.com/vllm_project/status/2027572573953724793

Love seeing practical engineering posts. If you’re running @vLLM in production and hitting OOM or unstable load, this guide explains why workload profiling + tuned configs matter more than hardware alone. https://t.co/ItEVhNgGRi @AI21Labs”” https://x.com/DylanCouzon/status/2029208629312700592

Models can now be “baked” straight into a chip and run at 17,000 tokens/second It’s the new reality from @taalas_inc and their HC1 chips Here’s a breakdown of all the benefits and trade-offs Is AI inference finally stable enough to move past the GPU era?”” https://x.com/TheTuringPost/status/2028261046075396568

RL post-training needs heterogeneous hardware – beefy GPUs for the trainer, cheap GPUs for rollouts, and high-memory CPU instances for replay buffers. Running it all on top-tier GPUs is wasteful. SkyPilot Job Groups simplifies workloads with heterogeneous requirements: • One”” https://x.com/skypilot_org/status/2028878888211013907

This plot undersells how much of a compute multiplier Olmo Hybrid is: 2x compute multiplier on many downstream tasks (and solid LC performance!!!)”” https://x.com/soldni/status/2029594807723815372

Traditionally, most AI chips run models. @taalas_inc does the opposite – it physically encodes a model into a hardware. A model can be a chip now – and that means 16-17k tokens per second inference speed ▪️ So how are these chips, called Hardcore Models (HC), made? • A base”” https://x.com/TheTuringPost/status/2027034591824097575

WTF you can now train a 5 million context window 8b model on a single node of 8xH100s ???? Most people don’t realize that even on long context pretrained frontier models, most RL post-training is only done on a small fraction of that context. Why? Long context RL is”” https://x.com/rronak_/status/2028718679123497007

Under @POTUS leadership, the biggest AI companies in the world are committing to the Ratepayer Protection Pledge. Data centers are the foundation of the internet and next generation technologies, supporting the U.S. economy and national security. Although electricity demand is”” https://x.com/WHOSTP47/status/2029297529301475705?s=20

3 trillion tokens. 512 NVIDIA Blackwell GPUs. 7 days. The Olmo team at @allen_ai and Lambda trained Olmo Hybrid 7B fully in the open, with every training log, recovery metric, and model weight published. 97% active training time, median recovery under 4 minutes. Read the full”” https://x.com/LambdaAPI/status/2029591148529139771

An existential risk for near term open-weight models. In the coming years, the only places with business reasons for building them 1) non profits — good for research/the world 2) nvidia’s — keep their hardware up with ai 3) meta’s — commodotize their complements”” https://x.com/natolambert/status/2029049751472357631

If you’re trying out FA4, you’re likely to run into not being able to load cutlass.cute – the issue is that nvidia-cutlass-dsl wheel installs its python files into – wait what – site-packages/nvidia_cutlass_dsl/python_packages/cutlass so of course python can’t find these”” https://x.com/StasBekman/status/2029625103500317066

Since our initial code release 8 months ago, it’s been fun collaborating with the cuDNN and CUTLASS teams at NVIDIA. Newer versions of cuDNN have now implemented many of the optimizations here, and latest cuDNN offers similar perf to FA4. 10/”” https://x.com/tedzadouri/status/2029569315520086156

First steel beams went up today at our Stargate site in Milam County, Texas. Exciting to see this project taking shape with @SoftBank and @SBEnergy.”” https://x.com/gdb/status/2027241615992225930

The FA4 paper is finally out after a year of work. On Blackwell GPUs, attention now goes about as fast as matmul even though the bottlenecks are so different! Tensor cores are now crazy fast that attn fwd is bottlenecked by exponential, and attn bwd is bottlenecked by shared”” https://x.com/tri_dao/status/2029569881151263082