Image created with gemini-3.1-flash-image-preview with claude-sonnet-4-5. Image prompt: Animation cel style image of a muscular blue-skinned genie with friendly expression emerging from a brass oil lamp, arms extended toward a large floating silicon microchip wafer with glowing golden circuit traces, magical teal wisps flowing from genie’s hands onto the chip surface, warm jewel-tone color palette, clean gradient background, Disney-quality 2D animation aesthetic with bold outlines, cinematic lighting, horizontal composition with space for text overlay at top.

Nvidia, Microsoft, Amazon in Talks to Invest Up to $60 Billion in OpenAI — The Information https://www.theinformation.com/articles/nvidia-microsoft-amazon-talks-invest-60-billion-openai

Source: Amazon could invest up to $50B in OpenAI in coming weeks https://www.cnbc.com/2026/01/29/amazon-openai-investment-jassy-altman.html

An orchestration framework for small models that coordinate powerful tools – ToolOrchestra from NVIDIA It’s like a conductor model for agentic systems. Instead of solving everything itself, a small Orchestrator model reasons step-by-step and decides which tool or expert model”” https://x.com/TheTuringPost/status/2015565947827110255

Exclusive: China gives nod to ByteDance, Alibaba and Tencent to buy Nvidia’s H200 chips – sources | Reuters https://www.reuters.com/world/china/china-gives-green-light-importing-first-batch-nvidias-h200-ai-chips-sources-say-2026-01-28/

Welcome everyone to the Office Hours 👋 Quick reminder: the Events page on our website (https://t.co/cNKJF6oThw) has been synced with many SIG meetings, including Multimodal, Omni, AMD, torch-compile and CI. Choose the SIG you care about most and get involved. You can also join”” https://x.com/vllm_project/status/2016526685869596974

Today we’re launching Model Vault — a dedicated, fully managed platform to run Cohere models securely and at scale. Model Vault delivers the control and isolation of self-hosting, without the operational burden: 🔒 Dedicated, isolated VPC ⚡ No noisy neighbors or rate limits 🔁”” https://x.com/cohere/status/2016512841751154739

Thrilled to announce our $300M Series A at a $4B valuation! Chips are the fuel for AI. At Ricursive Intelligence, we are using AI to design better chips faster, closing the recursive self-improving loop between AI and hardware.”” https://x.com/annadgoldie/status/2015806107470438685

Our Maia 200 inference chip, announced today, is most performant first party silicon of any hyperscaler. 3x the FP4 performance of the Amazon Trainium v3, and FP8 performance above Google’s TPUv7.”” https://x.com/mustafasuleyman/status/2015845567138816326

Missed Dynamo Day 2026? Our session on large-scale LLM serving with vLLM from @simon_mo_ is now available on NVIDIA On-Demand. Covers disaggregated inference, Wide-EP for MoE, and rack-scale deployments on GB200 NVL72. Thanks @nvidia for hosting! Watch recording:”” https://x.com/vllm_project/status/2017075057550618751

Nemotron 3 Nano in NVFP4 just dropped from @NVIDIA! 4x throughput on B200 (vs FP8-H100) with accuracy preserved via Quantization-Aware Distillation. The checkpoint is already supported by vLLM https://t.co/xd6JETkS6o 🤝Thanks NVIDIA × vLLM community!”” https://x.com/vllm_project/status/2016562169140433322

We just launched an ultra-efficient NVFP4 precision version of Nemotron 3 Nano that delivers up to 4x higher throughput on Blackwell B200. Using our new Quantization Aware Distillation method, the NVFP4 version achieves up to 99.4% accuracy of BF16. Nemotron 3 Nano NVFP4:”” https://x.com/NVIDIAAIDev/status/2016556881712472570

NVIDIA and CoreWeave Strengthen Collaboration to Accelerate Buildout of AI Factories | NVIDIA Newsroom https://nvidianews.nvidia.com/news/nvidia-and-coreweave-strengthen-collaboration-to-accelerate-buildout-of-ai-factories

New Stanford and NVIDIA’s paper that really worth your attention They introduced Test-Time Training to Discover (TTT-Discover), which lets models keep learning at inference time, using RL to find breakthrough solutions. It’s a new way to effectively solve scientific problems.”” https://x.com/TheTuringPost/status/2015377899168424073

I don’t think people have realized how crazy the results are from this new TTT + RL paper from Stanford/Nvidia. Training an open source model, they – beat Deepmind AlphaEvolve, discovered new upper bound for Erdos’s minimum overlap problem – Developed new A100 GPU kernels 2x”” https://x.com/rronak_/status/2015649459552850113

Leave a Reply

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading