Ethan B. Holland

Over 54,900 manually organized AI links and counting

World Models: AI News Week Ending 03/20/2026

March 20, 2026

Image created with gemini-3.1-flash-image-preview with claude-sonnet-4-5. Image prompt: Using the provided reference image hoodornament.jpg, preserve the deep midnight navy car hood, chrome pedestal base, shallow depth-of-field sky background, and dramatic upward camera angle exactly as shown. Replace the Mercedes star with a single chrome hood ornament: a miniature crystalline Earth globe showing raised continents and polished ocean depressions, all rendered in seamless polished metal at realistic ornament scale, mounted on the same pedestal. Add bold white sans-serif text reading WORLD MODELS across the upper portion of the image as a clean headline.

People are undoubtedly a little alarmed at having unwittingly helped build a 3D map of the world for Niantic by contributing 30 billion crowdsourced images. I interviewed Niantic’s CTO Brian McClendon about exactly this in a TED interview last year — he’s also the guy who
https://x.com/bilawalsidhu/status/2033350363982471182

GR00T is moving away from VLM-based backbones in favor of integrated world models. Jensen Huang teased GR00T N2 during his keynote; NVIDIA’s next-gen foundation model built on DreamZero research. Utilizing a new world-action model architecture, it succeeds at novel tasks in
https://x.com/TheHumanoidHub/status/2034279221372321940

What if a robot could simulate the physical world from a single image. [📍Bookmark Paper & GitHub for later] PointWorld-1B from Stanford and NVIDIA is a large 3D world model that predicts how an entire scene will move, given RGB-D input and robot actions. The key idea is
https://x.com/IlirAliu_/status/2032895393407660380

Breaking: 1 trillion revenue for NVIDIA in 2027 Jensen Huang: “One year after last GTC, right here where I stand… I see, going down so much, through 2027. At least… one trillion dollars, you know? Now, does it make any sense? I’m certain computer demand will be much
https://x.com/TheTuringPost/status/2033622628385362068

Jensen just said NVIDIA’s $1T projection for 2025-27 covers only Blackwell and Rubin to keep it consistent with the previous projection. He mentioned he could have included Groq in that number: “”so if I would’ve included that, theoretically, not actually, but theoretically,
https://x.com/TheHumanoidHub/status/2033990614824665421

Nvidia targets data center revenue of $1+ trillion for 2025-2027. That’s already quite ridiculous, with the AI physical world only in its zeroth innings . $NVDA
https://x.com/TheHumanoidHub/status/2033627322331660784

A breakthrough in real-time video generation. As a research preview developed with @NVIDIA and shared at @NVIDIAGTC this week, we trained a new real-time video model running on Vera Rubin. HD videos generate instantly, with time-to-first-frame under 100ms. Unlocking an entirely
https://x.com/runwayml/status/2034284298769985914#m

NVIDIA GTC 2026 Keynote: Everything That Happened in 12 Minutes – YouTube https://www.youtube.com/watch?v=X2i_8O75_Os

DoorDash’s New Paid Tasks Turn Couriers Into AI and Robot Trainers – Bloomberg https://www.bloomberg.com/news/articles/2026-03-19/doordash-s-new-paid-tasks-turn-couriers-into-ai-and-robot-trainers

LiTo: Joint Geometry and Appearance Modeling for Image-to-3D Generation TL;DR: Generates high-fidelity 3D objects from a single image by jointly modeling geometry + view-dependent appearance (lighting, reflections) in a unified latent space
https://x.com/Almorgand/status/2033987312451731904

AMI Labs just raised $1.03B. World Labs raised $1B a few weeks earlier. Both are betting on world models. But almost nobody means the same thing by that term. Here are, in my view, five categories of world models. — 1. Joint Embedding Predictive Architecture (JEPA)
https://x.com/zhuokaiz/status/2032201769053212682

Google Maps 3d basemap and navigation experience just became a lot more immersive 😍
https://x.com/bilawalsidhu/status/2032122828992962704

Introducing our biggest upgrade to @googlemaps since the original launch, featuring Ask Gemini (with personalization), Immersive Navigation, and much more!! 🗺️
https://x.com/OfficialLoganK/status/2032101245763149908

it’s all over when google realizes the treasure trove that is street view + aerial data and launches a version of genie grounded in the real world…
https://x.com/bilawalsidhu/status/2033954619181654114

GTC 2026 News | NVIDIA Newsroom https://nvidianews.nvidia.com/online-press-kit/gtc-2026-news

Jensen says he can’t think of a company building robots that isn’t working with Nvidia.
https://x.com/TheHumanoidHub/status/2033642974492659894

NVIDIA GTC 2026: Live Updates on What’s Next in AI | NVIDIA Blog https://blogs.nvidia.com/blog/gtc-2026-news/

Developers used to argue about programming languages; now they argue about harnesses. NemoClaw is NVIDIA’s answer to your OpenClaw safety woes — zero permissions by default, sandboxed subagents, private inference enforced at the infra layer. Here’s a guide on how to start:
https://x.com/baseten/status/2034649896523874356

Go from “”hello world”” to “”hello claw!”” 🦞 We’re hosting a Build-A-Claw extravaganza in the #NVIDIAGTC Park Mon-Thur where you can BYOD or buy a DGX Spark on-site and our NVIDIA experts will help you install @OpenClaw. See you there! 🙌 Full details 👉 https://x.com/NVIDIAAIDev/status/2032847578404888907

We’re going live at #NVIDIAGTC in 30 minutes. ⏱️ Join us for GTC Live at 8 a.m. PT as we get ready for Jensen Huang’s keynote 11 a.m. Featuring industry leaders from: @bfl_ml, @Cadence, @CaterpillarInc, @cohere, @CoreWeave, @DellTech, @EdisonSci, @FireworksAI_HQ, @IBM,
https://x.com/nvidia/status/2033551362210865371

🚀 Live from @NVIDIAGTC, we’re releasing Holotron-12B! Developed with @nvidia, it’s a high-throughput, open-source, multimodal model engineered specifically for the age of computer-use agents. Get started today! 🤗Hugging Face: https://t.co/SyAuqLIacS 📖Technical Deep Dive:
https://x.com/hcompany_ai/status/2033851052714320083

AI is already redesigning chip design itself! And the biggest bottleneck left is validation. Here is Bill Dally describing to @JeffDean how @nvidia uses AI to design chips: “We’re already using AI across multiple parts of the chip design process, and it’s delivering real
https://x.com/TheTuringPost/status/2034413469542588613

How NVIDIA Dynamo 1.0 Powers Multi-Node Inference at Production Scale | NVIDIA Technical Blog https://developer.nvidia.com/blog/nvidia-dynamo-1-production-ready/

With Nemotron 3 Nano 4B in the NVIDIA Nemotron 3 family, llama.cpp users get a compact model for action-taking conversational personas, available across NVIDIA GPU-enabled systems and @NVIDIA_AI_PC
https://x.com/ggerganov/status/2033947673825337477

The frontier has increasingly shifted to hybrid models – from Qwen to Kimi-Linear and now with NVIDIA’s Nemotron-3 Super – that rely on a strong linear sequence model. Today we release Mamba-3, the most powerful linear model to date.
https://x.com/tri_dao/status/2033948569502413245

NVIDIA thanks all its partners: the message? There is no way around NVIDIA. NVIDIA is the center of the revolution.
https://x.com/kimmonismus/status/2033615181415387610

Straight from NVIDIA GTC: Jensen Huang just unveiled a new vision for AI infrastructure For the first time, Rubin GPUs+Groq LPUs are paired: > 35× higher inference throughput > 10× more revenue from trillion-parameter models Architecture & why it’s needed
https://x.com/TheTuringPost/status/2033700480975520097

Thank you Jensen and NVIDIA! She’s a real beauty! I was told I’d be getting a secret gift, with a hint that it requires 20 amps. (So I knew it had to be good). She’ll make for a beautiful, spacious home for my Dobby the House Elf claw, among lots of other tinkering, thank you!!
https://x.com/karpathy/status/2034321875506196585

What if a world model could render not an imagined place, but the actual city? We introduce Seoul World Model, the first world simulation model grounded in a real-world metropolis. TL;DR: We made a world model RAG over millions of street-views. proj: https://x.com/jyseo_cv/status/2033739972264792430

World Models: Computing the Uncomputable https://www.notboring.co/p/world-models

At this nerdiest of all nerdy sessions 💞, Jeff Dean said he doesn’t think we’re running out of data. “I think there’s still an enormous amount of data in the world that we haven’t really used yet for training these models. We train on some video data, for example, but there’s a
https://x.com/TheTuringPost/status/2034411360302567803

Learning from robot data? Standard. Direct Video-Action Models (DVA) is different: treat robot control as video generation, then translate the generated video into actions. Built by @rhoda_ai_, the system pre-trains causal video models from scratch and can run complex
https://x.com/IlirAliu_/status/2032742738853048413

Video generation might be a much better backbone for robot learning than image-text models. DiT4DiT couples a video Diffusion Transformer with an action Diffusion Transformer, letting robot policies learn directly from spatiotemporal video dynamics instead of static visual
https://x.com/IlirAliu_/status/2032380216962691114