Image created with gemini-2.5-flash-image with claude-sonnet-4-5. Image prompt: Wide-angle view inside a massive spherical zero-gravity training chamber in orbit, translucent holographic tactical overlays and glowing AR waypoints floating in space, a single figure in flight suit suspended at center surrounded by virtual enemy formations, curved transparent walls revealing deep space beyond, cool blue lighting with neon green highlights, cinematic sci-fi realism, high contrast dramatic lighting, sense of vastness and isolation

Super excited to announce SIMA 2! It’s a general agent that can understand & reason about complex instructions and complete tasks in simulated game worlds, even ones it has never seen before. Incredible to see how it can learn just from self-play… a crucial step towards AGI https://x.com/demishassabis/status/1989096784870928721

SIMA 2: A Gemini-Powered AI Agent for 3D Virtual Worlds – Google DeepMind https://deepmind.google/blog/sima-2-an-agent-that-plays-reasons-and-learns-with-you-in-virtual-3d-worlds/

SIMA 2 is our most capable AI agent for virtual 3D worlds. 👾🌐 Powered by Gemini, it goes beyond following basic instructions to think, understand, and take actions in interactive environments – meaning you can talk to it through text, voice, or even images. Here’s how 🧵 https://x.com/GoogleDeepMind/status/1988986218722291877

Our SIMA 2 research offers a strong path towards applications in robotics and another step towards AGI in the real world. Find out more → https://x.com/GoogleDeepMind/status/1988987865401798898

SIMA 2 🤝 Genie 3 We tested SIMA 2’s abilities in simulated 3D worlds created by our world model Genie 3. It demonstrated unprecedented adaptability by navigating its surroundings and took meaningful steps toward goals. https://x.com/GoogleDeepMind/status/1989024090414309622

SIMA 2: A Gemini-Powered AI Agent for 3D Virtual Worlds – Google DeepMind
https://deepmind.google/blog/sima-2-an-agent-that-plays-reasons-and-learns-with-you-in-virtual-3d-worlds/?utm_source=x&utm_medium=social&utm_campaign=&utm_content=

One of the coolest 3d real estate demos I’ve seen in a while. Treedis brings together gaussian splats for area understanding → BIM data to check which units are available → Matterport for indoor views → and image editing AI to restyle decor and see different times of day. The https://x.com/bilawalsidhu/status/1988328067140644953

After a year of team work, we’re thrilled to introduce Depth Anything 3 (DA3)! 🚀 Aiming for human-like spatial perception, DA3 extends monocular depth estimation to any-view scenarios, including single images, multi-view images, and video. In pursuit of minimal modeling, DA3 https://x.com/bingyikang/status/1989358267668336841

Paper page – Depth Anything 3: Recovering the Visual Space from Any Views https://huggingface.co/papers/2511.10647

Depth Anything 3 Recovering the Visual Space from Any Views https://x.com/_akhaliq/status/1989336687529619858

Depth Anything 3 – a Hugging Face Space by depth-anything
https://huggingface.co/spaces/depth-anything/depth-anything-3

I genuinely think we’re on the cusp of a new type of creation engine. Feels less like prompting and more like puppeteering reality itself. MotionStream is a taste of what’s to come: https://x.com/bilawalsidhu/status/1986877076839014462

MotionStream: Real-Time Video Generation with Interactive Motion Controls
https://joonghyuk.com/motionstream-web/

Wildminder on X: “MotionStream: Real-time, interactive video generation with mouse-based motion control; runs at 29 FPS with 0.4s latency on one H100; uses point tracks to control object/camera motion and enables real-time video editing. https://t.co/fFi9iB9ty7 https://t.co/zKb9u3bj9g” / X
https://x.com/wildmindai/status/1985828041566941576

Marble: A Multimodal World Model | World Labs https://www.worldlabs.ai/blog/marble-world-model

A robot could learn a task just by watching a generated video? PhysWorld connects video generation with real-world robot learning. It turns visual imagination into physical skill. ✅ Takes one image and a task prompt ✅ Generates a video showing how to complete the task ✅ https://x.com/IlirAliu_/status/1988678189527273831

[2511.07416] Robot Learning from a Physical World Model https://arxiv.org/abs/2511.07416

Fei-Fei Li’s World Labs speeds up the world model race with Marble, its first commercial product | TechCrunch https://techcrunch.com/2025/11/12/fei-fei-lis-world-labs-speeds-up-the-world-model-race-with-marble-its-first-commercial-product/

From Words to Worlds: Spatial Intelligence is AI’s Next Frontier https://drfeifei.substack.com/p/from-words-to-worlds-spatial-intelligence

.@drfeifei started her new blog We believe this will be one of the most interesting reads about Spatial Intelligence. She writes, that Spatial Intelligence depends on world models built on 3 core principles: – They must be generative – able to create coherent, https://x.com/TheTuringPost/status/1988727531353305524

Perceptron’s platform is here — built for Physical AI Developers can now use Isaac-0.1 or Qwen3VL 235B via: Perceptron API — fast, reliable multimodal intelligence Python SDK — simple, grounded prompting for vision + language Build apps that see and understand the world. https://x.com/perceptroninc/status/1988713482460750290

We’ve been integrating Isaac across the industry and have realized developers are missing a single platform for Physical AI – prompt engineering, deployment, and integration. Today we are excited to release Perceptron’s Platform – supporting our API – supporting chat”” / X https://x.com/AkshatS07/status/1988713765152649711

Perceptron AI on X: “Perceptron’s platform is here — built for Physical AI Developers can now use Isaac-0.1 or Qwen3VL 235B via: Perceptron API — fast, reliable multimodal intelligence Python SDK — simple, grounded prompting for vision + language Build apps that see and understand the world. https://t.co/5ZyaOGQb1i” / X
https://x.com/perceptroninc/status/1988713482460750290

We are just scratching the surface of precise control over AI video generation. MotionStream unlocks real-time video with interactive motion controls. You can interactively generate video based on motion inputs (like drawn trajectories, camera movements, or motion transfer). https://x.com/bilawalsidhu/status/1986838921712701833

@giffmana A joint isotropic gaussian has the property that marginals along all projections are also gaussian (and vice versa). Not true of Laplace.”” / X https://x.com/ylecun/status/1988999683801510063

You can just bike around your city with a camera and auto identify vacant & derelict buildings. Ireland has a housing crisis. Thousands of empty buildings. Nobody knows where they all are. UCD’s Spatial Dynamics Lab is using the latest in AI to map them automatically — https://x.com/bilawalsidhu/status/1987238945739186444

OpenReal2Sim Demo https://hesic73.github.io/OpenReal2Sim_demo/

We’re back with another update to Veo 3.1: Rolling out now on mobile and desktop, you can upload multiple reference images alongside your video prompts, to create entirely new worlds and more nuanced videos that are true to your vision.”” / X https://x.com/GeminiApp/status/1989440642179801192

AI’s next frontier is Spatial Intelligence, a technology that will turn seeing into reasoning, perception into action, and imagination into creation. But what is it? Why does it matter? How do we build it? And how can we use it? Today, I want to share with you my thoughts on https://x.com/drfeifei/status/1987891210699379091

We will transcend space & time by teleoperation human robots anywhere in the world using VR headsets. It’s wild how far we’ve come from iPads on wheels. https://x.com/bilawalsidhu/status/1986336092111708459

Tesla vehicles and Optimus share core technology: Actuators Power electronics Battery Manufacturing Data communication Audio system Cameras A14/A15 chips Training cluster Neural simulation Real-world AI https://x.com/TheHumanoidHub/status/1986565979841962377

America and China build robots. Europe builds committees. Tesla just showed Optimus in pilot production. Humanoids being assembled like cars. Meanwhile, in Europe, we’re still arguing over regulation, ethics boards, and frameworks that nobody in the real world reads. This https://x.com/IlirAliu_/status/1986869259226456142

Leave a Reply

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading