Image created with Flux Pro v1.1 Ultra. Image prompt: Newsroom lab with headsets and AR teleprompters overlaying headlines in space; the word “AR/VR” set as a lower-third title in clean overlay sans, high legibility; holographic UI panels float over the skyline view; futuristic, volumetric light

Control an avatar in real time (in this case a dog on a beach!) https://x.com/demishassabis/status/1958696900489523633

Genie 3 has advanced spatial memory, when you make changes in the world, they persist in the simulation even when out of view! https://x.com/demishassabis/status/1958696898488840414

Simulations are the future, & one of the main tools we’ll ultimately use to understand and predict things about the universe. This is why I’m so excited about Genie 3, our latest interactive world simulator – here are some insanely cool things you might have missed about it 🧵: https://x.com/demishassabis/status/1958696882105995312

You can prompt Genie 3 with text, photos, or videos – like this cool game example created using Imagen 4 -> Veo 3 -> Genie 3 https://x.com/demishassabis/status/1958696891639595148

Nvidia announced Cosmos Reason 7B, an open-source VLM to enable robots to see, reason, and act in the physical world, solving multistep tasks The company also made Isaac Sim 5.0 and Isaac Lab 2.2 generally available https://x.com/adcock_brett/status/1957111085481242892

when you engage “”hovercraft mode”” on your new whip (made w/ nvidia cosmos) https://x.com/bilawalsidhu/status/1956160140404777142

TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis https://freedomintelligence.github.io/talk-vid/

Testing out the new Helix walking controller. it’s unstoppable https://x.com/adcock_brett/status/1958193476639826383

Join @xAI and help build a purely AI software company called Macrohard. It’s a tongue-in-cheek name, but the project is very real! In principle, given that software companies like Microsoft do not themselves manufacture any physical hardware, it should be possible to simulate”” / Macrohard is xAI’s playful project name (a nod to Microsoft) for building a fully AI-simulated software company. The goal: since firms like Microsoft produce no physical hardware, AI could theoretically replicate their entire operations—from coding to management. It’s real, and we’re hiring! 🚀 X https://x.com/elonmusk/status/1958852874236305793

We can now train AI inside the mind of another AI. 🤯 🌍 Our world model, Genie 3, imagines and generates new worlds on the fly. 🤖 Our embodied agent, Sima, is dropped in and learns to navigate them autonomously. The entire loop—from the environment to the action—is generated https://x.com/bonniesjli/status/1958948293523767561

3d digital twin of a new light rail project in australia bim/cad data + google maps 3d tiles –> interactive 3d engine imagine if every major infrastructure project in the world did this https://x.com/bilawalsidhu/status/1957529179794133147

Sun’s out, models out. 😎 @IBM & @NASA dropped Surya, an open-source heliophysics model trained on 14 years of observations from NASA’s Solar Dynamics Observatory, and it’s 🔥🔥🔥. https://x.com/huggingface/status/1958163027238223985

What is the best all-in-one platform for vibe coding games and interactive experiences? Has anyone made sharing & discovery a first class citizen too — so you can play & remix community creations?”” / X https://x.com/bilawalsidhu/status/1958240030637302223

@snowglobe_so has sharable links now! You can share a public read-only view of any simulation. Each link lasts for 3 days. Share deep insights and data with the rest of your team!”” / X https://x.com/zaydsimjee/status/1958938033811869735

MeshCoder LLM-Powered Structured Mesh Code Generation from Point Clouds https://x.com/_akhaliq/status/1958379365147775414

PERSONA: Personalized Whole-Body 3D Avatar with Pose-Driven Deformations from a Single Image”” TL;DR: new framework for animated 3D human avatar from a single image; pose-driven deformation; high authenticity; correct facial expression and hand gestures; SMPL-X parameters https://x.com/Almorgand/status/1956023405234246057 https://mks0601.github.io/PERSONA/

We haven’t had the GPT moment for spatial intelligence yet — one model so general purpose it’s equally game changing for Palantir and Disney. Something that can understand and reason about 3D worlds — then generate them, faithfully or fantastically. That’s the convergence https://x.com/bilawalsidhu/status/1956010027828842647

Tencent’s Hunyuan team dropped big models: —Hunyuan-GameCraft for generating playable videos from a single scene image and user actions —Hunyuan-Large-Vision, a versatile and powerful multimodal understanding model https://x.com/adcock_brett/status/1957111107933409607

Well, this happened sooner than I expected… Tencent has dropped their version of Genie 3. https://x.com/bilawalsidhu/status/1955968609940873624

Wow! Chinese lab Tencent Hunyuan has released an open source alternative to Genie 3 🔥 You can generate realistic videos that you can control in real time. – Long-term consistency – No need for expensive rendering – Trained on 1M+ gameplay recordings Already available ↓ https://x.com/itsPaulAi/status/1957182570309013714

Congrats @jparkerholder, @shlomifruchter, & the Genie & Veo teams! If you are interested to know more, the latest episode of the @GoogleDeepMind Podcast with the brilliant @FryRsquared has just dropped, and is all about Genie 3 & its incredible potential: https://x.com/demishassabis/status/1958696904146976927

Meta’s ‘Hypernova’ smart glasses will cost a lot less than we thought, new report claims | Mashable https://mashable.com/article/meta-smart-glasses-display-hypernova?test_uuid=003aGE6xTMbhuvdzpnH5X4Q&test_variant=b

Adapt3R: Adaptive 3D Scene Representation for Domain Transfer in Imitation Learning https://www.pair.toronto.edu/Adapt3R/

introducing Halo X, always listening AI glasses. vibe think. never use your brain again. pre-order now. https://x.com/AnhPhuNguyen1/status/1958199821048705312

pairlab/Adapt3R: Official implementation of Adapt3R: Adaptive 3D Scene Representation for Domain Transfer in Imitation Learning https://github.com/pairlab/Adapt3R

HTC launched Vive Eagle AI smartglasses in Taiwan, taking on Meta AI glasses The glasses use Google and OpenAI’s models for assistance, and offer similar features such as live photo-based translation Price starts at $520 https://x.com/adcock_brett/status/1957111220474892360

Most imitation learning policies break when the camera moves or the robot changes. NOT THIS ONE 👇 [📍 Bookmark for later ] A new 3D scene representation encoder, tackles this by enabling zero-shot generalization to unseen embodiments and viewpoints… And it works with any IL https://x.com/IlirAliu_/status/1956409667908936185

Mechaverse https://mechaverse.dev/

Tired: building AI for waifus & chatbots Wired: building AI for space exploration! Excited to introduce Surya, the first open-source AI foundation model for heliophysics, released by @NASA & @IBM on @huggingface! It’s a 366M-parameter transformer model pretrained on 9 years”” / X https://x.com/ClementDelangue/status/1958181104034156781

[2503.04877] Adapt3R: Adaptive 3D Scene Representation for Domain Transfer in Imitation Learning https://arxiv.org/abs/2503.04877

Tinker Diffusion’s Gift to 3D–Multi-View Consistent Editing From Sparse Inputs without Per-Scene Optimization https://x.com/_akhaliq/status/1958380980000981208

Turning a beefy 23gb explosion simulation into a 300mb asset you can run in real-time inside unreal is a different kind of cool. ZibraVDB uses AI to compress to compress OpenVDB volumes and has a generous free tier. https://x.com/bilawalsidhu/status/1958202154893648331

Today, we’re launching the Runway Game Worlds Beta. Over the last few months, we have been working on research and products that are moving us closer toward a future where you will be able to explore any character, story or world in real time. While generating the pixels of https://x.com/runwayml/status/1958516860149997672

Big upgrade coming to Helix https://x.com/adcock_brett/status/1957526592793838038

Major TOM @GoogleDeepMind’s AlphaEarth Embeddings are now on @huggingface! 🚀 A new 6 TB prototype dataset for the community. Get it here: https://x.com/mikonvergence/status/1958767622176039019

Masquerade: Learning from In-the-wild Human Videos using Data-Editing https://arxiv.org/pdf/2508.09976

Developed a more generalized version. The pattern 東南西北 can be displaced, but remains conserved. https://x.com/RavenKwok/status/1958157337187020973

Has GPT-5 Achieved Spatial Intelligence? An Empirical Study https://x.com/_akhaliq/status/1957833219992080581

Robots learn best from robot data, but… what if we could turn any human video into a robot demo? [❗️Every clip is fake] That’s exactly what Masquerade does. It takes raw human egocentric videos Replaces the human arms with a rendered robot Produces “robotized” demos from https://x.com/IlirAliu_/status/1957788396505432258

Synchronized robots, tested in 3D, to boost uptime and cut integration errors: This setup uses RobotStudio to simulate dual-robot coordination with precise axis timing and collision checks. Every move is validated in a digital twin before going live. benefits: ✅ Full axis https://x.com/IlirAliu_/status/1957498982466211911

The first universal 3D robot viewer for the browser… load URDF, MJCF, or USD in seconds. No installs. No setup. Developer Victor Oldensand has built Mechaverse, a universal 3D robotics viewer that works on desktop and even mobile (if your phone is strong enough). No installs. https://x.com/IlirAliu_/status/1956260218523828459

BeyondMimic: From Motion Tracking to Versatile Humanoid Control via Guided Diffusion https://arxiv.org/pdf/2508.08241

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading