Image created with gemini-2.5-flash-image with claude-sonnet-4-5. Image prompt: A person wearing modern AR glasses reaches toward a glowing translucent holographic chess piece floating above an empty wooden table in a bright sunlit room, hand gesture intersecting the hologram, photorealistic style with soft light reflections on the table surface

Amazon is developing smart glasses allowing delivery drivers to work hands-free https://www.aboutamazon.com/news/transportation/smart-glasses-amazon-delivery-drivers

AI video models may not be complete world models, but they are oddly capable of fairly sophisticated (if flawed) “”simulations”” of novel situations. Veo 3.1: “”three toy ships, one made of iron, the other of wood, and one out of loosely packed sugar, fall into a pool of water”” https://x.com/emollick/status/1980126684306424155

Ethan Mollick on X: “This is not perfect, but it also doesn’t seem like a model trained on video should be able to get so many details of the dynamics right: “honey pours down a marble statue of a toucan, the nose cracks and falls off” https://t.co/ei0I5ecmom” / X
https://x.com/emollick/status/1980128284294938661

Datasets you need to build an AI JARVIS — Meta dropped 500 hours of 3D motion data spanning everything from individual gestures to multi-person conversations and co-living scenarios, complete with motion tracking, annotations, and audio tracks. https://x.com/bilawalsidhu/status/1980719297669525925

🤖 NVIDIA’s Gr00t N1.5 is now available in LeRobot! This is the result of a great collaboration between the @huggingface LeRobot team and @NVIDIARobotics ! Gr00t N1.5 highlights: 🦾 Cross-embodiment foundation model for robots 🧠 Multimodal inputs: vision, language, and https://x.com/LeRobotHF/status/1981334159801929947

Apple Vision Pro’s M5 decoder is freaking insane for wireless PC VR 4K x 4K per eye HEVC, 10-Bit, at 120hz. With no struggle at all.. even while screen recording this clip and using the wacky breakthrough features https://x.com/SadlyItsBradley/status/1981594915982147652

Snapchat makes its first open prompt AI Lens available for free in the US | TechCrunch https://techcrunch.com/2025/10/22/snapchat-makes-its-first-open-prompt-ai-lens-available-for-free-in-the-us/

Spatial intelligence is so hot right now. And it will only get hotter.”” / X https://x.com/bilawalsidhu/status/1979182318553305597

World models: No one knows what it means, but it’s provocative. It gets the VCs going.”” / X https://x.com/bilawalsidhu/status/1979928032967094730

Now this is fking cool. Genie 3 made the World Labs approach look dated. But now they drop a step towards the best of both worlds. Generative video w/ spatial memory. And a live demo to boot. Making the bitter lesson sweet. Nicely done World Labs team! https://x.com/bilawalsidhu/status/1978917822228045858

Virtually Being https://eyeline-labs.github.io/Virtually-Being/

RTFM, a new generative interactive world model by World Labs, generates real-time video from a single image for exploring 3D worlds. Trained on large-scale video data to predict the next frame. Achieves unbounded persistence via spatial memory. try it: https://x.com/TheHumanoidHub/status/1978878123275096579

腾讯混元3D https://3d-models.hunyuan.tencent.com/world/

Today, we are open-sourcing Hunyuan World 1.1 (WorldMirror), a universal feed-forward 3D reconstruction model. 🚀🚀🚀   While our previously released Hunyuan World 1.0 (open-sourced, lite version deployable on consumer GPUs) focused on generating 3D worlds from text or https://x.com/TencentHunyuan/status/1980930623536837013

So these researchers figured out you can basically hallucinate 3D cities into existence using just satellite photos & a diffusion model. The problem’s pretty straightforward: satellites only see rooftops. Building facades? Invisible. Street-level detail? Doesn’t exist. But https://x.com/bilawalsidhu/status/1981036580153544854

Google prepares Genie 3 public experiment with AI worlds https://www.testingcatalog.com/google-prepares-genie-3-public-experiment-with-ai-generated-worlds/

I like light probes and I cannot lie. Especially when they’re being hallucinated by AI. Simulon is building a much higher quality version of image-based lighting estimation the Vision Pro or ARCore does in real time.”” / X https://x.com/bilawalsidhu/status/1979291609230659947

Sora 2 Pro is very impressive in my tests but really needs its own interface – using a storyboard maker shoved inside a video creation tool shoved inside of a drafts folder of a social media app is awkward.”” / X https://x.com/emollick/status/1980132516406473126

Introducing: Interactive Sora! A choose-your-own adventure GAME powered by Sora 2. Every choice spins up a brand-new scene instantly. Open source, link here: https://x.com/mattshumer_/status/1978848940083839162

Skild robot brain in action. The company is building a unified, omni-bodied brain to control any robot for any task. https://x.com/TheHumanoidHub/status/1979263197078282680

World-in-World: World Models in a Closed-Loop World https://world-in-world.github.io/

You can just transfer your entire performance to any character now. Fed it my guitar playing to drive this avatar, and it holds up well. Wan 2.2 Fun Control and Animate are powerful primitives. At the rate things are going, this’ll be a real-time AI filter. And at that point https://x.com/bilawalsidhu/status/1978871705599905892

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading