Ethan B. Holland

Over 56,100 manually organized AI links and counting

Augmented Reality (AR/VR): AI News Week Ending 06/20/2025

June 20, 2025

Image created with OpenAI GPT-Image-1. Image prompt: Cheesy late-night infomercial freeze-frame—garage workbench demo with goggles popping out of screen labelled “ARVR REALITY-BLASTER™”; laser grid, CRT ghosting, high-resolution

Our vision is for AI that uses world models to adapt in new and dynamic environments and efficiently learn new skills. We’re sharing V-JEPA 2, a new world model with state-of-the-art performance in visual understanding and prediction. V-JEPA 2 is a 1.2 billion-parameter model, https://x.com/AIatMeta/status/1932808881627148450

Cool demo of a GUI for LLMs! Obviously it has a bit silly feel of a “horseless carriage” in that it exactly replicates conventional UI in the new paradigm, but the high level idea is to generate a completely ephemeral UI on demand depending on the specific task at hand.”” / X https://x.com/karpathy/status/1935779463536755062

#NVIDIAIsaac Sim 5.0 and Isaac Lab 2.2 are now available in early developer preview on Github. 🎉 These releases give #Robotics developers early access to cutting-edge tools to simulate, train, and validate robots in a physics-based simulation environment. What’s new? https://x.com/NVIDIARobotics/status/1934768379652665403

Behind ANCESTRA: combining generative AI with live-action filmmaking https://blog.google/technology/google-deepmind/ancestra-behind-the-scenes/

Nvidia launched Issac Sim 5.0 and Issac Lab 2.2 in early preview on GitHub These open frameworks now come with extensions for synthetic data generation and robot models — streamlining how devs build, train, and test AI robots in physics-based simulations https://x.com/rowancheung/status/1934881540263018666

#CVPR2025 Picks #3 Alibaba just released VideoRefer-VideoLLaMA3 (2B & 7B video LLMs with A2.0 license!) These models can understand videos and segment objects, answer questions about them throughout the video at the same time 🤯 see it in action ⤵️ https://x.com/mervenoyann/status/1935739721772081336

AI stitching together multiple video feeds into one omniscient traffic god. This is what happens when cameras start talking to each other — mapping the trajectory of every vehicle and pedestrian seamlessly across cameras. Spatial intelligence is coming to a city near you. https://x.com/bilawalsidhu/status/1933346880336941297

How good is @runwayml Gen-4 References for visual effects? Here is a sample of what’s possible, all generated. https://x.com/c_valenzuelab/status/1934312626021949687

Woah. First person camera view being warped to a third person camera view using a unified flow and matching model. https://x.com/bilawalsidhu/status/1932975764992868427

Controllable and Expressive One-Shot Video Head Swapping https://humanaigc.github.io/SwapAnyHead/

What if a livestream had two digital avatars—talking, reacting, and engaging in real time? Luo Yonghao, one of China’s top livestreamers, made his digital avatar debut on Baidu’s e-commerce platform. Powered by the ERNIE foundation model, the livestream was the first to feature https://x.com/Baidu_Inc/status/1934982099112751197

Meta announces Oakley smart glasses that shoot 3K video | The Verge https://www.theverge.com/news/690133/meta-oakley-hstn-ai-glasses-price-date

🚀 Introducing Cosmos-Predict2! Our most powerful open video foundation model for Physical AI. Cosmos-Predict2 significantly improves upon Predict1 in visual quality, prompt alignment, and motion dynamics—outperforming popular open-source video foundation models. It’s openly https://x.com/qsh_zh/status/1933024567011995865

BREAKING at #BAAI Conference 2025: BAAI unveils RoboOS 2.0 (cross-embodied brain and cerebrum collaboration framework) & RoboBrain 2.0! Outperforms mainstream embodied AI models -World’s strongest open-source embodied brain model!#Robotics #EmbodiedAI #OpenSource https://x.com/BAAIBeijing/status/1931916124473499676

1X World Model Scaling Evaluation for Robots
https://x.com/1x_tech/status/1934634700758520053

Sometimes I feel like we give too much credit to Apple and overthink what they do and why they do it. Liquid Glass is the latest case in point. “The text is hard to read because they’re preparing us for how UI will look on smart glasses” proclaim a slew of influencers and tech https://x.com/bilawalsidhu/status/1934271512258728093

Japan’s TDK acquires US-based smart glasses company SoftEye | Reuters https://www.reuters.com/business/japans-tdk-acquires-us-based-smart-glasses-company-softeye-source-says-2025-06-19/

Snap plans to sell lightweight, consumer AR glasses in 2026 | TechCrunch https://techcrunch.com/2025/06/10/snap-plans-to-sell-lightweight-consumer-ar-glasses-in-2026/

This is what TSA actually sees when they scan your luggage at the airport. CT scanning is peak spatial intelligence — reconstructing reality in 3D from raw x-ray data. https://x.com/bilawalsidhu/status/1934740598327841009

WebSim, WorldSim, and The Summer of Simulative AI — with Joscha Bach of Liquid AI, Karan Malhotra of Nous Research, Rob Haisfield of WebSim.ai https://www.latent.space/p/sim-ai

Introducing the V-JEPA 2 world model and new benchmarks for physical reasoning https://ai.meta.com/blog/v-jepa-2-world-model-benchmarks/

China unveils a new brain for humanoids. The Beijing Academy of Artificial Intelligence (BAAI) has announced RoboBrain 2.0, an open-source general-purpose AI model designed to power humanoids and other general-purpose robots. BAAI is already collaborating with 20 Chinese https://x.com/TheHumanoidHub/status/1934326215382569132

The Beijing Academy of Artificial Intelligence dropped RoboBrain 2.0, an open-source AI for humanoids/robots It ingests multi-image and long videos as inputs and delivers capabilities like spatial perception, temporal perception, and scene reasoning https://x.com/rowancheung/status/1934518213687029851

Paradromics Ready First Human Brain-Computer Interface (BCI) https://cannadelics.com/2025/06/05/paradromics-brain-computer-interface-implant/

1X showcased 1XWM, a ‘world model’ that simulates a realistic, interactive world around a virtual robot With a few initial real-world frames and action trajectories, it simulates the result of those exact actions, including the physics of objects https://x.com/rowancheung/status/1934881641224196183

AnimaX: Animating the Inanimate in 3D with Joint Video-Pose Diffusion Models https://anima-x.github.io/

Bernt Børnich says teleoperation data and synthetic data are like a crutch to bootstrap your way to robots learning in the real world. https://x.com/TheHumanoidHub/status/1933761018897052006

🚀Exciting News! We’re thrilled to introduce OpenWBT — the open-source, whole-body VR teleoperation system designed for humanoid across various robot types & virtual/physical worlds! 🤖✨ 💡Developed through collaboration between Galbot and Tsinghua University researchers! 🔥 https://x.com/GalbotRobotics/status/1932082760736190588

🚀 Hunyuan 3D 2.1 is here! The first fully open-source, production-ready PBR 3D generative model! ✅Cinema-grade visuals: PBR material synthesis brings leather, bronze, and more to life with stunning light interactions. ✅ Fully open-source: Model weights, training/inference https://x.com/TencentHunyuan/status/1933563360320385527

Hunyuan 3D 2.1 released – I like the idea of models like Text->3D because there really aren’t a lot of 3D models easily available to your exact specifications for 3D Printing, and 3d modeling is hard https://x.com/Teknium1/status/1935656421506654256

Tencent released Hunyuan 3D 2.1, an open-source model for generating 3D assets, including PBR materials It can synthesize objects with cinematic textures and realism while covering their light interactions Fully open-source with model weights and code https://x.com/rowancheung/status/1934518092891086855

Create TikTok-Style Videos Faster With New AI Tools From TikTok Symphony | TikTok For Business Blog https://ads.tiktok.com/business/en-US/blog/tiktok-symphony-ai-tools?acq_banner_version=73758464

CLONE Closed-Loop Whole-Body Humanoid Teleoperation for Long-Horizon Tasks
https://humanoid-clone.github.io/

Today, Figure officially no longer has a Controls division The entire Controls team is now part of Helix, helping to accelerate our AI roadmap”” / X https://x.com/adcock_brett/status/1934641122565099919

Nvidia open-sourced Gr00t N1.5 3B, a foundation model for humanoid reasoning skills It’s available under a commercially permissive license with a full fine-tuning tutorial to use it with Hugging Face’s LeRobot SO-101 Arm https://x.com/rowancheung/status/1933072152363762035

Can’t wait till this AI video tech is running in real time at 60 leather jackets per second https://x.com/bilawalsidhu/status/1933599865424261306