Augmented Reality (AR/VR): AI News Week Ending 08/01/2025

Augmented Reality (AR/VR): AI News Week Ending 08/01/2025

August 1, 2025

Image created with Flux Pro v1.1 Ultra. Image prompt: photorealistic still image of a middle-aged man standing behind a woman, woman covering part of her face with her hand, man looking over her shoulder, both illuminated with warm stadium jumbotron lighting, natural skin tones, subtle lens flare, shallow depth of field, exact color temperature of a live event projection, both wearing VR headsets with “AR/VR” labels on the straps, cinematic realism –no text, captions, watermarks

AlphaEarth Foundations helps map our planet in unprecedented detail – Google DeepMind https://deepmind.google/discover/blog/alphaearth-foundations-helps-map-our-planet-in-unprecedented-detail/

Google just took a big step towards building ChatGPT for Earth. AlphaEarth Foundations does something clever — instead of drowning in petabytes of Earth observation data, it creates compact summaries of every 10x10m square on Earth by fusing optical, radar, LiDAR, and climate https://x.com/bilawalsidhu/status/1950580970907648234

we plugging ViTPose into Basketball AI according to @NBA rules, a player is considered to be in the paint only if both feet are inside the paint notebook: https://x.com/skalskip92/status/1950231824933982428

what player is that? in the upcoming supervision-0.27.0 release, you’ll be able to freely control text position, including applying custom offsets from the detection box supervision annotators are now so advanced, you can literally use them to create full visual content link: https://x.com/skalskip92/status/1950984077617799534

Google just discovered a powerful emergent capability in Veo 3 – visually annotate your instructions on the start frame, and Veo just does it for you! Instead of iterating endlessly on the perfect prompt, defining complex spatial relationships in words, you can just draw it out https://x.com/bilawalsidhu/status/1948844167603310660

Alibaba to launch AI-powered smart glasses creating rival to Meta https://www.cnbc.com/2025/07/28/alibaba-ai-smart-glasses-creates-rival-to-meta.html

DreamVVT: Mastering Realistic Video Virtual Try-On in the Wild via a Stage-Wise Diffusion Transformer Framework virtu-lab.github.io https://virtu-lab.github.io/

[2507.19481] HairCUP: Hair Compositional Universal Prior for 3D Gaussian Avatars https://arxiv.org/abs/2507.19481

3DGS-LM: Faster Gaussian-Splatting Optimization with Levenberg-Marquardt https://lukashoel.github.io/3DGS-LM/

Avat3r: Large Animatable Gaussian Reconstruction Model for High-fidelity 3D Head Avatars https://tobias-kirschstein.github.io/avat3r/

Dens3R: A Foundation Model for 3D Geometry Prediction”” TL;DR: feed-forward visual foundation; unposed images input for high-quality 3D pointmap with unified geometric dense prediction. multi-view and multi-resolution inputs possible; robust dense prediction https://x.com/Almorgand/status/1948760040891990066

geospatial intelligence https://x.com/bilawalsidhu/status/1949123293316800565

ICCV Poster ATLAS: Decoupling Skeletal and Shape Parameters for Expressive Parametric Human Modeling https://iccv.thecvf.com/virtual/2025/poster/2270

ICCV Poster Generative Modeling of Shape-Dependent Self-Contact Human Poses https://iccv.thecvf.com/virtual/2025/poster/876

Repurposing 2D Diffusion Models with Gaussian Atlas for 3D Generation https://arxiv.org/abs/2503.15877

ROMAN(Robust Object Map Alignment Anywhere) https://x.com/rsasaki0109/status/1948578963153891727

RT @TencentHunyuan: We’re thrilled to release & open-source Hunyuan3D World Model 1.0! This model enables you to generate immersive, explor…”” / X https://x.com/scaling01/status/1949300037051134245

Splat and Replace: 3D Reconstruction with Repetitive Elements”” TL;DR: lots of repetitions in scenes; handle poor coverage/occlusions: 3DGS + registration; variety of synthetic and real scenes with typical repetitive elements; improvement in quality of novel view synthesis. https://x.com/Almorgand/status/1950134517584257190

The one place facial recognition AR glasses actually make sense is conferences. You’re already wearing your PII on a lanyard. You’re already using 5 different networking apps. AR just removes the friction. “Who in this room shares my interests?” “Is anyone from my network”” / X https://x.com/bilawalsidhu/status/1948466301522862369

walking around this big ass 3d gaussian splat of a forest i swear i’m touching grass https://x.com/bilawalsidhu/status/1949981457419497690

We’re thrilled to release & open-source Hunyuan3D World Model 1.0! This model enables you to generate immersive, explorable, and interactive 3D worlds from just a sentence or an image. It’s the industry’s first open-source 3D world generation model, compatible with CG pipelines https://x.com/TencentHunyuan/status/1949288986192834718

Demis Hassabis says Veo3’s surprising grasp of physics and human dynamics challenges the notion that embodied interaction is required to understand intuitive physics. Passive observation alone may build a world model – and that’s what you’d need for a true AGI system. https://x.com/TheHumanoidHub/status/1948625791719465151

[2502.13133] AV-Flow: Transforming Text to Audio-Visual Human-like Interactions https://arxiv.org/abs/2502.13133

google earth renders work really well as inputs to runway aleph and luma’s v2v models. https://x.com/bilawalsidhu/status/1950338692465193415

New Unity package available: Reachy 2’s digital twin! – Gives immersive 3D experience through AR/VR – Fully controllable via Reachy 2 stack – Perfect for robotics courses & HRI research Explore robotics without the physical robot! https://x.com/pollenrobotics/status/1948335305246789925

RT @torchcompiled: they did diffusion on * checks notes * a house https://x.com/sedielem/status/1950190227475046877

Import your explosion stock footage or create particle simulation. Position it next to the window in 3D space, matching the camera angle and perspective. Color grade the explosion to match your scene’s lighting, cooler tones for overcast, warmer for sunny conditions. Adjust https://x.com/c_valenzuelab/status/1950257984715571606

The way Wan2.2 5B handles I2V and timesteps is awesome! Each latent frame has its own denoising timestep. The first frame is just set as completely denoised. This means you should be able to do a sliding denoise timestep window and have infinite long form video generation.”” / X https://x.com/ostrisai/status/1950129158618591646

Zuck would like you to be unable to think about superintelligence, and therefore has an incentive to redefine the word as meaning smart glasses.”” / X https://x.com/ESYudkowsky/status/1950685204684972495

11/ @davidlinjiahao built a game engine that turns any prompt into fully playable 3D assets using Grok 4. https://x.com/AtomSilverman/status/1949955594644795722

An interactive 3D simulation and visualization of a black hole, leveraging @threejs for rendering and custom GLSL shaders to achieve stunning details in the event horizon, starfield, and accretion disk. Coded using newly released @grok 4 from @xai https://x.com/techartist_/status/1943193486842323301

Finger painting an augmented reality filter in real time and embarrassing 😳 myself for the vibes. Hand gestures + voice controls. Coded with Grok 4 from @xAI using @threejs and mediapipe. https://x.com/renderfiction/status/1944731355666956291