Image created with gemini-3.1-flash-image-preview with claude-opus-4.7. Image prompt: Using the provided reference image, preserve the marigold orange backdrop, the seated woman’s closed-eyes smile and purple windbreaker, the tattooed singer’s red beanie and vest and his hand position and pose exactly, but replace the black handheld microphone with a sleek matte-black-and-white VR headset gripped by its strap and held to his mouth like a microphone with the visor facing forward, photographed with the same cinematic studio lighting, shallow depth of field, and seamless realism so the swap feels uncanny and natural. After generating the image, overlay the text “AR/VR” in the upper-left corner of the frame in large, bold, all-caps ITC Avant Garde Gothic Pro Medium (or a near-identical geometric sans-serif if unavailable), pure white (#FFFFFF), with no date, subtitle, drop shadow, or outline. The text should be substantial in scale — taking up a meaningful portion of the upper-left area — with comfortable margin from the top and left edges, set against the negative space of the orange backdrop so it does not overlap or obscure the singer, the seated woman, or the replaced object.

Long live bullet time. The future of sports is 4d gaussian splatting.
https://x.com/bilawalsidhu/status/2042470014964396343

Masked Depth Modeling for Spatial Perception”” TL;DR: treats missing depth as a learning signal to reconstruct accurate 3D geometry from noisy RGB-D inputs, enabling robust perception in real-world conditions
https://x.com/Almorgand/status/2042639933194575985

5 min video from an insta360 camera turned into a big ass 3d gaussian splat using the new niantic scaniverse app it’s kinda wild how well we can model the complexity of reality, and run in realtime in a browser at 100fps
https://x.com/bilawalsidhu/status/2042069658363117654

📢📢A double launch today! We’re releasing a paper analyzing the rapidly growing trend of “open-world evaluations” for measuring frontier AI capabilities. We’re also launching a new project, CRUX (Collaborative Research for Updating AI eXpectations), an effort to regularly
https://x.com/random_walker/status/2044841045867778365

Meet @HappyOysterAI from Alibaba ATH, an open‑ended world model built for real‑time world creation and interaction. Be part of the first wave and see what you can build. 🌍✨ #AlibabaAI #HappyOyster
https://x.com/AlibabaGroup/status/2044634595937882394?s=20

Most Physical AI models recognize patterns. They don’t understand the world. That’s why they fail on edge cases. BADAS 2.0 is a V-JEPA2 world model trained by @getnexar on real-world videos. We used the model to find what it didn’t understand, then trained on that. It
https://x.com/eranshir/status/2044759951340388611

Must-read research of the week ▪️ Neural Computers ▪️ The Illusion of Stochasticity in LLMs ▪️ Learning is Forgetting: LLM Training as Lossy Compression ▪️ A Frame is Worth One Token: Efficient Generative World Modeling with Delta Tokens ▪️ INSPATIO-WORLD: A Real-Time 4D World
https://x.com/TheTuringPost/status/2044113565771280775

We’re open-sourcing HY-World 2.0, a multimodal world model that generates, reconstructs, and simulates interactive *3D worlds* from text, images, and videos. Outputs can be integrated into game engines and embodied simulation pipelines. Key highlights: 🔹 One-click world
https://x.com/TencentHunyuan/status/2044604754836505076?s=20

Scal3R: Scalable Test-Time Training for Large-Scale 3D Reconstruction”” TL;DR: scalable test-time training with global memory modules enables accurate kilometer-scale 3D reconstruction from long RGB video sequences
https://x.com/Almorgand/status/2044468554754412564

Selfi: Self Improving Reconstruction Engine via 3D Geometric Feature Alignment”” TL;DR: improves 3D reconstruction by aligning features across views using self-distilled geometry-aware representations
https://x.com/Almorgand/status/2042631239601930681

Spark 2.0 is here! 🚀 We’re redefining what’s possible on the web with a streamable LoD system for 3D Gaussian Splatting. Built on Three.js, you can now stream massive 100M+ splat worlds to any device from mobile to VR using WebGL2. All open-source. Dive into the tech 👇
https://x.com/sparkjsdev/status/2044090505982816449

Static 3D generation isn’t enough. We need assets ready for animation. Our new #SIGGRAPH work, AniGen, takes a single image and generates the 3D shape, skeleton, and skinning weights all at once. Code is fully open-sourced! Kudos to @KyrieIr31012755 and @VastAIResearch 🧵(1/4)
https://x.com/yanpei_cao/status/2044094818872377720

The future of sports is immersive. We can already reproduce entire games in 3d, track every player down to their pose and heartbeat – but barely any of that makes it to your living room. Here’s everything you need to known about 3d sports tech.
https://x.com/bilawalsidhu/status/2043085376349442077

[2604.13036] Lyra 2.0: Explorable Generative 3D Worlds
https://arxiv.org/abs/2604.13036

Leave a Reply

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading