Image created with gemini-2.5-flash-image with claude-sonnet-4-5. Image prompt: Photorealistic wide shot of six Ionic limestone columns on Mizzou quad at golden hour with classical entablature carved ‘AR/VR’ spanning the top, the space between columns shows subtle holographic wireframe overlay and depth shimmer suggesting portal between physical campus and augmented virtual layer, warm beige stone, red brick buildings, green lawn, clear blue sky, no people, cinematic natural lighting with long shadows.

EdgeTAM, real-time segment tracker by Meta is now in @huggingface transformers with Apache-2.0 license 🔥 > 22x faster than SAM2, processes 16 FPS on iPhone 15 Pro Max with no quantization > supports single/multiple/refined point prompting, bounding box prompts https://x.com/mervenoyann/status/1986785795424788812

Any2Track: Galbot’s whole-body motion tracking system. https://x.com/TheHumanoidHub/status/1985219089015664676

Okay coolest Adobe Sneaks of the year award goes to Project Light Touch. Adobe calls this “”spatial lighting mode”” — interactively move your light source around within a 3D volume and voila — your image is accurately relit. They’re probably using ML to infer a PBR + depth map https://x.com/bilawalsidhu/status/1983982560054296843

NVIDIA World Simulation with Video Foundation Models for Physical AI https://huggingface.co/papers/2511.00062

Meta, Google, Apple – they’re all building AI replicas that capture your face, expressions, movements, personality. This goes way beyond Face ID. They’re basically creating a version of you that knows you better than you know yourself. The fidelity is remarkable too. We went https://x.com/bilawalsidhu/status/1985398951407722901

New episode: Spencer Huang is a Product Lead for @NVIDIARobotics, focusing on open-source simulation frameworks, synthetic data generation, and robot autonomy. Spencer breaks down RL breakthroughs unlocked by hardware, open-source simulation, synthetic data flywheels, the https://x.com/TheHumanoidHub/status/1984641886230102217

Apple Sr. Director Jeff Norris on how Vision Pro persona went from looking like cadavers to life-like avatars: “”First thing I’d want to point out is that we changed the the whole rendering approach of Personas to a completely new technique that’s based on Gaussian splats.”” The https://x.com/bilawalsidhu/status/1984267919035957704

The secret’s out — Apple’s persona avatars and 3D photos owe their quality boost to none other than Gaussian splatting. “”In our meeting, Norris explains that Persona technology uses Gaussian splatting to create those surprisingly convincing 3D facial scans.”” “”Norris says Apple https://x.com/bilawalsidhu/status/1983949413325402542

Excited to build models that simulate the world? 🌐🌐🌐 Our team is hiring!! See posts below for Research Scientists and Engineers. The roles are somewhat fluid, but rule of thumb is the RS typically require a PhD. Preference for London or Toronto. Apply here: 👇👇”” / X https://x.com/jparkerholder/status/1985729367469596843

(32) PercHead: Perceptual Head Model for Single-Image 3D Head Reconstruction & Editing – YouTube https://www.youtube.com/watch?v=4hFybgTk4kE

Instant Skinned Gaussian Avatars for Web, Mobile and VR Applications TL;DR: animates GS by leveraging parallel splat-wise processing to dynamically follow the underlying skinned mesh in real time while preserving high visual fidelity. https://x.com/Almorgand/status/1985377664526323886

PercHead: Perceptual Head Model for Single-Image 3D Head Reconstruction & Editing https://antoniooroz.github.io/PercHead/

SAM 2++: Tracking Anything at Any Granularity”” TL;DR: unifies video tracking across masks, boxes & points; uses task-specific prompts, a unified decoder, and a task-adaptive memory to track at any granularity. Backed by a new large-scale dataset. https://x.com/Almorgand/status/1986112315050369103

AI-video to robot transfer Generate AI video with Google VEO -> reconstruct the 3D motion -> train sim-to-real https://x.com/TheHumanoidHub/status/1985802136568123421

We’re introducing EnvHub to LeRobot! Upload simulation environments to the @hugginigface hub, and load them in lerobot, with one line of code ! env = lerobot.make_env(“”username/my-env””, trust_remote_code=True) Back in 2017, @OpenAI called on the community to build Gym https://x.com/jadechoghari/status/1986482455235469710

adding camera control to the list of things Qwen Image Edit is great at + with a specialized multi-angle LoRA it’s even better✨ > rotate the camera > tilt between bird’s-eye and worm’s-eye views > adjust lens (wide, close-up) of course we built a demo for it 🤝📹 https://x.com/linoy_tsaban/status/1986090375409533338

Qwen Image Multiple Angles LoRA is an exquisitely trained LoRA! 📐˚₊‧꒰ა Keep character and scenes consistent, and flies the camera around! Open source got there! One of the best LoRAs I’ve come across lately 🙌 https://x.com/multimodalart/status/1986174924038218087

Excited to introduce TWIST2, our next-generation humanoid data collection system. TWIST2 is portable (use anywhere, no MoCap), scalable (100+ demos in 15 mins), and holistic (unlock major whole-body human skills). Fully open-sourced: https://x.com/ZeYanjie/status/1986126096480587941

RayZer: A Self-supervised Large View Synthesis Model TL;DR: Modern self-supervised methods share the same principle of learning by predicting “”missing”” data. GPT predicts missing next token leveraging language sequential prior. Can we do the same for self-supervised 3D model? https://x.com/Almorgand/status/1984295391219617863

Introducing Odyssey-2: instant, interactive AI video https://odyssey.ml/introducing-odyssey-2

Leave a Reply

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading