Image created with gemini-2.5-flash-image with claude-sonnet-4-5. Image prompt: A child wearing a chunky VR headset stands on an 80s suburban street at twilight lined with Halloween decorations and carved pumpkins, while translucent holographic ghosts and digital monsters overlay the scene in neon blue and purple, autumn leaves scattered on the pavement, warm porch lights glowing, cinematic composition with split reality effect showing both physical decorations and augmented digital layer.
Google just took another big step towards becoming ChatGPT for planet earth. I can’t overstate how important this is — geospatial AI commodified. Here’s what it can do: https://x.com/bilawalsidhu/status/1981566109863289028
Nvidia just released ChronoEdit-14B on Hugging Face enables physics-aware image editing and action-conditioned world simulation through temporal reasoning. It distills priors from a 14B-parameter pretrained video generative model and separates inference into (i) a video https://x.com/_akhaliq/status/1983953896415604836
👀 Meet @NVIDIAAI Nemotron Nano 2 VL, now hosted on Nebius AI Studio – 10× higher throughput – Document + video intelligence – Open weights, open data – Ready for production Build multimodal assistants today → https://x.com/nebiusaistudio/status/1983243873317974318
NVIDIA Launches Open Models and Data to Accelerate AI Innovation | NVIDIA Blog https://blogs.nvidia.com/blog/open-models-data-ai/
A new open-source physics engine just dropped… and it could change how robots learn. Newton, built by @nvidia with support from @GoogleDeepMind and Disney Research, is now part of The Linux Foundation. It’s designed to bring precise, GPU-powered physics to robotics and https://x.com/IlirAliu_/status/1982726852507521065
Tesla’s learned world simulator, a neural network-based system, tackles the challenge of evaluating autonomous driving and Optimus robot AI. Trained on a massive, curated dataset from Tesla’s fleet, it synthesizes future states (e.g., high-resolution, multi-camera video streams) https://x.com/TheHumanoidHub/status/1981802545594216845
You know Tesla? Yeah the world model company https://x.com/bilawalsidhu/status/1982804738899935481
You can capture reality with remarkable fidelity – here’s a 1:1 3d gaussian splat perfectly aligned with reality. Once Meta layers in a scalable version of their codec avatar tech – this stuff is about to get trippy. Literally transcending time & space. https://x.com/bilawalsidhu/status/1982441732186022219
I like big splats and I cannot lie. PlayCanvas’s new LOD streaming system for 3D Gaussian Splatting is a big deal. It’s a key step towards splatting the world and making it explorable. Check out a live demo below in your browser. https://x.com/bilawalsidhu/status/1981353167918129354
Watching old delhi with a billion eyes (experimenting w/ generative ai + computer vision) https://x.com/bilawalsidhu/status/1983590151294243254
Just-in-time AI advertising is near. Here are the early signs. Even if these massive data center build outs don’t amount to AGI – you best believe personalized ads will be generated for you just a few videos before you scroll to it on your feed. The economics practically”” / X https://x.com/bilawalsidhu/status/1982130030760296944
Full circle moment. Six years of work on 3D maps and VR media — seeds planted years ago turn into this. The Google team is absolutely crushing it. https://x.com/bilawalsidhu/status/1981786698117001534
🚀Excited to team up with @NVIDIAAIDev to bring Nemotron Nano 2 VL to vLLM – a multimodal model powered by a hybrid Transformer-Mamba language backbone, built for video understanding and document intelligence✨ Full post here👇 https://x.com/vllm_project/status/1984334926972592193
We are so excited to be a launch partner for @nvidia Nemotron Nano 2 VL today and offer day-zero support for this highly accurate and efficient vision language model, alongside other models in the Nemotron family. To learn more, read our blog here https://x.com/basetenco/status/1983243273171845596
Got to say hi to @adcock_brett at the NVIDIA GTC pregame. Jensen’s keynote starts in one hour. https://x.com/TheHumanoidHub/status/1983187543349965082
people are sleeping on this release NVIDIA blessed us with a new family of Nemotron RAG models 🔥 it comes with text retrievers, multimodal retrievers as well as layout detectors with commercially permissive license 👏 https://x.com/mervenoyann/status/1984302303570960666
Nemotron Nano VL 12B V2 by @nvidia is now on Replicate A 12B vision-language beast for document intelligence & video understanding. Handles up to 4 images or 1 video, extract data from invoices, compare pics, summarize clips, all in 10 languages! https://x.com/replicate/status/1983242266836890026
Brett Adcock at NVIDIA GTC: Solving general-purpose AI for humanoid robots is ten to a hundred times harder than making the humanoid robots. https://x.com/TheHumanoidHub/status/1983199160875790426
”NVIDIA Isaac GR00T N open reasoning VLA models are now integrated into @huggingface’s LeRobot with the v0.4.0 release. 🤖 Making it easier than ever for the open-source robotics community to customize and deploy robot foundation models. 👉 https://x.com/NVIDIARobotics/status/1983564485588549657
Introducing Mika, the newest Grok Companion. Video made using Grok Imagine.”” / X https://x.com/xai/status/1981917247095685441
Introducing Odyssey-2: instant, interactive AI video https://odyssey.ml/introducing-odyssey-2
Everyone is going to be able to vibe code video games by the end of 2025″” / X https://x.com/OfficialLoganK/status/1982222911231377696
It is interesting that building a consistent video using multiple generations turns out to be a much easier than consistent sound This is across 6 Veo 3.1 clips. You can see how details get lost (the shape of the background building) but visuals are generally good. Not sound. https://x.com/emollick/status/1981162893631770892





Leave a Reply