Ethan B. Holland

Over 54,900 manually organized AI links and counting

Video: AI News Week Ending 02/06/2026

February 6, 2026

Image created with gemini-3.1-flash-image-preview with claude-sonnet-4-5. Image prompt: Flat cartoon illustration of a cute coral-red lobster mascot holding a movie clapperboard against a dark charcoal background, white speech bubble above containing the word VIDEO in bold sans-serif font, minimal filmstrip perforations on edges, simple geometric shapes, kawaii style, high contrast, clean outlines, landscape format

New milestone: we trained a robot foundation model on a world model backbone, and enabled zero-shot, open-world prompting capability for new verbs, nouns, and environments. If the world model can “”dream”” the right future in pixels, then the robot can execute well in motors. We”” https://x.com/DrJimFan/status/2019112603637920237

Eleven v3 — Most Expressive AI Voice Model https://elevenlabs.io/v3

ElevenLabs CEO: Voice is the next interface for AI | TechCrunch https://techcrunch.com/2026/02/05/elevenlabs-ceo-voice-is-the-next-interface-for-ai/

ElevenLabs raises $500M Series D at $11B valuation https://elevenlabs.io/blog/series-d

Genie is out of the bottle. Google is rolling out access to Project Genie: – Design your world and character using text and visual prompts. – The Genie 3 world model generates the environment in real-time as you move through it. Only available for Google AI Ultra subscribers in”” https://x.com/TheHumanoidHub/status/2016987944809353260

getting meta over here… prompted genie 3 to generate a zoom call and i can take control of the cursor and take it off the screen into the world. lmao.”” https://x.com/bilawalsidhu/status/2017346682116084079

Giving the world’s first photograph, the View from the Window at Le Gras, from 1822, to Genie 3.”” https://x.com/emollick/status/2018494862178316725

Google Genie just let me walk through 1900s San Francisco. I gave it one black-and-white photo. It gave me back a city — explorable from the sky or the street. This is the closest thing we have to a time machine.”” https://x.com/bilawalsidhu/status/2017045841836405035

I tested Google’s world model Genie 3… Then DeepMind told me everything 00:00 – Intro & Authoring Workflow 00:27 – Genie 3 Playtesting & Demos 05:33 – Interview w/ Google DeepMind (Genie 3 co-lead @jparkerholder and Sr. PM Diego Rivas) 06:54 – Wildest emergent behaviors”” https://x.com/bilawalsidhu/status/2018487746508018051

Much debate over Genie vs 3D engines. You can have both – the control of 3D scene graphs + the creativity of generative ai. Wrote this in 2024 breaking down the vision. The models are almost there. Now just imagine if Unreal / Unity productized this.”” https://x.com/bilawalsidhu/status/2018119240612536587

One of the wildest emergent capabilities of Genie 3 is that maps actually work. As I walk around the forest, the GPS display updates its heading in real time. Remember. There is no game engine here. This is an AI hallucinating a working navigational instrument purely from next”” https://x.com/bilawalsidhu/status/2017252036719657193

Today is the day. Google DeepMind just shipped playable reality: https://t.co/ct43xo4G43 I went hands-on with their Genie 3 world model that spawns interactive, 3D simulations from simple text. We’ve moved past watching videos; we’re now stepping *inside* them. Stick around to”” https://x.com/bilawalsidhu/status/2016925493552206113

Took an old photo of a WWI battlecruiser, gave it to Genie 3, and prompted it to let me play as a torpedo boat at the Battle of Jutland. Considering this is a research preview, astonishing how fast this has come. An AI dynamically generating the world with no game engine…”” https://x.com/emollick/status/2018198584508760108

📢 New paper from GEAR team @NVIDIARobotics We released DreamZero, a World Action Model that turns video world models into zero-shot robot policies. Built on a pretrained video diffusion backbone, it jointly predicts future video frames and actions. 🌐”” https://x.com/yukez/status/2019096072690553112

Introducing NVIDIA Cosmos Policy for Advanced Robot Control https://huggingface.co/blog/nvidia/cosmos-policy-for-robot-control

Waypoint-1.1 is live, and we’re kicking off weekly updates. This release crosses an important line from impressive short rollouts to local, real-time world models that are coherent, controllable, and playable. New model. Better prompting. Smoother rollouts.”” https://x.com/overworld_ai/status/2019109415023178208

Planning is one of the most exciting uses of world models, but existing planners struggle on long horizons. Introducing GRASP: a fast gradient-based planner for world models that outperforms prior methods on long-horizon tasks. Two key ideas: 1.jointly optimize actions and”” https://x.com/_amirbar/status/2019903658792497482

tl;dr New planner for world models! GRASP: gradient-based, stochastic, parallelized. Long range planning for world models has always been an issue. 0th order methods like CEM/MPPI dominate, but have degrading performance at longer contexts or higher-dimensional actions. We”” https://x.com/michaelpsenka/status/2019870377032503595

Robbyant has announced LingBot-VLA: an open-source Vision-Language-Action model – Pretrained on ~20k hours of real-world dual-arm robot data – Strong generalization across 9 embodiments – Improves consistently with more data – Claims outperformance over π₀.₅, GR00T N1.6 &”” https://x.com/TheHumanoidHub/status/2017337216054575513

World Model meets robot policy! Robbyant’s LingBot-VA: unifies video world modeling and robotic policy learning. – A single model generates both future video and the actions to make it real. – Long-term memory enables long-horizon tasks. – Claims significant outperformance over”” https://x.com/TheHumanoidHub/status/2017638555741552672

self-driving <as a 2D robot with a low-dim action space that focused mostly on avoidance rather than interaction> will reach real-world impact faster than anything else. the really cool part is that the world model isn’t just about videos; it’s about modeling continuous,”” https://x.com/sainingxie/status/2019841784990351381

Accelerating Creation, Powered by Roblox’s Cube Foundation Model | Roblox https://about.roblox.com/newsroom/2026/02/accelerating-creation-powered-roblox-cube-foundation-model

🚀 Introducing the Kling 3.0 Model: Everyone a Director. It’s Time. An all-in-one creative engine that enables truly native multimodal creation. – Superb Consistency: Your characters and elements, always locked in. – Flexible Video Production: Create 15s clips with precise”” https://x.com/Kling_ai/status/2019064918960668819?s=20

JUST-DUB-IT: Video Dubbing via Joint Audio-Visual Diffusion https://justdubit.github.io/

Kling 3.0 is here. Improved detail, character references and native audio is here. The absolute standout feature is Custom Multishot. Take control of your outputs by prompting for individual shots for up to 15 seconds. Fantastic release from @Kling_ai!”” https://x.com/jerrod_lew/status/2019099988429795740

Introducing the Artificial Analysis Video with Audio Arena! Compare video models with native audio generation including Veo 3.1, Grok Imagine, Sora 2, and Kling 2.6 Pro Since Google’s Veo 3 launched last May as the first major video model with native audio generation, many”” https://x.com/ArtificialAnlys/status/2019132516897288501

one side tangent from the @yitayml pod I am still thinking about is how people still underestimate the potential of World Models based on moving around in pretty 3D worlds. @ylecun and @jacob_d_kahn showed you can have world models in text and code. currently editing a BANGER”” https://x.com/swyx/status/2019605135689937405

BREAKING: @xAI’s Grok-Imagine-Video now #1 in Video Arena! For the first time, Grok-Imagine-Video-720p takes the top spot on the Image-to-Video leaderboard, overtaking Google’s Veo 3.1 while being 5x cheaper. Its 480p version released a few days ago ranks #4. Huge congrats to”” https://x.com/arena/status/2019204821551837665

Grok Imagine rank 1″” https://x.com/elonmusk/status/2019164163906629852

Introducing Grok Imagine 1.0, our biggest leap yet. 1.0 unlocks 10-second videos, 720p resolution, and dramatically better audio. Imagine has generated 1.245 billion videos in the last 30 days alone. Try it now: https://x.com/xai/status/2018164753810764061?s=20

Playing as Godot, finally arriving. Just as Beckett intended, thanks to AI.”” https://x.com/emollick/status/2018213227503534572