Image created with gemini-2.5-flash-image with claude-sonnet-4-5. Image prompt: Cinematic wide shot of a metallic humanoid robot frozen mid-leap in an art deco Emerald City plaza, body segmented into color-coded technical regions showing joints and components, dramatic green and silver theatrical lighting with moody atmosphere, bold movie title text ROBOTS overlaid at top
For AI to be able to help humans in the physical world, we need systems that can understand and simulate the universe. To exponentially accelerate Luma’s path to Multimodal AGI we are building a 2GW compute cluster with Humain and we have raised a $900M Series C. I am incredibly”” / X https://x.com/gravicle/status/1991202746871988680
Meta just dropped SAM 3D, but more interestingly, they basically cracked the 3D data bottleneck that’s been holding the field back for years. Manually creating or scanning 3D ground truth for the messy real world is basically impossible at scale. But what if you just have https://x.com/bilawalsidhu/status/1991237143898017854
Introducing SAM 3D: Powerful 3D Reconstruction for Physical World Images https://ai.meta.com/blog/sam-3d/
SAM 3D enables accurate 3D reconstruction from a single image, supporting real-world applications in editing, robotics, and interactive scene generation. Matt, a SAM 3D researcher, explains how the two-model design makes this possible for both people and complex environments. https://x.com/AIatMeta/status/1991605451809513685
Introducing SAM 3D, the newest addition to the SAM collection, bringing common sense 3D understanding of everyday images. SAM 3D includes two models: 🛋️ SAM 3D Objects for object and scene reconstruction 🧑🤝🧑 SAM 3D Body for human pose and shape estimation Both models achieve https://x.com/AIatMeta/status/1991184188402237877
We’re sharing model checkpoints, an evaluation benchmark, human body training data, and inference code with the community to support creative applications in fields like robotics, interactive media, science, sports medicine, and beyond. 🔗 SAM 3D Body: https://x.com/AIatMeta/status/1991184190323212661
Meta AI Demos https://aidemos.meta.com/segment-anything
Introducing Meta Segment Anything Model 3 and Segment Anything Playground https://ai.meta.com/blog/segment-anything-model-3/
SAM-3 is out on @huggingface! A big upgrade from SAM-2, and Meta finally added support for text prompts. Here I tried it out on @hazardeden10’s magical goal against @Arsenal using the text prompt “”Chelsea player”” Works pretty well! https://x.com/NielsRogge/status/1991213874687758799
Collecting a high quality dataset with 4M unique phrases and 52M corresponding object masks helped SAM 3 achieve 2x the performance of baseline models. Kate, a researcher on SAM 3, explains how the data engine made this leap possible. 🔗 Read the SAM 3 research paper: https://x.com/AIatMeta/status/1991640180185317644
SAM3 video tracking is so good yesterday: collect data, train custom object detector, use tracker to estimate object motion – days today: track anything with text prompt – seconds https://x.com/skalskip92/status/1991232397686219032
We’ve partnered with @Roboflow to enable people to annotate data, fine-tune, and deploy SAM 3 for their particular needs. Try it here: https://x.com/AIatMeta/status/1991191530367799379
SAM 3 tackles a challenging problem in vision: unifying a model architecture for detection and tracking. Christoph, a researcher on SAM 3, shares how the team made it possible. 🔗 Read the SAM 3 research paper: https://x.com/AIatMeta/status/1991538570402934980
SAM3 is open-source model. You can use the models in commercial. You can modify or fine tune. You keep ownership of your modifications. You do not need to release your source code.”” / X https://x.com/skalskip92/status/1991626755782877234
Today we are releasing & open-sourcing Segment Anything 3 (SAM 3). It is a state-of-the-art model for image & video segmentation, and builds upon the work of SAM & SAM 2. SAM3 will also power features in Edits, Meta AI, & Facebook Marketplace soon. https://x.com/alexandr_wang/status/1991198465628459494
Today we’re excited to unveil a new generation of Segment Anything Models: 1️⃣ SAM 3 enables detecting, segmenting and tracking of objects across images and videos, now with short text phrases and exemplar prompts. 🔗 Learn more about SAM 3: https://x.com/AIatMeta/status/1991178519557046380
Figure has shared numbers on its 11-month humanoid deployment at BMW’s Spartanburg factory. – Contributed to the production of 30,000+ cars (X3 vehicles). – 90,000+ parts loaded. – Ran 10-hour shifts, Monday to Friday. – Estimated 200+ miles of walking. – A single Figure 02 https://x.com/TheHumanoidHub/status/1991205599846269220
AI can now create AND explore 3D worlds. World models and agentic AI are on a collision course. World Labs is making world-building effortless. Google DeepMind’s SIMA-2 is making agency inside those worlds possible. Together, they hint at a new paradigm–AI that both creates https://x.com/bilawalsidhu/status/1990994808626950579
Google DeepMind has introduced SIMA 2, a reasoning, conversational AI agent for 3D worlds including games and generative world-model scenes. – Handles complex goals, explains steps, supports multilingual/emojis for collaborative play. – Adapts to real-time generated 3D worlds https://x.com/TheHumanoidHub/status/1989424462085960082
Google DeepMind’s SIMA 1 vs SIMA 2 The bitter lesson continues to be bitter sweet https://x.com/bilawalsidhu/status/1989001120849735898
Gemini Robotics 1.5 features a separate reasoning engine (ER), but its VLA model is also capable of thinking due to interleaved reasoning tokens. The VLA is able to independently operate long autonomous sequences (15+ minutes) without aid from the ER/VLM. https://x.com/TheHumanoidHub/status/1989393094631199088
Genki Robotics, a new humanoid robotics startup, is headquartered in Tokyo, Japan. It’s founded by Andy Rubin, the founder of Android, who was a Google executive for nine years. Rubin led Google’s robotics division during 2013-14, which included Boston Dynamics, which Google https://x.com/TheHumanoidHub/status/1990313434567844000
NVIDIA researchers present SONIC, a generalist humanoid controller: It scales motion tracking on a single policy to achieve natural, robust whole-body movement. The scalable foundation avoids manual reward engineering and features a universal token space and kinematic planner to https://x.com/TheHumanoidHub/status/1989409669983736306
If you work with robotics, AV, or 3D vision, this update will save you months of engineering. Most models need complex engineering to get reliable 3D geometry. This one does it with a plain transformer. Depth Anything 3 is the new model from @BytedanceTalk that predicts stable, https://x.com/IlirAliu_/status/1989622721366446190
Depth Anything 3 proves most 3D vision research has been overengineering the problem. Vanilla DINOv2 transformer + depth-ray pairs crushes SOTA by 44% on pose, 25% on geometry. One approach for SOTA monocular depth, multi-view geometry, pose estimation, and novel view synthesis”” / X https://x.com/bilawalsidhu/status/1989444908357488832
ByteDance-Seed/Depth-Anything-3: Depth Anything 3 https://github.com/ByteDance-Seed/Depth-Anything-3
Depth Anything 3 is here! It’s a beefy one! https://x.com/Almorgand/status/1989370456131215514
After a year of team work, we’re thrilled to introduce Depth Anything 3 (DA3)! 🚀 Aiming for human-like spatial perception, DA3 extends monocular depth estimation to any-view scenarios, including single images, multi-view images, and video. In pursuit of minimal modeling, DA3 https://x.com/bingyikang/status/1989358267668336841
Damn. DeepMind’s generalist AI agent SIMA 2 evolved from basic instruction-following to actual reasoning companion. Uses vision and keyboard/mouse like a human player, works across dozens of games without touching game code. The robotics angle is obvious – if you can generalize https://x.com/bilawalsidhu/status/1988986033669828985
Hot new Mountain View startup awakens from stealth: Sunday Robotics. Co-founders: – Tony Zhao: Stanford PhD dropout, ex-DeepMind, Tesla, GoogleX – Cheng Chi: PhD from Stanford & Columbia Full reveal on Wednesday, let’s see the substance behind the hype https://x.com/TheHumanoidHub/status/1990603992997769467
Simulating the Visual World with Artificial Intelligence: A Roadmap https://world-model-roadmap.github.io/
Efficient drone service with fewer risks: No humans, no danger… Imagine a drone that goes into stinky, dark sewer tunnels so people don’t have to. This cool machine checks for problems without anyone crawling in. It’s like having a robot superhero for dangerous places! ✅ https://x.com/IlirAliu_/status/1990858504274653230
From informed sources: The initial closing of the funding round, totaling $400M, was completed last month. A second close for an additional $100M is in the works. This will bring the post-money valuation of Apptronik to about $5.5 billion.”” / X https://x.com/TheHumanoidHub/status/1990329014981198279
3 generations of humanoid robots at Figure Designed in-house, manufactured, and walked in 3 years https://x.com/adcock_brett/status/1989755336929284565
Stress tested https://x.com/adcock_brett/status/1990975878227243441
Uneven terrain https://x.com/adcock_brett/status/1990099767435915681
Over the last 6 months, our robots loaded over 90,000 parts to the BMW production line You can access the full write-up here: https://x.com/adcock_brett/status/1991178821848936630
Today I’m proud to share that our F.02 robots have contributed to the production of 30,000 cars at BMW We’re sharing our learnings from an 11-month real world deployment as the F.02 fleet retires https://x.com/adcock_brett/status/1991178640848007676
Excited to share our F.02 robots have contributed to the production of 30,000 cars at BMW Today we’re sharing our learnings from an 11-month real world deployment as the F.02 fleet retires https://x.com/Figure_robot/status/1991178512510951782
Brett Adcock accused UBTech of faking its “hundreds delivered” Walker S2 video. UBTech has published another “behind the scenes” video of the humanoid robot fleet saying, “They said it looked too perfect to be real. But perfection isn’t fabricated–it’s delicately engineered.” https://x.com/TheHumanoidHub/status/1989357328999813464
Look at the reflections on this bot, then compare them to the ones behind it. The bot in front is real – everything behind it is fake If you see a head unit reflecting a bunch of ceiling lights, that’s a giveaway it’s CGI https://x.com/adcock_brett/status/1989019691004883205
Physical Intelligence unveiled π*0.6 (Pi-Star 0.6): a vision-language-action (VLA) model upgraded via their new Recap method (RL with Experience & Corrections via Advantage-conditioned Policies). Recap combines three human-like learning stages: initial demonstrations, real-time https://x.com/TheHumanoidHub/status/1990585956269965743
The Verge just called Matic “smarter, quieter, and [able to] get the job done.” Under the hood, Matic is running a live 3D voxel occupancy map from regular cameras. This is what “robot vac that cleans like a human” looks like. https://x.com/maticrobots/status/1988068395385057617
TextOp: A real-time text-to-motion framework for humanoid robots. It allows users to instruct the robot using natural language and to modify commands on the fly, producing smooth, whole-body motions instantly. https://x.com/TheHumanoidHub/status/1990865719693844990
Bringing robots into the real world needs something we rarely talk about: human-scale data that captures how tasks actually unfold. Spatial AI just released SEA, a new egocentric dataset designed for training robot foundation models. It is focused on long sequences, natural https://x.com/IlirAliu_/status/1991214327777706072
Munich-based industrial robots maker Agile Robots has unveiled Agile ONE, its first humanoid robot. 174 cm [5′ 8″”] tall, 69 kg [152 lb] weight, 20 kg [44 lb] payload. It has dexterous hands with integrated fingertip and force-torque sensors. Production is scheduled to begin in https://x.com/TheHumanoidHub/status/1991235395686920457
.@KWRoboticsAI is a robotics enthusiast and software engineer. Besides his day job, Kevin creates educational content on AI and robotics while also working on building his own humanoid robot. In this episode, we chat about the current state of humanoid robotics and break down https://x.com/TheHumanoidHub/status/1990864992904884474
Fascinating to see π*0.6 working on a pair of pants for 10 minutes. Comically confused at times but 100% committed to figuring it out. The robot is low-key sentient. https://x.com/TheHumanoidHub/status/1990972546414460942
UBTech exec Michael Tam, at a forum in Hong Kong, stated: – expects UBTech humanoid production to 10x to 5,000 units next year. – expects manufacturing cost to decline 20 to 30% annually. – “”By roughly 2027-30, we believe the production cost can fall to under $20k.”” https://x.com/TheHumanoidHub/status/1991228205580128482
Sunday Robotics unveiled its home robot, Memo, a wheeled robot with two arms and pincerlike hands. Training method: Sunday pays remote workers to perform household tasks wearing gloves that resemble Memo’s hands. The Mountain View-based company is focused on a full-stack https://x.com/TheHumanoidHub/status/1991219882818744427
How do you give a robot hand the strength to lift a pan and the precision to pinch a tiny nut with the same fingers? That is the core question behind Power to Precision, a new framework that tackles one of the oldest problems in dexterous robotics. Most multi-fingered hands can https://x.com/IlirAliu_/status/1991073280292262350
Apptronik is in advanced talks to secure a new funding round of at least $400 million, led by existing investor B Capital, potentially reaching a $5 billion pre-money valuation. This follows a previous Series A round in February 2025 where the Austin-based humanoid robotics https://x.com/TheHumanoidHub/status/1990296591476113471
Robot models get better only when humans feed them more demos. This one improves by learning from its own mistakes. pi*0.6 is a new VLA from @physical_int, that can refine its skills through real-world RL, not just teleop data. The team calls the method Recap, and from what I https://x.com/IlirAliu_/status/1990702714259472401
If the only thing we rely on is teleoperation to get training data, it will take decades.”” Sunday Robotics founder @tonyzzhao explains why Memo skips teleoperation, and instead trains on data from a human wearing “”Skill Capture Gloves,”” which capture higher-quality data at a https://x.com/tbpn/status/1991659658923352138
Most robots still need markers, checkerboards, or long calibration rituals just to know where their arms are. Now it works from raw images in seconds. roboreg is a markerless multi arm localization toolkit that plugs into ROS 2 and RViz. No special hardware. No custom setup. https://x.com/IlirAliu_/status/1990341378614636587
Almost no one talks about height control. This clip shows why it matters. A capacitive sensor on the cutting head keeps the nozzle only fractions of a millimeter above the metal sheet, even when the sheet is warped or vibrating. That tiny distance is everything for a clean cut. https://x.com/IlirAliu_/status/1990492791802114098
After 18 months in stealth, dozens of prototypes, millions of real-home demonstrations, and one final all-nighter, we’re thrilled for you to say hello to Memo https://x.com/sundayrobotics/status/1991196264772387261
Robot tiles that let you walk forever in VR Researchers at the University of Tsukuba built a tile that moves under your feet as you walk. It uses sensors to read your gait and predict where your next step will land. ✅ Each tile slides into place before your foot touches down. https://x.com/IlirAliu_/status/1989407343591800921
Today, we present a step-change in robotic AI @sundayrobotics. Introducing ACT-1: A frontier robot foundation model trained on zero robot data. – Ultra long-horizon tasks – Zero-shot generalization – Advanced dexterity 🧵-> https://x.com/tonyzzhao/status/1991204839578300813
In this episode, I talk with @JonMSchwartz, co-founder and CEO of @Ultraroboticsco, about how to actually get robots deployed in warehouses: We walk through Jon’s journey from tearing apart electronics on a tiny New York City workbench to Harvey Mudd, early YC startups in 3D https://x.com/IlirAliu_/status/1988972117669343481
Robots correcting their own mistakes sounds great on paper. IMO the real question is: Can they do it on a busy production floor? The full video runs for more than 12 minutes of continuous autonomous behavior. You can see how the system handles real edge cases without pausing https://x.com/IlirAliu_/status/1988891081379889161
Robots that can learn real-world skills in minutes instead of months are not science fiction anymore. We are watching the shift happen in front of us. The team at @ToyotaResearch has been pushing this space for years. Now their work is leaving the lab and moving into real https://x.com/IlirAliu_/status/1989748656983941554
Robotics has an unwritten rule. Hardware first. Deployment later. A small team in Brooklyn ignored that rule and still shipped a working robot in months. Here is what they did differently 🧵: https://x.com/IlirAliu_/status/1990057889567043978
The team is crushing it lately. If you want the backstory and hear how co-founder @mehul thinks, here’s our conversation from last year: https://x.com/IlirAliu_/status/1990222029379756529
Ego-VCP: a learned ego-vision world model trained offline on demonstration-free random data that predicts dynamics in latent space for humanoid robots. Achieved robust real-time contact-rich planning on a real Unitree G1: bracing against walls, blocking flying objects, https://x.com/TheHumanoidHub/status/1990558687703019539
Chinese startup MindOn trained Unitree G1 to do house chores. “No speed up, no teleoperation” https://x.com/TheHumanoidHub/status/1989364406850044284





Leave a Reply