AI News #86: Week Ending May 23, 2025 with 18 Executive Summaries, Top 93 Links, and 12 Helpful Visuals

May 24, 2025

About This Week’s Covers

This week’s covers celebrate my great friend Mike Bernstein, who introduced me to the Beastie Boys’ album “Paul’s Boutique” many years ago. I asked GPT-o3 to create a rubric I could use for batch producing derivative newsletter covers in the spirit of the Paul’s Boutique album cover.

The main cover is a spin-off with the Figure O1 robot peeking out from amongst the goods for sale. The awning has the Anthropic name and logo. Everything was created with GPT Image-1 other than some color correction in Photoshop and the addition of the Figure logo and the Anthropic logo. It’s still easier for me that way, for now.

GPT o3 created a rubric that allowed me to give it 43 one-word category names, and from those 43 single words, the API returned 43 Paul’s Boutique-inspired cover images via API.

I’ve included my favorite six of the covers below:

This Week By The Numbers

Total Organized Headlines: 660

This Week’s Executive Summaries

I’m behind two weeks because I’ve been enjoying spending time with my family.

May has been a breakthrough month of practical impacts. The release of Claude 4 Opus marks the first time my serious developer friends have told me that they see the potential to use AI for enterprise development. One friend told me that Opus is as good as a mid-career PhD-level computer programmer.

The building blocks for AI agents to make true impacts have been put in place. I think 2025 is going to be the year that people start losing jobs in entry-level operational customer service roles from health to finance to law.

Anthropic’s new models are out, performing as well as some of the best human coders, and can process entire enterprise code bases within their memory.

If agents weren’t enough, interfaces are officially beginning to be disrupted. OpenAI acquired Apple designer Jony Ive’s company for $6.5 billion. The rumor is they are going to build a device that has no graphical user interface.

Google has launched an Internet agent that can manage up to 10 web-based tasks at the same time, including things like booking flights or making restaurant reservations. Even with a web interface like a browser, if the system takes a search query and goes out throughout the entire Internet to accommodate requests, that implies it might as well be a non-graphical interface. You just ask Google what you want and Chrome goes and does it.

Google’s Internet agent can also save tasks to repeat on a regular basis. For example, if you want to check real estate listings, instead of opening the apps, Google will check them every day for you and even schedule viewings of properties.

Anthropic has created the standard for interfacing artificial intelligence with existing systems. Zapier is now connected to Anthropic’s system and can connect to and automate over 8,000 applications without requiring development expertise.

AI search tool Perplexity reports that people are booking hotels directly through its AI search platform more and more every day. Hotel advertising is Google’s second-largest category.

A lot of companies are forcing each other’s hands. Perplexity is forcing Google to make its search more agentic. Meanwhile, Google introducing app usage is forcing Apple to integrate AI into its App Store. For a long time, I’ve said that Apple is sitting on a large action model that could destroy the App Store overnight.

As artificial intelligence models grow stronger, both Anthropic and Google have strengthened their security systems and raised their threat levels as measured by proximity to AGI (aka better than humans).

OpenAI’s software engineering agent Codex continues to get rave reviews in its second week of public availability. It’s a genuine horse race between Google, Anthropic, and OpenAI. The consensus is Anthropic’s Claude 4 Opus is the best; however, there are strong cases to be made for Google’s 2.5 Gemini (and its insane multimodal context window) as well as OpenAI’s o3.

Google has launched a new video generation model called Veo that has gone viral. It’s getting the same amount of buzz that GPT images received. However, Google’s model costs $200 per month. Even with the cost, this is the first time that artificial intelligence has been able to create convincing deepfake videos that even experts have trouble detecting.

Google quietly launched a tool that allows you to describe a web interface, and it will build the entire code for you.

Microsoft launched a science agent which discovered a new material in a few hours. Not only was the new material discovered, but scientists were able to synthesize the compound in the laboratory.

Nvidia continues to quietly launch robot training simulation models. There is a very good chance Nvidia leapfrogs everybody in robot training. The interesting twist is that Nvidia open-sources all of their models (that I know of).

OpenAI has partnered with the United Arab Emirates to build the first international deployment of its massive Stargate AI infrastructure platform.

All this and a lot more in this week’s newsletter below. Remember, I’m a few weeks behind and this is for May 23, 2025.

Anthropic launches Claude Opus 4 and Sonnet 4 models
Anthropic released two AI models that excel at coding and complex reasoning tasks. Both models can now use tools like web search while thinking through problems and work with multiple tools simultaneously. The models address a key limitation in AI development by maintaining performance during extended work sessions, with Opus 4 capable of working continuously for several hours on complex projects. Claude Opus 4 leads global benchmarks for software engineering, scoring 72.5% on SWE-bench, while Claude Sonnet 4 improves significantly on the previous version with a 72.7% score on the same coding test. Companies like GitHub, Cursor, and Replit report that the models provide more precise code edits and better understanding of large codebases compared to previous versions.

Claude Sonnet 4 is much better at codebase understanding. Paired with recent improvements in Cursor, it’s SOTA on large codebases https://x.com/amanrsanger/status/1925679410142691606

Introducing Claude 4 \ Anthropic https://www.anthropic.com/news/claude-4

Introducing the next generation: Claude Opus 4 and Claude Sonnet 4. Claude Opus 4 is our most powerful model yet, and the world’s best coding model. Claude Sonnet 4 is a significant upgrade from its predecessor, delivering superior coding and reasoning. https://x.com/AnthropicAI/status/1925591505332576377

OpenAI acquires Jony Ive’s startup for $6.5 billion to build pocket-sized AI device
OpenAI purchased former Apple designer Jony Ive’s hardware company io for $6.5 billion to develop a new category of AI devices. The first product will be pocket-sized, screen-free, and contextually aware of users’ surroundings, designed to serve as a “third core device” alongside laptops and phones. Altman aims to release the device by late 2026 and expects it to reach 100 million units faster than any previous new product category. The collaboration began two years ago between Ive’s design collective LoveFrom and OpenAI, eventually growing into io with former Apple executives. Ive has criticized current smartphone interfaces and wants to address the “unintended consequences” of screen-heavy devices, while Altman believes the partnership could add $1 trillion to OpenAI’s value.

Details leak about Jony Ive’s new ‘screen-free’ OpenAI device | The Verge https://www.theverge.com/news/672357/openai-ai-device-sam-altman-jony-ive

OpenAI goes public with its acquisition of the Jony Ive device startup, which we first scooped here: https://x.com/steph_palazzolo/status/1925237718994350466

Sam and Jony introduce io | OpenAI https://openai.com/sam-and-jony/

thrilled to be partnering with jony, imo the greatest designer in the world. excited to try to create a new generation of AI-powered computers. https://x.com/sama/status/1925242282523103408

thrilled to be partnering with jony, imo the greatest designer in the world. excited to try to create a new generation of AI-powered computers. https://x.com/sama/status/1925242282523103408?s=46&t=b7l37rB6wtbyAh6ah1NpZQ

Google Is Preparing to Turn Web Browsers Into Personal Research Agents, Racing OpenAI
Google’s new internet agent can manage up to 10 web-based tasks simultaneously, like booking flights, ordering items, and making restaurant reservations. The system can now learn from repeated actions and remember user preferences. Google is integrating these capabilities into Search and Chrome, allowing the AI to automatically purchase tickets when available or schedule apartment tours based on specific criteria. This is a new trend: AI agents that can handle routine online tasks without supervision. For example, instead of manually checking real estate apps daily, Google’s agent can routinely filter listings, schedule viewings, and create comparison charts across multiple properties and present daily findings. In related news, OpenAI updated their web browsing agent, Operator, with the GPT latest reasoning model.

@demishassabis Our ultimate vision for the @GeminiApp is to transform it into a universal AI assistant — an AI that’s personal, powerful and proactive and one of our key milestones on the road to artificial general intelligence (AGI). https://x.com/Google/status/1924882592236540085

@googlechrome @GeminiApp Agent Mode in @GeminiApp is a new experiment coming soon to subscribers that lets you delegate complex planning and tasks to Gemini to get stuff done. https://x.com/Google/status/1924877422761005352

@googlechrome @GeminiApp Say you’re looking for an apartment. Instead of you filtering through real-estate apps daily, Gemini can find listings that fit your criteria, schedule tours and add them to your calendar, and create side-by-side comparisons. https://x.com/Google/status/1924877428997939563

1️⃣ It’s better at multitasking & can tackle up to 10 tasks simultaneously 2️⃣ It’s using Teach and Repeat — you can show it a task once and & it learns a plan for future tasks 3️⃣ Its computer use capabilities are coming to the Gemini API this summer #GoogleIO”” / X https://x.com/Google/status/1924876543479714026

1/ Agent Mode is coming to the Gemini App Google introduced Agent Mode, enabling Gemini to autonomously execute multi-step tasks. 🏠 Find 2-bedroom apartments under $2,000 on Zillow and schedule a tour. 🍽️ Book a 7 PM reservation at the best-rated Thai restaurant nearby. https://x.com/AtomSilverman/status/1924960409062342676

3 updates to Project Mariner, our research prototype that can interact with the web and get things done:”” / X https://x.com/Google/status/1924876541147709897

5/ Integrating into AI Mode in Search and Chrome Mariner’s capabilities are being embedded into Google’s Search and Chrome 🎟️ Automatically purchase event tickets when they become available. 🛎️ Reserve tables at restaurants based on user preferences.”” / X https://x.com/AtomSilverman/status/1924960909686128810

Agent Mode in the @Geminiapp can help you get more done across the web – coming to subscribers soon. Plus a new multi-tasking version of Project Mariner is now available to Google AI Ultra subscribers in the US, and computer use capabilities are coming to the Gemini API. https://x.com/sundarpichai/status/1924909900033122466

Explore the future of AI agents with Project Mariner – a research prototype that can help you get things done, like: ⛱️Planning trips 🛒Ordering items 🍽️Making reservations ✅All with your oversight #GoogleIO https://x.com/GoogleDeepMind/status/1924936861983609194

Since releasing Project Mariner in December, we’ve been working with trusted testers to gather feedback. Today, we’re announcing updates, including: 📈Managing up to 10 tasks at once 🧑‍🏫Ability to learn and repeat tasks 🌐Easy access via a web app 1️⃣All in one dashboard https://x.com/GoogleDeepMind/status/1924936866597335107

We’re introducing Gemini in @GoogleChrome, rolling out first to Google AI Pro subscribers in the U.S. It’s your AI browsing assistant to help you get things done. Type or talk to help you quickly understand content or get tasks done using the context of your current webpage — https://x.com/Google/status/1924892719739973640

We’re starting to integrate agentic capabilities throughout our products, including @GoogleChrome, Search, and @GeminiApp. #GoogleIO”” / X https://x.com/Google/status/1924877381853978790

Operator 🤝 OpenAI o3 Operator in ChatGPT has been updated with our latest reasoning model. https://x.com/OpenAI/status/1925963018791178732

@GeminiApp Today, AI Overviews have more than 1.5 billion users every month. That means Google Search is bringing generative AI to more people than any other product in the world. https://x.com/Google/status/1924874920871526830

AI Mode in Google Search: Updates from Google I/O 2025 https://blog.google/products/search/google-search-ai-mode-update/#ai-mode-search

AI Mode is Search transformed with Gemini 2.5 at the core. It’s our most powerful AI search, with more advanced reasoning and multimodality, and the ability to go deeper through follow-up questions and helpful links to the web. Here’s a peek at what’s coming soon to AI Mode: 🧵”” / X https://x.com/Google/status/1924886582479171927

Last year, we introduced Project Astra: a research prototype exploring capabilities for a universal AI assistant. 🤝 We’ve been making it even better with improved voice output, memory and computer control – so it can be more personalized and proactive. Take a look ↓ #GoogleIO https://x.com/GoogleDeepMind/status/1924883244459425797

Google announces AI Ultra subscription plan https://blog.google/products/google-one/google-ai-ultra/

Anthropic adds tools for building AI agents, integrating with Zapier and thousands of automation tools
Anthropic released API capabilities that let developers build AI agents that execute code, connect to thousands of apps, and handle files. The code execution tool allows Claude to run Python programs and create data visualizations during conversations, while the MCP connector links Claude to over 8,000 applications through Zapier without requiring custom programming. A new Files API helps Claude work with documents, and extended prompt caching reduces costs by storing frequently used prompts for up to an hour. The MCP integration enables Claude to perform actions like tracking PayPal transactions, creating Google Docs, generating images through DALL-E, and sending automated email briefings that combine calendar and weather data.

🚀 Introducing Claude x Zapier MCP! Now Claude can work with 8,000+ apps and 30,000+ pre-built actions through Zapier—no custom integrations needed. The Model Context Protocol creates a secure bridge so Claude doesn’t just understand what you want, but actually takes action. ⚡”” / X https://x.com/zapier/status/1918007000363122829

Claude’s new MCP integration is INSANE! Connect PayPal, Gmail, and 7,000+ apps directly in chat. Top 5 things this update enables: • Financial tracking with PayPal transaction data • Daily briefings with calendar, email & weather • Research reports delivered to your https://x.com/JulianGoldieSEO/status/1919285937730617821

MCP is a true gift for AI developers! I recorded a video to show you how to connect AI agents to third-party tools that require authentication using MCP. If you’ve tried, you know this is as painful as it gets. Imagine your agent connects to GitHub, Gmail, and Slack. That’s https://x.com/svpino/status/1917194874497171510

Today, we’re announcing four new capabilities on the Anthropic API to help developers build more powerful AI agents. A code execution tool, MCP connector, Files API, and extended prompt caching: https://x.com/AnthropicAI/status/1925633118104416587

we’re launching a suite of new tools and new features today. a new MCP tool, code interpreter tool, and image generation tool – plus background mode, reasoning summaries, and file search within reasoning models https://x.com/stevenheidel/status/1925209984180380101

Zapier just gave your AI the keys their new MCP lets agents trigger 30,000+ actions across 8,000 apps with real access, not hacks. https://x.com/ProductHunt/status/1920550567153397977

Google’s Gemini 2.5 Pro tops coding leaderboards with new multimodal reasoning mode
Google released Gemini 2.5 Pro, which leads the WebDev Arena coding benchmark with a score of 1415. The model included a mode called “Deep Think,” a reasoning system that explores multiple solutions simultaneously before responding, improving performance on complex math and programming problems. Most importantly, Gemini 2.5 Pro, is multimodal, meaning it can process different types of content together – text, images, audio, and video/

2.5 Pro is now the best model for coding and learning. With a strong ELO score of 1415, it’s topping the WebDev Arena leaderboard – and it incorporates LearnLM, our family of models fine-tuned for learning built with educational experts. #GoogleIO https://x.com/GoogleDeepMind/status/1924878252172353851

Deep Think in 2.5 Pro has landed. 🤯 It’s a new enhanced reasoning mode using our research in parallel thinking techniques – meaning it explores multiple hypotheses before responding. This enables it to handle incredibly complex math and coding problems more effectively. https://x.com/GoogleDeepMind/status/1924881598102839373

Gemini 2.5 can now organize vast amounts of multimodal information, reason about everything it sees, and write code to simulate anything. ↓ #GoogleIO https://x.com/GoogleDeepMind/status/1924878250255516126

And starting this week, Gemini 2.5, our most intelligent model, is coming to Search, for both AI Mode and AI Overviews in the U.S. https://x.com/Google/status/1924885533609599187

Perplexity’s AI hotel booking feature gains traction
Perplexity reports that users are increasingly booking hotels directly through its AI search platform, representing a potential challenge to Google’s advertising business. Hotel bookings are Google’s second-largest advertising category. The feature allows users to complete reservations without leaving Perplexity’s interface, bypassing traditional booking sites and search ads. This development could signal a broader shift in how people discover and book travel, with AI platforms potentially capturing revenue that traditionally flows through search advertising and online travel agencies.

hotel bookings natively on perplexity are quietly growing. it’s one of the under-the-radar features we have right now that has a massive potential to disrupt the ad industry. google’s second biggest adword category i think.”” / X https://x.com/AravSrinivas/status/1923124236618469735

Anthropic activates enhanced safety measures for Claude Opus 4
Anthropic implemented AI Safety Level 3 protections for Claude Opus 4 as a precautionary measure, even though the company hasn’t determined if the model definitively requires these safeguards. The enhanced security makes it harder to steal the AI model’s core programming, while deployment restrictions specifically target potential misuse for developing chemical, biological, radiological, and nuclear weapons. These measures represent a step up from the baseline protections used for previous Claude models, designed to defend against sophisticated attackers, and should only cause Claude to refuse queries on a very narrow set of dangerous topics while Anthropic continues evaluating the model’s risk level.

Activating AI Safety Level 3 Protections \ Anthropic https://www.anthropic.com/news/activating-asl3-protections

Google strengthens Gemini 2.5 against hidden malicious instructions
Google published research on protecting its Gemini 2.5 AI models from “indirect prompt injection” attacks, where hackers embed malicious commands in emails, documents, or websites that trick AI agents into sharing private data or performing unauthorized actions. The company developed security measures to help Gemini distinguish between legitimate user requests and hidden manipulative instructions that could exploit AI agents with access to personal information like calendars, emails, and external websites. This security work addresses a growing cybersecurity risk as AI agents become more capable of accessing and acting on personal data across multiple platforms.

Advancing Gemini’s security safeguards – Google DeepMind https://deepmind.google/discover/blog/advancing-geminis-security-safeguards/

OpenAI launches Codex software engineering agent to rave reviews
OpenAI released Codex, a cloud-based AI agent that handles multiple coding tasks simultaneously in isolated sandbox environments preloaded with your repository. Powered by codex-1 (a version of OpenAI o3 optimized for software engineering), Codex can write features, fix bugs, answer codebase questions, and create pull requests while running tests until they pass. The agent works through ChatGPT’s sidebar, takes 1-30 minutes per task depending on complexity, and provides detailed logs of its actions for verification. Available now to ChatGPT Pro, Team, and Enterprise users, with Plus users getting access soon.

💥 Today we’re launching Codex: a software agent that operates in the cloud and can do many tasks in parallel. In the future most code will be written by AI; society will be accelerated because of it. This is a research preview, but we’re very excited to see what you build.”” / X https://x.com/kevinweil/status/1923403368849871329

A user can then review code suggestions made by the agent. It can show a preview of the test it ran. And the user can then create and push a PR. https://x.com/omarsar0/status/1923398310812918226

Asked Codex to internationalize our app and localize it into Japanese before bed last night. Woke up to complete Japanese support this morning 🇯🇵 What would have taken a few days was done overnight. https://x.com/kn/status/1923819590209220908

Best way to use Codex is to create PRs liberally. Feels like a very different way of writing code!”” / X https://x.com/gdb/status/1923530399692750978

BREAKING: OpenAI announces research preview of Codex in ChatGPT Next-level coding agent within ChatGPT. Pay attention, devs and non-devs! Here is all you need to know: https://x.com/omarsar0/status/1923394424622522394

Codex CLI keeps getting better. In the long run, I expect that “”local”” (e.g. Codex CLI) and “”remote”” (e.g. Codex) coding agents will come together — imagine their combination as a remote coworker who can also look over your shoulder. Excited for the future of programming!”” / X https://x.com/gdb/status/1923492615959478375

Codex for bug finding;”” / X https://x.com/gdb/status/1923509728124207587

Codex for code migrations:”” / X https://x.com/gdb/status/1923802002582319516

Codex for internationalization:”” / X https://x.com/gdb/status/1923897958954872903

Codex is powered by a new model called codex-1. OpenAI claims this is their best coding model to date.”” / X https://x.com/omarsar0/status/1923394428766437684

Introducing Codex | OpenAI https://openai.com/index/introducing-codex/

It seems there was a lot of alignment work that went into Codex. This led to the agent being able to produce cleaner patches and overall code that aligns with a coder’s preference, standards, and instructions.”” / X https://x.com/omarsar0/status/1923403068944580739

Just released Codex, a software engineering agent that can work on many tasks in parallel. It runs on its own cloud-based compute infrastructure, and can fix bugs, answer questions about your code, run tests, etc.. Feels like a step towards the future of software engineering.”” / X https://x.com/gdb/status/1923401740986052770

OpenAI introduces Codex: A cloud-based software engineering agent that can work on many tasks in parallel, powered by codex-1. https://x.com/iScienceLuvr/status/1923394959916273820

OpenAI shared that their engineers use Codex for the following: – refactoring – renaming – writing tests – scaffolding new features – wiring components – fixing bugs – drafting documentation They are noticing new habits emerging from offloading background work to the agents.”” / X https://x.com/omarsar0/status/1923403070806929877

The Codex agent can analyze a codebase and find areas of improvement. It suggest improvements, Then you can schedule tasks right within ChatGPT. https://x.com/omarsar0/status/1923394967008874889

today we are introducing codex. it is a software engineering agent that runs in the cloud and does tasks for you, like writing a new feature of fixing a bug. you can run many tasks in parallel.”” / X https://x.com/sama/status/1923398457747787817

What’s being released? A remote software engineering agent, Codex. Can run many coding tasks in parallel. Available for Pro, Enterprise, and Team ChatGPT users starting today.”” / X https://x.com/omarsar0/status/1923394427071918310

Microsoft declares 2025 the year of AI agents, OpenAI’s Greg Brockman agrees
Microsoft announced new AI agent capabilities across GitHub, Azure, and Windows platforms, positioning itself for what executives call “the age of AI agents.” The company revealed that 15 million developers use GitHub Copilot and over 230,000 organizations including 90% of Fortune 500 companies have built AI agents through Copilot Studio. New features include GitHub’s first asynchronous coding agent, Windows AI Foundry for local AI development, and access to over 1,900 AI models through Azure AI Foundry including xAI’s Grok models. Microsoft also introduced enterprise security tools like Entra Agent ID to manage AI agents and announced support for Anthopic’s Model Context Protocol across all its platforms, plus a new web standard called NLWeb designed to help websites work better with AI agents.

Microsoft Build 2025: The age of AI agents and building the open agentic web – The Official Microsoft Blog https://blogs.microsoft.com/blog/2025/05/19/microsoft-build-2025-the-age-of-ai-agents-and-building-the-open-agentic-web/

2025 is the year of agents.”” / X https://x.com/gdb/status/1923541152508281329

Google launches Veo 3 video generator with built-in audio (goes viral)
Google released Veo 3, an AI video creation tool that generates clips with synchronized audio including dialogue, sound effects, and background noise. The system can create videos from text prompts that capture realistic physics like water movement and snow crunching, while also handling complex scenarios like Broadway musicals or 1970s children’s shows with specific visual styles. Google combined Veo 3 with its Imagen and Gemini AI systems into a filmmaking platform called Flow, which helps users build scenes and save reusable elements like characters and locations, available now to Google AI Pro and Ultra subscribers in the United States.

Check out Veo 3 🔥🔥🔥 sound on 🔊”” / X https://x.com/_tim_brooks/status/1924895946967810234

From capturing real-world physics – like the noise and movement of water, or the look and sound of walking in snow – to lip syncing, Veo 3 is great at understanding what you want. You can tell a short story in your prompt, and the model gives you back a clip that brings it to https://x.com/GoogleDeepMind/status/1924893531300077675

In Flow, AI can help make clips from prompts, build them into scenes and then save your ingredients – such as characters, locations, objects or styles – all in one place. ↓ https://x.com/GoogleDeepMind/status/1924896542848090276

Say goodbye to the silent era of video generation: Introducing Veo 3 — with native audio generation. 🗣️ Quality is up from Veo 2, and now you can add dialogue between characters, sound effects and background noise. Veo 3 is available now in the @GeminiApp for Google AI Ultra https://x.com/Google/status/1924893837295546851

Veo 3 is available today for Ultra subscribers in the United States in the @GeminiApp. Find out more about where you can use it ↓ https://x.com/GoogleDeepMind/status/1924893533787332996

Veo 3, our SOTA video generation model, has native audio generation and is absolutely mindblowing. For filmmakers + creatives, we’re combining the best of Veo, Imagen and Gemini into a new filmmaking tool called Flow. Ready today for Google AI Pro and Ultra plan subscribers. https://x.com/sundarpichai/status/1924909490081825195

Veo 3: “”a big broadway musical about garlic bread, with elaborate costumes and a sondheim-like vibe”” https://x.com/emollick/status/1925065546082484418

Veo 3: “”a scene from an unnerving 1970s childrens show with live action puppets and Lovecraftian overtones singing a song”” https://x.com/emollick/status/1925047195738218505

Video, meet audio. 🎥🤝🔊 With Veo 3, our new state-of-the-art generative video model, you can add soundtracks to clips you make. Create talking characters, include sound effects, and more while developing videos in a range of cinematic styles. 🧵 https://x.com/GoogleDeepMind/status/1924893528062140417

Google labs launches stitch for UI design coding
Google Labs released Stitch, an AI tool that generates user interface designs and code for developers and designers. The tool aims to streamline the process of creating web and app interfaces by automatically producing both visual designs and the corresponding code. Stitch represents Google’s entry into the growing market of AI-powered design tools, competing with other platforms that help non-technical users build functional interfaces without extensive coding knowledge.

Google made an AI coding tool specifically for UI design | The Verge https://www.theverge.com/news/670773/google-labs-stitch-ui-coding-design-tool

Meet Stitch by @GoogleLabs, the easiest and fastest product to generate great designs and UIs. 🧵 https://x.com/stitchbygoogle/status/1924947794034622614

Google launches live camera sharing for Gemini AI assistant
Google rolled out Gemini Live’s camera and screen sharing features to Android users and began the iOS rollout, allowing people to have real-time conversations with the AI while showing it what they’re seeing through their phone’s camera or screen. The feature lets users get instant help with tasks like identifying objects, solving math problems, or getting explanations about what’s displayed on their screen during voice conversations with Gemini. This expands Gemini’s capabilities beyond text and voice to include visual context, making the AI assistant more interactive and practical for everyday situations where users need immediate visual assistance.

Gemini Live camera and screen sharing in @GeminiApp is available on @Android and rolling out to iOS, starting today. https://x.com/Google/status/1924876301573239061

Google is bringing real-time AI camera sharing to Search | The Verge https://www.theverge.com/news/670597/google-search-live-ai-mode-gemini-ios

Microsoft AI agents discover new cooling material in hours
Microsoft’s AI Discovery platform helped researchers find a chemical-free immersion coolant by using multiple AI agents working together, identifying a material previously unknown to scientists in just hours rather than the typical months-long research process. The team, led by John Link, successfully synthesized the discovered compound in their laboratory, demonstrating how AI can accelerate materials discovery by rapidly analyzing vast combinations of chemical properties and suggesting viable alternatives to harmful forever chemicals used in electronics cooling.

Mindblowing demo: John Link led a team of AI agents to discover a forever-chemical-free immersion coolant using Microsoft Discovery. The agents surfaced a material “”unknown to humans”” — in hours, not months — and the team synthesized it in the lab. “”It’s literally very cool.”” https://x.com/vitrupo/status/1924568771353841999

Nvidia launches Isaac GR00T N1.5 robotics foundation model
Nvidia released Isaac GR00T N1.5, an updated open-source foundation model designed to teach humanoid robots reasoning and physical skills, along with new tools for generating training data without relying on human demonstrations. The company introduced two synthetic data generation systems: Isaac GR00T-Mimic, which uses physics simulations to expand human motion data, and GR00T-Dreams, which creates new motion videos from single images using AI video generation. Nvidia also launched Cosmos-Reason1-7B, the first reasoning model specifically built for robotics that understands physical common sense and can make appropriate decisions for robotic actions.

Jensen just announced NVIDIA’s Isaac GR00T N1.5 and GR00T-Dreams blueprint at COMPUTEX 2025: ⦿ Isaac GR00T N1.5 is the first update to NVIDIA’s open, generalized, fully customizable foundation model for humanoid reasoning and skills. ⦿ “Human demonstrations aren’t scalable — https://x.com/TheHumanoidHub/status/1924332201862414495

JUST IN🚨: Nvidia open sourced Physical AI models reasoning models that understand physical common sense and generate appropriate embodied decisions 👀 https://x.com/reach_vb/status/1924525937443365193

NVIDIA offers two blueprints for synthetic data generation: ⦿ Isaac GR00T-Mimic: Uses a physics engine to amplify human motion data in simulation. ⦿ GR00T-Dreams (announced yesterday): Fine-tunes a video generation AI model to create new motion videos from a single image. https://x.com/TheHumanoidHub/status/1924538121687073167

NVIDIA released new vision reasoning model for robotics: Cosmos-Reason1-7B 🤖 > first reasoning model for robotics 😱 > based on Qwen 2.5-VL-7B, use with @huggingface transformers or vLLM 🤗 > comes with SFT & alignment dataset and a new benchmark 👏 https://x.com/mervenoyann/status/1924817927561183498

Nvidia develops reasoning AI without training examples
Nvidia created Nemotron-Research-Tool-N1, a family of AI models that learned to reason and use tools through rule-based reinforcement learning rather than being trained on existing reasoning examples. The approach eliminates the need for reasoning supervision or distillation from other models, allowing the AI to develop problem-solving skills independently. This method could reduce training costs and dependencies on human-generated reasoning data while still producing models capable of complex logical thinking and tool use.

Tool-using LLMs can learn to reason—without reasoning traces. 🔥 We present Nemotron-Research-Tool-N1, a family of tool-using reasoning LLMs trained entirely via rule-based reinforcement learning—no reasoning supervision, no distillation. 📄 Paper: https://x.com/ShaokunZhang1/status/1922105694167433501

Google develops Gemini Diffusion text generation model
Google created Gemini Diffusion, a text generation model that works differently from traditional AI by refining random noise into coherent text through multiple steps rather than predicting words directly. This approach helps the model excel at coding and math problems by allowing it to iterate and improve solutions quickly, similar to how image generation models like DALL-E create pictures by gradually removing noise from random pixels.

We’ve developed Gemini Diffusion: our state-of-the-art text diffusion model. Instead of predicting text directly, it learns to generate outputs by refining noise, step-by-step. This helps it excel at coding and math, where it can iterate over solutions quickly. #GoogleIO https://x.com/GoogleDeepMind/status/1924888095448825893

OpenAI launches first international AI infrastructure partnership with UAE
OpenAI partnered with the United Arab Emirates to build the first international deployment of its Stargate AI infrastructure platform, creating a 1-gigawatt computing cluster in Abu Dhabi expected to go live in 2026. The partnership, developed with U.S. government coordination and support from President Trump, includes companies like G42, Oracle, NVIDIA, Cisco, and SoftBank, while the UAE will invest in U.S. Stargate infrastructure and become the first country to enable ChatGPT nationwide. This marks the launch of OpenAI’s “OpenAI for Countries” initiative, which aims to help governments build sovereign AI capabilities in coordination with the U.S., with plans for 10 partnerships across key regions and the potential to serve up to half the world’s population within a 2,000-mile radius of the UAE facility.

Introducing Stargate UAE | OpenAI https://openai.com/index/introducing-stargate-uae/

12 AI Visuals and Charts: Week Ending May 23, 2025

StackOverflow questions over time, source SEDE; sadface, lunch has been eaten https://x.com/marcgravell/status/1922922817143660783

Microsoft just revealed its next big AI bets at Build 2025. I sat down with Microsoft CEO Satya Nadella to unpack: -Microsoft’s vision for the “agentic web” -Why your next job might be AI agent manager -What happens when 95% of code is AI-generated Timestamps: 0:00 Building https://x.com/rowancheung/status/1925228045415416297

Eric Schmidt on why AI is actually *underhyped* We dove into the big questions at TED — AGI, China, open source, and human agency. One of the rare leaders who can straddle the worlds of tech, policy, and Burning Man. Check out the full talk below 👇 https://x.com/bilawalsidhu/status/1923085454397616533

Code with Claude Opening Keynote – YouTube https://www.youtube.com/watch?v=EvtPBaaykdo

Mastering Claude Code in 30 minutes – YouTube https://www.youtube.com/watch?v=6eBSHbLKuN0

what are we doing here folks https://x.com/catehall/status/1925631571605827944

Report: Spring 2025 AI Model Usage Trends – Poe https://poe.com/blog/spring-2025-ai-model-usage-trends

Most important plot from IO today — AI usage is skyrocketing. This is real. https://x.com/natolambert/status/1924916998133129716?s=46

Google I/O 2025: Listen to a podcast recap https://blog.google/technology/ai/release-notes-podcast-io-2025/

o3, give me a screenshot from that one safety video in the 1950s about how to care for your giant killer squid”” “”great, a second shot from when it all goes wrong for poor Timmy.”” “”show me that one moment people always talk about”” (this was all the first output from the AI) https://x.com/emollick/status/1923861434271740005

I thought my “”otter on a plane using Wifi”” benchmark was already done, but Veo 3 adds higher quality… and sound Here is “”an otter on a plane using wifi on their phone, the flight attendant asks them “”do you want a drink ?”” and the otter nods”” (One of the first set of 4 videos) https://x.com/emollick/status/1925018308182524391

New video of fully autonomous Optimus. Performing many new tasks – instructed via natural language. All the tasks are done by a single neural net – learned directly from human videos. “”This breakthrough allows us to learn new tasks much faster.”” https://x.com/TheHumanoidHub/status/1925052725714419889