About This Week’s Covers

This week’s covers celebrate my great friend Mike Bernstein, who introduced me to the Beastie Boys’ album “Paul’s Boutique” many years ago. I asked GPT-o3 to create a rubric I could use for batch producing derivative newsletter covers in the spirit of the Paul’s Boutique album cover.

The original Paul’s Boutique cover

The main cover is a spin-off with the Figure O1 robot peeking out from amongst the goods for sale. The awning has the Anthropic name and logo. Everything was created with GPT Image-1 other than some color correction in Photoshop and the addition of the Figure logo and the Anthropic logo. It’s still easier for me that way, for now.

GPT o3 created a rubric that allowed me to give it 43 one-word category names, and from those 43 single words, the API returned 43 Paul’s Boutique-inspired cover images via API.

I’ve included my favorite six of the covers below:

This Week By The Numbers

Total Organized Headlines: 660

This Week’s Executive Summaries

I’m behind two weeks because I’ve been enjoying spending time with my family.

May has been a breakthrough month of practical impacts. The release of Claude 4 Opus marks the first time my serious developer friends have told me that they see the potential to use AI for enterprise development. One friend told me that Opus is as good as a mid-career PhD-level computer programmer.

The building blocks for AI agents to make true impacts have been put in place. I think 2025 is going to be the year that people start losing jobs in entry-level operational customer service roles from health to finance to law.

Anthropic’s new models are out, performing as well as some of the best human coders, and can process entire enterprise code bases within their memory.

If agents weren’t enough, interfaces are officially beginning to be disrupted. OpenAI acquired Apple designer Jony Ive’s company for $6.5 billion. The rumor is they are going to build a device that has no graphical user interface.

Google has launched an Internet agent that can manage up to 10 web-based tasks at the same time, including things like booking flights or making restaurant reservations. Even with a web interface like a browser, if the system takes a search query and goes out throughout the entire Internet to accommodate requests, that implies it might as well be a non-graphical interface. You just ask Google what you want and Chrome goes and does it.

Google’s Internet agent can also save tasks to repeat on a regular basis. For example, if you want to check real estate listings, instead of opening the apps, Google will check them every day for you and even schedule viewings of properties.

Anthropic has created the standard for interfacing artificial intelligence with existing systems. Zapier is now connected to Anthropic’s system and can connect to and automate over 8,000 applications without requiring development expertise.

AI search tool Perplexity reports that people are booking hotels directly through its AI search platform more and more every day. Hotel advertising is Google’s second-largest category.

A lot of companies are forcing each other’s hands. Perplexity is forcing Google to make its search more agentic. Meanwhile, Google introducing app usage is forcing Apple to integrate AI into its App Store. For a long time, I’ve said that Apple is sitting on a large action model that could destroy the App Store overnight.

As artificial intelligence models grow stronger, both Anthropic and Google have strengthened their security systems and raised their threat levels as measured by proximity to AGI (aka better than humans).

OpenAI’s software engineering agent Codex continues to get rave reviews in its second week of public availability. It’s a genuine horse race between Google, Anthropic, and OpenAI. The consensus is Anthropic’s Claude 4 Opus is the best; however, there are strong cases to be made for Google’s 2.5 Gemini (and its insane multimodal context window) as well as OpenAI’s o3.

Google has launched a new video generation model called Veo that has gone viral. It’s getting the same amount of buzz that GPT images received. However, Google’s model costs $200 per month. Even with the cost, this is the first time that artificial intelligence has been able to create convincing deepfake videos that even experts have trouble detecting.

Google quietly launched a tool that allows you to describe a web interface, and it will build the entire code for you.

Microsoft launched a science agent which discovered a new material in a few hours. Not only was the new material discovered, but scientists were able to synthesize the compound in the laboratory.

Nvidia continues to quietly launch robot training simulation models. There is a very good chance Nvidia leapfrogs everybody in robot training. The interesting twist is that Nvidia open-sources all of their models (that I know of).

OpenAI has partnered with the United Arab Emirates to build the first international deployment of its massive Stargate AI infrastructure platform.

All this and a lot more in this week’s newsletter below. Remember, I’m a few weeks behind and this is for May 23, 2025.

Anthropic launches Claude Opus 4 and Sonnet 4 models
Anthropic released two AI models that excel at coding and complex reasoning tasks. Both models can now use tools like web search while thinking through problems and work with multiple tools simultaneously. The models address a key limitation in AI development by maintaining performance during extended work sessions, with Opus 4 capable of working continuously for several hours on complex projects. Claude Opus 4 leads global benchmarks for software engineering, scoring 72.5% on SWE-bench, while Claude Sonnet 4 improves significantly on the previous version with a 72.7% score on the same coding test. Companies like GitHub, Cursor, and Replit report that the models provide more precise code edits and better understanding of large codebases compared to previous versions.

Claude Sonnet 4 is much better at codebase understanding. Paired with recent improvements in Cursor, it’s SOTA on large codebases https://x.com/amanrsanger/status/1925679410142691606

Introducing Claude 4 \ Anthropic https://www.anthropic.com/news/claude-4

Introducing the next generation: Claude Opus 4 and Claude Sonnet 4. Claude Opus 4 is our most powerful model yet, and the world’s best coding model. Claude Sonnet 4 is a significant upgrade from its predecessor, delivering superior coding and reasoning. https://x.com/AnthropicAI/status/1925591505332576377

OpenAI acquires Jony Ive’s startup for $6.5 billion to build pocket-sized AI device
OpenAI purchased former Apple designer Jony Ive’s hardware company io for $6.5 billion to develop a new category of AI devices. The first product will be pocket-sized, screen-free, and contextually aware of users’ surroundings, designed to serve as a “third core device” alongside laptops and phones. Altman aims to release the device by late 2026 and expects it to reach 100 million units faster than any previous new product category. The collaboration began two years ago between Ive’s design collective LoveFrom and OpenAI, eventually growing into io with former Apple executives. Ive has criticized current smartphone interfaces and wants to address the “unintended consequences” of screen-heavy devices, while Altman believes the partnership could add $1 trillion to OpenAI’s value.

Details leak about Jony Ive’s new ‘screen-free’ OpenAI device | The Verge https://www.theverge.com/news/672357/openai-ai-device-sam-altman-jony-ive

OpenAI goes public with its acquisition of the Jony Ive device startup, which we first scooped here: https://x.com/steph_palazzolo/status/1925237718994350466

Sam and Jony introduce io | OpenAI https://openai.com/sam-and-jony/

thrilled to be partnering with jony, imo the greatest designer in the world. excited to try to create a new generation of AI-powered computers. https://x.com/sama/status/1925242282523103408

thrilled to be partnering with jony, imo the greatest designer in the world. excited to try to create a new generation of AI-powered computers. https://x.com/sama/status/1925242282523103408?s=46&t=b7l37rB6wtbyAh6ah1NpZQ

Google Is Preparing to Turn Web Browsers Into Personal Research Agents, Racing OpenAI
Google’s new internet agent can manage up to 10 web-based tasks simultaneously, like booking flights, ordering items, and making restaurant reservations. The system can now learn from repeated actions and remember user preferences. Google is integrating these capabilities into Search and Chrome, allowing the AI to automatically purchase tickets when available or schedule apartment tours based on specific criteria. This is a new trend: AI agents that can handle routine online tasks without supervision. For example, instead of manually checking real estate apps daily, Google’s agent can routinely filter listings, schedule viewings, and create comparison charts across multiple properties and present daily findings. In related news, OpenAI updated their web browsing agent, Operator, with the GPT latest reasoning model.

@demishassabis Our ultimate vision for the @GeminiApp is to transform it into a universal AI assistant — an AI that’s personal, powerful and proactive and one of our key milestones on the road to artificial general intelligence (AGI). https://x.com/Google/status/1924882592236540085

@googlechrome @GeminiApp Agent Mode in @GeminiApp is a new experiment coming soon to subscribers that lets you delegate complex planning and tasks to Gemini to get stuff done. https://x.com/Google/status/1924877422761005352

@googlechrome @GeminiApp Say you’re looking for an apartment. Instead of you filtering through real-estate apps daily, Gemini can find listings that fit your criteria, schedule tours and add them to your calendar, and create side-by-side comparisons. https://x.com/Google/status/1924877428997939563

1️⃣ It’s better at multitasking & can tackle up to 10 tasks simultaneously 2️⃣ It’s using Teach and Repeat — you can show it a task once and & it learns a plan for future tasks 3️⃣ Its computer use capabilities are coming to the Gemini API this summer #GoogleIO”” / X https://x.com/Google/status/1924876543479714026

1/ Agent Mode is coming to the Gemini App  Google introduced Agent Mode, enabling Gemini to autonomously execute multi-step tasks. 🏠 Find 2-bedroom apartments under $2,000 on Zillow and schedule a tour.  🍽️ Book a 7 PM reservation at the best-rated Thai restaurant nearby. https://x.com/AtomSilverman/status/1924960409062342676

3 updates to Project Mariner, our research prototype that can interact with the web and get things done:”” / X https://x.com/Google/status/1924876541147709897

5/ Integrating into AI Mode in Search and Chrome  Mariner’s capabilities are being embedded into Google’s Search and Chrome 🎟️ Automatically purchase event tickets when they become available.  🛎️ Reserve tables at restaurants based on user preferences.”” / X https://x.com/AtomSilverman/status/1924960909686128810

Agent Mode in the @Geminiapp can help you get more done across the web – coming to subscribers soon.  Plus a new multi-tasking version of Project Mariner is now available to Google AI Ultra subscribers in the US, and computer use capabilities are coming to the Gemini API. https://x.com/sundarpichai/status/1924909900033122466

Explore the future of AI agents with Project Mariner – a research prototype that can help you get things done, like: ⛱️Planning trips 🛒Ordering items 🍽️Making reservations ✅All with your oversight #GoogleIO https://x.com/GoogleDeepMind/status/1924936861983609194

Since releasing Project Mariner in December, we’ve been working with trusted testers to gather feedback. Today, we’re announcing updates, including: 📈Managing up to 10 tasks at once 🧑‍🏫Ability to learn and repeat tasks 🌐Easy access via a web app 1️⃣All in one dashboard https://x.com/GoogleDeepMind/status/1924936866597335107

We’re introducing Gemini in @GoogleChrome, rolling out first to Google AI Pro subscribers in the U.S. It’s your AI browsing assistant to help you get things done. Type or talk to help you quickly understand content or get tasks done using the context of your current webpage — https://x.com/Google/status/1924892719739973640

We’re starting to integrate agentic capabilities throughout our products, including @GoogleChrome, Search, and @GeminiApp. #GoogleIO”” / X https://x.com/Google/status/1924877381853978790

Operator 🤝 OpenAI o3 Operator in ChatGPT has been updated with our latest reasoning model. https://x.com/OpenAI/status/1925963018791178732

@GeminiApp Today, AI Overviews have more than 1.5 billion users every month. That means Google Search is bringing generative AI to more people than any other product in the world. https://x.com/Google/status/1924874920871526830

AI Mode in Google Search: Updates from Google I/O 2025 https://blog.google/products/search/google-search-ai-mode-update/#ai-mode-search

AI Mode is Search transformed with Gemini 2.5 at the core. It’s our most powerful AI search, with more advanced reasoning and multimodality, and the ability to go deeper through follow-up questions and helpful links to the web. Here’s a peek at what’s coming soon to AI Mode: 🧵”” / X https://x.com/Google/status/1924886582479171927

Last year, we introduced Project Astra: a research prototype exploring capabilities for a universal AI assistant. 🤝 We’ve been making it even better with improved voice output, memory and computer control – so it can be more personalized and proactive. Take a look ↓ #GoogleIO https://x.com/GoogleDeepMind/status/1924883244459425797

Google announces AI Ultra subscription plan https://blog.google/products/google-one/google-ai-ultra/

Anthropic adds tools for building AI agents, integrating with Zapier and thousands of automation tools
Anthropic released API capabilities that let developers build AI agents that execute code, connect to thousands of apps, and handle files. The code execution tool allows Claude to run Python programs and create data visualizations during conversations, while the MCP connector links Claude to over 8,000 applications through Zapier without requiring custom programming. A new Files API helps Claude work with documents, and extended prompt caching reduces costs by storing frequently used prompts for up to an hour. The MCP integration enables Claude to perform actions like tracking PayPal transactions, creating Google Docs, generating images through DALL-E, and sending automated email briefings that combine calendar and weather data.

🚀 Introducing Claude x Zapier MCP! Now Claude can work with 8,000+ apps and 30,000+ pre-built actions through Zapier—no custom integrations needed. The Model Context Protocol creates a secure bridge so Claude doesn’t just understand what you want, but actually takes action. ⚡”” / X https://x.com/zapier/status/1918007000363122829

Claude’s new MCP integration is INSANE! Connect PayPal, Gmail, and 7,000+ apps directly in chat. Top 5 things this update enables:   • Financial tracking with PayPal transaction data   • Daily briefings with calendar, email & weather   • Research reports delivered to your https://x.com/JulianGoldieSEO/status/1919285937730617821

MCP is a true gift for AI developers! I recorded a video to show you how to connect AI agents to third-party tools that require authentication using MCP. If you’ve tried, you know this is as painful as it gets. Imagine your agent connects to GitHub, Gmail, and Slack. That’s https://x.com/svpino/status/1917194874497171510

Today, we’re announcing four new capabilities on the Anthropic API to help developers build more powerful AI agents. A code execution tool, MCP connector, Files API, and extended prompt caching: https://x.com/AnthropicAI/status/1925633118104416587

we’re launching a suite of new tools and new features today. a new MCP tool, code interpreter tool, and image generation tool – plus background mode, reasoning summaries, and file search within reasoning models https://x.com/stevenheidel/status/1925209984180380101

Zapier just gave your AI the keys their new MCP lets agents trigger 30,000+ actions across 8,000 apps with real access, not hacks. https://x.com/ProductHunt/status/1920550567153397977

Google’s Gemini 2.5 Pro tops coding leaderboards with new multimodal reasoning mode
Google released Gemini 2.5 Pro, which leads the WebDev Arena coding benchmark with a score of 1415. The model included a mode called “Deep Think,” a reasoning system that explores multiple solutions simultaneously before responding, improving performance on complex math and programming problems. Most importantly, Gemini 2.5 Pro, is multimodal, meaning it can process different types of content together – text, images, audio, and video/

2.5 Pro is now the best model for coding and learning. With a strong ELO score of 1415, it’s topping the WebDev Arena leaderboard – and it incorporates LearnLM, our family of models fine-tuned for learning built with educational experts. #GoogleIO https://x.com/GoogleDeepMind/status/1924878252172353851

Deep Think in 2.5 Pro has landed. 🤯 It’s a new enhanced reasoning mode using our research in parallel thinking techniques – meaning it explores multiple hypotheses before responding. This enables it to handle incredibly complex math and coding problems more effectively. https://x.com/GoogleDeepMind/status/1924881598102839373

Gemini 2.5 can now organize vast amounts of multimodal information, reason about everything it sees, and write code to simulate anything. ↓ #GoogleIO https://x.com/GoogleDeepMind/status/1924878250255516126

And starting this week, Gemini 2.5, our most intelligent model, is coming to Search, for both AI Mode and AI Overviews in the U.S. https://x.com/Google/status/1924885533609599187

Perplexity’s AI hotel booking feature gains traction
Perplexity reports that users are increasingly booking hotels directly through its AI search platform, representing a potential challenge to Google’s advertising business. Hotel bookings are Google’s second-largest advertising category. The feature allows users to complete reservations without leaving Perplexity’s interface, bypassing traditional booking sites and search ads. This development could signal a broader shift in how people discover and book travel, with AI platforms potentially capturing revenue that traditionally flows through search advertising and online travel agencies.

hotel bookings natively on perplexity are quietly growing. it’s one of the under-the-radar features we have right now that has a massive potential to disrupt the ad industry. google’s second biggest adword category i think.”” / X https://x.com/AravSrinivas/status/1923124236618469735

Anthropic activates enhanced safety measures for Claude Opus 4
Anthropic implemented AI Safety Level 3 protections for Claude Opus 4 as a precautionary measure, even though the company hasn’t determined if the model definitively requires these safeguards. The enhanced security makes it harder to steal the AI model’s core programming, while deployment restrictions specifically target potential misuse for developing chemical, biological, radiological, and nuclear weapons. These measures represent a step up from the baseline protections used for previous Claude models, designed to defend against sophisticated attackers, and should only cause Claude to refuse queries on a very narrow set of dangerous topics while Anthropic continues evaluating the model’s risk level.

Activating AI Safety Level 3 Protections \ Anthropic https://www.anthropic.com/news/activating-asl3-protections

Google strengthens Gemini 2.5 against hidden malicious instructions
Google published research on protecting its Gemini 2.5 AI models from “indirect prompt injection” attacks, where hackers embed malicious commands in emails, documents, or websites that trick AI agents into sharing private data or performing unauthorized actions. The company developed security measures to help Gemini distinguish between legitimate user requests and hidden manipulative instructions that could exploit AI agents with access to personal information like calendars, emails, and external websites. This security work addresses a growing cybersecurity risk as AI agents become more capable of accessing and acting on personal data across multiple platforms.

Advancing Gemini’s security safeguards – Google DeepMind https://deepmind.google/discover/blog/advancing-geminis-security-safeguards/

OpenAI launches Codex software engineering agent to rave reviews
OpenAI released Codex, a cloud-based AI agent that handles multiple coding tasks simultaneously in isolated sandbox environments preloaded with your repository. Powered by codex-1 (a version of OpenAI o3 optimized for software engineering), Codex can write features, fix bugs, answer codebase questions, and create pull requests while running tests until they pass. The agent works through ChatGPT’s sidebar, takes 1-30 minutes per task depending on complexity, and provides detailed logs of its actions for verification. Available now to ChatGPT Pro, Team, and Enterprise users, with Plus users getting access soon.

💥 Today we’re launching Codex: a software agent that operates in the cloud and can do many tasks in parallel. In the future most code will be written by AI; society will be accelerated because of it. This is a research preview, but we’re very excited to see what you build.”” / X https://x.com/kevinweil/status/1923403368849871329

A user can then review code suggestions made by the agent. It can show a preview of the test it ran. And the user can then create and push a PR. https://x.com/omarsar0/status/1923398310812918226

Asked Codex to internationalize our app and localize it into Japanese before bed last night. Woke up to complete Japanese support this morning 🇯🇵 What would have taken a few days was done overnight. https://x.com/kn/status/1923819590209220908

Best way to use Codex is to create PRs liberally. Feels like a very different way of writing code!”” / X https://x.com/gdb/status/1923530399692750978

BREAKING: OpenAI announces research preview of Codex in ChatGPT Next-level coding agent within ChatGPT. Pay attention, devs and non-devs! Here is all you need to know: https://x.com/omarsar0/status/1923394424622522394

Codex CLI keeps getting better. In the long run, I expect that “”local”” (e.g. Codex CLI) and “”remote”” (e.g. Codex) coding agents will come together — imagine their combination as a remote coworker who can also look over your shoulder. Excited for the future of programming!”” / X https://x.com/gdb/status/1923492615959478375

Codex for bug finding;”” / X https://x.com/gdb/status/1923509728124207587

Codex for code migrations:”” / X https://x.com/gdb/status/1923802002582319516

Codex for internationalization:”” / X https://x.com/gdb/status/1923897958954872903

Codex is powered by a new model called codex-1. OpenAI claims this is their best coding model to date.”” / X https://x.com/omarsar0/status/1923394428766437684

Introducing Codex | OpenAI https://openai.com/index/introducing-codex/

It seems there was a lot of alignment work that went into Codex. This led to the agent being able to produce cleaner patches and overall code that aligns with a coder’s preference, standards, and instructions.”” / X https://x.com/omarsar0/status/1923403068944580739

Just released Codex, a software engineering agent that can work on many tasks in parallel. It runs on its own cloud-based compute infrastructure, and can fix bugs, answer questions about your code, run tests, etc.. Feels like a step towards the future of software engineering.”” / X https://x.com/gdb/status/1923401740986052770

OpenAI introduces Codex: A cloud-based software engineering agent that can work on many tasks in parallel, powered by codex-1. https://x.com/iScienceLuvr/status/1923394959916273820

OpenAI shared that their engineers use Codex for the following: – refactoring – renaming – writing tests – scaffolding new features – wiring components – fixing bugs – drafting documentation They are noticing new habits emerging from offloading background work to the agents.”” / X https://x.com/omarsar0/status/1923403070806929877

The Codex agent can analyze a codebase and find areas of improvement. It suggest improvements, Then you can schedule tasks right within ChatGPT. https://x.com/omarsar0/status/1923394967008874889

today we are introducing codex. it is a software engineering agent that runs in the cloud and does tasks for you, like writing a new feature of fixing a bug. you can run many tasks in parallel.”” / X https://x.com/sama/status/1923398457747787817

What’s being released? A remote software engineering agent, Codex. Can run many coding tasks in parallel. Available for Pro, Enterprise, and Team ChatGPT users starting today.”” / X https://x.com/omarsar0/status/1923394427071918310

Microsoft declares 2025 the year of AI agents, OpenAI’s Greg Brockman agrees
Microsoft announced new AI agent capabilities across GitHub, Azure, and Windows platforms, positioning itself for what executives call “the age of AI agents.” The company revealed that 15 million developers use GitHub Copilot and over 230,000 organizations including 90% of Fortune 500 companies have built AI agents through Copilot Studio. New features include GitHub’s first asynchronous coding agent, Windows AI Foundry for local AI development, and access to over 1,900 AI models through Azure AI Foundry including xAI’s Grok models. Microsoft also introduced enterprise security tools like Entra Agent ID to manage AI agents and announced support for Anthopic’s Model Context Protocol across all its platforms, plus a new web standard called NLWeb designed to help websites work better with AI agents.

Microsoft Build 2025: The age of AI agents and building the open agentic web – The Official Microsoft Blog https://blogs.microsoft.com/blog/2025/05/19/microsoft-build-2025-the-age-of-ai-agents-and-building-the-open-agentic-web/

2025 is the year of agents.”” / X https://x.com/gdb/status/1923541152508281329

Google launches Veo 3 video generator with built-in audio (goes viral)
Google released Veo 3, an AI video creation tool that generates clips with synchronized audio including dialogue, sound effects, and background noise. The system can create videos from text prompts that capture realistic physics like water movement and snow crunching, while also handling complex scenarios like Broadway musicals or 1970s children’s shows with specific visual styles. Google combined Veo 3 with its Imagen and Gemini AI systems into a filmmaking platform called Flow, which helps users build scenes and save reusable elements like characters and locations, available now to Google AI Pro and Ultra subscribers in the United States.

Check out Veo 3 🔥🔥🔥 sound on 🔊”” / X https://x.com/_tim_brooks/status/1924895946967810234

From capturing real-world physics – like the noise and movement of water, or the look and sound of walking in snow – to lip syncing, Veo 3 is great at understanding what you want. You can tell a short story in your prompt, and the model gives you back a clip that brings it to https://x.com/GoogleDeepMind/status/1924893531300077675

In Flow, AI can help make clips from prompts, build them into scenes and then save your ingredients – such as characters, locations, objects or styles – all in one place. ↓ https://x.com/GoogleDeepMind/status/1924896542848090276

Say goodbye to the silent era of video generation: Introducing Veo 3 — with native audio generation. 🗣️ Quality is up from Veo 2, and now you can add dialogue between characters, sound effects and background noise. Veo 3 is available now in the @GeminiApp for Google AI Ultra https://x.com/Google/status/1924893837295546851

Veo 3 is available today for Ultra subscribers in the United States in the @GeminiApp. Find out more about where you can use it ↓ https://x.com/GoogleDeepMind/status/1924893533787332996

Veo 3, our SOTA video generation model, has native audio generation and is absolutely mindblowing. For filmmakers + creatives, we’re combining the best of Veo, Imagen and Gemini into a new filmmaking tool called Flow. Ready today for Google AI Pro and Ultra plan subscribers. https://x.com/sundarpichai/status/1924909490081825195

Veo 3: “”a big broadway musical about garlic bread, with elaborate costumes and a sondheim-like vibe”” https://x.com/emollick/status/1925065546082484418

Veo 3: “”a scene from an unnerving 1970s childrens show with live action puppets and Lovecraftian overtones singing a song”” https://x.com/emollick/status/1925047195738218505

Video, meet audio. 🎥🤝🔊 With Veo 3, our new state-of-the-art generative video model, you can add soundtracks to clips you make. Create talking characters, include sound effects, and more while developing videos in a range of cinematic styles. 🧵 https://x.com/GoogleDeepMind/status/1924893528062140417

Google labs launches stitch for UI design coding
Google Labs released Stitch, an AI tool that generates user interface designs and code for developers and designers. The tool aims to streamline the process of creating web and app interfaces by automatically producing both visual designs and the corresponding code. Stitch represents Google’s entry into the growing market of AI-powered design tools, competing with other platforms that help non-technical users build functional interfaces without extensive coding knowledge.

Google made an AI coding tool specifically for UI design | The Verge https://www.theverge.com/news/670773/google-labs-stitch-ui-coding-design-tool

Meet Stitch by @GoogleLabs, the easiest and fastest product to generate great designs and UIs. 🧵 https://x.com/stitchbygoogle/status/1924947794034622614

Google launches live camera sharing for Gemini AI assistant
Google rolled out Gemini Live’s camera and screen sharing features to Android users and began the iOS rollout, allowing people to have real-time conversations with the AI while showing it what they’re seeing through their phone’s camera or screen. The feature lets users get instant help with tasks like identifying objects, solving math problems, or getting explanations about what’s displayed on their screen during voice conversations with Gemini. This expands Gemini’s capabilities beyond text and voice to include visual context, making the AI assistant more interactive and practical for everyday situations where users need immediate visual assistance.

Gemini Live camera and screen sharing in @GeminiApp is available on @Android and rolling out to iOS, starting today. https://x.com/Google/status/1924876301573239061

Google is bringing real-time AI camera sharing to Search | The Verge https://www.theverge.com/news/670597/google-search-live-ai-mode-gemini-ios

Microsoft AI agents discover new cooling material in hours
Microsoft’s AI Discovery platform helped researchers find a chemical-free immersion coolant by using multiple AI agents working together, identifying a material previously unknown to scientists in just hours rather than the typical months-long research process. The team, led by John Link, successfully synthesized the discovered compound in their laboratory, demonstrating how AI can accelerate materials discovery by rapidly analyzing vast combinations of chemical properties and suggesting viable alternatives to harmful forever chemicals used in electronics cooling.

Mindblowing demo: John Link led a team of AI agents to discover a forever-chemical-free immersion coolant using Microsoft Discovery. The agents surfaced a material “”unknown to humans”” — in hours, not months — and the team synthesized it in the lab. “”It’s literally very cool.”” https://x.com/vitrupo/status/1924568771353841999

Nvidia launches Isaac GR00T N1.5 robotics foundation model
Nvidia released Isaac GR00T N1.5, an updated open-source foundation model designed to teach humanoid robots reasoning and physical skills, along with new tools for generating training data without relying on human demonstrations. The company introduced two synthetic data generation systems: Isaac GR00T-Mimic, which uses physics simulations to expand human motion data, and GR00T-Dreams, which creates new motion videos from single images using AI video generation. Nvidia also launched Cosmos-Reason1-7B, the first reasoning model specifically built for robotics that understands physical common sense and can make appropriate decisions for robotic actions.

Jensen just announced NVIDIA’s Isaac GR00T N1.5 and GR00T-Dreams blueprint at COMPUTEX 2025: ⦿ Isaac GR00T N1.5 is the first update to NVIDIA’s open, generalized, fully customizable foundation model for humanoid reasoning and skills. ⦿ “Human demonstrations aren’t scalable — https://x.com/TheHumanoidHub/status/1924332201862414495

JUST IN🚨: Nvidia open sourced Physical AI models reasoning models that understand physical common sense and generate appropriate embodied decisions 👀 https://x.com/reach_vb/status/1924525937443365193

NVIDIA offers two blueprints for synthetic data generation: ⦿ Isaac GR00T-Mimic: Uses a physics engine to amplify human motion data in simulation. ⦿ GR00T-Dreams (announced yesterday): Fine-tunes a video generation AI model to create new motion videos from a single image. https://x.com/TheHumanoidHub/status/1924538121687073167

NVIDIA released new vision reasoning model for robotics: Cosmos-Reason1-7B 🤖 > first reasoning model for robotics 😱 > based on Qwen 2.5-VL-7B, use with @huggingface transformers or vLLM 🤗 > comes with SFT & alignment dataset and a new benchmark 👏 https://x.com/mervenoyann/status/1924817927561183498

Nvidia develops reasoning AI without training examples
Nvidia created Nemotron-Research-Tool-N1, a family of AI models that learned to reason and use tools through rule-based reinforcement learning rather than being trained on existing reasoning examples. The approach eliminates the need for reasoning supervision or distillation from other models, allowing the AI to develop problem-solving skills independently. This method could reduce training costs and dependencies on human-generated reasoning data while still producing models capable of complex logical thinking and tool use.

Tool-using LLMs can learn to reason—without reasoning traces. 🔥 We present Nemotron-Research-Tool-N1, a family of tool-using reasoning LLMs trained entirely via rule-based reinforcement learning—no reasoning supervision, no distillation. 📄 Paper: https://x.com/ShaokunZhang1/status/1922105694167433501

Google develops Gemini Diffusion text generation model
Google created Gemini Diffusion, a text generation model that works differently from traditional AI by refining random noise into coherent text through multiple steps rather than predicting words directly. This approach helps the model excel at coding and math problems by allowing it to iterate and improve solutions quickly, similar to how image generation models like DALL-E create pictures by gradually removing noise from random pixels.

We’ve developed Gemini Diffusion: our state-of-the-art text diffusion model. Instead of predicting text directly, it learns to generate outputs by refining noise, step-by-step. This helps it excel at coding and math, where it can iterate over solutions quickly. #GoogleIO https://x.com/GoogleDeepMind/status/1924888095448825893

OpenAI launches first international AI infrastructure partnership with UAE
OpenAI partnered with the United Arab Emirates to build the first international deployment of its Stargate AI infrastructure platform, creating a 1-gigawatt computing cluster in Abu Dhabi expected to go live in 2026. The partnership, developed with U.S. government coordination and support from President Trump, includes companies like G42, Oracle, NVIDIA, Cisco, and SoftBank, while the UAE will invest in U.S. Stargate infrastructure and become the first country to enable ChatGPT nationwide. This marks the launch of OpenAI’s “OpenAI for Countries” initiative, which aims to help governments build sovereign AI capabilities in coordination with the U.S., with plans for 10 partnerships across key regions and the potential to serve up to half the world’s population within a 2,000-mile radius of the UAE facility.

Introducing Stargate UAE | OpenAI https://openai.com/index/introducing-stargate-uae/

12 AI Visuals and Charts: Week Ending May 23, 2025

StackOverflow questions over time, source SEDE; sadface, lunch has been eaten https://x.com/marcgravell/status/1922922817143660783

Microsoft just revealed its next big AI bets at Build 2025. I sat down with Microsoft CEO Satya Nadella to unpack: -Microsoft’s vision for the “agentic web” -Why your next job might be AI agent manager -What happens when 95% of code is AI-generated Timestamps: 0:00 Building https://x.com/rowancheung/status/1925228045415416297

Eric Schmidt on why AI is actually *underhyped* We dove into the big questions at TED — AGI, China, open source, and human agency. One of the rare leaders who can straddle the worlds of tech, policy, and Burning Man. Check out the full talk below 👇 https://x.com/bilawalsidhu/status/1923085454397616533

Code with Claude Opening Keynote – YouTube https://www.youtube.com/watch?v=EvtPBaaykdo

Mastering Claude Code in 30 minutes – YouTube https://www.youtube.com/watch?v=6eBSHbLKuN0

what are we doing here folks https://x.com/catehall/status/1925631571605827944

Report: Spring 2025 AI Model Usage Trends – Poe https://poe.com/blog/spring-2025-ai-model-usage-trends

Most important plot from IO today — AI usage is skyrocketing. This is real. https://x.com/natolambert/status/1924916998133129716?s=46

Google I/O 2025: Listen to a podcast recap https://blog.google/technology/ai/release-notes-podcast-io-2025/

o3, give me a screenshot from that one safety video in the 1950s about how to care for your giant killer squid”” “”great, a second shot from when it all goes wrong for poor Timmy.”” “”show me that one moment people always talk about”” (this was all the first output from the AI) https://x.com/emollick/status/1923861434271740005

I thought my “”otter on a plane using Wifi”” benchmark was already done, but Veo 3 adds higher quality… and sound Here is “”an otter on a plane using wifi on their phone, the flight attendant asks them “”do you want a drink ?”” and the otter nods”” (One of the first set of 4 videos) https://x.com/emollick/status/1925018308182524391

New video of fully autonomous Optimus. Performing many new tasks – instructed via natural language. All the tasks are done by a single neural net – learned directly from human videos. “”This breakthrough allows us to learn new tasks much faster.”” https://x.com/TheHumanoidHub/status/1925052725714419889

Top 93 Links of The Week – Organized by Category

AGI

I wish these skeptical AI articles would actually grapple with the growing body of research that AI can really do original research & perform key unstructured tasks across the spectrum of high-end white collar employment. AI criticism is important, but it should be clear-eyed.”” / X https://x.com/emollick/status/1923417536072241529

AI learns how vision and sound are connected, without human intervention | MIT News | Massachusetts Institute of Technology https://news.mit.edu/2025/ai-learns-how-vision-and-sound-are-connected-without-human-intervention-0522

Meta FAIR and Rothschild Foundation Hospital present a groundbreaking study mapping how language representations emerge in the brain, revealing striking parallels with LLMs. This research offers unprecedented insights into the neural development of language, showing how AI https://x.com/AIatMeta/status/1925590735254167926

How likely is an intelligence explosion as forecast in AI 2027? Algorithmic advances that could drive an intelligence explosion may be bottlenecked by compute, according to new research from @noshpesoj and @uchicagoxlab described in this week’s Gradient Update. Here’s why: https://x.com/EpochAIResearch/status/1923489932581945683

AgentsCopilots

Wow, first test of @OpenAI Codex agent – connected to my @readbetterio repo, it found a bug in 3 minutes (nothing major, but still). First merged PR entirely written by AI. Very cool experience! https://x.com/RBouschery/status/1923490563212419375

You can now clone any YouTube channel’s thumbnail style—and automate the process of generating thumbnails for your own videos in that same style. In this sneak peek, I show how I used the Agent Development Kit (ADK) to replicate Alex Hormozi’s exact thumbnail look using OpenAI’s https://x.com/bhancock_ai/status/1920185203919573227

This is the trend I see in thoughtful people using AI. Model ability is catching up to some of the promises made by AI labs in a way that is difficult to ignore (while still behind the biggest hype). We don’t know where it will end, but views need to be updated as tech improves.”” / X https://x.com/emollick/status/1924480193298629015

8/ Gemini SDK is now Compatible with MCP Tools  Developers can integrate Gemini with Model Context Protocol (MCP) tools for enhanced agentic capabilities. 🌐 Automate form submissions across multiple platforms. 🤖Integrate Gemini into existing workflows for task automation.”” / X https://x.com/AtomSilverman/status/1924960920671076858

I built an automated AI travel agency with multi-agents. It has 4 MCP AI agents working together as a team: 1. Google Maps Agent 2. Airbnb Booking Agent 3. Google Calendar Agent 4. Weather Agent 100% Opensource Code with step-by-step tutorial. https://x.com/Saboo_Shubham_/status/1919942105553895737

🦜🤖Introducing Open Agent Platform An open-source, no-code agent building platform. OAP enables non-developers to build highly customizable agents, which connect to: – 🛠️MCP Tools – 📄LangConnect for RAG – 🤖Other LangGraph Agents! Try the public demo, or fork & customize it https://x.com/LangChainAI/status/1922722850542346680

the OpenAI Responses API is now the first truly agentic API 🚀 developers can combine MCP servers, code interpreter, reasoning, web search, and RAG – all within a single API call – to build the next generation of agents 🤖”” / X https://x.com/stevenheidel/status/1925209983073046616

Introducing Gemma 3n, our multimodal model built for mobile on-device AI. 🤳 It runs with a smaller memory footprint, cutting down RAM usage by nearly 3x – enabling more complex applications right on your phone, or for livestreaming from the cloud. Now available in early https://x.com/GoogleDeepMind/status/1925916216083779774

Here’s my “”Dark Leisure”” theory of any potential productivity paradox in AI: – most AI use rn is bottom up and hidden (employee first, not company first): employees vibe code, vibe market, vibe write and get stuff done faster – in many orgs, there is too little incentive to”” / X https://x.com/fabianstelzer/status/1926000937702764635

3/ With project mariner, you can: – Automate the process of listing products across various e-commerce sites.  – Schedule multiple appointments (e.g., doctor, dentist, car service) concurrently”” / X https://x.com/AtomSilverman/status/1924960901142323588

We’re partnering with @Dell to accelerate secure, agentic enterprise AI solutions. Dell will be the first provider to offer our secure agents platform, Cohere North, to enterprises on-premises, which is crucial for regulated industries handling sensitive data 🧵 https://x.com/cohere/status/1924512634373865950

We’re partnering with @SAP to bring enterprise-ready agentic AI to businesses worldwide! Our models will be embedded into SAP Business Suite, offering secure and scalable AI capabilities. With Cohere’s cutting-edge models also available on SAP AI Core, enterprises can leverage https://x.com/cohere/status/1924858543716630644

We’re introducing thought summaries in 2.5 Flash and Pro via the Gemini API and @GoogleCloud’s #VertexAI. These organize the model’s thoughts into a clear format with headers and key information about its actions to give more transparency. https://x.com/GoogleDeepMind/status/1924879655762632816

You can now crawl entire websites and extract LLM-ready data with a single tool. Crawl4AI is an open-source repo built for AI agents, RAG, and data pipelines. It supports both browser-based and HTTP crawling, with real-time Markdown generation from any site. https://x.com/LiorOnAI/status/1925930945137254629

With GPT-4 as a tutor Nigerian students saw years of learning in weeks. Important World Bank research investigates if AI chatbots can effectively and affordably boost learning in Nigeria. 🇳🇬 Researchers conducted a Randomized Controlled Trial (RCT) in Nigeria. First-year https://x.com/rohanpaul_ai/status/1925614762139713851

As someone involved in academic research on AI, it is notable to me that most of the key experiments showing the impressive abilities of AI on work, medicine, psychology, and so many other fields were done on GPT-4… a model that is now so obsolete that it is gone from ChatGPT. https://x.com/emollick/status/1923134492115365905

QoL Update: Starting today, you will see an AI generated summary for all papers of Hugging Face Papers! 🔥 GG @mishig25 🐐 https://x.com/reach_vb/status/1925517801197879737

this AI agent is f**king scary Rork can clone top App Store AI apps with a few prompts I just cloned Character AI, but removed all the censorship. now you can create dream gf & chat with her.. about anything 9 examples: https://x.com/EHuanglu/status/1923395698860699785

Connect Google ADK Agents to 100+ Systems with GCP Integration Connectors Google’s Agent Development Kit (ADK) now integrates with GCP Integration Connectors, enabling AI agents to perform real-world tasks across over 100 systems. Key features: – Agents can execute actions https://x.com/NdabageraM/status/1921524696992137343

Introducing Jules, an AI coding agent powered by Gemini 2.5 Pro. Jules works asynchronously across your repo on tasks like fixing bugs or refactoring, helping you cross multiple things off your to do list at the same time. Plus, stay updated with Codecasts, a daily podcast of https://x.com/julesagent/status/1924890206853116142

Jules: An asynchronous coding agent | Hacker News https://news.ycombinator.com/item?id=44034918

I just generated a 5:30 min Multi-Speaker Podcast on Agentic Patterns using Gemini 2.5 Flash and our new Text-to-speech (TTS) Model! At I/O we launched native controllable Audio Generation for Gemini 2.5 Pro & Flash. > Controllable style, accent, pace, tone. > single and https://x.com/_philschmid/status/1925888544175734873

Grok 3 is now available on Microsoft Azure https://x.com/ibab/status/1924518628172693922

Here is the full conversation today between Microsoft CEO Satya Nadella and @elonmusk. Elon: “”With Grok 3.5, which is about to be released, it’s trying to reason from first principles.”” https://x.com/SawyerMerritt/status/1924536496981172533

The killer feature of OpenAI Codex is parallelism. Browser-based work is evolving: from humans handling tasks one tab at a time, to overseeing multiple AI agent tabs, providing feedback as needed.”” / X https://x.com/alexhalliday/status/1923728921150820650

AMA with OpenAI Codex team : r/ChatGPT https://www.reddit.com/r/ChatGPT/comments/1ko3tp1/comment/mso344o/

In the latest issue of The Batch, Andrew Ng shares how large companies can move faster by using AI. Plus: 📰 OpenAI Codex turns agents into your dev team 📰 Grok spread conspiracies after rogue update 📰 U.S. makes AI tech deals with Saudi Arabia and UAE Read The Batch: https://x.com/DeepLearningAI/status/1925975010893516991

4. NLWeb: This is a new open project that lets you use natural language to interact with any website. Think of it like HTML for the agentic web. https://x.com/satyanadella/status/1924535902321442846

Reasoning Generalization Reasoning fails to generalize across environments. Agents struggle with spatial coordination (Messenger), legal move inference (Hanoi), and adapting to opponent patterns (RPS). Even with reward shaping and hints, models often underperform random https://x.com/omarsar0/status/1924182841677709540

Anthropic

Anthropic’s New Model Excels at Reasoning and Planning—and Has the Pokémon Skills to Prove It | WIRED https://www.wired.com/story/anthropic-new-model-launch-claude-4/

Ever wondered you can chat with your Google Calendar? So introducing Google Calendar MCP. Here is the Repo Link: https://x.com/avikm744/status/1921903828334518511

Top 10 Most Popular MCP Servers in the Cline https://x.com/cline/status/1918427793047863337

Here’s the easiest way to build an MCP server: 1. Use Gitingest to convert the FastMCP repo into LLM-ready text. 2. Download the text file. 3. Upload it to Google AI Studio, specifying the MCP server type. Gemini 2.5 Pro handles the rest! https://x.com/akshay_pachaar/status/1918283739760828795

Microsoft releases NLWeb NLWeb uses MCP to make it simple to interact with websites in a standardized way. Devs can now convert any website into an AI app. MCP is to NLWeb what HTTP is to HTML. This went largely unnoticed this week, but it looks like a big deal. https://x.com/omarsar0/status/1925900575666733207

Introducing support for remote MCP servers, image generation, Code Interpreter, and more in the Responses API. https://x.com/OpenAIDevs/status/1925214114445771050

A Step-by-Step Tutorial on Connecting Claude Desktop to Real-Time Web Search and Content Extraction via Tavily AI and Smithery using Model Context Protocol (MCP) In this hands-on tutorial, we’ll learn how to seamlessly connect Claude Desktop to real-time web search and https://x.com/Marktechpost/status/1918877427335622673

Here’s a quick demo of searching, running and using the browser-tools MCP using OneMCP. https://x.com/Ipenywis/status/1921213033973772350

I’m starting to learn that agents, a bit like RAG I guess, is becoming less of a thing and just a control structure. With MCP integrated to InferenceClient, agents are just while loops. No stress. No framework. Just LLMs doing stuff. https://x.com/ben_burtenshaw/status/1925933013889663115

A week ago, Anthropic quietly weakened their ASL-3 security requirements. Yesterday, they announced ASL-3 protections. I appreciate the mitigations, but quietly lowering the bar at the last minute so you can meet requirements isn’t how safety policies are supposed to work. 🧵”” / X https://x.com/RyanPGreenblatt/status/1925992236648464774

Anthropic closes $2.5 billion credit facility https://www.cnbc.com/amp/2025/05/16/anthropic-ai-credit-facility.html

Anthropic CEO, Dario Amodei: the first billion-dollar company with a single human employee could emerge by 2026 https://x.com/slow_developer/status/1925632756639256577

Anthropic raises Series E at $61.5B post-money valuation \ Anthropic https://www.anthropic.com/news/anthropic-raises-series-e-at-usd61-5b-post-money-valuation

SDK – Anthropic https://docs.anthropic.com/en/docs/claude-code/sdk

Implementing An Airbnb and Excel MCP Server In this tutorial, we’ll build an MCP server that integrates Airbnb and Excel, and connect it with Cursor IDE. Using natural language, you’ll be able to fetch Airbnb listings for a specific date range and location, and automatically https://x.com/Marktechpost/status/1918543230779703762

Another paper showing AI (Claude 3.5) is more persuasive than the average human, even when the humans had financial incentives In this case, either AI or humans (paid if they were persuasive) tried to convince quiz takers (paid for accuracy) to pick either right or wrong answers https://x.com/emollick/status/1923474500194095282

How Does Claude 4 Think? – Sholto Douglas & Trenton Bricken – YouTube https://www.youtube.com/watch?v=64lXQP6cs5M

Windows is getting support for the ‘USB-C of AI apps’ | The Verge https://www.theverge.com/news/669298/microsoft-windows-ai-foundry-mcp-support

Build a MCP server that can read a tweet via CDP, discuss with AI, then save to @raycastapp Notes. https://x.com/Leechael/status/1921555839359373415

Apple

Claude 4 prompt engineering best practices – Anthropic https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/claude-4-best-practices

Apple to Open AI Models to Developers in Bid to Spur New Apps – Bloomberg https://www.bloomberg.com/news/articles/2025-05-20/apple-to-open-ai-models-to-developers-betting-that-it-will-spur-new-apps?embedded-checkout=true

BusinessAI

AI’s ability to make tasks not just cheaper, but also faster, is underrated in its importance in creating business value. For the task of writing code, AI is a game-changer. It takes so much less effort — and is so much cheaper — to write software with AI assistance than”” / X https://x.com/AndrewYNg/status/1923045958511886549

They went from nearly killing their startup over a weekend to being in talks with OpenAI for a $3 billion acquisition. How @windsurf_ai turned a moment of existential panic into one of AI’s most remarkable success stories. One weekend in 2022, the founders of Exafunction https://x.com/fdaudens/status/1923458065937883509

📢We’re excited to share that we’ve raised $100M in seed funding to support LMArena and continue our research on reliable AI. Led by @a16z and UC Investments (@UofCalifornia), we’re proud to have the support of those that believe in both the science and the mission. We’re https://x.com/lmarena_ai/status/1925241333310189804

Chinese Startup Trials First AI Doctor Clinic in Saudi Arabia – Bloomberg https://www.bloomberg.com/news/articles/2025-05-15/chinese-startup-trials-first-ai-doctor-clinic-in-saudi-arabia?embedded-checkout=true

Great column by @htaneja & @FareedZakaria in @theinformation: « America’s historical technological leadership wasn’t built on protectionism and closed systems—it was fueled by creating a dynamic marketplace of optionality, including open platforms the world could build upon. Yet https://x.com/ClementDelangue/status/1924578324392587385

This will likely produce a business ~100x bigger than Apple’s market cap (or more) What’s clear now is we’re in the right decade for humanoid robotics – this will feel like the future jumped ahead by 50 years”” / X https://x.com/adcock_brett/status/1923406193743081596

ChipsHardware

2025 State of AI Infrastructure Report | Google Cloud https://cloud.google.com/resources/content/state-of-ai-infrastructure

EducationAI

The current state of research on AI and education: Growing evidence that, when used as a tutor with instructor guidance, AI seems to have quite significant positive effects. When used alone to get help with homework, it can act as shortcut that hurts learning Still early days. https://x.com/emollick/status/1925055450254385592

Very big impact: The final version of a randomized, controlled World Bank study finds using a GPT-4 tutor with teacher guidance in a six week after school progam in Nigeria had “”more than twice the effect of some of the most effective interventions in education”” at very low costs https://x.com/emollick/status/1924919060753465537

EthicsLegalSecurity

“anthropic included a “model welfare evaluation” in the claude 4 system card. it might seem absurd, but I believe this is a deeply good thing to do “Claude shows a striking ‘spiritual bliss’ attractor state” https://x.com/arithmoquine/status/1925598303393042477

China launches first of 2,800 satellites for AI space computing constellation – SpaceNews https://spacenews.com/china-launches-first-of-2800-satellites-for-ai-space-computing-constellation/

UAE launches Arabic language AI model as Gulf race gathers pace | Reuters https://www.reuters.com/world/middle-east/uae-launches-arabic-language-ai-model-gulf-race-gathers-pace-2025-05-21/

Chicago Sun-Times publishes made-up books and fake experts in AI debacle | The Verge https://www.theverge.com/ai-artificial-intelligence/670510/chicago-sun-times-ai-generated-reading-list

Google

It’s official… we’re bringing Gemini to Wear OS! 🎉 In the coming months, you’ll be able to chat naturally with Gemini to get things done across apps, like creating a personalized workout playlist or remembering where you put your stuff. Check it out: https://x.com/WearOSbyGoogle/status/1922370010112032820

NEW: Google announces Gemini Diffusion It’s an experimental text diffusion model that leverages parallel generation to achieve insane low latency. It can generate 5x faster than 2.0 Flash Light! https://x.com/omarsar0/status/1924882868477563141

Thinking budgets are coming to 2.5 Pro soon. 💭 You’ll have more control over how much the model thinks before it responds – or you can simply turn it off. https://x.com/GoogleDeepMind/status/1924879658081980761

Glasses with Android XR are lightweight and designed for all-day wear. They work with your phone so you can be hands-free, stay in the moment with friends and complete your to-do list. https://x.com/Google/status/1924899930109575474

Google Beam: Updates to Project Starline from I/O 2025 https://blog.google/technology/research/project-starline-google-beam-update/

Google Beam: Be there from anywhere with our breakthrough communication technology. https://starline.google/

Google dropped AlphaEvolve, an AI that discovers algorithms for scientific and computational challenges —Uses Gemini models with auto-evaluation & iteration —Found the first improvement on 1969’s Strassen’s algorithm —Also boosting efficiency for Google https://x.com/adcock_brett/status/1924133683444793819

Evidence from an ongoing nationally representative survey of US workers that there has been a very large, very recent surge in AI use at work, going from around 30% of workers in December to over 40% in March/April 2025. Big expansion of use of both Gemini & ChatGPT. https://x.com/emollick/status/1925132760810692901

You can also create comics 💬, packaging 🥫, stylized stamps 💮 and more – all with improved spelling and new layouts. https://x.com/GoogleDeepMind/status/1924892789638070732

Google released MedGemma on I/O’25 👏 > 4B and 27B instruction fine-tuned vision LMs and a 4B pre-trained vision LM for medicine > available with transformers from the get-go 🤗 they also released a cool demo for scan reading ⤵️ https://x.com/mervenoyann/status/1925569064597893288

Gemini Diffusion https://simonwillison.net/2025/May/21/gemini-diffusion/

InternationalAI

Spatial Speech Translation: Translating Across Space With Binaural Hearables https://dl.acm.org/doi/pdf/10.1145/3706598.3713745

MetaAI

Meta just released KernelLLM 8B on Hugging Face ⚡ > On KernelBench-Triton Level 1, our 8B parameter model exceeds models such as GPT-4o and DeepSeek V3 in single-shot performance 🤯 > With multiple inferences, KernelLLM’s performance outperforms DeepSeek R1 https://x.com/reach_vb/status/1924478755898085552

We’ve released Open Molecules 2025 (OMol25), a new Density Functional Theory (DFT) dataset for molecular chemistry, and Meta’s Universal Model for Atoms (UMA), a machine learning interatomic potential. These tools will accelerate molecular and materials discovery, unlocking new https://x.com/AIatMeta/status/1924502785028190366

OpenAI

soon we have another low-key research preview to share with you all”” / X https://x.com/sama/status/1923104360243835131

this was an extremely smart thing for you all to do and i’m sorry naive people are giving you grief.”” / X https://x.com/sama/status/1923428713095479437

A conversation with OpenAI’s Chief Product Officer, Kevin Weil – YouTube https://www.youtube.com/watch?v=LZr6Rhu8_as

One thing the Deep Research models make clear is letting AIs agentically use their own tools to do search & see the outcomes can result in much better outcomes than RAG. The ability for AI to understand context for answers, see what isn’t found, & follow “curiosity” all matter.”” / X https://x.com/emollick/status/1923066932174942544

Perplexity

Perplexity partners with PayPal for in-chat AI shopping https://www.cnbc.com/2025/05/14/perplexity-partners-with-paypal-for-in-chat-ai-shopping.html

Robotics

Sam Altman: On that first day, when you’re just walking down the street and seven humanoid robots walk past you, it’s going to feel very sci-fi. And I don’t think that’s too far off from, like, a visceral, “oh, man, this is going to do a lot of things people used to do.” https://x.com/TheHumanoidHub/status/1924868956017590511

Sundar Pichai on the robotics opportunity: ⦿ He’s impressed by recent humanoid progress – so much so that he sometimes has to look closely to tell if a robot video is real or fake. ⦿ Google initially moved too early into the application layer, but now sees the combination of https://x.com/TheHumanoidHub/status/1923278278275760383

Elon on Optimus in today’s CNBC interview ⦿ Currently, Optimus is being trained from demonstrations collected by humans wearing mocap suits with cameras on their heads – performing primitive tasks such as opening doors, picking up objects, and dancing. This is needed to https://x.com/TheHumanoidHub/status/1924981814311133509

Optimus can now learn from first-person video. Many new skills are emerging that can be instructed via natural language. Next step: expand this to learning from third-person videos (random internet videos) and push reliability via self-play (Reinforcement learning). https://x.com/TheHumanoidHub/status/1925057174092579253

TechPapers

[2505.09662] Large Language Models Are More Persuasive Than Incentivized Human Persuaders https://arxiv.org/abs/2505.09662

TwitterXGrok

“an unauthorized modification was made to the Grok response bot’s prompt on X” By whom? By a hacker? By aliens? This is bullshit. Everyone knows what happened. You just got caught because Grok gave you away.”” / X https://x.com/svpino/status/1923194083977167240

We want to update you on an incident that happened with our Grok response bot on X yesterday. What happened: On May 14 at approximately 3:15 AM PST, an unauthorized modification was made to the Grok response bot’s prompt on X. This change, which directed Grok to provide a”” / X https://x.com/xai/status/1923183620606619649

Leave a Reply

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading