Agents and Copilots: AI News Week Ending 05/23/2025

Image created with Ideogram 3.0. Image prompt: Lower-East-Side street-corner photograph reminiscent of a late-80s album cover: weathered red-brick tenement with exterior fire-escapes, canvas awning shading racks of vintage clothes; above the awning, a hand-painted board reads ‘Agents SPORTSWEAR’; a hanging blade sign in cursive script reads ‘Agents Boutique’; a delivery robot wearing a nametag reading ‘Agent-01’ rolls past the storefront; warm golden-hour light, subtle 35mm film grain, muted yet punchy color palette, gritty NYC vibe.

BREAKING: OpenAI announces research preview of Codex in ChatGPT Next-level coding agent within ChatGPT. Pay attention, devs and non-devs! Here is all you need to know: https://x.com/omarsar0/status/1923394424622522394

Explore the future of AI agents with Project Mariner – a research prototype that can help you get things done, like: ⛱️Planning trips 🛒Ordering items 🍽️Making reservations ✅All with your oversight #GoogleIO https://x.com/GoogleDeepMind/status/1924936861983609194

Claude 3.7 Sonnet and Claude Code \ Anthropic https://www.anthropic.com/news/claude-3-7-sonnet

Today, we’re announcing four new capabilities on the Anthropic API to help developers build more powerful AI agents. A code execution tool, MCP connector, Files API, and extended prompt caching: https://x.com/AnthropicAI/status/1925633118104416587

🚀 Introducing Claude x Zapier MCP! Now Claude can work with 8,000+ apps and 30,000+ pre-built actions through Zapier—no custom integrations needed. The Model Context Protocol creates a secure bridge so Claude doesn’t just understand what you want, but actually takes action. ⚡”” / X https://x.com/zapier/status/1918007000363122829

Zapier just gave your AI the keys their new MCP lets agents trigger 30,000+ actions across 8,000 apps with real access, not hacks. https://x.com/ProductHunt/status/1920550567153397977

Gemini Live camera and screen sharing in @GeminiApp is available on @Android and rolling out to iOS, starting today. https://x.com/Google/status/1924876301573239061

Google is bringing real-time AI camera sharing to Search | The Verge https://www.theverge.com/news/670597/google-search-live-ai-mode-gemini-ios

Gemini 2.5 can now organize vast amounts of multimodal information, reason about everything it sees, and write code to simulate anything. ↓ #GoogleIO https://x.com/GoogleDeepMind/status/1924878250255516126

Last year, we introduced Project Astra: a research prototype exploring capabilities for a universal AI assistant. 🤝 We’ve been making it even better with improved voice output, memory and computer control – so it can be more personalized and proactive. Take a look ↓ #GoogleIO https://x.com/GoogleDeepMind/status/1924883244459425797

2.5 Pro is now the best model for coding and learning. With a strong ELO score of 1415, it’s topping the WebDev Arena leaderboard – and it incorporates LearnLM, our family of models fine-tuned for learning built with educational experts. #GoogleIO https://x.com/GoogleDeepMind/status/1924878252172353851

Google announces AI Ultra subscription plan https://blog.google/products/google-one/google-ai-ultra/

Deep Think in 2.5 Pro has landed. 🤯 It’s a new enhanced reasoning mode using our research in parallel thinking techniques – meaning it explores multiple hypotheses before responding. This enables it to handle incredibly complex math and coding problems more effectively. https://x.com/GoogleDeepMind/status/1924881598102839373

@demishassabis Our ultimate vision for the @GeminiApp is to transform it into a universal AI assistant — an AI that’s personal, powerful and proactive and one of our key milestones on the road to artificial general intelligence (AGI). https://x.com/Google/status/1924882592236540085

Mindblowing demo: John Link led a team of AI agents to discover a forever-chemical-free immersion coolant using Microsoft Discovery. The agents surfaced a material “”unknown to humans”” — in hours, not months — and the team synthesized it in the lab. “”It’s literally very cool.”” https://x.com/vitrupo/status/1924568771353841999

@googlechrome @GeminiApp Agent Mode in @GeminiApp is a new experiment coming soon to subscribers that lets you delegate complex planning and tasks to Gemini to get stuff done. https://x.com/Google/status/1924877422761005352

1/ Agent Mode is coming to the Gemini App Google introduced Agent Mode, enabling Gemini to autonomously execute multi-step tasks. 🏠 Find 2-bedroom apartments under $2,000 on Zillow and schedule a tour. 🍽️ Book a 7 PM reservation at the best-rated Thai restaurant nearby. https://x.com/AtomSilverman/status/1924960409062342676

Agent Mode in the @Geminiapp can help you get more done across the web – coming to subscribers soon. Plus a new multi-tasking version of Project Mariner is now available to Google AI Ultra subscribers in the US, and computer use capabilities are coming to the Gemini API. https://x.com/sundarpichai/status/1924909900033122466

Meet Stitch by @GoogleLabs, the easiest and fastest product to generate great designs and UIs. 🧵 https://x.com/stitchbygoogle/status/1924947794034622614

We’re introducing Gemini in @GoogleChrome, rolling out first to Google AI Pro subscribers in the U.S. It’s your AI browsing assistant to help you get things done. Type or talk to help you quickly understand content or get tasks done using the context of your current webpage — https://x.com/Google/status/1924892719739973640

We’re starting to integrate agentic capabilities throughout our products, including @GoogleChrome, Search, and @GeminiApp. #GoogleIO”” / X https://x.com/Google/status/1924877381853978790

3 updates to Project Mariner, our research prototype that can interact with the web and get things done:”” / X https://x.com/Google/status/1924876541147709897

5/ Integrating into AI Mode in Search and Chrome Mariner’s capabilities are being embedded into Google’s Search and Chrome 🎟️ Automatically purchase event tickets when they become available. 🛎️ Reserve tables at restaurants based on user preferences.”” / X https://x.com/AtomSilverman/status/1924960909686128810

AI Mode is Search transformed with Gemini 2.5 at the core. It’s our most powerful AI search, with more advanced reasoning and multimodality, and the ability to go deeper through follow-up questions and helpful links to the web. Here’s a peek at what’s coming soon to AI Mode: 🧵”” / X https://x.com/Google/status/1924886582479171927

And starting this week, Gemini 2.5, our most intelligent model, is coming to Search, for both AI Mode and AI Overviews in the U.S. https://x.com/Google/status/1924885533609599187

Since releasing Project Mariner in December, we’ve been working with trusted testers to gather feedback. Today, we’re announcing updates, including: 📈Managing up to 10 tasks at once 🧑‍🏫Ability to learn and repeat tasks 🌐Easy access via a web app 1️⃣All in one dashboard https://x.com/GoogleDeepMind/status/1924936866597335107

Asked Codex to internationalize our app and localize it into Japanese before bed last night. Woke up to complete Japanese support this morning 🇯🇵 What would have taken a few days was done overnight. https://x.com/kn/status/1923819590209220908

MCP is a true gift for AI developers! I recorded a video to show you how to connect AI agents to third-party tools that require authentication using MCP. If you’ve tried, you know this is as painful as it gets. Imagine your agent connects to GitHub, Gmail, and Slack. That’s https://x.com/svpino/status/1917194874497171510

Microsoft Build 2025: The age of AI agents and building the open agentic web – The Official Microsoft Blog https://blogs.microsoft.com/blog/2025/05/19/microsoft-build-2025-the-age-of-ai-agents-and-building-the-open-agentic-web/

Tool-using LLMs can learn to reason—without reasoning traces. 🔥 We present Nemotron-Research-Tool-N1, a family of tool-using reasoning LLMs trained entirely via rule-based reinforcement learning—no reasoning supervision, no distillation. 📄 Paper: https://x.com/ShaokunZhang1/status/1922105694167433501

💥 Today we’re launching Codex: a software agent that operates in the cloud and can do many tasks in parallel. In the future most code will be written by AI; society will be accelerated because of it. This is a research preview, but we’re very excited to see what you build.”” / X https://x.com/kevinweil/status/1923403368849871329

2025 is the year of agents.”” / X https://x.com/gdb/status/1923541152508281329

A user can then review code suggestions made by the agent. It can show a preview of the test it ran. And the user can then create and push a PR. https://x.com/omarsar0/status/1923398310812918226

Best way to use Codex is to create PRs liberally. Feels like a very different way of writing code!”” / X https://x.com/gdb/status/1923530399692750978

Codex CLI keeps getting better. In the long run, I expect that “”local”” (e.g. Codex CLI) and “”remote”” (e.g. Codex) coding agents will come together — imagine their combination as a remote coworker who can also look over your shoulder. Excited for the future of programming!”” / X https://x.com/gdb/status/1923492615959478375

Codex for bug finding;”” / X https://x.com/gdb/status/1923509728124207587

Codex for code migrations:”” / X https://x.com/gdb/status/1923802002582319516

Codex for internationalization:”” / X https://x.com/gdb/status/1923897958954872903

Introducing Codex | OpenAI https://openai.com/index/introducing-codex/

Just released Codex, a software engineering agent that can work on many tasks in parallel. It runs on its own cloud-based compute infrastructure, and can fix bugs, answer questions about your code, run tests, etc.. Feels like a step towards the future of software engineering.”” / X https://x.com/gdb/status/1923401740986052770

OpenAI introduces Codex: A cloud-based software engineering agent that can work on many tasks in parallel, powered by codex-1. https://x.com/iScienceLuvr/status/1923394959916273820

The Codex agent can analyze a codebase and find areas of improvement. It suggest improvements, Then you can schedule tasks right within ChatGPT. https://x.com/omarsar0/status/1923394967008874889

today we are introducing codex. it is a software engineering agent that runs in the cloud and does tasks for you, like writing a new feature of fixing a bug. you can run many tasks in parallel.”” / X https://x.com/sama/status/1923398457747787817

What’s being released? A remote software engineering agent, Codex. Can run many coding tasks in parallel. Available for Pro, Enterprise, and Team ChatGPT users starting today.”” / X https://x.com/omarsar0/status/1923394427071918310

It seems there was a lot of alignment work that went into Codex. This led to the agent being able to produce cleaner patches and overall code that aligns with a coder’s preference, standards, and instructions.”” / X https://x.com/omarsar0/status/1923403068944580739

Operator 🤝 OpenAI o3 Operator in ChatGPT has been updated with our latest reasoning model. https://x.com/OpenAI/status/1925963018791178732

Introducing Claude 4 \ Anthropic https://www.anthropic.com/news/claude-4

Introducing the next generation: Claude Opus 4 and Claude Sonnet 4. Claude Opus 4 is our most powerful model yet, and the world’s best coding model. Claude Sonnet 4 is a significant upgrade from its predecessor, delivering superior coding and reasoning. https://x.com/AnthropicAI/status/1925591505332576377

Claude’s new MCP integration is INSANE! Connect PayPal, Gmail, and 7,000+ apps directly in chat. Top 5 things this update enables: • Financial tracking with PayPal transaction data • Daily briefings with calendar, email & weather • Research reports delivered to your https://x.com/JulianGoldieSEO/status/1919285937730617821

we’re launching a suite of new tools and new features today. a new MCP tool, code interpreter tool, and image generation tool – plus background mode, reasoning summaries, and file search within reasoning models https://x.com/stevenheidel/status/1925209984180380101

@googlechrome @GeminiApp Say you’re looking for an apartment. Instead of you filtering through real-estate apps daily, Gemini can find listings that fit your criteria, schedule tours and add them to your calendar, and create side-by-side comparisons. https://x.com/Google/status/1924877428997939563

Google made an AI coding tool specifically for UI design | The Verge https://www.theverge.com/news/670773/google-labs-stitch-ui-coding-design-tool

1️⃣ It’s better at multitasking & can tackle up to 10 tasks simultaneously 2️⃣ It’s using Teach and Repeat — you can show it a task once and & it learns a plan for future tasks 3️⃣ Its computer use capabilities are coming to the Gemini API this summer #GoogleIO”” / X https://x.com/Google/status/1924876543479714026

hotel bookings natively on perplexity are quietly growing. it’s one of the under-the-radar features we have right now that has a massive potential to disrupt the ad industry. google’s second biggest adword category i think.”” / X https://x.com/AravSrinivas/status/1923124236618469735

AI Mode in Google Search: Updates from Google I/O 2025 https://blog.google/products/search/google-search-ai-mode-update/#ai-mode-search

Codex is powered by a new model called codex-1. OpenAI claims this is their best coding model to date.”” / X https://x.com/omarsar0/status/1923394428766437684

OpenAI shared that their engineers use Codex for the following: – refactoring – renaming – writing tests – scaffolding new features – wiring components – fixing bugs – drafting documentation They are noticing new habits emerging from offloading background work to the agents.”” / X https://x.com/omarsar0/status/1923403070806929877

StackOverflow questions over time, source SEDE; sadface, lunch has been eaten https://x.com/marcgravell/status/1922922817143660783

Microsoft just revealed its next big AI bets at Build 2025. I sat down with Microsoft CEO Satya Nadella to unpack: -Microsoft’s vision for the “agentic web” -Why your next job might be AI agent manager -What happens when 95% of code is AI-generated Timestamps: 0:00 Building https://x.com/rowancheung/status/1925228045415416297

Code with Claude Opening Keynote – YouTube https://www.youtube.com/watch?v=EvtPBaaykdo

Mastering Claude Code in 30 minutes – YouTube https://www.youtube.com/watch?v=6eBSHbLKuN0

Wow, first test of @OpenAI Codex agent – connected to my @readbetterio repo, it found a bug in 3 minutes (nothing major, but still). First merged PR entirely written by AI. Very cool experience! https://x.com/RBouschery/status/1923490563212419375

You can now clone any YouTube channel’s thumbnail style—and automate the process of generating thumbnails for your own videos in that same style. In this sneak peek, I show how I used the Agent Development Kit (ADK) to replicate Alex Hormozi’s exact thumbnail look using OpenAI’s https://x.com/bhancock_ai/status/1920185203919573227

This is the trend I see in thoughtful people using AI. Model ability is catching up to some of the promises made by AI labs in a way that is difficult to ignore (while still behind the biggest hype). We don’t know where it will end, but views need to be updated as tech improves.”” / X https://x.com/emollick/status/1924480193298629015

8/ Gemini SDK is now Compatible with MCP Tools Developers can integrate Gemini with Model Context Protocol (MCP) tools for enhanced agentic capabilities. 🌐 Automate form submissions across multiple platforms. 🤖Integrate Gemini into existing workflows for task automation.”” / X https://x.com/AtomSilverman/status/1924960920671076858

I built an automated AI travel agency with multi-agents. It has 4 MCP AI agents working together as a team: 1. Google Maps Agent 2. Airbnb Booking Agent 3. Google Calendar Agent 4. Weather Agent 100% Opensource Code with step-by-step tutorial. https://x.com/Saboo_Shubham_/status/1919942105553895737

🦜🤖Introducing Open Agent Platform An open-source, no-code agent building platform. OAP enables non-developers to build highly customizable agents, which connect to: – 🛠️MCP Tools – 📄LangConnect for RAG – 🤖Other LangGraph Agents! Try the public demo, or fork & customize it https://x.com/LangChainAI/status/1922722850542346680

the OpenAI Responses API is now the first truly agentic API 🚀 developers can combine MCP servers, code interpreter, reasoning, web search, and RAG – all within a single API call – to build the next generation of agents 🤖”” / X https://x.com/stevenheidel/status/1925209983073046616

Introducing Gemma 3n, our multimodal model built for mobile on-device AI. 🤳 It runs with a smaller memory footprint, cutting down RAM usage by nearly 3x – enabling more complex applications right on your phone, or for livestreaming from the cloud. Now available in early https://x.com/GoogleDeepMind/status/1925916216083779774

Here’s my “”Dark Leisure”” theory of any potential productivity paradox in AI: – most AI use rn is bottom up and hidden (employee first, not company first): employees vibe code, vibe market, vibe write and get stuff done faster – in many orgs, there is too little incentive to”” / X https://x.com/fabianstelzer/status/1926000937702764635

3/ With project mariner, you can: – Automate the process of listing products across various e-commerce sites. – Schedule multiple appointments (e.g., doctor, dentist, car service) concurrently”” / X https://x.com/AtomSilverman/status/1924960901142323588

We’re partnering with @Dell to accelerate secure, agentic enterprise AI solutions. Dell will be the first provider to offer our secure agents platform, Cohere North, to enterprises on-premises, which is crucial for regulated industries handling sensitive data 🧵 https://x.com/cohere/status/1924512634373865950

We’re partnering with @SAP to bring enterprise-ready agentic AI to businesses worldwide! Our models will be embedded into SAP Business Suite, offering secure and scalable AI capabilities. With Cohere’s cutting-edge models also available on SAP AI Core, enterprises can leverage https://x.com/cohere/status/1924858543716630644

We’re introducing thought summaries in 2.5 Flash and Pro via the Gemini API and @GoogleCloud’s #VertexAI. These organize the model’s thoughts into a clear format with headers and key information about its actions to give more transparency. https://x.com/GoogleDeepMind/status/1924879655762632816

You can now crawl entire websites and extract LLM-ready data with a single tool. Crawl4AI is an open-source repo built for AI agents, RAG, and data pipelines. It supports both browser-based and HTTP crawling, with real-time Markdown generation from any site. https://x.com/LiorOnAI/status/1925930945137254629

With GPT-4 as a tutor Nigerian students saw years of learning in weeks. Important World Bank research investigates if AI chatbots can effectively and affordably boost learning in Nigeria. 🇳🇬 Researchers conducted a Randomized Controlled Trial (RCT) in Nigeria. First-year https://x.com/rohanpaul_ai/status/1925614762139713851

As someone involved in academic research on AI, it is notable to me that most of the key experiments showing the impressive abilities of AI on work, medicine, psychology, and so many other fields were done on GPT-4… a model that is now so obsolete that it is gone from ChatGPT. https://x.com/emollick/status/1923134492115365905

QoL Update: Starting today, you will see an AI generated summary for all papers of Hugging Face Papers! 🔥 GG @mishig25 🐐 https://x.com/reach_vb/status/1925517801197879737

this AI agent is f**king scary Rork can clone top App Store AI apps with a few prompts I just cloned Character AI, but removed all the censorship. now you can create dream gf & chat with her.. about anything 9 examples: https://x.com/EHuanglu/status/1923395698860699785

Connect Google ADK Agents to 100+ Systems with GCP Integration Connectors Google’s Agent Development Kit (ADK) now integrates with GCP Integration Connectors, enabling AI agents to perform real-world tasks across over 100 systems. Key features: – Agents can execute actions https://x.com/NdabageraM/status/1921524696992137343

Introducing Jules, an AI coding agent powered by Gemini 2.5 Pro. Jules works asynchronously across your repo on tasks like fixing bugs or refactoring, helping you cross multiple things off your to do list at the same time. Plus, stay updated with Codecasts, a daily podcast of https://x.com/julesagent/status/1924890206853116142

Jules: An asynchronous coding agent | Hacker News https://news.ycombinator.com/item?id=44034918

I just generated a 5:30 min Multi-Speaker Podcast on Agentic Patterns using Gemini 2.5 Flash and our new Text-to-speech (TTS) Model! At I/O we launched native controllable Audio Generation for Gemini 2.5 Pro & Flash. > Controllable style, accent, pace, tone. > single and https://x.com/_philschmid/status/1925888544175734873

Grok 3 is now available on Microsoft Azure https://x.com/ibab/status/1924518628172693922

Here is the full conversation today between Microsoft CEO Satya Nadella and @elonmusk. Elon: “”With Grok 3.5, which is about to be released, it’s trying to reason from first principles.”” https://x.com/SawyerMerritt/status/1924536496981172533

The killer feature of OpenAI Codex is parallelism. Browser-based work is evolving: from humans handling tasks one tab at a time, to overseeing multiple AI agent tabs, providing feedback as needed.”” / X https://x.com/alexhalliday/status/1923728921150820650

AMA with OpenAI Codex team : r/ChatGPT https://www.reddit.com/r/ChatGPT/comments/1ko3tp1/comment/mso344o/

In the latest issue of The Batch, Andrew Ng shares how large companies can move faster by using AI. Plus: 📰 OpenAI Codex turns agents into your dev team 📰 Grok spread conspiracies after rogue update 📰 U.S. makes AI tech deals with Saudi Arabia and UAE Read The Batch: https://x.com/DeepLearningAI/status/1925975010893516991

4. NLWeb: This is a new open project that lets you use natural language to interact with any website. Think of it like HTML for the agentic web. https://x.com/satyanadella/status/1924535902321442846

Reasoning Generalization Reasoning fails to generalize across environments. Agents struggle with spatial coordination (Messenger), legal move inference (Hanoi), and adapting to opponent patterns (RPS). Even with reward shaping and hints, models often underperform random https://x.com/omarsar0/status/1924182841677709540

I wish these skeptical AI articles would actually grapple with the growing body of research that AI can really do original research & perform key unstructured tasks across the spectrum of high-end white collar employment. AI criticism is important, but it should be clear-eyed.”” / X https://x.com/emollick/status/1923417536072241529

Alibaba’s Qwen team made Deep Research for Qwen Chat available for all users It’s pretty much like ChatGPT’s Deep Research, providing users the ability to prepare detailed reports on different subjects in a matter of minutes. https://x.com/adcock_brett/status/1924133804630753660

Anthropic’s New Model Excels at Reasoning and Planning—and Has the Pokémon Skills to Prove It | WIRED https://www.wired.com/story/anthropic-new-model-launch-claude-4/

Ever wondered you can chat with your Google Calendar? So introducing Google Calendar MCP. Here is the Repo Link: https://x.com/avikm744/status/1921903828334518511

Top 10 Most Popular MCP Servers in the Cline https://x.com/cline/status/1918427793047863337

Here’s the easiest way to build an MCP server: 1. Use Gitingest to convert the FastMCP repo into LLM-ready text. 2. Download the text file. 3. Upload it to Google AI Studio, specifying the MCP server type. Gemini 2.5 Pro handles the rest! https://x.com/akshay_pachaar/status/1918283739760828795

Microsoft releases NLWeb NLWeb uses MCP to make it simple to interact with websites in a standardized way. Devs can now convert any website into an AI app. MCP is to NLWeb what HTTP is to HTML. This went largely unnoticed this week, but it looks like a big deal. https://x.com/omarsar0/status/1925900575666733207

Introducing support for remote MCP servers, image generation, Code Interpreter, and more in the Responses API. https://x.com/OpenAIDevs/status/1925214114445771050

A Step-by-Step Tutorial on Connecting Claude Desktop to Real-Time Web Search and Content Extraction via Tavily AI and Smithery using Model Context Protocol (MCP) In this hands-on tutorial, we’ll learn how to seamlessly connect Claude Desktop to real-time web search and https://x.com/Marktechpost/status/1918877427335622673

Here’s a quick demo of searching, running and using the browser-tools MCP using OneMCP. https://x.com/Ipenywis/status/1921213033973772350

I’m starting to learn that agents, a bit like RAG I guess, is becoming less of a thing and just a control structure. With MCP integrated to InferenceClient, agents are just while loops. No stress. No framework. Just LLMs doing stuff. https://x.com/ben_burtenshaw/status/1925933013889663115

Implementing An Airbnb and Excel MCP Server In this tutorial, we’ll build an MCP server that integrates Airbnb and Excel, and connect it with Cursor IDE. Using natural language, you’ll be able to fetch Airbnb listings for a specific date range and location, and automatically https://x.com/Marktechpost/status/1918543230779703762

Build a MCP server that can read a tweet via CDP, discuss with AI, then save to @raycastapp Notes. https://x.com/Leechael/status/1921555839359373415

AI’s ability to make tasks not just cheaper, but also faster, is underrated in its importance in creating business value. For the task of writing code, AI is a game-changer. It takes so much less effort — and is so much cheaper — to write software with AI assistance than”” / X https://x.com/AndrewYNg/status/1923045958511886549

They went from nearly killing their startup over a weekend to being in talks with OpenAI for a $3 billion acquisition. How @windsurf_ai turned a moment of existential panic into one of AI’s most remarkable success stories. One weekend in 2022, the founders of Exafunction https://x.com/fdaudens/status/1923458065937883509

It’s official… we’re bringing Gemini to Wear OS! 🎉 In the coming months, you’ll be able to chat naturally with Gemini to get things done across apps, like creating a personalized workout playlist or remembering where you put your stuff. Check it out: https://x.com/WearOSbyGoogle/status/1922370010112032820

NEW: Google announces Gemini Diffusion It’s an experimental text diffusion model that leverages parallel generation to achieve insane low latency. It can generate 5x faster than 2.0 Flash Light! https://x.com/omarsar0/status/1924882868477563141

Thinking budgets are coming to 2.5 Pro soon. 💭 You’ll have more control over how much the model thinks before it responds – or you can simply turn it off. https://x.com/GoogleDeepMind/status/1924879658081980761

Glasses with Android XR are lightweight and designed for all-day wear. They work with your phone so you can be hands-free, stay in the moment with friends and complete your to-do list. https://x.com/Google/status/1924899930109575474

Google Beam: Updates to Project Starline from I/O 2025 https://blog.google/technology/research/project-starline-google-beam-update/

Google dropped AlphaEvolve, an AI that discovers algorithms for scientific and computational challenges —Uses Gemini models with auto-evaluation & iteration —Found the first improvement on 1969’s Strassen’s algorithm —Also boosting efficiency for Google https://x.com/adcock_brett/status/1924133683444793819

⭐️Introducing AG-UI; The Agent-User Interaction protocol 👾 Bring your AI agents into Frontend applications, and let them interact with users. Launching with day-0 integrations with LangGraph, CrewAI, Mastra, and AG2 – with more partnerships on the way 👀 https://x.com/CopilotKit/status/1921940427944702001

6/ Experimental Rollout and Developer Access An experimental version of Agent Mode will soon roll out for subscribers, with broader availability expected by summer 2025.”” / X https://x.com/AtomSilverman/status/1924960913347768769

Agent Chat UI now supports file uploads! You can now upload images & PDFs in your chats, and have them automatically passed to your graph with no additional configuration! Try it out with your own graph today 🔗👇 https://x.com/BraceSproul/status/1924888661369487830

AgentOps 0.4.12 is out!! – New @openai Agents SDK examples – Runs now save logs to the @AgentOpsAI dashboard – Support for @IBMwatsonx – Support for @iodotnet LLMs – Improved tool tracking and decorators – Upgraded support for @ag2oss – Critical bugfixes in our @mintlify docs https://x.com/AlexReibman/status/1923487777351664005

Agents from scratch This repo covers the basics of building agents: + Fundamentals + Build an agent + Agent eval + Agent w/ human-in-the-loop + Agent w/ long-term memory Builds to a deployable agent to run your email Code (all open source): https://x.com/RLanceMartin/status/1923061504028619159

AI agents (s/o @Replit) empower solopreneurs and small teams to deliver code at a fraction of the cost it used to. This lowers the barrier to entry for millions of business owners who previously would have to pay an arm and a leg for custom software. Thanks for the shoutout https://x.com/billyjhowell/status/1922460572706058503

AI agents stuck in silos? Meet ACP, an open REST protocol that lets agents from LangChain, CrewAI to AutoGen talk natively. No SDK lock-in, async first, offline discovery. Plug & play. More details here: https://x.com/armand_ruiz/status/1921603264191062248

AI Agents vs. Agentic AI Interesting paper summarizing distinctions between AI Agents and Agentic AI. It also talks about the key ideas, solutions, and the future. Here are my notes: https://x.com/omarsar0/status/1923817691455873420

ARI is the world’s most intelligent deep research agent…even according to OpenAI 👀 ARI beats OpenAI Deep Research 76% of the time—but don’t just take our word for it. Their own model was the judge (thanks, o3). And now, you can try ARI for free 🧵 https://x.com/youdotcom/status/1923030761369649268

Big news for developers! We’re launching two new products to bring code execution & dev environments to AI apps: 📦 Together Code Sandbox ⚡ Together Code Interpreter Now you can run LLM-generated code in secure, scalable, fully managed environments. Details below 👇 https://x.com/togethercompute/status/1924860124436238532

day 1 of OSS release: LangGraph now supports node level caching! Very useful for when you have common parts of workflows Also useful for when you are debugging an agent! Allows for much faster iteration”” / X https://x.com/hwchase17/status/1924557667634172099

Demonstrating end-to-end scientific discovery with Robin: a multi-agent system | FutureHouse https://www.futurehouse.org/research-announcements/demonstrating-end-to-end-scientific-discovery-with-robin-a-multi-agent-system

Deploy AI Pipelines Faster with Hayhooks | Haystack https://haystack.deepset.ai/blog/deploy-ai-pipelines-faster-with-hayhooks

Excited to share a fantastic blog post by @tuanacelik on the fundamentals of memory for agentic systems 🧠🤖 Most stateful agentic applications require a concept of memory to retain context, conversations, interactions across time, but their implementations have been quite https://x.com/jerryjliu0/status/1922866494557339809

First, a crucial mindset shift: stop treating AI like a vending machine for code. Effective AI Engineering is IDE-native collaboration. It’s a strategic partnership blending your insight with AI’s capabilities. Think of AI as a highly skilled (but forgetful) pair programmer.”” / X https://x.com/cline/status/1922846227709952206

For extended tasks, break them down (/smol). Start “”new tasks”” or sessions, carrying over only essential, summarized context to keep the AI focused. https://x.com/cline/status/1922846460489629825

Group Think Multiple Concurrent Reasoning Agents Collaborating at Token Level Granularity https://x.com/_akhaliq/status/1924504013963173961

I asked Codex to convert a legacy project from Python 2.7 to 3.11 and from Django 1.x to 5.0 It literally took 12 minutes If you know, that’s usually weeks of pain This is actually insane https://x.com/flavioAd/status/1923742238502220082

If you’re an engineer who’s feeling hesitant or overwhelmed by the innovation pace of AI coding, this thread is for you. Here’s the 10% of fundamentals that will put you in the 90th percentile of AI engineers. 🧵/many”” / X https://x.com/cline/status/1922846215894597996

Introducing Proxy 1.0 – the world’s most capable web-browsing agent. https://x.com/convergence_ai_/status/1892129466610073931

Introducing the AI Gateway – Vercel https://vercel.com/blog/ai-gateway

Introducing the new @aomniapp – an AI agent that builds other agents for you. Now you can go from idea -> fully working agent with 1 prompt. We provide all the tools and contact datasets to build agents that excel at revenue gen activities. Here are some cool use cases 🧵 https://x.com/dzhng/status/1922750467181814232

Introducing the Simple Agent API A minimal, open-source setup for serving Agents using FastAPI + Postgres – built for speed, clarity, and dev happiness. > clone repo > docker compose up > that’s it! Details below 👇 https://x.com/ashpreetbedi/status/1921602391901692021

Knowledge begets more knowledge, algorithms optimising other algorithms – we are using AlphaEvolve to optimise our AI ecosystem, the flywheels are spinning fast…”” / X https://x.com/demishassabis/status/1922855468549968007

Major milestone announcement! Convergence has signed a definitive agreement to be acquired by @salesforce. Since our founding, Convergence’s mission has been to push the boundaries of AI agents, creating systems that can handle complex, dynamic digital tasks with human-like”” / X https://x.com/convergence_ai_/status/1923022043970248736

Next up, you need to master the AI’s “”context window.”” This is its short-term memory, holding your instructions, code, chat history, etc. It’s finite. When it gets too full (often >50% for many models), AI performance can dip. It might start to “”forget”” earlier parts of your https://x.com/cline/status/1922846284656095468

One prompt → fully built slides, webpage, doc, sheets, podcast. Skywork Super Agents just dropped. Unmatched deep research capabilities, surfacing 10x more source materials than competitors, while delivering professional-grade results at 40% lower cost. They are also leading https://x.com/rohanpaul_ai/status/1925532536786075997

Proactive context management is key to avoiding this. Be aware of how full the window is. For long chats, use techniques to summarize the history (/newtask). https://x.com/cline/status/1922846376783905171

Project Mariner Enhancements Project Mariner now supports up to 10 simultaneous tasks and includes a “”Teach and Repeat”” feature. https://x.com/AtomSilverman/status/1924960890325254461

Say hello to Windows AI Foundry 💫 Meet the unified and reliable platform supporting the AI developer lifecycle from model selection, optimization, fine-tuning and deployment across silicon. Learn more: https://x.com/windowsdev/status/1924610295139299433

Stagehand is now in @crewAIInc, Allow multiple agents to automate a browser by adding the StagehandTool to your Crew. Multi-agent workflows have never been better. @Stagehanddev🤘 https://x.com/browserbasehq/status/1922747491025310199

tasked my team to learn + build agentic workflows for our employees using n8n and google adk… now we are building a sports betting agent lmfao @PropHolliday https://x.com/NickGattuso/status/1919592809906020456

Tasks scheduling is coming super soon. All browsers but will work best on Comet. https://x.com/AravSrinivas/status/1925683786664096051

The models were trained to have good code quality and style. The models are better at avoiding generating extra code that users didn’t ask for. Other neat features include being able to quickly verify summaries of what the agent did (e.g., a preview of test results). https://x.com/omarsar0/status/1923399152060239963

The single biggest lever for better AI-generated code? Planning before AI writes any code. Frontload all relevant context — files, existing patterns, overall goals. Then, collaboratively develop a strategy with your AI. (this is why Cline has Plan/Act modes) https://x.com/cline/status/1922846244185178189

This reflects a larger symptom of how software has historically been built versus how it is built now. Previous generations of software often focused on well-understood market needs. companies would either a) optimize for existing demand, b) introduce marginal improvements or new”” / X https://x.com/c_valenzuelab/status/1924509198848819347

Today, Box announced new AI Agents to work with enterprise content, powering Deep Research, Search, and enhanced Data Extraction. There’s a tremendous amount of value that’s trapped in unstructured data, from contracts to research data, that we can finally unlock with AI. https://x.com/levie/status/1923104047306859006

We launched Agentflow V2 yesterday. One of the most asked questions: What is the difference between Agentflow and automation platforms like n8n, Make, or Zapier? 👇 https://x.com/FlowiseAI/status/1923416261565825192

We’re excited to release an interactive guide highlighting the definitive set of principles for building AI agents 🔥 Based on the popular 12-Factor agents repo by @dexhorthy. We packaged the principles into an interactive website and Colab notebook with working code examples, https://x.com/jerryjliu0/status/1925961220948894101

Welcome – Strands Agents SDK https://strandsagents.com/latest/

When running the task, the agent runs automatic tests after making changes. It also runs linters to verify that the code matches style expectations. It can refer to its MD file for further guidance on how to carry out these important tasks. https://x.com/omarsar0/status/1923397665343041585

The most obvious, lowest risk way to use AI (and to get a sense of how good it is) is to ask it for second opinions in your area of expertise. This works across most fields. And I’d go further: increasingly, not using AI as a second opinion is going to lead to worse outcomes.”” / X https://x.com/emollick/status/1924152902907494870

🚀 AI x Bitcoin is here — and the Maestro MCP is making it real. With our Model Context Protocol-aware server, AI agents can natively interact with Bitcoin via the Maestro API: 🎵 Query blocks, transactions, and addresses 🎵 Track balances + UTXOs 🎵 Let LLMs reason about live https://x.com/GoMaestroOrg/status/1921922248753262791

I built an app that is 100% accessible via MCP plus two more and used all three together to accomplish a single task. Their ability to integrate with each other without having to do any of the “”glue”” code turns your AI assistant into an actual assistant! https://x.com/kentcdodds/status/1917983062912319627

MCP Observability Alpha on @AgentOpsAI DM for early access”” / X https://x.com/AtomSilverman/status/1923961881540354303

MCP vs Agent2Agent Protocol clearly explained: MCP connects LLMs to tools and data. A2A enables AI agents to communicate with each other. https://x.com/Saboo_Shubham_/status/1922116392079524345

OpenMemory MCP provides a persistent memory layer for AI tools like Claude, Cursor and Windsurf. It enables AI Agents to securely read and write to a shared memory. Runs 100% locally on your computer. https://x.com/Saboo_Shubham_/status/1923428646078779745

PraisonAI v2.2 is here! Create AI Agents as MCP server, MCP Client, MCP Stdio SSE Support & Deploy. Just three lines of code More information below: https://x.com/MervinPraison/status/1923034615410778485

This guy literally connected AI agents and real-world APIs using MCP in 10 minutes https://x.com/aaditsh/status/1923038665611280671

Wait wait this meme became a official collaboration??? https://x.com/cloneofsimo/status/1925993220468560003

𝐘𝐨𝐮𝐫 𝐂𝐀𝐌𝐄𝐋-𝐀𝐈 𝐀𝐠𝐞𝐧𝐭 𝐢𝐬 𝐧𝐨𝐰 𝐚 𝐭𝐨𝐨𝐥. You can now serve any CAMEL-AI agent as an MCP server, letting external clients like Claude Desktop call it directly. Here’s how it works: 🧵 Day 2/7 https://x.com/CamelAIOrg/status/1922258375686963639

Very strong agentic performance by the new Claude 4 Opus and Sonnet, placing 1st and 3rd on the GAIA benchmark. Notice, this leaderboard unfortunately doesn’t include Google’s latest Gemini models nor OpenAI’s o4-mini or o3. https://x.com/scaling01/status/1926017165108375607

Opus (makes a simple math error) https://x.com/lefthanddraft/status/1925617749704778145

For people who don’t like Claude’s behavior here (and I think it’s totally valid to disagree with it), I encourage you to describe your own recommended policy for agentic models should do when users ask them to help commit heinous crimes. Your options are (1) actively try to”” / X https://x.com/johnschulman2/status/1925960286281838757

Zed just dropped the fastest Agentic code editor built in Rust. Works with Claude Sonnet 3.7, Gemini 2.5 Pro and local models via Ollama. 100% opensource. https://x.com/Saboo_Shubham_/status/1921754009221906848

Now I know how to build fully working REST APIs without knowing any programming language or framework — in under 1 minute, without writing a single line of code manually 🤯 I used @neondatabase MCP server together with GitHub Copilot for Azure in VS code #MVPBuzz #GitHubCopilot https://x.com/BoburUmurzokov/status/1921509163487654214

What is the Agentic Web? 8 important updates from #MSBuild 1. Agents as first-class business & M365 entities. 2. Microsoft Entra Agent ID for knowing your agents. 3. NLWeb, MCP, Open Protocols as the foundation layer for an open agent ecosystem. 4. Agentic DevOps https://x.com/TheTuringPost/status/1924910543154119105

Agents & MCP are great… but how can you DEPLOY an agent with MCP tools? Here’s the fastest way I’ve found to build and deploy with the OpenAI Agents SDK in <3 minutes. https://x.com/mattppal/status/1921566074145001842

🚀 LangGraph Platform Now Supports MCP! Every deployed agent on LangGraph Platform now exposes its own MCP endpoint. Leverage your agents as tools in any client supporting streamable HTTP for MCP— no custom code or infrastructure required. 📚Docs: https://x.com/LangChainAI/status/1924863441862562279

Guide on how to Bridge @LangChainAI & @CamelAIOrg agents via the Model Context Protocol for seamless, cross-framework AI Agent collaboration. https://x.com/CamelAIOrg/status/1919750181622579627

The power of Perplexity right at your fingertips, with @code + the Perplexity MCP server in agent mode ✨ https://x.com/code/status/1919475022948692297

Codex on ChatGPT iOS:”” / X https://x.com/gdb/status/1924703367718388123

Exclusive: Google Sees Smart Glasses as the ‘Next Frontier’ for AI. And It’s Not Working Alone – CNET https://www.cnet.com/tech/computing/exclusive-google-sees-xr-smart-glasses-as-the-ultimate-use-for-ai-with-warby-parker-samsung-and-xreal-on-deck/#ftag=CAD590a51e

Google I/O 2025: Gemini on Android XR coming to glasses, headsets https://blog.google/products/android/android-xr-gemini-glasses-headsets/

How Apple Intelligence and Siri AI Went So Wrong – Bloomberg https://www.bloomberg.com/news/features/2025-05-18/how-apple-intelligence-and-siri-ai-went-so-wrong?embedded-checkout=true

Agents taking actions in your ad account🐳 Coming next week to Moby Agents… https://x.com/AY_Orbach/status/1923425142039822517

II-Agent – Intelligent Internet https://ii.inc/web/blog/post/ii-agent

We released Devstral. It is a 24B model released under the Apache 2.0 license. It the best open model on SWE-Bench verified today. You can check our blog post or test it with OpenHands (from @allhands_ai ) following the instructions here: https://x.com/b_roziere/status/1925194095359676768

Deep Think’ boosts the performance of Google’s flagship Google Gemini AI model | TechCrunch https://techcrunch.com/2025/05/20/deep-think-boosts-the-performance-of-googles-flagship-google-gemini-ai-model/

(4) Evaluation Driven Development for Agentic Systems. https://www.newsletter.swirlai.com/p/evaluation-driven-development-for

3 new products going out today! First: AI MEETING NOTES 🎙️✨ Syncs with @NotionCalendar, or works on any @NotionHQ page (just type /meet). – No more typing notes! – No creepy bots – State-of-the-art AI summaries w/ templates – Ask questions anytime – Transcriptions that https://x.com/ivanhzhao/status/1922312312486297857

At @AgentOpsAI we found $2.7m in revenue in a customers CRM How? By building an AI agent that analyzes our customer’s CRM + finds opportunities to engage with EXISTING customers. Comment @AgentOpsAI for early access https://x.com/AtomSilverman/status/1924521025666220232

Everybody is doing AI agents these days. Here’s a great example of an application that gets it right: @genspark_ai AI Sheets lets you literally talk to your spreadsheets. Upload your files, ask any data analysis question, and it automatically analyzes everything, pulls the info, https://x.com/fchollet/status/1924509605050327475

Spreadsheets haven’t changed much in decades. Until now. Here is an AI Agent that can take your data and run a complete data analysis on it. It can generate charts, summaries, and reports. You only need to know what questions to ask. Check out this video. I created it. https://x.com/svpino/status/1923092831532613864

it is amazing and exciting how much software one person is going to be able to create with tools like this. “”you can just do things”” is one of my favorite memes; i didn’t think it would apply to AI itself, and its users, in such an important way so soon.”” / X https://x.com/sama/status/1923399498019021298

[15 Apr 2025] Gemini’s AlphaEvolve agent uses Gemini 2.0 to find new Math and cuts Gemini cost 1% — without RL https://x.com/swyx/status/1923367096995443189

We only need ONE example for RLVR on LLMs to achieve significant improvement on math tasks! 📍RLVR with one training example can boost: – Qwen2.5-Math-1.5B: 36.0% → 73.6% – Qwen2.5-Math-7B: 51.0% → 79.2% on MATH500. 📄 Paper: https://x.com/ypwang61/status/1917596101953348000

Semantic Layer Summit https://www.semanticlayersummit.com/

There should be no AI button https://kojo.blog/ai-button/

To make sure your AI agent is not bullshitting you, you need to evaluate its reasoning… but to do so automatically, you need an LLM… 🤔so how do you evaluate the trace evaluator? With TRAIL, which contains: – a full taxonomy of agent errors and most frequent failure cases,”” / X https://x.com/clefourrier/status/1922923060622971360

“The opportunity with AI is truly as big as it gets. And it will be up to this wave of developers, technology builders and problem solvers to make sure its benefits reach as many people as possible.” – @sundarpichai #GoogleIO”” / X https://x.com/Google/status/1924901038424781307

1. AlphaEvolve AlphaEvolve is a coding agent developed by Google DeepMind that uses LLM-guided evolution to discover new algorithms and optimize computational systems. https://x.com/dair_ai/status/1924150361750655178

7/ ⚙️ Early Agent Mode adopters can test and provide feedback on new features. 🛠️ Developers can build and refine agentic applications using the Gemini API.”” / X https://x.com/AtomSilverman/status/1924960917030334558

A Github repo I made to help you learn and build simple agents using Google ADK (Agent Development Kit) – Google’s official framework to build powerful Al agents in a structured and simple way 😉 👇 https://x.com/hrishikeshhh_/status/1921930136120664556

Attending Build with AI Ibadan 2025 by @gdgibadan. Just built an Agent with Google ADK. #BuildwithAiibadan2025 #BWAI2025 https://x.com/Josylad/status/1921172852704793000

Google ADK has in built capability to serve your agents via FastAPI endpoints: `adk api_server` This allows you to create your own custom frontend UI for your agents. In this 3rd tutorial, let’s build a custom UI with Streamlit in Cursor for our ElevenLabs TTS MCP agent 👇 https://x.com/chongdashu/status/1921351038457585970

Google launches stand-alone NotebookLM apps for Android and iOS | TechCrunch https://techcrunch.com/2025/05/19/google-launches-standalone-notebooklm-app-for-android/

Jules: Google’s autonomous AI coding agent https://blog.google/technology/google-labs/jules/

Just built an A2A Agent with Google ADK + Gemini Follow me -> @theailanguage https://x.com/theailanguage/status/1920747094907769270

Simple multi-agent search system using Google ADK. Simple to orchestrate but quite powerful! https://x.com/omarsar0/status/1920853608716812398

This feels hard to describe! Our Research team is cooking. @GoogleDeepMind AlphaEvolve is evolutionary coding agent using an ensemble of Gemini 2.0 Flash & Pro to discover and optimize algorithms that solve complex problems in mathematics and computing. Compared to other SWE https://x.com/_philschmid/status/1922913381746352188

6. Gemini AI updates: —Gemini 2.5 Pro Deep Think, a new mode that uses parallel thinking to solve math and coding problems —Gemini 2.5 Flash, upgraded with improved performance across benchmarks Both also include native audio outputs across languages https://x.com/rowancheung/status/1925084387894300762

Tired of a never-ending code backlog? 🫥 Meet Jules: our AI dev agent that helps you clear your to-do list by multitasking across your codebases – freeing you up to focus on other things. Get caught up on @JulesAgent → https://x.com/GoogleDeepMind/status/1925283511902048537

VP @Google @JoshWoodward takes the stage at #GoogleIO to explain how we’re making @GeminiApp into the most personal, proactive and powerful AI assistant. ↓ https://x.com/GoogleDeepMind/status/1924890757401321516

Check out @shresbm and my blog on how get started building AI agents with Google Gemini and these awesome open-source tools: https://x.com/_philschmid/status/1924886346444710135

What is cool about AlphaEvolve? It’s an evolutionary coding agent from @GoogleDeepMind that finds new algorithms and scientific solutions for complex tasks like math problems and even chip design. It’s powered by top Gemini models and automated evaluators and works autonomously https://x.com/TheTuringPost/status/1925676395629298082

code agents > tool calling https://x.com/fdaudens/status/1923397074495627531

Let’s goo! Starting today you can access 5000+ LLMs powered by MLX directly from Hugging Face Hub! 🔥 All you need to do is click `Use this model` from any compatible model \o/ That’s it, all you need to get blazingly fast intelligence right at your terminal! What would you https://x.com/reach_vb/status/1924517049474101412

Wow, @jandotai is now Apache licensed – big win for on device community! 🔥 Way to go team! https://x.com/reach_vb/status/1925475572219568269

👀 Learn how @cognition_labs built Devin – a fully autonomous software engineer teammate! 🔥 In his LangChain Interrupt talk, Russell Kaplan, President at Cognition, shares how they brought Devin to life — and what it takes to build powerful, production-ready agents. In this https://x.com/LangChainAI/status/1926012891926286463

Azure AI Foundry Agent Service is now generally available, and it comes with first-class LlamaIndex support! The Agent Service allows enterprise customers to build ➡️ Customer support assistants that handle inquiries and reduce response times ➡️ Process automation bots that https://x.com/llama_index/status/1924502129974411504

GitHub Copilot now has a coding agent embedded right where you already collaborate with developers: on GitHub. And yes, you can access it from VS Code too. 🤖 https://x.com/ashtom/status/1924496497543901407

Transforming R&D with agentic AI: Introducing Microsoft Discovery | Microsoft Azure Blog https://azure.microsoft.com/en-us/blog/transforming-rd-with-agentic-ai-introducing-microsoft-discovery/

True game changer comes from GitHub! We are now in the era of Agentic DevOps 😎 GitHub Copilot has gone beyond suggesting lines of code. It now supports the entire software development lifecycle – from planning and implementation to updates, tests, and debugging. ▪️ Here’s how https://x.com/TheTuringPost/status/1924495827999031709

Use and share agents in SharePoint in Teams chats. At mention your agent, get instant responses and precise information for team discussions. See it here: https://x.com/MSFTMechanics/status/1924172685849801119

We are at @MicrosoftBuild and there will be tons of fascinating launches tomorrow but today, let us share a nice human story from proud dad – aka @Microsoft’s CTO – @kevin_scott And it’s true – that the moment we are in in AI – everyone becomes much more capable now to build https://x.com/TheTuringPost/status/1924296119582093752

You don’t need to be a data scientist or prompt engineer to get fine-tuned AI responses. Copilot Tuning lets anyone create task-specific models in just a few clicks. See how. https://x.com/MSFTMechanics/status/1924621134558908916

Inspired by Microsoft’s A2A vision, I experimented with a multi-agent setup! 🚀 Microsoft’s AutoGen & Google ADK agents discover each other via A2A, extract key metrics from a quarterly report, benchmark vs. industry, & auto-draft a 1-page exec summary. Demo Video👇 #A2A https://x.com/Prasan09V/status/1920867425735897225

Breaking: Microsoft adds @openai’s main rival’s model : @grok is coming to their foundry model collection. @elonmusk has spoken https://x.com/TheTuringPost/status/1924508051253653745

3. Agent factory: Foundry is the complete app platform for building apps and agents. We are adding support for more models from Grok, Hugging Face, Meta, Mistral, and more. Plus: Agentic retrieval in Azure AI Search, Foundry Agent Service, integration with Copilot Studio, and https://x.com/satyanadella/status/1924535900463366247

Devstral | Mistral AI https://mistral.ai/news/devstral

Meet Devstral, our SOTA open model designed specifically for coding agents and developed with @allhands_ai https://x.com/MistralAI/status/1925191937792901298

“I love using codex from my phone, kicking off tasks when I have an idea then finishing them when I’m back at my computer – feels like magic!” / X
https://x.com/fouadmatin/status/1924603959966330906

@OpenAIDevs Codex has created more than 50 PRs for me today. Thanks for the generous rate limits!”” / X https://x.com/AIMachineDream/status/1923521417481708019

Codex is neat, but I really wish that OpenAI had gone the extra step of making it accessible to non-coders. Not that non-coders should expect to make complex or high-quality applications with today’s SWE agents, but democratizing making of small tools can make a big difference.”” / X https://x.com/emollick/status/1923969586954645921

GPT 4.1 is now my senior software architect / product manager @openai GPT-4.1 is great at instruction following so I asked it to work with me to build an app It’s really good for iterating — great for planning and brainstorming! See how we built a plan together 👇🧵 https://x.com/donvito/status/1922864617241575462

I used to be a software engineer, but had never checked in code at OpenAI—until Codex. I checked in two bug fixes earlier this week, written by Codex while I did other work.”” / X https://x.com/kevinweil/status/1923403371307753510

It operates in OpenAI infrastructure, so you don’t need to worry about compute. Each task runs in its own micro VM sandbox, file system, CPU, memory, etc. The agent can operate freely in that environment. The agent has learned all kinds of commands to operate effectively.”” / X https://x.com/omarsar0/status/1923395546225528955

More info on the codex-mini-latest model that’s available in the Responses API. It’s a fine-tuned version of o4-mini, specifically designed for use in Codex CLI. – 200K context window – 100K max outputs tokens – reasoning tokens supported https://x.com/omarsar0/status/1923408662422311203

New tools and features in the Responses API | OpenAI https://openai.com/index/new-tools-and-features-in-the-responses-api/

OpenAI launched Codex, a new coding agent that builds features and fixes bugs autonomously It can work on many tasks in parallel, navigate codebases, implement changes, and propose pull requests for review Available for Pro, Enterprise, and Team users https://x.com/adcock_brett/status/1924133661072396293

OpenAI showed an environment that they fully configured, such as the Codex-CLI. As configuration, you can also provide steerability and important instructions to the model via MD files. https://x.com/omarsar0/status/1923396913123958830

OpenAI: Scaling PostgreSQL to the Next Level | PixelsTech https://www.pixelstech.net/article/1747708863-openai%3a-scaling-postgresql-to-the-next-level

structured outputs in the API just got even more structured – including support for regex!”” / X https://x.com/stevenheidel/status/1924924775266144565

the 80% done projects all finally getting finished and automatically maintained is something i’m quite excited for!”” / X https://x.com/sama/status/1924629906669109492

We’re launching a research preview of Codex: a cloud-based software engineering agent that can work on many tasks in parallel. Rolling out to Pro, Enterprise, and Team users in ChatGPT starting today. https://x.com/OpenAI/status/1923416740073033873

We’ve made some improvements to Codex CLI, based on your feedback: ⬥ Sign in with ChatGPT to quickly connect your API org ⬥ New model, codex-mini, optimized for low-latency code Q&A and editing https://x.com/OpenAIDevs/status/1923467278701498418

wow so far DMs in a shockingly even dead heat between “”you made a software engineer and didn’t include unlimited use in the $20 plan? fuck you!”” and “”you made a software engineer and you’re not charging $20k a month for it? what the fuck?”””” / X https://x.com/sama/status/1923403788947275954

codex-mini-latest is available on the Responses API and priced at $1.50 per 1M input tokens and $6 per 1M output tokens, with a 75% prompt caching discount. No image inputs yet. No way to course-correct the agent while it’s working. Asynchronous collaboration with code agents”” / X https://x.com/omarsar0/status/1923403072669102399

ChatGPT Codex: The Missing Manual – Latent.Space https://www.latent.space/p/codex

ChatGPT Codex: The Missing Manual – YouTube https://www.youtube.com/watch?v=LIHP4BqwSw0

Releasing the OpenAI to Z Challenge — using o3/o4 mini and GPT 4.1 models to discover previously unknown archaeological sites:”” / X https://x.com/gdb/status/1923105670464782516

🚀 Build no-code agents with Open Agent Platform (OAP), our open-source, citizen developer platform for building, prototyping, and deploying agents. With Open Agent Platform, you can: 🔧 Build agents via a web UI— no heavy coding required 🧠 Connect to RAG servers for better https://x.com/LangChainAI/status/1925224206473842691

best agent chat ui, all open source!”” / X https://x.com/hwchase17/status/1924892270085448072

That’s a wrap on Interrupt 2025! 🚀 🌎 800 agent engineers from across the globe gathered in San Francisco for LangChain’s first industry conference to hear stories of teams building agents – and we’re still riding the high! @Cisco, @Uber, @Replit, @LinkedIn, @BlackRock,”” / X https://x.com/LangChainAI/status/1923089610772807959

This Thursday, the LlamaIndex team is hosting our first Discord office hours session! Drop in to ask anything LlamaIndex, and for an events driven agent workflows run-through and live coding session. See you there Thursday! Join the LlamaIndex Discord and add yourself to the https://x.com/llama_index/status/1924527932258845178

What has just been open-sourced by @Microsoft: ▪️ GitHub Copilot in Visual Studio Code ▪️ Natural Language Web (NL Web) ▪️ TypeAgent ▪️ Windows Subsystem for Linux (WSL) ▪️ Edit command-line text editor + Microsoft showed strong commitment to MCP as its standard open protocol https://x.com/TheTuringPost/status/1924598434507743728

🚀 Qwen Web Dev just got even better! ✨ One prompt. One website. One click to deploy. 💡 Let your creativity shine — and share it with the world. 🔥 What will you build today? https://x.com/Alibaba_Qwen/status/1924299942614688111

🚀 Learn how to build lightweight, real-time @AgnoAgi agents for medical and legal tasks without hogging resources in the latest tutorial by @pavan_mantha1. How? ➡️ Modular agents that are easy to update without a full rebuild ➡️ Tracking performance and interactions with https://x.com/qdrant_engine/status/1924348846647259626

Today Demis announced Deep Think which marks our progression to greater test-time compute and stronger reasoning capabilities in Gemini 💎 Highlighting USAMO which is a very challenging set of held-out math problems, we’re now at 49% accuracy. This is equivalent to the top https://x.com/jack_w_rae/status/1924897579122491523

🔥 one-click install MCP servers in @windsurf_ai 🎯 >>> windsurf introduced ‘Cascade Plugins’ quick walkthrough for you below 👇 🎥 you can install an MCP Server with a single click includes: >> knowledge graph memory >> github >> postgres db …and much more. 💥 check https://x.com/daniel_mac8/status/1920472000654463161

🤖 From this week’s issue: The Haystack team announced Hayhooks, an open-source package that can turn Haystack pipelines into production-ready REST APIs or expose them as MCP tools with full customization and minimal code. https://x.com/dl_weekly/status/1925961718808649966

🪄 Our new MCP Server lets you manage data through simple prompts. You can create, read, update, or delete records by having natural conversations with AI—without needing any complex API knowledge. Check it out 👇 #ai #mcpserver #mcp #Baserow https://x.com/baserow/status/1920028513521778708

🚀 Announcing: MCP Startup Boilerplate v0.0.1! ⭐ https://x.com/fkadev/status/1920527751087288711

A2A and MCP – Tutorial with code Demo – YouTube https://www.youtube.com/watch?v=nSjj1ZaNP2c

Agno makes it incredibly easy to add multiple MCP servers. @AgnoAgi @ashpreetbedi The example shows 5 MCP servers — and you can add even more! Each MCP server comes with numerous tools, giving your LLM access to a powerful and diverse toolset. https://x.com/prompt48/status/1921069384899973397

All this vibe coding is making me hungry… So I built a remote MCP Server with @CloudflareDev that lets me order 🍕 directly from within Claude Code. https://x.com/rickyrobinett/status/1918049664466862349

An MCP server to chat with any GitHub repo! It is powered by GitIngest, and has two tools: – git_directory_structure → to read the directory structure. – git_read_important_files → to read files. 100% open-source! https://x.com/DailyDoseOfDS_/status/1920773272742097228

BIG NEWS for #PHP devs! ✨ Tired of finding only Python or TypeScript tools for building an MCP server? Introducing PHP MCP Server v1.0.0 🚀, a core, framework-agnostic Model Context Protocol implementation in PHP. Let’s dive in 🧵 https://x.com/CodeWithKyrian/status/1917850839546294630

Built an MCP server that decompiles smart contracts on the Monad testnet and explains what the code does. Now you can read unverified contracts, identify vulnerabilities, and analyze how protocols actually work under the hood. https://x.com/consolexyz/status/1917921098806816903

Claude web app now includes an “”Add more”” option in the Integrations section of settings, marked as BETA, using Model Context Protocol (MCP) “”Custom integrations”” – “”Connect data and tools from other sources.”” https://x.com/btibor91/status/1917484367955517677

I built an MCP server boilerplate* for vibe coders to launch paid tools in less than 20 minutes – hosted free on cloudflare – stripe for payments – google or github user login – easy to set up * it’s $25 if you want it: mcpboilerplate.⁠com https://x.com/iannuttall/status/1920484902752981012

I Built an MCP server for Reddit: https://x.com/Arindam_1729/status/1920932635850760492

I built this game in just 2 hours with the help of AI. Blender and Unity MCP, vibe coding, 3D & sound ML tools. 🪄✨ Join me on 16.05 @AMazeFest workshop — I’ll guide you through the process ✨ https://x.com/vladstorm_/status/1920546549777494382

I have some big news about FastHTML and @AnthropicAI Claude 4 🙂 https://x.com/jeremyphoward/status/1925679459098566687

I’m f*ing sick of cloning repos, setting them up, and debugging nonsense just to run a simple MCP. So I built a one-click desktop app that runs any MCP — with hundreds available out of the box. And yeah, it’s completely FREE. ➜ onemcp .io https://x.com/Ipenywis/status/1921212869636747464

Running MCP servers today = herding cats… on fire… in a hurricane. 🐈🔥🌪️ Docker said “”nah.”” 🛠️ Runtime fixes 🔐 Secrets locked down 🌐 Dynamic tool discovery Docker is making #MCP production-ready… coming soon! Read how: https://x.com/Docker/status/1918388293906948188

The @AgentOpsAI is co-hosting the Amazon MCP hackathon this Saturday! Check it out 👇”” / X https://x.com/AlexReibman/status/1923142838767829270

Turn your Api Endpoints into MCP Server with just 4 lines of code 🔥 I am talking about fastapi-mcp, do checkout this new library. Full video on YT: https://x.com/debsourya005/status/1918976584716898404

Using Firecrawl MCP as a Company Researcher is incredible 👀 Instantly uncover everything about any company from products and pricing to FAQs, blogs, and beyond right in Claude Desktop. Open source and powered by @firecrawl_dev 🔥 https://x.com/ericciarla/status/1920508021580706267

We’re excited to launch OpenMemory MCP, a private memory for MCP-compatible clients powered by @mem0ai Today, most AI assistants and dev tools operate without memory. You plan your roadmap in Claude, implement tasks in Cursor, but none of them know what the other did. Each tool https://x.com/taranjeetio/status/1922315139057070154

y’all know that @huggingface Spaces is the app store of AI what you don’t know is all these apps are MCP Servers thanks to @Gradio MCP server 😮 plug it to your favorite provider 🤠 insanely powerful! https://x.com/mervenoyann/status/1923406695000093095

MCP meets Ollama! In this video you’ll learn how build a 100% local MCP client that you can connect to any MCP server. 100% open-source code, step-by-step guide: https://x.com/akshay_pachaar/status/1921877497475485778

Hugging Face just dropped Tiny Agents into its own NPM package a squad of lightweight composable agents built on Hugging Face’s Inference Client and MCP stack https://x.com/_akhaliq/status/1924871432816783681

Codex is now available in the ChatGPT iOS app! Start new tasks, view diffs, ask for changes, and even push PRs—all on the go. And you can keep tabs on Codex with live activities on your lock screen, or pick things up again when you’re back at your computer. 🏃🤳 https://x.com/OpenAIDevs/status/1924601527898951914

We’ve also been working on upgrades to Project Astra, including more natural voice output with native audio, improved memory, & computer control. Over time we’ll bring these new capabilities to Gemini Live & new experiences in Search, Live API for devs, and new form factors like”” / X https://x.com/Google/status/1924883459253649494

📰 News in Arena: Mistral Medium 3 makes a strong debut with the community! Highlights: 💠 #11 overall in chat: a +90 point leap from Mistral Large 💠Top-tier in technical domains (#5 in Math, #7 in Hard Prompts & Coding) 💠#9 in WebDev Arena Congrats to @MistralAI on the https://x.com/lmarena_ai/status/1924482515244622120

Expanding your AI Horizons, Summer ’25 Edition (2025) – Shopify https://www.shopify.com/blog/expanding-your-ai-horizons-summer-edition-25

Google also announced plans to expand Gemini to more platforms, including smartwatches, cars, and XR headsets. Plus, it debuted ‘AI Futures Fund’ to give startups early access to advanced models, funding, and technical expertise to boost development https://x.com/adcock_brett/status/1924133705913655709

Together AI and Agentica launched DeepCoder-14B-Preview, a code generation model that competes with top reasoning models like OpenAI’s o1 and DeepSeek-R1, but at a fraction of the size. Built on a 14 billion parameter Qwen model, DeepCoder uses a highly optimized reinforcement https://x.com/DeepLearningAI/status/1924570759793369303

Watch Gemini 2.5 Pro Deep Think tackle the challenging “”catch a mole”” problem from @Codeforces. 🪤 This new mode is based on our research in parallel thinking and considers multiple hypotheses before responding. See it in action ↓ https://x.com/GoogleDeepMind/status/1925676461651791992

Google will let you ‘try on’ clothes with AI | The Verge https://www.theverge.com/news/670346/google-try-on-clothes-ai-shopping-io-2025

Dropping this Website to Image Generator tomorrow! Built with @GoogleAI’s new Imagen 4 and Gemini Flash 👀 https://x.com/ericciarla/status/1925340344109162721

AlphaEvolve is deeply disturbing for RL diehards like yours truly Maybe midtrain + good search is all you need for AI for scientific innovation And what an alpha move to keep it secret for a year Congrats big G”” / X https://x.com/_jasonwei/status/1923091260354531612

With Google’s AlphaEvolve, we have evidence that LLMs can discover novel & useful ideas, when put together with the right tooling. These results are impressive: given 50 open math problems, the AI rediscovered the leading approach 75% of the time & improved on it 20% of the time https://x.com/emollick/status/1922860271208456588

🤖 From this week’s issue: Google introduced Veo 3 and Imagen 4, and a new tool for filmmaking called Flow. https://x.com/dl_weekly/status/1925904865164689539

Flow is available for Google AI Pro and Ultra plan subscribers in the US, with more countries coming soon. Try it here ↓ https://x.com/GoogleDeepMind/status/1924896551496716667

Get into the zone with Flow. 🎬 It combines the best of our most advanced models Veo, Imagen and Gemini into 1️⃣ master filmmaking tool – helping you weave cinematic clips, dynamic scenes, and compelling narratives into stories with consistent results. https://x.com/GoogleDeepMind/status/1924896540138586528

Introducing Flow: a new type of AI filmmaking tool that combines the best of Veo, Imagen and Gemini — built with and for creatives. Flow helps you maintain character and visual consistency from one clip to the next. See how emerging filmmakers are using it 🎥 https://x.com/Google/status/1924896843441336440

Google Flow is the closest thing I’ve seen to a multimodal AI studio for creatives. And it’s available today. It feels like a generative camera and soundstage, where you can “capture” all the shots that you need — and feel confident you have everything to put it together in https://x.com/bilawalsidhu/status/1924901783664787942

ollama run devstral Devstral from @MistralAI and @allhands_ai is available on Ollama!”” / X https://x.com/ollama/status/1925198849263747147

.@NachoSoto builds faster with Codex, kicking off tasks from his phone and starting every change with a PR already drafted. https://x.com/OpenAI/status/1923416746150523221

@levie Thanks for having me! Always fun to catch up and excited for more together between @OpenAI and @Box.”” / X https://x.com/kevinweil/status/1924285487491240162

📢 Big News! We’re teaming up with @OpenAI to launch the OpenAI to Z Challenge – Kaggle’s first-ever featured Hackathon! https://x.com/kaggle/status/1923063468430610894

chatgpt daily active users have increased >4x over the last year. messages/day by much more than that. at the same time, the engineering team has greatly increased reliability and is now making real progress on speed. significant scale to be doing this at; great work!”” / X https://x.com/sama/status/1924844678144516576

GPT-4.1 for instruction following:”” / X https://x.com/gdb/status/1923228487885742547

Wolfram’s guess about why ChatGPT was suddenly so much more capable than expected is that, as the scale of the model increased, it found the hidden deep structural patterns of writing which is similar to the structure of human thought (“writing is thinking” is a common assertion) https://x.com/emollick/status/1923533622071480727

AM-Thinking-v1 looks like a strong 32B reasoning model. It outperforms DeepSeek-R1 and rivals Qwen3-235B-A22B. All built on top of open-source. The 32B scale is a great size for deployment and fine-tuning. Best part: the model is open-sourced! https://x.com/omarsar0/status/1922668488826741061

codex-1 shows strong performance even without AGENTS .md or custom scaffolding. It consistently outperforms o3-high on SWE tasks. https://x.com/omarsar0/status/1923401100419387759