Agents and Copilots: AI News Week Ending 08/29/2025

Image created with Flux Pro v1.1 Ultra. Image prompt: Giant “100” as pure white negative‑space cutout dominating the frame; minimalist poster style; task‑graph nodes and checklists weaving through the zeros as an autonomous agent orchestrates tasks; cobalt matte backdrop; high contrast, crisp edges, soft studio light, no other text, no logos

AGENTS md is a simple, open format for guiding coding agents. Works like a README but designed specifically for AI Agents to understand your codebase. Single file works across Cursor, Claude, OpenAI Codex, Google Jules, and Factory AI. https://x.com/Saboo_Shubham_/status/1957992746372985213

This is the word on the street I’ve been hearing in SF. Lots of people quietly prefer GPT-5 for code over Claude, counter to the memetic narrative on the TL”” / X https://x.com/BasedBeffJezos/status/1958942764747694593

Connectors and persistent conversations in the Responses API:”” / X https://x.com/gdb/status/1958691151139283454

Blue (@heyBlueX) lets you control your phone’s apps by voice so tasks actually get finished, hands-free. It handles messages, email, and actions across apps by tapping and typing as you would. https://x.com/ycombinator/status/1958182627422146811

ByteDance just opensourced a desktop automation AI Agent. This agent can use any desktop app, open files, and browse websites using vision models running locally. 100% Free, Opensource, and Local. https://x.com/unwind_ai_/status/1956538069311500514

Gemini Live updates: More Google app connections and visual help https://blog.google/products/gemini/gemini-live-updates-august-2025/

Two big updates to the Responses API today. 🖇️ Connectors — Pull context from Gmail, Google Calendar, Dropbox, and more in a single API call. 💬 Conversations — Persist chat threads for your users, without running your own database. More below:”” / X https://x.com/OpenAIDevs/status/1958660207745409120

Gemini for Government – brings together the best of Google’s AI-optimized & accredited commercial cloud, SOTA Gemini models, and agentic solutions to support the missions of government agencies! All for less than $0.50 per agency : )”” / X https://x.com/OfficialLoganK/status/1958549753148408045

I just love this. The new =COPILOT() function in Excel lets you analyze, generate content, and brainstorm directly in the grid. https://x.com/satyanadella/status/1957493248718680571

Build rich experiences with connectors to: – Read emails – Fetch calendar events – Search files and chats Gmail, Google Calendar, Drive, Dropbox, Teams, Outlook Calendar + Email, and SharePoint are available now. They work with deep research, too! https://x.com/OpenAIDevs/status/1958660214057791853

💥 We launched a host of great features in Codex today: * A new extension for Cursor, VSCode, Windsurf, and the like * A much improved Codex CLI running in your local environment * Ability to manage both local and cloud Codex tasks seamlessly, including… * … Codex-driven”” / X https://x.com/kevinweil/status/1960854500278985189

📣 We shipped major improvements to the Codex CLI today GPT-5, with usage included in your ChatGPT Plan (no API key needed) Upgraded prompt, harness, approvals & sandboxing logic… you name it Get the latest: 1. `npm install -g @openai/codex` -> v0.16+ 2. `codex login`”” / X https://x.com/embirico/status/1953526045573059056

BTW, I’ve basically stopped using Opus entirely and I now have several Codex tabs with GPT-5-high working on different tasks across the 3 codebases (HVM, Bend, Kolmo). Progress has never been so intense. My job now is basically passing well-specified tasks to Codex, and reviewing”” / X https://x.com/VictorTaelin/status/1958543021324029980

codex cli with gpt-5 is getting pretty good”” / X https://x.com/gdb/status/1959209931267297586

Codex https://developers.openai.com/codex

Codex is becoming much more integrated into the full stack of development, including code review and integrating between local and remote:”” / X https://x.com/gdb/status/1960900413785563593

Finally got around to trying Codex CLI with my OpenAI Plus subscription – and I was not prepared for how good it is!! 🔥🔥 codex -m gpt-5 -c model_reasoning_effort=””high”” Blew away Gemini CLI on same tasks 💥 Try it – feels way smarter and more capable.”” / X https://x.com/TendiesOfWisdom/status/1958938621311955249

I confirm same feeling on Claude Code, not cancelling yet, but surely downgrading and moving to: codex -m gpt-5 -c model_reasoning_effort=””high”””” / X https://x.com/ivanfioravanti/status/1959277577920536740

I really like the MLX, truly enables quick experimentation. Just spend a few minutes to get the codex to work with local GLM 4.5 air via the mlx_lm server. It works beautifully. https://x.com/LiMzba/status/1960277996172149103

Image inputs landed in codex cli. Update to 0.24 to try it along with many other improvements. https://x.com/thsottiaux/status/1960579534257820024

Meanwhile, I’ve been having a blast pair-programming with gpt-5 (medium+high) in codex-cli. I can really bounce API-design ideas off it, ask for pros/cons, alternative ideas, and it’s been spot-on. It doesn’t mind pushing back on bad ideas, it makes me aware of pitfalls I’ve https://x.com/giffmana/status/1959362175648084124

new features in codex cli! try it out: npm install -g @openai/codex”” / X https://x.com/gdb/status/1960759142089658798

Seems like people are really coming around to codex cli w/ gpt5-high!! What is codex cli still missing? How can we make it even better??”” / X https://x.com/ericmitchellai/status/1959236423124492769

Using Codex with your ChatGPT plan | OpenAI Help Center https://help.openai.com/en/articles/11369540-using-codex-with-your-chatgpt-plan

We’re releasing new Codex features to make it a more effective coding collaborator: – A new IDE extension – Easily move tasks between the cloud and your local environment – Code reviews in GitHub – Revamped Codex CLI Powered by GPT-5 and available through your ChatGPT plan.”” / X https://x.com/OpenAIDevs/status/1960809814596182163

With these updates, Codex works as one agent across your IDE, terminal, cloud, GitHub, and even on your phone — all connected by your ChatGPT account. It’s all included in Plus, Pro, Team, Edu, and Enterprise plans. Check out the new Codex developer hub to get started.”” / X https://x.com/OpenAIDevs/status/1960809823387443479

yeah so OpenAI’s Codex CLI slaps crank that up to High reasoning on the $20 month plan and let it cook I needed to mock up some complex interactions in a graph model for an engineer I fed it a list of specs 15m later, hit 90% coverage Claude never got past 10% on Opus”” / X https://x.com/frantzfries/status/1959700004781847017

Continuing the journey of optimal LLM-assisted coding experience. In particular, I find that instead of narrowing in on a perfect one thing my usage is increasingly diversifying across a few workflows that I “”stitch up”” the pros/cons of: Personally the bread & butter (~75%?) of”” / X https://x.com/karpathy/status/1959703967694545296

Agentic Browser Security: Indirect Prompt Injection in Perplexity Comet | Brave https://brave.com/blog/comet-prompt-injection/

Our custom LLM, gpt-4b micro, has helped achieve an advance in biology. It designed novel variants of the Nobel-winning Yamanaka factors that achieve a 50x increase in reprogramming efficiency in vitro compared to standard OSKM proteins.”” / X https://x.com/gdb/status/1958928877415510134

Hey AI, give me a clever, moving one paragraph story about a paradox, in any genre you desire. make it good”” These are the first attempts. A bit of the obvious time travel tales from Gemini and Grok. Claude loves to pull on your emotions. GPT-5 Pro goes in a stranger direction. https://x.com/emollick/status/1959817825729781837

LangGraph Studio just got a face-lift! ✨ A revamped Interact mode now supports: • 📝 Markdown • 📌 Sticky headers • 📡 Smarter log tailing •🧹 A cleaner, more readable design Use it with the recently released Trace mode to seamlessly go from high-level to in-the-weeds. https://x.com/LangChainAI/status/1960442209918218491

read more (about the new agent protocol – worst tweet ever) https://x.com/imjaredz/status/1960742382720442368

SIP API: https://x.com/juberti/status/1961118371090501972

Super excited to announce the official integration between ART and LangGraph! You can now easily train your LangGraph agents with reinforcement learning — automatically improving reasoning, tool use, and adaptability. More info below: https://x.com/corbtt/status/1960102502764036270

The @zeddotdev team just dropped Agent Client Protocol .. ACP… another protocol… actually some pretty cool stuff here “”Language Server Protocol for AI agents”” Aiming to decouple AI coding assistants from specific editors, making your prompts and agent behaviors portable”” / X https://x.com/imjaredz/status/1960742370229805552

Building AI that answers with confidence requires two things: fresh context and transparent execution. Static agents miss both. We paired @tavilyai (real-time web) with W&B Weave (tracing, evals, ops) to ship research agents that are accurate, current and trustworthy. https://x.com/weave_wb/status/1960428416236445931

Introducing gpt-realtime and Realtime API updates for production voice agents | OpenAI https://openai.com/index/introducing-gpt-realtime/#image-input

One of the most pressing questions in our AI Evals course is: “”Why can’t I just have an LLM write my LLM pipeline?”” The nuanced answer is that you can use LLMs to assist, but not for the whole pipeline. Knowing where to put the LLM in the loop is the hard part. To unpack this,”” / X https://x.com/sh_reya/status/1961110090314125524

Built an AI agent called Ava that you can CC in any email to find time and book meetings Love my Sundays https://x.com/rowancheung/status/1959671075526041815

I built a small coding agent that lets you build high-quality agentic document workflows through natural language 🤖📑 Describe a reference document and the task you want to do over it (e.g. “split the document into each fund section, analyze the financials for each fund”) and https://x.com/jerryjliu0/status/1961123785597505603

We just raised a $14.5 million Series A led by @nexusvp to connect AI agents to the web Alongside it, we’re launching 2 big upgrades to @firecrawl_dev – our v2 API that supports faster scraping, news & image search and a brand new website The new chapter starts today 🔥 https://x.com/nickscamara_/status/1957824588970103193

Meet LFM2 MCP, an in-browser tool calling MCP allowing for fast and local agentic workflows. Built by @KarnikShreyas, extending @xenovacom’s WebGPU work. https://x.com/LiquidAI_/status/1960735546960986216

[AMA] The Future of AI Agents in Coding with Guy Gur-Ari & Igor Ostrovsky, co-founders of Augment Code. Aug 29, 10am PT / 1pm ET. We’ll answer questions on the future of AI agents and why context matters in AI coding on r/webdev. Ask us anything! : r/webdev https://www.reddit.com/r/webdev/comments/1mzxiyq/ama_the_future_of_ai_agents_in_coding_with_guy/

🫡 Assistants We’re winding down the Assistants API beta. It will sunset one year from now, August 26, 2026. We’ve put together a guide to help you migrate to the Responses API: https://x.com/OpenAIDevs/status/1960409187122602172

Reacher (@ReacherApp) is your marketing agent for creator collaborations. They automate creator discovery, outreach, campaign management, and content strategy so marketing teams can focus on what humans do best. https://x.com/ycombinator/status/1957502756752658921

There are 3 pervasive patterns in coding agent development that we deliberately avoid at Cline: 1. Multi-agent orchestration 2. RAG (via indexed codebases) 3. Instruction overloading Here’s why. <thread> https://x.com/cline/status/1960175630907306325

Congrats to Eric and the team at @genspark_ai for launching Genspark AI Developer! It’s a zero-setup, complete IDE that runs in your browser, like Replit. You describe what you want, you get visual feedback and you and iterate on the output. You can pick your model (e.g. Claude https://x.com/fchollet/status/1959083315878928808

Most takes on RL environments are bad. 1. There are hardly any high-quality RL environments and evals available. Most agentic environments and evals are flawed when you look at the details. It’s a crisis: and no one is talking about it because they’re being hoodwinked by labs”” / X https://x.com/rosstaylor90/status/1959494279077728549

📈 Process reward strikes back 🚨 I think it is obvious that eventually we need to rely on stepwise judges instead of final outcome rewards. As tasks get longer (or even endless), it is unreasonable to push up/down all steps involved. Here we show you can obtain stepwise labels”” / X https://x.com/tesatory/status/1960533462672400724

🪜Introducing: StepWiser🦉 📝: https://x.com/jaseweston/status/1960529697055355037

@LangChainAI launched Deep Agents which are super powerful, but coding agents can’t build them because they’re stuck without the latest docs. So, we built a docs MCP server for LangChain and successfully vibe coded a Deep Agent using it. Here’s the side-by-side: coding with vs. https://x.com/ToriSeidenstein/status/1960744792813658391

I’ve been waiting for this. The Unified MCP is here! Rube is a universal MCP server to connect your AI agents to all *your* apps. Works with your favorite IDE, Claude Code, and other MCP clients. Watch it research YouTube vids → generate a full content strategy doc. Insane! https://x.com/omarsar0/status/1960084088133398718

LiveMCP-101 Stress Testing and Diagnosing MCP-enabled Agents on Challenging Queries https://x.com/_akhaliq/status/1959073276937801737

There’s a new way to build production-grade MCP servers. – It takes less than a minute. – You don’t have to write any code. – You can integrate from 100k+ tools. Here’s a step-by-step breakdown (100% local):”” / X https://x.com/_avichawla/status/1960590605480026244

“”Huge Realtime API release today! Details below, but TLDR: – GA (out of beta) – better instruction following, naturalness, audio – MCP support – new voices – SIP (telephony) support – new WebRTC APIs and video support Demos: https://x.com/juberti/status/1961116594211364942

Introducing gpt-realtime and Realtime API updates for production voice agents | OpenAI https://openai.com/index/introducing-gpt-realtime/#additional-capabilities

Introducing gpt-realtime and Realtime API updates for production voice agents | OpenAI https://openai.com/index/introducing-gpt-realtime/#remote-mcp-server-support

The Realtime API is officially out of beta and ready for your production voice agents! We’re also introducing gpt-realtime—our most advanced speech-to-speech model yet—plus new voices and API capabilities: 🔌 Remote MCPs 🖼️ Image input 📞 SIP phone calling ♻️ Reusable prompts https://x.com/OpenAIDevs/status/1961124915719053589

Voice is the OG modality. So excited for image inputs, function calling & MCP support in the Realtime API GA! `gpt-realtime` is a lot more natural and expressive, and every time a SOTA voice model is released, you know what I gotta do… Here is the new voice Marin, on https://x.com/swyx/status/1961124194789499233

Anthropic made Claude Code available to Team and Enterprise users It features a new admin control for managing spend, policy settings, and more, ensuring teams get the flexibility to scale with their usage https://x.com/TheRundownAI/status/1958432715046265016

Qwen-Code weekly release (v0.0.8) : ✨ Deep VS Code Integration: Get context-aware suggestions & inline diffs directly in your editor! Initialize with /ide and supercharge your workflow. 🔌 Enhanced MCP Support: Add, remove, list MCP servers via CLI (qwen mcp add|remove|list), https://x.com/Alibaba_Qwen/status/1959170659583476026

Apple in talks to use Google’s Gemini AI to power revamped Siri, Bloomberg News reports | Reuters https://www.reuters.com/business/apple-talks-use-googles-gemini-ai-power-revamped-siri-bloomberg-news-reports-2025-08-22/

Eleven v3 (alpha), now available in the API | ElevenLabs https://elevenlabs.io/blog/eleven-v3-alpha-now-available-in-the-api

Our early studies (and many others) found 20-30% productivity gains in controlled experiments in fields ranging from consulting to coding But translating gains to the organizational level takes time, and leadership. I wrote about many of the reasons here. https://x.com/emollick/status/1958350546831630810

Introducing DeepSeek-V3.1: our first step toward the agent era! 🚀 🧠 Hybrid inference: Think & Non-Think — one model, two modes ⚡️ Faster thinking: DeepSeek-V3.1-Think reaches answers in less time vs. DeepSeek-R1-0528 🛠️ Stronger agent skills: Post-training boosts tool use and”” / X https://x.com/deepseek_ai/status/1958417062008918312

Build a market research agent with Gemini and @aisdk by Vercel. Learn how to research trends with Google Search, extract structured data into charts, and generate a final PDF report. https://x.com/googleaidevs/status/1958197342483472612

CFOs, PE sponsors diverge on AI adoption approach: Accordion https://finance.yahoo.com/news/cfos-pe-sponsors-diverge-ai-151025380.html

Native world knowledge Gemini’s world knowledge is a huge unlock here. Watch this interactive education tutor built with Gemini 2.5 Flash Image. Source: https://x.com/omarsar0/status/1960347789637878171

Google’s NotebookLM updates Audio and Video Overviews https://blog.google/technology/google-labs/notebook-lm-audio-video-overviews-more-languages-longer-content/

AI is getting REALLY meta. Genie 3 builds a simulation of reality by digesting YouTube. SIMA agent learns inside of it. Repeat. Your robots will be dreaming at night, reliving their mistakes, and figuring out how to do a better job next time. This world model tech really https://x.com/bilawalsidhu/status/1959295692515541143

🔔 Two months ago, we released #IneqMath, which revealed the Soundness Gap: LLMs can guess answers to Olympiad-level inequalities problems, but still struggle to make rigorous proof steps. Since then, it’s been downloaded 4K+ times on HuggingFace! ➡️ https://x.com/lupantech/status/1960384184842879444

Good news @Copilot users! With Deep Research, you get 5 free research reports a month for complex, thorough analysis + deep dives. For the extra curious, you can get even more with Copilot Pro. Free access available in all Copilot countries + languages, on mobile, web + Edge. https://x.com/mustafasuleyman/status/1958967409001603300

Introducing gpt-realtime and Realtime API updates for production voice agents | OpenAI https://openai.com/index/introducing-gpt-realtime/

A useful thing that GPT-5 can do that wasn’t previously possible before powerful AI is to monitor complex topics by asking it to give you scheduled reports. Example: I have a weekly report on “reproducible, benchmarked evidence of autonomous or recursive self‑improvement in AI” https://x.com/emollick/status/1959424313502961824

people seem to really like the new codex features!”” / X https://x.com/sama/status/1961096744533647501

Max Assistant is live for all Perplexity Max subscribers. Max Assistant thinks longer and uses advanced reasoning models to handle complex workflows and questions. https://x.com/PerplexityComet/status/1958235518086566239

‘pip install elysia’ and ‘elysia start’ That’s literally all it takes to get the most advanced open source agentic RAG app running on your data. We just released 𝗘𝗹𝘆𝘀𝗶𝗮, our open source, agentic RAG framework and an app so cool needed a cool video to go with it. Watch https://x.com/victorialslocum/status/1961095661719359624

Today we’re launching “Agentic Knowledge Graph Construction,” a short course built in collaboration with @Neo4j and taught by Andreas Kollegger (@akollegger). RAG retrieves relevant text, and knowledge graphs complement it by modeling relationships and provenance so answers https://x.com/DeepLearningAI/status/1960726499419676861

Can AI solve open problems in math, physics, coding, medical sciences & beyond? We collected unsolved questions (UQ) & tested frontier LLMs. Some solutions passed expert validation… https://x.com/Muennighoff/status/1960391987917402509

Classic deep state Washington thinking around tech is focused purely on *control* and *risk* and has a lack of understanding of technology/developer ecosystems work. As @DavidSacks says: for the American AI stack to win, we need to maximize marketshare. This means maximizing”” / X https://x.com/sriramk/status/1961072926561550366

Parallel agents are emerging as an important new direction for scaling up AI. AI capabilities have scaled with more training data, training-time compute, and test-time compute. Having multiple agents run in parallel is growing as a technique to further scale and improve”” / X https://x.com/AndrewYNg/status/1961118026398617648

Introducing Grok Code Fast 1, a speedy and economical reasoning model that excels at agentic coding. Now available for free on GitHub Copilot, Cursor, Cline, Kilo Code, Roo Code, opencode, and Windsurf. https://x.com/xai/status/1961129789944627207

So many multipliers! Great to see that Grok2 was trained using μP. https://x.com/QuanquanGu/status/1959358955643080770

Three ways to code for free just landed in Cline v3.26.6. Cloud speed with @grok Code Fast 1, local privacy via @LMStudio, or generous daily limits via the Qwen Code provider –> pick your path. 🧵 https://x.com/cline/status/1961201105729401060

Elon could start by making X the best matchmaker on the planet. Put Grok to work to build connections in the real world. Not pull us away from it into the arms of an AI companion.”” / X https://x.com/bilawalsidhu/status/1958570141119037766