Agents and Copilots: AI News Week Ending 07/18/2025

Agents and Copilots: AI News Week Ending 07/18/2025

July 18, 2025

congrats to @FakePsyho for claiming the top spot on the @atcoder World Finals programming competition (followed by OpenAI at #2)!”” / X https://x.com/gdb/status/1945553676321657127

Congrats to @FakePsyho for winning AtCoder World Tour Finals 2025 Heuristic 🚀 Humanity has prevailed (for now!) Thanks OpenAI for sponsoring #AWTF2025, and getting #2 on this grand challenge. Proud of @SakanaAILabs & @AtCoder’s ALE-Agent for reaching #5, on a shoestring budget!”” / X https://x.com/hardmaru/status/1945850637528490134

good job psyho”” / X https://x.com/sama/status/1945540005805658440

official results from @atcoder World Tour Finals are in — great results for both humans (#1 and #3 onwards) and AI (#2 in the world!). a milestone for AI for solving hard problems.”” / X https://x.com/gdb/status/1945989983569129632

RT @FakePsyho: Humanity has prevailed (for now!) I’m completely exhausted. I figured, I had 10h of sleep in the last 3 days and I’m barely…”” / X https://x.com/itsclivetime/status/1945590725279977900

we’re competing in the @atcoder World Finals programming contest. real nailbiter — OpenAI has been #1 for most of the contest. looked like it might be over when @FakePsyho pulled ahead, but we’ve just retaken the lead. 1 hour and 20 minutes to go! https://x.com/gdb/status/1945404295794610513

OpenAI’s Agent mode can now work with Spreadsheets achieving 45% on SpreadsheetBench https://x.com/scaling01/status/1945896464632148366

OpenAI on X: “We’ve decided to treat this launch as High Capability in the Biological and Chemical domain under our Preparedness Framework, and activated the associated safeguards. This is a precautionary approach, and we detail our safeguards in the system card. We outlined our approach on” / X
https://x.com/OpenAI/status/1945904754443669659

Preparing for future AI capabilities in biology | OpenAI
https://openai.com/index/preparing-for-future-ai-capabilities-in-biology/

RT @boazbaraktcs: ChatGPT Agent is the first model we classified as “”High”” capability for biorisk. Some might think that biorisk is not r…”” / X https://x.com/jekbradbury/status/1945944398199677016

🚨 BREAKING: @Kimi_Moonshot’s Kimi-K2 is now the #1 open model in the Arena! With over 3K community votes, it ranks #5 overall, overtaking DeepSeek as the top open model. Huge congrats to the Moonshot team on this impressive milestone! The leaderboard now features 7 different https://x.com/lmarena_ai/status/1945866381880373490

5 Things You Need to Know About Moonshot AI and Kimi K2, the New #1 model on the Hub https://huggingface.co/blog/fdaudens/moonshot-ai-kimi-k2-explained

Every ML Engineer’s dream loss curve: “Kimi K2 was pre-trained on 15.5T tokens using MuonClip with zero training spike, demonstrating MuonClip as a robust solution for stable, large-scale LLM training.” https://x.com/hardmaru/status/1943976259236901315

For those unfamiliar with Kimi K2: – Surpasses models like GPT-4.1 and Claude 4 Opus on coding benchmarks – Scores new highs on math and STEM tests among non-reasoning systems – Doesn’t even have multimodal or reasoning capabilities yet kimi [dot] com https://x.com/rowancheung/status/1944647747027558636

I think I will spend the rest of the day letting Kimi generate these reports. They are so nice to look at compared to what OpenAI, Anthropic and others give you https://x.com/scaling01/status/1944850575470027243

It’s so beautiful to see the @Kimi_Moonshot team participating in every single community discussions or pull requests on @huggingface (the little blue bubbles on the right). In my opinion, every serious AI organization should dedicate meaningful time and ressources to this https://x.com/ClementDelangue/status/1946208120385999328

It’s undeniable with Kimi-K2 China has reached the frontier and will surpass the US next year”” / X https://x.com/scaling01/status/1944045857340359044

Kimi has a distinct writing style that is free of most of the patterns we now associate with AI generated text. Both Kimi and DeepSeek’s prose is apparently even more impressive in Chinese. Both of these models have a unique ‘voice’, quite different from Western AI. https://x.com/AndrewCurran_/status/1944434569899290839

Kimi is 200 people, very few of them with “frontier experience”, a platform (but you can buy such data) and a modest GPU budget. In theory there are many dozens of business entities that could make K2 in the West. It’s telling how none did. Not sure what it’s telling tho.”” / X https://x.com/teortaxesTex/status/1944856509734961596

Kimi is a really weird model, and it needs a lot more testing to figure out For example, I gave it an altered version of Great Gatsby and it found the two alterations (as does Claude) but then made up a ton of hallucinated nonsense that sounded plausible but was just plain wrong https://x.com/emollick/status/1944974487369158864

Kimi K2 is an incredible model.”” / X https://x.com/skirano/status/1944123290525831317

Kimi K2 is now available on https://x.com/togethercompute/status/1944952034840732138

Kimi K2 is number one trending on HF, congrats! https://x.com/huggingface/status/1944155602583691492

Kimi K2 is so good at tool calling and agentic loops, can call multiple tools in parallel and reliably, and knows “”when to stop””, which is another important property. It’s the first model I feel comfortable using in production since Claude 3.5 Sonnet. https://x.com/skirano/status/1944475540951621890

Kimi K2 just hit #1 on @huggingface trending models in <24 hours! This MoE powerhouse packs 1T params with 32B active – crushing coding challenges and autonomous agent tasks. https://x.com/fdaudens/status/1943996876778614948

Kimi K2 now on https://x.com/togethercompute/status/1945143838911128019

Kimi K2, the latest from @Kimi_Moonshot is now live in the Arena! https://x.com/lmarena_ai/status/1944827675597791456

Kimi K2: Open Agentic Intelligence https://moonshotai.github.io/Kimi-K2/

Kimi team is more american than most American labs lol”” / X https://x.com/Teknium1/status/1944430651278537098

Kimi team just trained a state of the art open source model 32B active parameter/1T total with 0 training instabilities, thanks to MuonClip, this is amazing https://x.com/eliebakouch/status/1943687750563004801

Kimi-k2 seems to be a very good (and giant & odd) open weights model that may be the new leader in open LLMs. It is not beating the frontier closed models on my weird tests, but it doesn’t have a reasoner yet. More testing needed but Chinese open weights models are impressive. https://x.com/emollick/status/1943901440453259374

past week had huuuge releases, here’s our picks 🔥 > moonshot released Kimi K2, sota LLM with 1T total 32B active parameters 🤯 > @huggingface released SmolLM3-3B, best LM for it’s size, offers thinking mode 💭 as well as the dataset, smoltalk2 > Alibaba released WebSailor-3B, https://x.com/mervenoyann/status/1944757807191888080

Pretty wild that @Kimi_Moonshot dropped a 1T parameter (32B active) MoE trained on 15.5 Trillion tokens – MIT licensed 🔥 Beats all other open weights models across coding, agentic and reasoning benchmarks Ofcourse live on Hugging Face! 🤗 https://x.com/reach_vb/status/1943703030026641801

RT @ArtificialAnlys: While Moonshot AI’s Kimi k2 is the leading open weights non-reasoning model in the Artificial Analysis Intelligence In…”” / X https://x.com/zacharynado/status/1944945039647629548

RT @DeepInfra: Moonshot AI’s Kimi 2 is now live on DeepInfra, as always at the best price of $0.55/$2.20, full tool call and context suppor…”” / X https://x.com/jeremyphoward/status/1944939322735780260

RT @htihle: Results from kimi-k2 on WeirdML! It does very well for a non-reasoning model. Like a scaled up deepseek-v3, beating out gpt-4.1…”” / X https://x.com/bigeagle_xd/status/1944325829657554962

RT @huggingface: Kimi K2 is number one trending on HF, congrats! https://x.com/_akhaliq/status/1944159007456784512

RT @ivanfioravanti: Kimi-Dev-72B-4bit-DWQ is on mlx-community! It took 9 hours to create 😅 Quick performance test on M3 Ultra: Prompt: 56…”” / X https://x.com/awnihannun/status/1944108947411284374

RT @Kimi_Moonshot: 🚀 Hello, Kimi K2! Open-Source Agentic Model! 🔹 1T total / 32B active MoE model 🔹 SOTA on SWE Bench Verified, Tau2 & Ace…”” / X https://x.com/stanfordnlp/status/1944114320226263165

RT @koltregaskes: Kimi-K2 tops EQ-Bench, the benchmark that measures emotional intelligence. https://x.com/jeremyphoward/status/1944326479246147899

RT @lmarena_ai: 🚨 BREAKING: @Kimi_Moonshot’s Kimi-K2 is now the #1 open model in the Arena! With over 3K community votes, it ranks #5 over…”” / X https://x.com/Kimi_Moonshot/status/1945897926796185841

RT @lmarena_ai: Kimi K2, the latest from @Kimi_Moonshot is now live in the Arena! https://x.com/Kimi_Moonshot/status/1945462820147249523

RT @masondrxy: New K2 model from @Kimi_Moonshot is officially supported by @LangChainAI on @GroqInc! See 👇 https://x.com/Hacubu/status/1945144499228811676

RT @OpenRouterAI: Kimi K2 is now passing 200 tokens per second on OpenRouter Props to @GroqInc !”” / X https://x.com/JonathanRoss321/status/1945779694256722025

RT @reach_vb: LOVE ITT! You can run Kimi K2 (1T token MoE) on a single M4 Max 128GB VRAM (w/ offloading) or a single M3 Ultra (512GB) 🔥 Th…”” / X https://x.com/reach_vb/status/1944997786329460978

RT @sam_paech: Kimi-K2 just took top spot on both EQ-Bench3 and Creative Writing! Another win for open models. Incredible job @Kimi_Moonsh…”” / X https://x.com/Teknium1/status/1944285648825069759

RT @sdrzn: Seriously blown away by Moonshot’s new Kimi K2 model in @cline. It beats Claude Opus 4 on coding benchmarks and is up to 90% che…”” / X https://x.com/ClementDelangue/status/1946316382313869778

RT @weights_biases: NEW: Kimi K2 is now live on W&B Inference by @CoreWeave! It’s the first truly open challenger, ready for production wi…”” / X https://x.com/l2k/status/1945225318928634149

Seen many people mention how kimi K2 for example has no CoT or thinking which isn’t true, more of an issue with terminology Main difference with reasoning models (in terms of actual functionality) is the thinking is hidden during general non-verifiable rl, so the model can”” / X https://x.com/Grad62304977/status/1944050338551484702

Some thoughts on the decisions behind Kimi K2’s architecture – from our infra staff”” / X https://x.com/Kimi_Moonshot/status/1944589115510734931

Thank you to @Kimi_Moonshot for quickly addressing my queries on the correct system prompt for Kimi K2! We’ll be re-uploading all BF16 + dynamic @unslothai GGUFs with fixed tool calling & the new sys prompt! Sys prompt = “”You are Kimi, an AI assistant created by Moonshot AI.”””” / X https://x.com/danielhanchen/status/1946163064665260486

That’s from Kimi K2 blog post. In case someone says «wow and it’s not RL-trained». It very much is, don’t get misled by the absence of long CoT. Looks like DeepResearch but It’s probably similar to what’s been happening since Sonnet 3.5, giving it uncanny «pre-reasoner» powers. https://x.com/teortaxesTex/status/1944416704253018372

The success of Kimi K2 is no accident. The unfortunate reality in AI is that user experiences haven’t yet fully caught up to raw model capabilities. Experiences have plateaued. There are only so many coding assistants, research tools, or agents you can realistically offer, and https://x.com/skirano/status/1945505132323766430

TheZvi’s answer “why isn’t there American Kimi” basically: incentives. I *partially* buy it. But given the Concern about the dominance of Chinese open models, expressed by numerous patriotic think tanks, I think we could expect *someone* rising to the task. https://x.com/teortaxesTex/status/1945624983985639487

This is what 200 tokens/second looks like with Kimi K2 on @GroqInc For reference, Claude Sonnet-4 is usually delivered at ~60 TPS https://x.com/cline/status/1945354314844922172

True, the first ever application of Muon was to break the 3-second barrier in the CIFAR-10 speedrun. For perspective on scale that was a 3e14 flop training; @Kimi_Moonshot’s K2 is 3e24 flops, 10 orders of magnitude larger. https://x.com/kellerjordan0/status/1945701578645938194

We’ve just fixed 2 bugs in Kimi-K2-Instruct huggingface repo. Please update the following files to apply the fix: – tokenizer_config.json: update chat-template so that it works for multi-turn tool calls. – tokenization_kimi.py: update encode method to enable encoding special”” / X https://x.com/Kimi_Moonshot/status/1945050874067476962

We’ve submitted Kimi K2 to @lmarena_ai. Waiting to be added to the match pool: https://x.com/Kimi_Moonshot/status/1944754256059453823

You might not have heard of Moonshot AI, but within 24 hours, their Kimi K2 model shot to the top of the Hugging Face trending models. So… who are they, and why does this matter? 🧵Here are a few standout facts:”” / X https://x.com/fdaudens/status/1945128932040208867

Kimi K2 at 185 t/s (or even higher, nearly 220 in my short tests) is probably the best use of Groq to date, and can make K2 immediately more compelling than Sonnet 4. Impressive that they’ve managed to fit this 1T monster on their chips. https://x.com/teortaxesTex/status/1944950183051321542

Quick start project for Claude Code on Kimi:”” / X https://x.com/jeremyphoward/status/1944326308210921652

Very interesting – you can use Kimi with the Anthropic API. This means, perhaps most importantly, that you can now use Kimi with Claude Code! 🤯 https://x.com/jeremyphoward/status/1944322841866125597

RT @allhands_ai: Kimi-K2 is definitely the first strong open-weight competitor to Claude Sonnet. 65.4% on SWE-Bench Verified in OpenHands,…”” / X https://x.com/TheZachMueller/status/1945545349352829439

The DeepSeek moment was supercharged by pent-up consumer demand for a good free AI for those who wouldn’t pay (especially for students for homework) A reason Kimi K2 has not had the immediate public impact of DeepSeek may be, for most consumers/students, DeepSeek is good enough”” / X https://x.com/emollick/status/1944764085741957153

RT @yawnxyz: Kimi K2 is **INCREDIBLE** at using tools. I built a chrome extension to chat with Google Maps, but I never posted it. All th…”” / X https://x.com/bigeagle_xd/status/1945087963408351728

I’ve been a bit quiet on X recently. The past year has been a transformational experience. Grok-4 and Kimi K2 are awesome, but the world of robotics is a wondrous wild west. It feels like NLP in 2018 when GPT-1 was published, along with BERT and a thousand other flowers that https://x.com/DrJimFan/status/1944443447953498285

I doubt that Sama’s delay of open model is about Kimi. But I don’t find the logic here compelling either. «Only nerds noticed Kimi». Well, Sama is loathed. The point of his model is, above all things, PR. If it’s not open SOTA, reports will notice *that*. I think he wants SOTA. https://x.com/teortaxesTex/status/1944263611398180954

Rumors that OpenAI delayed their open-source model because of Kimi are fun, but from what I hear: – the model is much smaller than Kimi K2 (<< 1T parameters) – super powerful – but due to some (frankly absurd) reason I can’t say, they realized a big issue just before release, so”” / X https://x.com/Yuchenj_UW/status/1944235634811379844

Super excited to see Kimi K2 land on Perplexity. If you’re fine-tuning, quick reminder: using the Muon optimizer during both fine-tuning and RL phases gives the best results (details are in our Moonlight paper).”” / X https://x.com/Kimi_Moonshot/status/1944224975428497549

Grok 4 suggests that scaling still works (with the diminishing returns predicted by the scaling law), and that tool use can unlock performance gains. Kimi suggests there continues to be big opportunities from improvements in methods (Muon, etc.). Lots of paths for AI right now.”” / X https://x.com/emollick/status/1944306918631018856

Cognition (the Devin AI agent crew) snapped up Windsurf, a fast-growing AI coding startup. The battle for AI developer tools is getting fierce. 💻 https://x.com/fdaudens/status/1945121371094155531

Cognition | Cognition’s acquisition of Windsurf https://cognition.ai/blog/windsurf

Cognition has signed a definitive agreement to acquire Windsurf. The acquisition includes Windsurf’s IP, product, trademark and brand, and strong business. Above all, it includes Windsurf’s world-class people, whom we’re privileged to welcome to our team. We are also honoring https://x.com/cognition_labs/status/1944819486538023138

Cognition, maker of the AI coding agent Devin, acquires Windsurf | TechCrunch https://techcrunch.com/2025/07/14/cognition-maker-of-the-ai-coding-agent-devin-acquires-windsurf/

RT @nmasc_: NEW: Inside 96 hours of Windsurf whiplash: OpenAI talks broke down, Google brought out the velvet rope & Cognition sealed a dea…”” / X https://x.com/steph_palazzolo/status/1945226161140728021

“these results were eye-opening for me… chatgpt agent performed better than i expected on some pretty realistic investment banking tasks”
https://x.com/tejalpatwardhan/status/1945894313977860203

ChatGPT agent for investment banking:”” / X https://x.com/gdb/status/1946074958238765503

Citi and Ant International Pilot AI-Enabled Forecasting Solution to Enhance FX Risk Management for Airline Customers
https://www.citigroup.com/global/news/press-release/2025/citi-ant-international-ai-solution-enhance-fx-risk-management-airline-customers

Citi, Ant International pilot AI-powered FX tool for clients to help cut hedging costs | Reuters https://www.reuters.com/business/finance/citi-ant-international-pilot-ai-powered-fx-tool-clients-help-cut-hedging-costs-2025-07-18/

Goldman Sachs is testing viral AI agent Devin as a ‘new employee’ | TechCrunch https://techcrunch.com/2025/07/11/goldman-sachs-is-testing-viral-ai-agent-devin-as-a-new-employee/

OpenAI working on payment checkout system within ChatGPT, FT reports | Reuters https://www.reuters.com/business/openai-working-payment-checkout-system-within-chatgpt-ft-reports-2025-07-16/

Been using the Dia browser for a couple of days now and realizing it’s become more of a hassle to navigate to ChatGPT or Perplexity. The deep integration with an LLM changes the experience of using a browser and navigating the internet. The browser wars are about to begin.”” / X https://x.com/alecdewitz/status/1935420754226790842

In the works already. Team moving at a pace that’s fast even for Perplexity standards. https://x.com/AravSrinivas/status/1945537471540072888

Perplexity is now the #1 overall app on App Store in India, ahead of ChatGPT. https://x.com/AravSrinivas/status/1945960772091433081

ChatGPT agent for working with Excel, Powerpoint, etc.:”” / X https://x.com/gdb/status/1946007318824673534

New from our security teams: Our AI agent Big Sleep helped us detect and foil an imminent exploit. We believe this is a first for an AI agent – definitely not the last – giving cybersecurity defenders new tools to stop threats before they’re widespread.https://x.com/tulseedoshi/status/1945113799297536313

AI firms like OpenAI are poaching Wall Street quants with massive paydays, shifting the talent landscape for building artificial general intelligence. 💰 https://x.com/fdaudens/status/1944759768528060558

The AI Labs Are Coming for Wall Street’s Quants – Business Insider https://www.businessinsider.com/ai-talent-openai-wall-street-quant-trading-firms-2025-7

OpenAI’s Windsurf deal is off — and Windsurf’s CEO is going to Google | The Verge https://www.theverge.com/openai/705999/google-windsurf-ceo-openai

RT @jordihays: Here is most of what I’ve gathered on the Windsurf / Google Deal The founders and dozens of engineers are going to Google.…”” / X https://x.com/_arohan_/status/1944203727059226784

The Next Stage of Windsurf https://windsurf.com/blog/windsurfs-next-stage

The Windsurf Dynamics: On the need for a social contract, an analysis of the potential payouts / cap table math, what a better outcome might have looked like instead, and why –– maybe? –– the Windsurf founders and board might have actually done the right thing, leaving a graceful https://x.com/haridigresses/status/1944406541064433848

It’s the year of the social sciences hacker. We’re about to see leaps in innovation that don’t come from engineers. Instead, they’ll come from people who’ve never gotten to build before. I couldn’t be more excited about it. https://x.com/mustafasuleyman/status/1945164452761899025

I often rant about how 99% of attention is about to be LLM attention instead of human attention. What does a research paper look like for an LLM instead of a human? It’s definitely not a pdf. There is huge space for an extremely valuable “research app” that figures this out.”” / X https://x.com/karpathy/status/1943411187296686448

Dia Browser has a built in AI Chat tab. You can reference any tab you have open and even make comparisons between them. Dia is able to understand the page you’re on and give answers. It’s pretty cool! https://x.com/jerrod_lew/status/1933132174921961807

Dia Skills are one of the things that makes @diabrowser so powerful. Brave doesn’t have this in Leo, and no, you can’t just “”get a Chrome extension to do this for you”” 🫠 Here’s how @joshm and team started with Skills and some of the rad things you can do with them today, a Dia https://x.com/morganlinton/status/1942589297200390165

My top 4 features from @browsercompany’s new Dia Browser so far: ⚡ CMD+T → chat with AI instantly 🧠 CMD+E → ask Dia about the current page (no more copy-paste into GPT) ✍️ Select text → CMD+E → ask to revise my writing → replace 🔗 Type @ → pull context from other tabs https://x.com/zineanteoh/status/1909618736199598276

New Dia browser came out to be a great tool to keep stay updated with the latest dev drama without watching the whole 40 minute video 😅 Simple prompt for video summary and boom, you saved yourself 40 minutes. https://x.com/vasilije_luka/status/1942900540397998574

Quickly chat with any pdf with Dia browser open any pdf in with Dia ask anything like you’d do it with any llm + you can use any custom skill you have from Dia to speed up more https://x.com/pugni_vito/status/1942964581825200293

Talk to your youtube videos with AI, straight from your browser! Love this new AI Dia Browser @diabrowser https://x.com/diegocabezas01/status/1934066414257860610

This AI browser just made watching YouTube videos obsolete. It literally reads your screen and does everything for you ✨ How Dia AI Browser is changing everything: ✅ Summarizes entire YouTube videos in seconds ✅ Creates custom automation skills with one command ✅ Manages https://x.com/JulianGoldieSEO/status/1942795852474360068

Trying out pair browsing with the DIA browser for the first time—writing this post as part of the experiment! https://x.com/cleeeeeeeeement/status/1932861729664377103

Using Dia Browser is a super power. Just used it to quickly summarize spending just by having account info open in a tab. Mind blown. 🤯 @browsercompany”” / X https://x.com/talkaboutdesign/status/1933120237282472337

The @browsercompany team had it all: millions of users, Chrome’s former lead, Silicon Valley darling status. They threw it away to build Dia—a browser that learns from every tab you open. They shared the story with @danshipper on AI &I. https://x.com/every/status/1940427109467570430

Copilot on Windows: Vision Desktop Share begins rolling out to Windows Insiders | Windows Insider Blog https://blogs.windows.com/windows-insider/2025/07/15/copilot-on-windows-vision-desktop-share-begins-rolling-out-to-windows-insiders/

Three things: a deep research model with enhanced search browser; a revolutionary computer-use operator; and a sandboxed terminal to execute math and code. A browser, a computer, a terminal… are you getting it? These are not three separate agents. This is one agent, and we https://x.com/swyx/status/1945904109766459522

Great to see all the work we put it into the search layer paying off when it comes integrated natively in an agentic browser. There shouldn’t be a need for the user to figure out when to use what tool or modes. Everything should blend together like a perfectly played orchestra.”” / X https://x.com/AravSrinivas/status/1945136929218953577

💥 Announcing ChatGPT agent: a powerful new agent that can use a computer, browse the web, write code, use a terminal, write reports, create images, edit spreadsheets, and even create slides for you. The slides often… need some work. But you know how this goes: first it’s https://x.com/kevinweil/status/1945896640780390631

ChatGPT agent for finding a great Airbnb:”” / X https://x.com/gdb/status/1946075573476069580

ChatGPT agent is ready to introduce itself. https://x.com/OpenAI/status/1945890050077782149

ChatGPT can now do work for you using its own computer. Introducing ChatGPT agent—a unified agentic system combining Operator’s action-taking remote browser, deep research’s web synthesis, and ChatGPT’s conversational strengths. https://x.com/OpenAI/status/1945904743148323285

Introducing ChatGPT agent: bridging research and action | OpenAI https://openai.com/index/introducing-chatgpt-agent/

Just launched ChatGPT Agent (sorry GPT-5 waiters, it is coming!), the most capable AI agent model to date! It has been such an honor to be part of a crazy sprint to get this amazing model trained and shipped together with an absolutely gem team (@isafulf , @caseychu9 ,”” / X https://x.com/xikun_zhang_/status/1945895070269583554

OpenAI’s New ChatGPT Agent Tries to Do It All | WIRED https://www.wired.com/story/openai-chatgpt-agent-launch/

RT @emollick: I had early access & ChatGPT agent is, I think, a big step forward for getting AIs to do real work Even at this stage, it do…”” / X https://x.com/nickaturley/status/1945975092342841487

tip for chatgpt agent slides: first ask it to do the research only, then ask it to make the slides!”” / X https://x.com/isafulf/status/1946231119751545014

Vibe Check: OpenAI Enters the Browser Wars With ChatGPT Agent https://every.to/vibe-check/vibe-check-openai-enters-the-browser-wars-with-chatgpt-agent

When we founded OpenAI (10 years ago!!), one of our goals was to create an agent that could use a computer the same way as a human — with keyboard, mouse, and screen pixels. ChatGPT Agent is a big step towards that vision, and bringing its benefits to the world thoughtfully.”” / X https://x.com/gdb/status/1945923067403984979

You can ask ChatGPT Agent to train an AI on datasets you are interested in, and do analyses for you. Building AI and doing data analysis will be automated end-to-end in the future. You are hearing it right. We are working hard to automating our own job :)”” / X https://x.com/xikun_zhang_/status/1946278266786189744

ChatGPT Agent has lower performance than o3 on PaperBench, SWE-Bench verified, OpenAI PRs and OpenAI Research Engineer Interview questions https://x.com/scaling01/status/1945932154455695752

A new agentic browser just shipped from Perplexity and it’s pretty wild. Watch this video of @PerplexityComet taking over my LinkedIn tab and taking actions on my part. Interesting UX where the tab glows blue as it’s taking actions. I like the integration of agentic actions https://x.com/ryancarson/status/1942962447369036201

AI-powered browsers like Perplexity’s Comet promise to do your web surfing for you. But do they really save time, or just add more noise? 🌐 https://x.com/fdaudens/status/1945121374063698080

Ask Comet to book a meeting or send an email. Comet transforms entire sessions into single, seamless interactions. https://x.com/PerplexityComet/status/1943026179960873207

asked @PerplexityComet to load up our brand colors in @MeetGamma then shifted my focus to building the actual content of the deck https://x.com/jennysvng/status/1943074383091671529

Been using @PerplexityComet, and there are soo many new use cases for it, but this has got to be one of my favs: I received a verification link sent to my Gmail, and I asked Comet Assistant to click it and verify me on my behalf. And it did it! Simple yet useful ^_^ https://x.com/_Matskuu/status/1942977239974400170

BREAKING 🚨: Comet Browser can now control an open web page from a sidecar! Now it can simply take it over and click around. Making Comet to publish a blog post for me 👀 https://x.com/testingcatalog/status/1928546603448562087

Browse at the speed of thought. https://x.com/PerplexityComet/status/1942968195419361290

Comet browser applying for a job for me 👀 Soon, you will be able to execute such things on a schedule. https://x.com/testingcatalog/status/1926043202684854674

Comet has become a natural extension of all my workflows, ideas, and content since I started using it. I can easily recall any saved information and connect to all of my personal knowledge management tools. Effortless networked intelligence. Proud of this team! https://x.com/camerontstow/status/1943047355944833153

Comet… is nuts. I asked it to go find the subreddits that people would ask cooking questions on. Then, find common questions and come up with ad angles for those questions for Hexclad. For kicks, I asked it to make a static ad for me with my fav angle Results. Are. Insane. https://x.com/NathanSnell/status/1943095214932943291

cool query on my comet browser for handling my X addiction. https://x.com/AravSrinivas/status/1912592179291385896

First test of Perplexity’s new agentic browser, Comet 👇 Comet authenticates into your accounts (e.g. email, calendar) to take actions on your behalf. It pulled a list of all my email newsletters, and unsubscribed from the specific ones I asked it to 🤯 https://x.com/omooretweets/status/1943078090718220653

Hooolllyyy crap. Perplexity’s comet browser is insane. Operator was a total dud. Manus is better but meh. Videos coming. I asked it to duplicate a meta campaign for me. No problem. All automated. Anyone want me to try anything specific? https://x.com/NathanSnell/status/1943062637656338805

How to watch YouTube on Comet https://x.com/AravSrinivas/status/1946240617031606672

I feel like I’m living in the future right now. Been using the new browser called Comet from @perplexity_ai (thanks @AravSrinivas for getting me access!) Like millions of others, I spend hours and hours a day in a browser. Specifically, Chrome. And, Chrome hasn’t”” / X https://x.com/dharmesh/status/1943084541733933189

Let Comet handle the customer support reps for you. Customer support is already a lot of AI anyway. So let your AI talk to the other AIs while you watch YouTube or do some work :-)”” / X https://x.com/AravSrinivas/status/1944778316323717437

Memory is magic when it works. Comet is “memory-native” – the closest approximation of truly understanding the user there is. https://x.com/AravSrinivas/status/1944078543324844077

Perplexity Comet https://comet.perplexity.ai/

Perplexity Comet vs ChatGPT Agent”” / X https://x.com/AravSrinivas/status/1946076236683624616

PERPLEXITY COMET WORKS ON DUNE FOR CONTENT IDEATION!!!! SO COOL! https://x.com/0xDataWolf/status/1943265415322595630

Perplexity is testing new feature with Comet browser which will be able to just go out there and do things for you via prompts. Exciting times ahead https://x.com/AIProductPM/status/1940108252559081764

Prime Day Shopping with Comet. User saves $280 in less than 5 minutes by asking Comet to compare prices.”” / X https://x.com/AravSrinivas/status/1944183680915714548

RT @itsPaulAi: Perplexity Comet can automate any task in your browser This is the first time you REALLY have an AI agent working autonomou…”” / X https://x.com/denisyarats/status/1945321982725382170

RT @PerplexityComet: Clean up your inbox. Ask Comet to unsubscribe you from spam and unwanted emails. https://x.com/AravSrinivas/status/1945232153609978273

RT @rowancheung: Perplexity Comet is not like other agents I’ve been testing it all week, and it’s starting to actually *stick* Having in…”” / X https://x.com/AravSrinivas/status/1945620938068037633

The Cursor for Web Browsing, is here. And it’s better than Comet at turning your open tabs and bookmarks into a codebase. Here is a full breakdown of how i’m using @diabrowser Exploring the Future of Browsing with DIA Browser: Essential Features for Content Creators & https://x.com/rileybrown_ai/status/1943041778304847889

The most interesting thing about Perplexity Comet is that it can actually do things in Cal / Gmail Ex. I asked it to reschedule a 1:1 – it moved the invite and sent an email Neither Google nor OpenAI have done this in their agents…maybe for safety reasons, but it’s limiting 🤔 https://x.com/omooretweets/status/1943116119243416009

The TAM for Comet is bigger than Perplexity because it appeals to people who don’t even want AI. Just the best core browser in the market at the end of the day.”” / X https://x.com/AravSrinivas/status/1946035102150238475

USE CASE 2: Cross-tab product comparison If you’re looking for a new product or looking for flights, Comet can compare tabs in real time It’s surprisingly fast and analyzes the reviews of the tabs too https://x.com/rowancheung/status/1945524017915674879

USE CASE 3: Summarize any YT video with a click You can summarize + chat with any long YT video and get key moments This is also possible in Gemini, but having it in the browser means you can watch the video AND chat/learn with Comet in the side tab at the same time https://x.com/rowancheung/status/1945524019681480992

Vibe coding with @PerplexityComet – asked the browser agent to build me a simple (locally run) yt-dlp wrapper. It navigated to github,created the repo, wrote/committed/pushed the code. You can even make changes to your code from the sidecar, feels like an AI IDE lmao 😂 https://x.com/killuaz0ldyck07/status/1942976067075281248

When you’re on Comet, you’re operating at an abstraction above which AI to use and how to pull in relevant context. Agents are powerful and operate like a human would to complete the task. You go from chat turns to end-to-end workflows. https://x.com/AravSrinivas/status/1944024356138758367

New AI features in Google Search: Call a business or do research
https://blog.google/products/search/deep-search-business-calling-google-search/

We’re bringing Gemini 2.5 Pro to AI Mode: giving you access to our most intelligent AI model, right in @Google Search. With its advanced reasoning capabilities, watch how it can tackle incredibly difficult math problems, with links to learn more ↓ https://x.com/GoogleDeepMind/status/1945515683451736246

In case you missed it!💫 The MCP Illustrated guidebook is here and it covers: – The fundamentals of MCP – Explanations with visuals and code – 11 hands-on projects for AI engineers Projects covered: 1. Build a 100% local MCP Client 2. MCP-powered Agentic RAG 3. MCP-powered https://x.com/akshay_pachaar/status/1938883016450785625

Introducing Asimov: The Code Research Agent for Engineering Teams | Reflection AI https://reflection.ai/blog/introducing-asimov/

Introducing Kiro – “Kiro’s strength is getting those prototypes into production systems with features such as specs and hooks.” Kiro https://kiro.dev/blog/introducing-kiro/

Introducing Amazon Bedrock AgentCore: Securely deploy and operate AI agents at any scale (preview) | AWS News Blog https://aws.amazon.com/blogs/aws/introducing-amazon-bedrock-agentcore-securely-deploy-and-operate-ai-agents-at-any-scale/

🎥 Want the text from any YouTube video? Now you can — no plugins, no installs. Just drop the link, and our YouTube MCP turns it into text instantly. Try it now with this Agent: https://x.com/OmniMCP/status/1942855673324397021

Whenever I looked into having a personal assistant, it struck me how few of our existing structures support intermediate permissions. Either a person acts fully on your behalf and can basically defraud you, or they can’t do anything useful. I wonder if AI agents will change that.”” / X https://x.com/AmandaAskell/status/1946253987923304699

An MCP Server for Legal Research (SCOTUS Opinions) 🧑‍⚖️ In less than 10 minutes I indexed 100+ Supreme Court opinions from 2022-2024, using LlamaCloud to parse/index the data with really high accuracy, and then made it available as an MCP server to any AI client. You can then use https://x.com/jerryjliu0/status/1941181730536444134

I built an MCP server for editing videos It takes Google Drive video links and editing instructions You can export the output to any video editor https://x.com/itstundealao/status/1939675731077796099

I just built an AI Voice Agent using n8n + Retell AI that: ✅Calls past customers automatically ✅Books appointments in real-time ✅Summarizes calls & updates your CRM ✅No human needed, runs 24/7 @omoalhajaabiola is a blessing to this generation #Automotive #RealEstate #Garage https://x.com/Asoft001/status/1940682520364044737

NotebookLM introduces curated featured notebooks with partners https://blog.google/technology/google-labs/notebooklm-featured-notebooks/

OpenAI Agent mode benchmarks! ~42% on HLE ~27% on FrontierMath https://x.com/scaling01/status/1945895473430089947

Lovable just raised $200M at a $1.8B valuation led by Accel. This all started unexpectedly with me calling my friend at 6AM to go for a walk. I’ve never shared this story before: (thread) https://x.com/antonosika/status/1945899512503112035

I built a voice assistant that analyzes the entire stock market. Built my backend and MCP endpoint using FastAPI on Python and it works. This was exciting to build ngl ❤️. https://x.com/dnaijatechguy/status/1940375435017384271

AI Is Already Showing Signs of Slashing Job Openings in the UK – Bloomberg https://www.bloomberg.com/news/articles/2025-07-13/ai-is-already-showing-signs-of-slashing-job-openings-in-the-uk

Introducing Orchids – the world’s first AI tool that lets you chat with AI to build apps and websites that don’t look and feel “”AI generated””. On internal benchmarks, Orchids performs close to 3x better on general app and website creation tasks than any other tool on the market. https://x.com/kevinlu625/status/1942252767999111250

New study warns of risks in AI mental health tools | Stanford Report https://news.stanford.edu/stories/2025/06/ai-mental-health-care-tools-dangers-risks

Coming off @Google IO, we’ve made it possible to build AI Agents with real-time data from verified sources via Google ADK + Dappier 🧠⚡ – Define agents and tools using Google ADK – Plug into Dappier for web search + latest data for stocks, sports, news, and more https://x.com/DappierAI/status/1928430036257759269

It looks like scale + tool use + multimodal remains the chosen path forward.”” / X https://x.com/emollick/status/1943169759312322604

A hidden, but powerful feature — you can create scheduled tasks with ChatGPT Agent! Agent can, in the background, regularly search the web or your connectors and take action on the web, including on authenticated sites https://x.com/neelajj/status/1945945913014546805

GPT Agent Now rolled out to 100% of Pro users. Due to higher than expected demand, Plus and Team users will begin getting access Monday.”” / X https://x.com/OpenAI/status/1946024465214935279

Reasoning reimagined: Introducing Phi-4-mini-flash-reasoning | Microsoft Azure Blog https://azure.microsoft.com/en-us/blog/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning/

AI-powered web browsers want to help people save time. But are they effective? https://www.nbcnews.com/tech/tech-news/ai-powered-web-browsers-want-help-people-time-are-effective-rcna218039

RT @DSPyOSS: 🤯 New research deploys DSPy-optimized system in real-world medical settings. Finds 70% increase in positive patient feedback.…”” / X https://x.com/lateinteraction/status/1946328354740691103

Recent WebSailor paper by Alibaba-NLP, shows how to post-train models for Deep Research – good insights in there, about creating a dataset then training recipe. I particularly like how the agentic RL at the end of post-training improves scores by ~4 p.p. across the board: RL + https://x.com/AymericRoucher/status/1945870603275403693

“a guy created a dataset of 50 books from London 1800-1850 for LLM training. no modern bias. it’s actually super cool to see what can be trained on it! ”
https://x.com/Hesamation/status/1944839882968588446

Introducing Devstral Small and Medium 2507! This latest update offers improved performance and cost efficiency, perfectly suited for coding agents and software engineering tasks. https://x.com/MistralAI/status/1943316390863118716

We’ve published a position paper, with many across the industry, calling for work on chain-of-thought faithfulness. This is an opportunity to train models to be interpretable. We’re investing in this area at OpenAI, and this perspective is reflected in our products:”” / X https://x.com/gdb/status/1945350912668737701

1/N Yesterday in Tokyo we @OpenAI ran a 10‑hour live Humans vs AI exhibition at the AtCoder World Tour Finals Heuristic. We pointed an OpenAI reasoning model at the same brutal problem the finalists tackled—no human help, same rules, same clock. Buckle up. 👇 https://x.com/andresnds/status/1945655797314154762

🔄 🤖 Pipeline of Agents Pattern A modular pattern for AI agent workflows, showcasing sequential agent chaining and state isolation through a cybersecurity implementation. Built for production with error handling and integrated tooling. Learn more here 📚 https://x.com/LangChainAI/status/1944426659639431477

🤖 New on the Nebius blog: Agent 101 – Launching production‑grade AI agents at scale From proof‑of‑concept to production-level systems, we break down the full stack: ✅ LLMs ✅Frameworks: @crewaiinc, @LangChainAI, @Google ADK, @AgnoAgi ✅Observability: @helicone_ai, https://x.com/nebiusaistudio/status/1943625245572600177

🤖💼 AI Headhunter Agent Build a powerful LinkedIn recruitment agent that automates candidate sourcing using LangGraph. Features state management, parallel processing, and real-time updates through Davia integration. Watch the full tutorial here 📺 https://x.com/LangChainAI/status/1944441795234136265

🚀 Just built my first AI agent using Google’s ADK (AI Dev Kit)! 🌦️ It fetches real-time weather updates from natural language input using tool-calling. Excited to expand its capabilities next! #AI #GoogleADK #AIagents #GenAI #WeatherApp #BuildInPublic #DeveloperJourney https://x.com/ChinmayBoradee/status/1933919696119619624

Absolutely delux Github repository. lots of code-first tutorials covering every layer of production-grade GenAI agents by @NirDiamantAI https://x.com/rohanpaul_ai/status/1945750323404185655

AI agents are leveling up—reasoning, planning, acting, and working together. 🤯 Philippe Charrière shows us how to use Google’s Agent Development Kit with Docker Model Runner + LiteLLM. 💻 Read: https://x.com/Docker/status/1928825074812903499

AI Personal Assistant I built a simple one today that reads Gmail, understands prompts, and sends replies. On to the next. #n8n https://x.com/AFB13_/status/1940530925160616239

Announcing Proactor v1.0, world’s first ‘self-active’ AI teammate. Proactor v1.0 is super-smart. It can perceive, think, and act by itself. No prompt, no hot key, pure initiative. Someone trying to cheat on you? Proactor can auto fact-check and call them out. Say you’re going https://x.com/Proactor_ai/status/1942584369623036005

Announcing replit.md 📝 A collaborative document between you and Agent about project architecture and your preferences for it.”” / X https://x.com/amasad/status/1942234183746908376

Anthropic just dropped a free course on building AI Apps with MCP. Learn to connect AI Agents to external data sources like GitHub, Google Docs, local files using MCP. 100% free. https://x.com/Saboo_Shubham_/status/1929916710682783915

ARC-AGI-3 scores 0% for AI, 100% for humans now live with API where you can test your agent: https://x.com/scaling01/status/1946261191782797717

Been Working On A Very Cool AI Agents project for Real Estate Agents that Handles Questions, Schedules Viewings, and answer questions about property and the company and Fires Email and Slack Notification And Sell the Buyers on the Houses. Using Langchain and ADK. https://x.com/o7_sidharth/status/1934598487649501692

Blog v2.0 @AgentOpsAI now live! bringing you one step closer to the world of agent observability, infra, and ops https://x.com/n_sri_laasya/status/1943000976522383387

BrowserOS – Open-Source Agentic Browser https://www.browseros.com/

Build a Customer Support Ticket Agent with Structured output using Google Agent Development Kit. 100% Opensource code with step-by-step tutorial:”” / X https://x.com/Saboo_Shubham_/status/1941685282375696888

Coding Triangle makes models face every part of real programming work. Most checks only ask whether a program passes a small script. Coding Triangle asks if a model also explains its logic and hunts tricky inputs. Researchers test 4 large models on 200 AtCoder tasks. Each https://x.com/rohanpaul_ai/status/1944326434245189812

For those unaware of just how important caching is for coding agents, 88% of my tokens in Cursor are cache reads. This means that 88% of my usage is enjoying 10x savings compared to no caching. https://x.com/nrehiew_/status/1945638580673552408

Here are a few other things you can do: – research bios of everyone you’re meeting – summarize the most important emails you missed – edit generated emails / calendar events All without ever leaving the tab We literally sent thousands of invites while developing this and I’m”” / X https://x.com/Ali_Shobeiri/status/1943151185822912645

I built a full-blown research machine using n8n. Here are the workflows powering it (steal them)👇 https://x.com/aiedge_/status/1940236498877784469

Join @Redisinc and @LangChainAI on July 23 for a webinar to learn how LangGraph + Redis make it easy to build AI agents with real memory. In the live webinar, you’ll get demos, performance tips, and hear from the engineers behind the integration. 👉 RSVP here: https://x.com/LangChainAI/status/1946317782741832020

Jules now supports Bun as a JavaScript runtime. If your project uses Bun instead of Node, it’ll just work, no extra setup needed. Read more about the Jules environment! https://x.com/julesagent/status/1946341638764388708

Just added Moonshot AI as a provider in Cline 🫡 https://x.com/cline/status/1945164549134672373

MiniMax is the James Bond of AI agents. I tested every AI website builder. Only one created something I’d actually use. Minimax M1 Quality Indicators: → Pixel-perfect design execution → Functional interactive elements → Professional multimedia integration → Responsive https://x.com/JulianGoldieSEO/status/1942303502702895287

Multi-agent coding systems are *crazy* good. Feels like cheating. 2-3x better than single-agent. Tutorial + prompts + code coming soon. Watch 2min sneak peak. https://x.com/mckaywrigley/status/1944123521006788818

n8n might have a problem… @pipedream just dropped String, an AI agent that builds AI agents… Here’s what it does: – Prompt and deploy in seconds – 10x easier than no-code builders – Auto-handles auth & troubleshooting I’m testing it now and honestly? It’s pretty damn https://x.com/tomcrawshaw01/status/1942745648736166146

RT @MishaLaskin: Engineers spend 70% of their time understanding code, not writing it. That’s why we built Asimov at @reflection_ai. The…”” / X https://x.com/swyx/status/1945503020177068506

Sometimes you get lucky with vibe coding. These days, I rely less on luck and get better results by focusing on context engineering. I built this fully functional deep research agent with Replit Agent and n8n in <10 mins. And it’s deployed too! What a time to be alive! https://x.com/omarsar0/status/1940527082779693108

The UX of AI doesn’t exist yet. Imagination, taste, and obsessing over the agent-human feedback loop—that’s how we’ll get it right. Meet @tryramp’s first step into agentic orchestration. Spoiler alert, it’s not just chat. https://x.com/diegozaks/status/1943323500464259526

turbopuffer makes it too easy to build state-of-the-art AI apps”” – Notion https://x.com/turbopuffer/status/1945865085530026359

Unless and until agents really do work at expert level, the benefits of AI use are going to be contingent on the skills of the AI user, the jagged abilities of the AI you use, the process into which you integrate use, the experience you have with the AI system & the task itself”” / X https://x.com/emollick/status/1943476555742957864

USE CASE 4: Grab highlights from social feeds If you don’t want to ever open up LI, it can scroll your feed for you and tell you whats happening today Getting an AI agent to doom scroll for you is a next gen life hack 😂 https://x.com/rowancheung/status/1945524021455675699

We’re entering the AI Agent era. I’ve been using #A2A + #ADK to build internal agents, and one thing is clear. Agents work best with a single, focused role. Like micro services, they link together to form larger systems but with some big differences: 🧵”” / X https://x.com/rootwarp/status/1930958894815449143

We’ve open-sourced Trae-Agent. You can all `git clone` `cd trae-agent` now https://x.com/Trae_ai/status/1941019035141132693

Why not all vector databases are agent-ready? Vector database is the backbone of AI agent memory. So its choice is one of the most crucial infrastructure decision. Your demo might crush it, but can your infrastructure survive success? Most vector databases weren’t built for https://x.com/TheTuringPost/status/1946342951199871340

You’re building AI agents the wrong way. @tyler_agg (PhD) shared what actually works in production: – Map the manual process completely first – Avoid autonomous agents for consistency – Design prompts like you’re teaching a novice Get the deep dive video in the post below 👇 https://x.com/tomcrawshaw01/status/1942234714267586901

If we compared AI capabilities against humans with no access to tools, such as the internet, we would probably find that AI already outperformed humans at many or most cognitive tasks we perform at work. But of course this is not a helpful comparison and doesn’t tell us much”” / X https://x.com/random_walker/status/1946180439045018046

.@gumloop_ai’s MCP nodes feel like the future of no-code AI automations. I spun up a custom reddit node to get trending posts in PLAIN ENGLISH – wow. It feels like lovable for AI Automations. “”Vibe Automation”” may soon become a thing with MCP nodes. Link to full https://x.com/dswharshit/status/1932788275787542776

Claude autonomously managing data on LinkedDataHub using MCP tools 😌 In 2 minutes it has created a tourist guide to Copenhagen as an interactive document that reuses Linked Open Data resources. Stay tuned for updates! 🚀 https://x.com/namedgraph/status/1940072384046014902

Cursor ultra tip connect cursor to github using gitmvp mcp ask ai to build any feature get code in one prompt🤯 https://x.com/devloperVivek/status/1941894778687479915

Here’s the easiest way to build any MCP server: – Download the FastMCP repo with GitIngest. – Give it to @FactoryAI and specify the MCP server to build. Factory generates production-ready code with README, usage, error-handling—everything! https://x.com/DailyDoseOfDS_/status/1934544048897060955

I am about to launch the Context Engineer MCP. It’s packed with a lot of cool features. – it can detect when the user asks for a complex feature and triggers structured planning. – it gathers requirements by asking questions and analyzing your code base – it generates PRD, https://x.com/alessiocarra_/status/1939717386396975577

I’ve been using AI to automate literally everything – proposals, task assigning, updates, all of it. Running it on a custom Notion MCP + Cursor + Telegram bot. Super convenient, saves me hours daily. Want something like this for ur workflow? reply below 👇 https://x.com/0xratnakar/status/1941131957548835290

if you havent tried the Chrome + iMessage + Apple Notes + Linear + Gmail + GCal DXT integrations in Claude you are missing out the literal LLM OS evolution Smarter Siri is here; it’s just called Claude Desktop https://x.com/swyx/status/1945734758102868243

If you want to massively speed up your iteration on shipping, you’ve GOT to use playwright mcp and tell your agent how to use it in your https://x.com/ryancarson/status/1943295167491871112

introducing furi. the fastest way to ship agents that work with any mcp server. a single line to add it in your codebase. https://x.com/winzyes/status/1940851985705836656

MCP is redefining what AI can do, and 95% of builders haven’t caught on. I’ve built agents that run my calendar, organize files, and manage tasks. All autonomous. The results changed how I think about automation. I’m breaking down exactly how MCP works (with code + examples): https://x.com/LoicReco/status/1940424881411354975

MCP-Unity: Game development with Unity Engine https://x.com/MCP_Community/status/1940687411442602385

Setting up your own MCP server shouldn’t take hours. With Composio, you can launch a fully functional MCP server straight from the dashboard in just a few clicks. No setup scripts. No infra headaches. No complex configs. Just a clean, intuitive interface where you click, https://x.com/apoorv_taneja/status/1939909421905400185

This Claude MCP AI Agent replaces your $200K+ Operations Teams. I probably shouldn’t be sharing the exact system for free… while I was trying to catch Pikachu at 3am on Pokemon Go, it audited my entire business, found 12 bottlenecks, and built me 5 production-ready n8n agents https://x.com/aryanXmahajan/status/1942652342068994218

Tip: @claude_code is useful as a general agent, beyond coding. Name your files well. For mac users, Apple Notes and iMessages exists in your local file system. With your permission, @claude_code will be able to help with them. Example usages : /journal command that will”” / X https://x.com/claude_code/status/1944944964708000083

Turn any React app into an MCP client in just 3 lines of code. Cloudflare’s use-mcp library lets you connect React app to any MCP server with built-in authentication. 100% Opensource. https://x.com/Saboo_Shubham_/status/1936611861879021684

Wow! Here’s a goldmine of 450+ MCP servers, all pre-containerized in a single repo! 🤯 ✔️ No manual setup! You just pull the image 🛡️ Runs safely in isolated containers 🔁 Auto-updated via Nixpacks By far the easiest and safest way to use MCP servers! Repo link below ↓ https://x.com/DataChaz/status/1933499735064580142

You can now turn any program into a self-improving AI agent Simply add the new AgentOps MPC server to your favorite AI coding tool for improved data extraction, and let Cursor code it for you. AI agents can code themselves, troubleshoot, and self-repair. Here’s how: https://x.com/AgentOpsAI/status/1938601964918432109

Here’s my 1hr tutorial on how to use Claude Code for notes & research. 10x your notes with: – core agentic flows – custom commands – automated tags/links – subagents – cloud usage – stt The goal is to “agent-pill” you on the future of work. Watch for 10 tips + demos in 61min. https://x.com/mckaywrigley/status/1943034127462339060

Now you can supercharge your terminal with MCP servers (open-source). MCP CLI lets you interact with local and remote MCP servers, built with a rich UI, and full LLM provider integration. You can run tools, manage conversations, or automate workflows directly from your https://x.com/_avichawla/status/1941746751981093034

🚀 Qwen Chat for Desktop is here! 💻 All the power of Qwen Chat — now with MCP support for smarter, faster agents. ⚡️ Run MCP Server, boost productivity, and stay in control. 📥 Grab it now: https://x.com/Alibaba_Qwen/status/1943692825566355819

The latest MLX Swift supports tvOS 📺 ! You can use the same code for macOS, iOS (iPhone, iPad), visionOS (Vision Pro), and now tvOS (Apple TV). https://x.com/awnihannun/status/1944893455202967921

Just discovered PutnamBench for theorem proving : from a problem & its answer, models must generate a formally correct mathematical proof 👌 (or from the problem -> the proof and answer) Nice one to evaluate actual reasoning/logic capabilities (though in formal languages)”” / X https://x.com/clefourrier/status/1945386312212664804

just tried and the agent solved level 1 in its own browser lol. thanks for creating the benchmark! https://x.com/EdwardSun0909/status/1946304932333940899

🎬 Built my own AI Email Agent using @n8n_io + OpenAI + Gmail! Watch it read messages, generate smart replies, and auto-send emails—fully automated! ⚡📩 #AI #n8n #OpenAI #NoCode #Automation #AIAgent https://x.com/HeetDayaniBJP/status/1939587941267087364

Building in human-in-the-loop in a full-stack app requires agent orchestration that can pause, wait for human input in one endpoint, and resume the workflow from another endpoint. This is a great tutorial by @rsrohan99 showing you how to build an e2e agent with human-in-the-loop”” / X https://x.com/jerryjliu0/status/1946003574904987743

Built a Voice AI Agent for a real estate client — helped close $43,000 in deals. Set it up in just 20 minutes using *n8n* + AI. Fully automated, no manual follow-ups needed. Want the full setup + workflow? Like, RT & reply “Ai” — I’ll DM it (make sure you’re following). https://x.com/Sitara_AI7/status/1941315302194807174

Built my first n8n Email Summary Agent! #DEMO ✅ Automated workflow that summarizes emails 2x daily with AI Results: 2+ hours saved daily, zero missed actions! Real automation solving Real problems 💪 Stay tuned on my AI workflow Automation Journey, Thanks to @omoalhajaabiola https://x.com/KEmehige/status/1941978990635319419

Meet Context, the first AI office suite. Humanity spends 2.5 trillion hours a year on office work. Context can one-shot most of it. Welcome to the era of vibe-working. Sign up today or tag @contextsuite with a prompt. https://x.com/josephsemrai/status/1942271504009551986

Seeing lots of questions like: wait, I thought Windsurf was already acquired? What is Cognition buying? Let me explain. Windsurf the company is an *extraordinary* asset. It was missing its founders and research team, but it has a beloved product, valuable IP, an incredible”” / X https://x.com/russelljkaplan/status/1944845868273709520

The Tiny Teams Playbook – by Shawn swyx Wang – Latent.Space https://www.latent.space/p/tiny

Show HN: ArchGW – An intelligent edge and service proxy for agents | Hacker News https://news.ycombinator.com/item?id=44546265

guidde・Magically create video documentation with AI https://www.guidde.com/

There is a 10% chance that ChatGPT agent will actually gamble away your life savings if you asked it https://x.com/scaling01/status/1945930617775882728

today’s LLMs have reduced the cost of mediocrity to next-to-nothing unfortunately, the cost of greatness remains high as it’s ever been”” / X https://x.com/jxmnop/status/1944806459868381313

The research on AI companions and mental health is still very preliminary & unclear as to long-term impact. Seems like an important topic to research right now. (I would also hope that xAI is tracking anonymized data about their new companion product for known potential harms) https://x.com/emollick/status/1945593158190207096

We ourselves are enthusiastic users of AI in our scientific workflows. On a day-to-day basis, it all feels very exciting. But the impact of AI on science as an institution, rather than individual scientists, is a different question that demands a different kind of analysis. https://x.com/random_walker/status/1945849588805447743

I made my first internet dollar(s) this weekend🤯 Been releasing free resources (tutorials, source code) for a while now, esp. on YouTube. Decided to release a Builder Pack that helps accelerate agent development with Google’s ADK/A2A So grateful & humbled for the support ❤ https://x.com/chongdashu/status/1934599562959720607

Just dropped: ADK-Agent-Examples! Build AI agents with @Google’s Agent Development Kit + Nebius AI Studio. From multi-agent pipelines to tool integrations, it’s a playground for LLM-powered apps. 🧠 Powered by @AIatMeta Llama 3, @Alibaba_Qwen 3, @deepseek_ai R1, via Nebius https://x.com/nebiusaistudio/status/1927759640961466404

Host MCP servers on Cloud Run! Need a hosting platform to support the tools and resources your AI agents interact with? Deploy MCP servers to Cloud Run to take advantage of Cloud Run’s pay-per-use, automatic scaling infrastructure with GPU instances → https://x.com/GoogleCloudTech/status/1940825366936813846

Built something fun: Dia Browser insertion cursor but works everywhere https://x.com/naveennaidu_m/status/1889554727362593215

so I integrated the google_search tool for my agent to be able to search the internet. Here, I asked ‘tory’ my agent, what infoFi was, and it told me! if this isn’t cool, I don’t know what is. https://x.com/islathebuilder/status/1930913771284738112

Holy Shit! Dia Browser (@diabrowser) killed 𝕏’s X Pro feature with their “”Split View Pane”” feature. 🤯 https://x.com/MehulFanawala/status/1940640193008288021

Maybe we should buy Cline 😅😅😅”” / X https://x.com/ClementDelangue/status/1946288857814610057

Fun new @gumloop_ai flow I’m using: It scrapes a brand’s static ad library, writes a JSON description of each ad, then rewrites the JSON for YOUR brand based on your visual guidelines and product features. Getting outputs like the attached. No designer needed. https://x.com/peterczepiga/status/1933493132831711594

Built an AI system that turns any product URL into complete Facebook ads in minutes. Just drop in a product link → gets brand research, target audience, pain points → generates ad copy + image prompts → creates final ad creatives. All automated through n8n + Airtable. https://x.com/mikefutia/status/1941982241233486272

Le Chat dives deep. | Mistral AI https://mistral.ai/news/le-chat-dives-deep

RT @MistralAI: Introducing the world’s best (and open) speech recognition models! https://x.com/ClementDelangue/status/1945233605745135754

RT @MistralAI: Try Voxtral with an API call: https://x.com/ClementDelangue/status/1945233623164006523

the best part about the mistral release is that the models don’t loose as much on text – this has been a biggest pain point for a audioLMs for a long while https://x.com/reach_vb/status/1945140430288417007

Voxtral | Mistral AI https://mistral.ai/news/voxtral

RT @sama: watching chatgpt agent use a computer to do complex tasks has been a real “”feel the agi”” moment for me; something about seeing th…”” / X https://x.com/ShunyuYao12/status/1945917559796298083

Open Deep Research is here 🔍 We’ve open sourced one of the most powerful agent use cases. Built on LangGraph, Open Deep Research: • Uses a supervisor architecture to coordinate research sub-agents • Supports your own LLMs, tools, and MCP servers • Produces high-quality https://x.com/LangChainAI/status/1945514869224357904

RT @liliang_ren: We’re open-sourcing the pre-training code for Phi4-mini-Flash, our SoTA hybrid model that delivers 10× faster reasoning th…”” / X https://x.com/ClementDelangue/status/1946246738823545317

I used to spend hours on blog outlines. Now it takes 10 seconds. This AI agent builds your blog outline before you finish your coffee ☕ . We built a new AI agent using @gumloop_ai that takes a keyword/topic and instantly gives you a full SEO-ready blog outline – backed by https://x.com/kavin_lazyy/status/1934956454789914702

Pixelesq – AI-Native No-Code Website Builder https://pixelesq.com/

The Hitchhiker’s Guide to Productionizing Retrieval 🤖 This is a must-read starter by @itsclelia, in partnership with @qdrant_engine, for anyone looking to build a RAG/retrieval pipeline for your AI agents. It contains practical tips and reference repos 📚 for every stage of the https://x.com/jerryjliu0/status/1945647281782636974

Chain of Thought (CoT) monitoring could be a powerful tool for overseeing future AI systems—especially as they become more agentic. That’s why we’re backing a new research paper from a cross-institutional team of researchers pushing this work forward.”” / X https://x.com/OpenAI/status/1945156362859589955

chatgpt_agent_system_card_launch.pdf https://cdn.openai.com/pdf/6bcccca6-3b64-43cb-a66e-4647073142d7/chatgpt_agent_system_card_launch.pdf

Most agent tests stop at tiny teams and ignore how the bots actually coordinate. AGENTSNET, proposed in this paper, shows what happens when the crowd scales and asks for real teamwork. 🧩 The benchmark packs 5 classical distributed tasks, namely coloring, vertex cover, https://x.com/rohanpaul_ai/status/1945838145624297531

This large study of 187k developers using GitHub Copilot finds AI transforms nature of coding. Coders focus: more coding & less management. They need to coordinate less, working with fewer people They experiment more with new languages, which would increase earnings $1,683/year https://x.com/emollick/status/1943426233791869348

This paper proposes Agent Exchange, an auction market that lets agents bid, form teams, and split rewards fairly. ⚙️ AEX turns a plain user query into a fast bidding war The user side module converts the request into a checklist, then broadcasts it. Each hub collects offers https://x.com/rohanpaul_ai/status/1944341030209220983

today I learned that creating python packages is pure cancer like I have never written so many shitty .toml .yml .md files in my whole life”” / X https://x.com/scaling01/status/1944204739052175856

if you’ve received a weird text from me recently that actually claude code, i’ve given it complete control over outbound comms”” / X https://x.com/vikhyatk/status/1945224884180644150

RT @mckaywrigley: So I gave Claude Code a Mac Mini. And it’s called Claudeputer. It runs 24/7 and it’s allowed to do whatever it wants -…”” / X https://x.com/imjaredz/status/1946304612102816136

RT @tenderizzation: aha sorry that wasn’t me it was claude code crazy that a superintelligent AI thinks we should be together tho https://…”” / X https://x.com/vikhyatk/status/1945227514101617075

We’ve published a directory of apps and tools that connect to Claude with one click. Browse and connect Claude to @canva, @figma, @linear, @NotionHQ, @stripe, and more. https://x.com/AnthropicAI/status/1944819149789700215

I built an AI cold calling system using Bland and n8n, and it’s been running for two months, automatically closing deals for me. Plus, I’ve set up a public phone number you can call to test out the AI yourself: Give it a try at (702) 979-7805. In this updated video, you’ll see https://x.com/alxberman/status/1939753810504949954

USE CASE 5: LI connection request automation If you get a lot of LI connection requests, it can go through them, and figure out cool people you may have missed I get hundreds a day and it’s impossible to manage, so this one is quite useful to me personally https://x.com/rowancheung/status/1945524022852379016

A.I. Is About to Solve Loneliness. That’s a Problem | The New Yorker https://www.newyorker.com/magazine/2025/07/21/ai-is-about-to-solve-loneliness-thats-a-problem

🚑 Built a health chatbot that suggests the right drugs based on your symptoms like a mini-doctor in your pocket. Powered by Google ADK. Just smart automation. 📷Demo below 👇 #TechAndVibes #NaijaToTheWorld #CodeAndCulture #DeveloperLifestyle https://x.com/ali_ogochu2581/status/1934663525022007341

I’m building some Java and Python agents with ADK, and bumping into issues that I can’t (yet) find Google results for. Thankfully, we shipped llms.txt files for the ADK ( https://x.com/rseroter/status/1929292162593706321

USE CASE 1: Email and Calendar management With the Google Cal and Gmail integrations, it can summarize emails, schedule meetings, and manager calendar events directly from the browser It’s weird that Google hasn’t done this yet https://x.com/rowancheung/status/1945524016103796993

I built a scraper with @n8n_io to scrape tiktok videos and breaks down best performing viral hooks on tiktok account All you have to do is: 1. simply give it a profile link 2. Save hooks data in your google sheet 3. it will break down their top performing hooks for you Amazing https://x.com/heart_yi/status/1940054596694761623

[2507.06261] Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities https://arxiv.org/abs/2507.06261

Building a Multi-Agent Deep Researcher with Gemini 2.5 Pro 🧑‍🔬📑 We’re excited to collaborate with @_philschmid and the @googleaidevs team on a brand-new tutorial 🧑‍🏫 : Build a multi-agent system with a researcher, writer, review agent that can search the web, record input, https://x.com/jerryjliu0/status/1944882346731430127

Google engineers shifted to a sparse mixture‑of‑experts transformer that picks only the needed mini‑networks per token, so compute stays low while total capacity rises. —- Paper – arxiv. org/abs/2507.06261 Paper “”Gemini 2.5: Pushing the Frontier with Advanced https://x.com/rohanpaul_ai/status/1944022179869241354

Google’s Gemini 2.5 paper has 3295 authors https://x.com/hardmaru/status/1944385851435205035

New Guide! Learn how to build a multi-agent “Deep Research” system with Gemini 2.5 and @llama_index. It dynamically searches the web, takes notes, and writes a comprehend research report with a feedback loop 🚀 🔍 Search the web with google 📝 Take notes with a dedicated https://x.com/_philschmid/status/1944835088039977124

Today we are rolling out our first Gemini Embedding model, which ranks #1 on the MTEB leaderboard, as a generally available stable model. It is priced at $0.15 per million tokens and ready for at scale production use! https://x.com/OfficialLoganK/status/1944806630979461445

Tried out Google’s ADK(agent development kit) and legit, the inbuilt UI with Gemini Free API is wild 🤯 So easy to use and looks sick! #ADK #GoogleAI #GeminiAPI https://x.com/027_Priyanshu/status/1934106038632153243

What if you had a smart personal assistant living in your watch that could share info and manage tasks for you when your hands are full? 🧠 You’re about to find out. Meet Gemini, rolling out now on Wear OS 4+ watches: https://x.com/WearOSbyGoogle/status/1942961942693359894

Build your first AI agent + MCP Server in Python. Here is everything you need to build your first AI agent in less than 20 minutes. About the code you’ll see here: 1. I used Google ADK with Gemini Flash to power the agent 2. The agent connects to an MCP server 3. It also https://x.com/svpino/status/1929881755915366772

Gemini CLI can automate your computer using MCP 🔥 Add Windows MCP (or macOS MCP) to Gemini CLI and you can tell it what to do autonomously. Gemini then takes control of your entire system to achieve the goal you’ve set. Links below https://x.com/itsPaulAi/status/1940903613888696776

Someone vibe coded an Al Agent that can use your phone on its own. He outlines using this as a ChatGPT-like interface, except things actually get done automatically. The person built this using Google ADK and the Gemini API 💀 Credits: Tyrange-D via r/singularity https://x.com/DigestibleAICo/status/1924218874678960504

RT @OfficialLoganK: Today we are rolling out our first Gemini Embedding model, which ranks #1 on the MTEB leaderboard, as a generally avail…”” / X https://x.com/demishassabis/status/1944870402251219338

Gemini-CLI is bad compared to Claude code in very fixable ways codex-cli is bad in odd ways. Feels unfriendly, unlike the GUI version of Codex and unusual for product-strong OpenAI”” / X https://x.com/kylebrussell/status/1945242558487044118

18/ How are you using comet? Any use cases that I missed? Follow @AtomSilverman and @AgentOpsAI for everything AI agent-related Have you tried the @AgentOpsAI MCP server? Link in bio. Last week’s thread: https://x.com/AtomSilverman/status/1944456541169762363

a fresh batch of comet invites just went out”” / X https://x.com/AravSrinivas/status/1945669970618421699

Looks like Grok 4 is 10^27 FLOPs given their graphs? HLE score is 26% without tools, Gemini 2.5 is 21.6% without tools. Curious what the tool piece is.”” / X https://x.com/emollick/status/1943162710725657055

LLMs for IMO 2025: gemini-2.5-pro (31.55%), o3 high (16.67%), Grok 4 (11.90%). https://x.com/denny_zhou/status/1945887753864114438

Gemini generates the best prompts for Veo 3. Full code below. ““python import time from google import genai from google.genai import types client = genai.Client() operation = client.models.generate_videos( model=””veo-3.0-generate-preview””, prompt=””””””{ “”character_name””: https://x.com/_philschmid/status/1945898590821584989

Generate videos with Veo 3 | Gemini API | Google AI for Developers https://ai.google.dev/gemini-api/docs/video

Start building with Veo 3: our state-of-the-art video generation model now available in paid public preview via the Gemini API and @Google AI Studio. 🎨 Here’s how to try it → https://x.com/GoogleDeepMind/status/1945886603328778556

RT @GeminiApp: A new Gemini feature just dropped and everything is alive?! Now you can turn photos into videos with sound in Gemini.”” / X https://x.com/demishassabis/status/1944939563170062804

Just shipped our first fully-automated newsletter! An experiment in merging editorial quality with AI workflows. Here’s how we built it: – The Funnel filters signal from noise → RSS feeds, Google Scripts, OpenAI – The Editor curates and cleans → @gumloop_ai – The Delivery https://x.com/kazsatamai/status/1933196696781214064

In minutes, AI builds custom proteins to battle life-threatning super-bugs. For the first time, Australian scientists have used Artificial Intelligence (AI) to generate a ready-to-use biological custom proteins, in this case, one that can kill antibiotic resistant bacteria like https://x.com/rohanpaul_ai/status/1944298248346513816

This guy literally built a massive database of n8n workflows. 2,000+ AI workflows across 365 different services and APIs. Comes with a lightning-fast search so you can find just what you need. https://x.com/Saboo_Shubham_/status/1940598129243631947

RT @balesni: A simple AGI safety technique: AI’s thoughts are in plain English, just read them We know it works, with OK (not perfect) tra…”” / X https://x.com/EthanJPerez/status/1946096565581730278

I am extremely excited about the potential of chain-of-thought faithfulness & interpretability. It has significantly influenced the design of our reasoning models, starting with o1-preview. As AI systems spend more compute working e.g. on long term research problems, it is”” / X https://x.com/merettm/status/1945157403315724547