Ethics/Legal/Security: AI News Week Ending 07/18/2025

congrats to @FakePsyho for claiming the top spot on the @atcoder World Finals programming competition (followed by OpenAI at #2)!”” / X https://x.com/gdb/status/1945553676321657127

Congrats to @FakePsyho for winning AtCoder World Tour Finals 2025 Heuristic 🚀 Humanity has prevailed (for now!) Thanks OpenAI for sponsoring #AWTF2025, and getting #2 on this grand challenge. Proud of @SakanaAILabs & @AtCoder’s ALE-Agent for reaching #5, on a shoestring budget!”” / X https://x.com/hardmaru/status/1945850637528490134

good job psyho”” / X https://x.com/sama/status/1945540005805658440

official results from @atcoder World Tour Finals are in — great results for both humans (#1 and #3 onwards) and AI (#2 in the world!). a milestone for AI for solving hard problems.”” / X https://x.com/gdb/status/1945989983569129632

RT @FakePsyho: Humanity has prevailed (for now!) I’m completely exhausted. I figured, I had 10h of sleep in the last 3 days and I’m barely…”” / X https://x.com/itsclivetime/status/1945590725279977900

we’re competing in the @atcoder World Finals programming contest. real nailbiter — OpenAI has been #1 for most of the contest. looked like it might be over when @FakePsyho pulled ahead, but we’ve just retaken the lead. 1 hour and 20 minutes to go! https://x.com/gdb/status/1945404295794610513

OpenAI’s Agent mode can now work with Spreadsheets achieving 45% on SpreadsheetBench https://x.com/scaling01/status/1945896464632148366

OpenAI on X: “We’ve decided to treat this launch as High Capability in the Biological and Chemical domain under our Preparedness Framework, and activated the associated safeguards. This is a precautionary approach, and we detail our safeguards in the system card. We outlined our approach on” / X
https://x.com/OpenAI/status/1945904754443669659

Preparing for future AI capabilities in biology | OpenAI
https://openai.com/index/preparing-for-future-ai-capabilities-in-biology/

RT @boazbaraktcs: ChatGPT Agent is the first model we classified as “”High”” capability for biorisk. Some might think that biorisk is not r…”” / X https://x.com/jekbradbury/status/1945944398199677016

🚨 BREAKING: @Kimi_Moonshot’s Kimi-K2 is now the #1 open model in the Arena! With over 3K community votes, it ranks #5 overall, overtaking DeepSeek as the top open model. Huge congrats to the Moonshot team on this impressive milestone! The leaderboard now features 7 different https://x.com/lmarena_ai/status/1945866381880373490

5 Things You Need to Know About Moonshot AI and Kimi K2, the New #1 model on the Hub https://huggingface.co/blog/fdaudens/moonshot-ai-kimi-k2-explained

Every ML Engineer’s dream loss curve: “Kimi K2 was pre-trained on 15.5T tokens using MuonClip with zero training spike, demonstrating MuonClip as a robust solution for stable, large-scale LLM training.” https://x.com/hardmaru/status/1943976259236901315

For those unfamiliar with Kimi K2: – Surpasses models like GPT-4.1 and Claude 4 Opus on coding benchmarks – Scores new highs on math and STEM tests among non-reasoning systems – Doesn’t even have multimodal or reasoning capabilities yet kimi [dot] com https://x.com/rowancheung/status/1944647747027558636

I think I will spend the rest of the day letting Kimi generate these reports. They are so nice to look at compared to what OpenAI, Anthropic and others give you https://x.com/scaling01/status/1944850575470027243

It’s so beautiful to see the @Kimi_Moonshot team participating in every single community discussions or pull requests on @huggingface (the little blue bubbles on the right). In my opinion, every serious AI organization should dedicate meaningful time and ressources to this https://x.com/ClementDelangue/status/1946208120385999328

It’s undeniable with Kimi-K2 China has reached the frontier and will surpass the US next year”” / X https://x.com/scaling01/status/1944045857340359044

Kimi has a distinct writing style that is free of most of the patterns we now associate with AI generated text. Both Kimi and DeepSeek’s prose is apparently even more impressive in Chinese. Both of these models have a unique ‘voice’, quite different from Western AI. https://x.com/AndrewCurran_/status/1944434569899290839

Kimi is 200 people, very few of them with “frontier experience”, a platform (but you can buy such data) and a modest GPU budget. In theory there are many dozens of business entities that could make K2 in the West. It’s telling how none did. Not sure what it’s telling tho.”” / X https://x.com/teortaxesTex/status/1944856509734961596

Kimi is a really weird model, and it needs a lot more testing to figure out For example, I gave it an altered version of Great Gatsby and it found the two alterations (as does Claude) but then made up a ton of hallucinated nonsense that sounded plausible but was just plain wrong https://x.com/emollick/status/1944974487369158864

Kimi K2 is an incredible model.”” / X https://x.com/skirano/status/1944123290525831317

Kimi K2 is now available on https://x.com/togethercompute/status/1944952034840732138

Kimi K2 is number one trending on HF, congrats! https://x.com/huggingface/status/1944155602583691492

Kimi K2 is so good at tool calling and agentic loops, can call multiple tools in parallel and reliably, and knows “”when to stop””, which is another important property. It’s the first model I feel comfortable using in production since Claude 3.5 Sonnet. https://x.com/skirano/status/1944475540951621890

Kimi K2 just hit #1 on @huggingface trending models in <24 hours! This MoE powerhouse packs 1T params with 32B active – crushing coding challenges and autonomous agent tasks. https://x.com/fdaudens/status/1943996876778614948

Kimi K2 now on https://x.com/togethercompute/status/1945143838911128019

Kimi K2, the latest from @Kimi_Moonshot is now live in the Arena! https://x.com/lmarena_ai/status/1944827675597791456

Kimi K2: Open Agentic Intelligence https://moonshotai.github.io/Kimi-K2/

Kimi team is more american than most American labs lol”” / X https://x.com/Teknium1/status/1944430651278537098

Kimi team just trained a state of the art open source model 32B active parameter/1T total with 0 training instabilities, thanks to MuonClip, this is amazing https://x.com/eliebakouch/status/1943687750563004801

Kimi-k2 seems to be a very good (and giant & odd) open weights model that may be the new leader in open LLMs. It is not beating the frontier closed models on my weird tests, but it doesn’t have a reasoner yet. More testing needed but Chinese open weights models are impressive. https://x.com/emollick/status/1943901440453259374

past week had huuuge releases, here’s our picks 🔥 > moonshot released Kimi K2, sota LLM with 1T total 32B active parameters 🤯 > @huggingface released SmolLM3-3B, best LM for it’s size, offers thinking mode 💭 as well as the dataset, smoltalk2 > Alibaba released WebSailor-3B, https://x.com/mervenoyann/status/1944757807191888080

Pretty wild that @Kimi_Moonshot dropped a 1T parameter (32B active) MoE trained on 15.5 Trillion tokens – MIT licensed 🔥 Beats all other open weights models across coding, agentic and reasoning benchmarks Ofcourse live on Hugging Face! 🤗 https://x.com/reach_vb/status/1943703030026641801

RT @ArtificialAnlys: While Moonshot AI’s Kimi k2 is the leading open weights non-reasoning model in the Artificial Analysis Intelligence In…”” / X https://x.com/zacharynado/status/1944945039647629548

RT @DeepInfra: Moonshot AI’s Kimi 2 is now live on DeepInfra, as always at the best price of $0.55/$2.20, full tool call and context suppor…”” / X https://x.com/jeremyphoward/status/1944939322735780260

RT @htihle: Results from kimi-k2 on WeirdML! It does very well for a non-reasoning model. Like a scaled up deepseek-v3, beating out gpt-4.1…”” / X https://x.com/bigeagle_xd/status/1944325829657554962

RT @huggingface: Kimi K2 is number one trending on HF, congrats! https://x.com/_akhaliq/status/1944159007456784512

RT @ivanfioravanti: Kimi-Dev-72B-4bit-DWQ is on mlx-community! It took 9 hours to create 😅 Quick performance test on M3 Ultra: Prompt: 56…”” / X https://x.com/awnihannun/status/1944108947411284374

RT @Kimi_Moonshot: 🚀 Hello, Kimi K2! Open-Source Agentic Model! 🔹 1T total / 32B active MoE model 🔹 SOTA on SWE Bench Verified, Tau2 & Ace…”” / X https://x.com/stanfordnlp/status/1944114320226263165

RT @koltregaskes: Kimi-K2 tops EQ-Bench, the benchmark that measures emotional intelligence. https://x.com/jeremyphoward/status/1944326479246147899

RT @lmarena_ai: 🚨 BREAKING: @Kimi_Moonshot’s Kimi-K2 is now the #1 open model in the Arena! With over 3K community votes, it ranks #5 over…”” / X https://x.com/Kimi_Moonshot/status/1945897926796185841

RT @lmarena_ai: Kimi K2, the latest from @Kimi_Moonshot is now live in the Arena! https://x.com/Kimi_Moonshot/status/1945462820147249523

RT @masondrxy: New K2 model from @Kimi_Moonshot is officially supported by @LangChainAI on @GroqInc! See 👇 https://x.com/Hacubu/status/1945144499228811676

RT @OpenRouterAI: Kimi K2 is now passing 200 tokens per second on OpenRouter Props to @GroqInc !”” / X https://x.com/JonathanRoss321/status/1945779694256722025

RT @reach_vb: LOVE ITT! You can run Kimi K2 (1T token MoE) on a single M4 Max 128GB VRAM (w/ offloading) or a single M3 Ultra (512GB) 🔥 Th…”” / X https://x.com/reach_vb/status/1944997786329460978

RT @sam_paech: Kimi-K2 just took top spot on both EQ-Bench3 and Creative Writing! Another win for open models. Incredible job @Kimi_Moonsh…”” / X https://x.com/Teknium1/status/1944285648825069759

RT @sdrzn: Seriously blown away by Moonshot’s new Kimi K2 model in @cline. It beats Claude Opus 4 on coding benchmarks and is up to 90% che…”” / X https://x.com/ClementDelangue/status/1946316382313869778

RT @weights_biases: NEW: Kimi K2 is now live on W&B Inference by @CoreWeave! It’s the first truly open challenger, ready for production wi…”” / X https://x.com/l2k/status/1945225318928634149

Seen many people mention how kimi K2 for example has no CoT or thinking which isn’t true, more of an issue with terminology Main difference with reasoning models (in terms of actual functionality) is the thinking is hidden during general non-verifiable rl, so the model can”” / X https://x.com/Grad62304977/status/1944050338551484702

Some thoughts on the decisions behind Kimi K2’s architecture – from our infra staff”” / X https://x.com/Kimi_Moonshot/status/1944589115510734931

Thank you to @Kimi_Moonshot for quickly addressing my queries on the correct system prompt for Kimi K2! We’ll be re-uploading all BF16 + dynamic @unslothai GGUFs with fixed tool calling & the new sys prompt! Sys prompt = “”You are Kimi, an AI assistant created by Moonshot AI.”””” / X https://x.com/danielhanchen/status/1946163064665260486

That’s from Kimi K2 blog post. In case someone says «wow and it’s not RL-trained». It very much is, don’t get misled by the absence of long CoT. Looks like DeepResearch but It’s probably similar to what’s been happening since Sonnet 3.5, giving it uncanny «pre-reasoner» powers. https://x.com/teortaxesTex/status/1944416704253018372

The success of Kimi K2 is no accident. The unfortunate reality in AI is that user experiences haven’t yet fully caught up to raw model capabilities. Experiences have plateaued. There are only so many coding assistants, research tools, or agents you can realistically offer, and https://x.com/skirano/status/1945505132323766430

TheZvi’s answer “why isn’t there American Kimi” basically: incentives. I *partially* buy it. But given the Concern about the dominance of Chinese open models, expressed by numerous patriotic think tanks, I think we could expect *someone* rising to the task. https://x.com/teortaxesTex/status/1945624983985639487

This is what 200 tokens/second looks like with Kimi K2 on @GroqInc For reference, Claude Sonnet-4 is usually delivered at ~60 TPS https://x.com/cline/status/1945354314844922172

True, the first ever application of Muon was to break the 3-second barrier in the CIFAR-10 speedrun. For perspective on scale that was a 3e14 flop training; @Kimi_Moonshot’s K2 is 3e24 flops, 10 orders of magnitude larger. https://x.com/kellerjordan0/status/1945701578645938194

We’ve just fixed 2 bugs in Kimi-K2-Instruct huggingface repo. Please update the following files to apply the fix: – tokenizer_config.json: update chat-template so that it works for multi-turn tool calls. – tokenization_kimi.py: update encode method to enable encoding special”” / X https://x.com/Kimi_Moonshot/status/1945050874067476962

We’ve submitted Kimi K2 to @lmarena_ai. Waiting to be added to the match pool: https://x.com/Kimi_Moonshot/status/1944754256059453823

You might not have heard of Moonshot AI, but within 24 hours, their Kimi K2 model shot to the top of the Hugging Face trending models. So… who are they, and why does this matter? 🧵Here are a few standout facts:”” / X https://x.com/fdaudens/status/1945128932040208867

Kimi K2 at 185 t/s (or even higher, nearly 220 in my short tests) is probably the best use of Groq to date, and can make K2 immediately more compelling than Sonnet 4. Impressive that they’ve managed to fit this 1T monster on their chips. https://x.com/teortaxesTex/status/1944950183051321542

Quick start project for Claude Code on Kimi:”” / X https://x.com/jeremyphoward/status/1944326308210921652

Very interesting – you can use Kimi with the Anthropic API. This means, perhaps most importantly, that you can now use Kimi with Claude Code! 🤯 https://x.com/jeremyphoward/status/1944322841866125597

RT @allhands_ai: Kimi-K2 is definitely the first strong open-weight competitor to Claude Sonnet. 65.4% on SWE-Bench Verified in OpenHands,…”” / X https://x.com/TheZachMueller/status/1945545349352829439

The DeepSeek moment was supercharged by pent-up consumer demand for a good free AI for those who wouldn’t pay (especially for students for homework) A reason Kimi K2 has not had the immediate public impact of DeepSeek may be, for most consumers/students, DeepSeek is good enough”” / X https://x.com/emollick/status/1944764085741957153

RT @yawnxyz: Kimi K2 is **INCREDIBLE** at using tools. I built a chrome extension to chat with Google Maps, but I never posted it. All th…”” / X https://x.com/bigeagle_xd/status/1945087963408351728

I’ve been a bit quiet on X recently. The past year has been a transformational experience. Grok-4 and Kimi K2 are awesome, but the world of robotics is a wondrous wild west. It feels like NLP in 2018 when GPT-1 was published, along with BERT and a thousand other flowers that https://x.com/DrJimFan/status/1944443447953498285

I doubt that Sama’s delay of open model is about Kimi. But I don’t find the logic here compelling either. «Only nerds noticed Kimi». Well, Sama is loathed. The point of his model is, above all things, PR. If it’s not open SOTA, reports will notice *that*. I think he wants SOTA. https://x.com/teortaxesTex/status/1944263611398180954

Rumors that OpenAI delayed their open-source model because of Kimi are fun, but from what I hear: – the model is much smaller than Kimi K2 (<< 1T parameters) – super powerful – but due to some (frankly absurd) reason I can’t say, they realized a big issue just before release, so”” / X https://x.com/Yuchenj_UW/status/1944235634811379844

Super excited to see Kimi K2 land on Perplexity. If you’re fine-tuning, quick reminder: using the Muon optimizer during both fine-tuning and RL phases gives the best results (details are in our Moonlight paper).”” / X https://x.com/Kimi_Moonshot/status/1944224975428497549

Grok 4 suggests that scaling still works (with the diminishing returns predicted by the scaling law), and that tool use can unlock performance gains. Kimi suggests there continues to be big opportunities from improvements in methods (Muon, etc.). Lots of paths for AI right now.”” / X https://x.com/emollick/status/1944306918631018856

“these results were eye-opening for me… chatgpt agent performed better than i expected on some pretty realistic investment banking tasks”
https://x.com/tejalpatwardhan/status/1945894313977860203

ChatGPT agent for investment banking:”” / X https://x.com/gdb/status/1946074958238765503

Citi and Ant International Pilot AI-Enabled Forecasting Solution to Enhance FX Risk Management for Airline Customers
https://www.citigroup.com/global/news/press-release/2025/citi-ant-international-ai-solution-enhance-fx-risk-management-airline-customers

Citi, Ant International pilot AI-powered FX tool for clients to help cut hedging costs | Reuters https://www.reuters.com/business/finance/citi-ant-international-pilot-ai-powered-fx-tool-clients-help-cut-hedging-costs-2025-07-18/

Goldman Sachs is testing viral AI agent Devin as a ‘new employee’ | TechCrunch https://techcrunch.com/2025/07/11/goldman-sachs-is-testing-viral-ai-agent-devin-as-a-new-employee/

OpenAI working on payment checkout system within ChatGPT, FT reports | Reuters https://www.reuters.com/business/openai-working-payment-checkout-system-within-chatgpt-ft-reports-2025-07-16/

Been using the Dia browser for a couple of days now and realizing it’s become more of a hassle to navigate to ChatGPT or Perplexity. The deep integration with an LLM changes the experience of using a browser and navigating the internet. The browser wars are about to begin.”” / X https://x.com/alecdewitz/status/1935420754226790842

In the works already. Team moving at a pace that’s fast even for Perplexity standards. https://x.com/AravSrinivas/status/1945537471540072888

Perplexity is now the #1 overall app on App Store in India, ahead of ChatGPT. https://x.com/AravSrinivas/status/1945960772091433081

ChatGPT agent for working with Excel, Powerpoint, etc.:”” / X https://x.com/gdb/status/1946007318824673534

New from our security teams: Our AI agent Big Sleep helped us detect and foil an imminent exploit. We believe this is a first for an AI agent – definitely not the last – giving cybersecurity defenders new tools to stop threats before they’re widespread.https://x.com/tulseedoshi/status/1945113799297536313

AI firms like OpenAI are poaching Wall Street quants with massive paydays, shifting the talent landscape for building artificial general intelligence. 💰 https://x.com/fdaudens/status/1944759768528060558

The AI Labs Are Coming for Wall Street’s Quants – Business Insider https://www.businessinsider.com/ai-talent-openai-wall-street-quant-trading-firms-2025-7

I often rant about how 99% of attention is about to be LLM attention instead of human attention. What does a research paper look like for an LLM instead of a human? It’s definitely not a pdf. There is huge space for an extremely valuable “research app” that figures this out.”” / X https://x.com/karpathy/status/1943411187296686448

💥 Announcing ChatGPT agent: a powerful new agent that can use a computer, browse the web, write code, use a terminal, write reports, create images, edit spreadsheets, and even create slides for you. The slides often… need some work. But you know how this goes: first it’s https://x.com/kevinweil/status/1945896640780390631

ChatGPT agent for finding a great Airbnb:”” / X https://x.com/gdb/status/1946075573476069580

ChatGPT agent is ready to introduce itself. https://x.com/OpenAI/status/1945890050077782149

ChatGPT can now do work for you using its own computer. Introducing ChatGPT agent—a unified agentic system combining Operator’s action-taking remote browser, deep research’s web synthesis, and ChatGPT’s conversational strengths. https://x.com/OpenAI/status/1945904743148323285

Introducing ChatGPT agent: bridging research and action | OpenAI https://openai.com/index/introducing-chatgpt-agent/

Just launched ChatGPT Agent (sorry GPT-5 waiters, it is coming!), the most capable AI agent model to date! It has been such an honor to be part of a crazy sprint to get this amazing model trained and shipped together with an absolutely gem team (@isafulf , @caseychu9 ,”” / X https://x.com/xikun_zhang_/status/1945895070269583554

OpenAI’s New ChatGPT Agent Tries to Do It All | WIRED https://www.wired.com/story/openai-chatgpt-agent-launch/

RT @emollick: I had early access & ChatGPT agent is, I think, a big step forward for getting AIs to do real work Even at this stage, it do…”” / X https://x.com/nickaturley/status/1945975092342841487

tip for chatgpt agent slides: first ask it to do the research only, then ask it to make the slides!”” / X https://x.com/isafulf/status/1946231119751545014

Vibe Check: OpenAI Enters the Browser Wars With ChatGPT Agent https://every.to/vibe-check/vibe-check-openai-enters-the-browser-wars-with-chatgpt-agent

When we founded OpenAI (10 years ago!!), one of our goals was to create an agent that could use a computer the same way as a human — with keyboard, mouse, and screen pixels. ChatGPT Agent is a big step towards that vision, and bringing its benefits to the world thoughtfully.”” / X https://x.com/gdb/status/1945923067403984979

You can ask ChatGPT Agent to train an AI on datasets you are interested in, and do analyses for you. Building AI and doing data analysis will be automated end-to-end in the future. You are hearing it right. We are working hard to automating our own job :)”” / X https://x.com/xikun_zhang_/status/1946278266786189744

ChatGPT Agent has lower performance than o3 on PaperBench, SWE-Bench verified, OpenAI PRs and OpenAI Research Engineer Interview questions https://x.com/scaling01/status/1945932154455695752

Claude for Financial Services \ Anthropic https://www.anthropic.com/news/claude-for-financial-services

We’ve launched Claude for Financial Services. Claude now integrates with leading data platforms and industry providers for real-time access to comprehensive financial information, verified across internal and industry sources. https://x.com/AnthropicAI/status/1945889476556853520

Musk suggests Tesla investor vote on xAI investment, rules out merger | Reuters https://www.reuters.com/business/autos-transportation/musk-says-he-does-not-support-merger-between-tesla-xai-2025-07-14/

Trump unveils $90 billion in energy and AI investments for Pennsylvania during summit in Pittsburgh – CBS Pittsburgh https://www.cbsnews.com/pittsburgh/news/trump-energy-ai-summit-pittsburgh-carnegie-mellon/

CoreWeave commits $6 billion to Pennsylvania data center amid Trump AI push | Reuters https://www.reuters.com/business/coreweave-commits-6-billion-ai-data-center-pennsylvania-2025-07-15/

ChatGPT may soon edit Excel and PowerPoint files natively, challenging Microsoft Office: Report | Mint https://www.livemint.com/technology/tech-news/chatgpt-may-soon-edit-excel-and-powerpoint-files-natively-challenging-microsoft-office-report-11752665586822.html

Elon Musk’s SpaceX might invest $2 billion in Musk’s xAI | TechCrunch https://techcrunch.com/2025/07/13/elon-musks-spacex-might-invest-2-billion-in-musks-xai/

Exclusive | SpaceX to Invest $2 Billion Into Elon Musk’s xAI – WSJ https://www.wsj.com/tech/spacex-to-invest-2-billion-into-elon-musks-xai-413934de

This is a real job now. Build the waifu of your dreams at @xAI. https://x.com/ebbyamir/status/1945247680176799944

🤝 Nvidia’s CEO Jensen Huang is walking a tightrope in Beijing, balancing US-China tech rivalry while keeping Nvidia at the heart of the AI revolution. The stakes? Trillions and global influence. https://x.com/fdaudens/status/1945537196884123923

Can Nvidia convince governments to pay for “sovereign AI”? Politicians are warming to the idea of national AI systems, but it might not reduce dependence on US tech. 🌍 https://x.com/fdaudens/status/1944759771212468733

Can Nvidia persuade governments to pay for “sovereign” AI? https://www.economist.com/business/2025/07/13/can-nvidia-persuade-governments-to-pay-for-sovereign-ai

Nvidia C.E.O. Treads Carefully in Beijing – The New York Times https://www.nytimes.com/2025/07/16/business/nvidia-jensen-huang-beijing.html

Nvidia CEO Jensen Huang’s rosy AI vision: “”There will be more jobs”” https://www.axios.com/2025/07/14/ai-jobs-nvidia-jensen-huang-dario-amodei

Nvidia just got the OK to sell AI chips to China after its CEO met Trump. Tech, trade, and geopolitics: all on the table. https://x.com/fdaudens/status/1945121369584234947

Nvidia says it will restart sales of a key AI chip to China, in a reversal of US restrictions | CNN Business https://www.cnn.com/2025/07/15/business/nvidia-resume-h20-chip-sales-to-china-intl-hnk

Industry video game actors pass agreement with studios for AI security | Reuters https://www.reuters.com/business/media-telecom/industry-video-game-actors-pass-agreement-with-studios-ai-security-2025-07-10/

CDAO Announces Partnerships with Frontier AI Companies to Address National Security Mission Areas > Chief Digital and Artificial Intelligence Office > PR-View https://www.ai.mil/Latest/News-Press/PR-View/Article/4242822/cdao-announces-partnerships-with-frontier-ai-companies-to-address-national-secu/

The General-Purpose AI Code of Practice | Shaping Europe’s digital future https://digital-strategy.ec.europa.eu/en/policies/contents-code-gpai

RT @EnricoShippole: We open-sourced 99% of US caselaw on @huggingface. Both AI and legal tech companies are selling this data for a high pr…”” / X https://x.com/ClementDelangue/status/1945185890294255741

A new agentic browser just shipped from Perplexity and it’s pretty wild. Watch this video of @PerplexityComet taking over my LinkedIn tab and taking actions on my part. Interesting UX where the tab glows blue as it’s taking actions. I like the integration of agentic actions https://x.com/ryancarson/status/1942962447369036201

AI-powered browsers like Perplexity’s Comet promise to do your web surfing for you. But do they really save time, or just add more noise? 🌐 https://x.com/fdaudens/status/1945121374063698080

Ask Comet to book a meeting or send an email. Comet transforms entire sessions into single, seamless interactions. https://x.com/PerplexityComet/status/1943026179960873207

asked @PerplexityComet to load up our brand colors in @MeetGamma then shifted my focus to building the actual content of the deck https://x.com/jennysvng/status/1943074383091671529

Been using @PerplexityComet, and there are soo many new use cases for it, but this has got to be one of my favs: I received a verification link sent to my Gmail, and I asked Comet Assistant to click it and verify me on my behalf. And it did it! Simple yet useful ^_^ https://x.com/_Matskuu/status/1942977239974400170

BREAKING 🚨: Comet Browser can now control an open web page from a sidecar! Now it can simply take it over and click around. Making Comet to publish a blog post for me 👀 https://x.com/testingcatalog/status/1928546603448562087

Browse at the speed of thought. https://x.com/PerplexityComet/status/1942968195419361290

Comet browser applying for a job for me 👀 Soon, you will be able to execute such things on a schedule. https://x.com/testingcatalog/status/1926043202684854674

Comet has become a natural extension of all my workflows, ideas, and content since I started using it. I can easily recall any saved information and connect to all of my personal knowledge management tools. Effortless networked intelligence. Proud of this team! https://x.com/camerontstow/status/1943047355944833153

Comet… is nuts. I asked it to go find the subreddits that people would ask cooking questions on. Then, find common questions and come up with ad angles for those questions for Hexclad. For kicks, I asked it to make a static ad for me with my fav angle Results. Are. Insane. https://x.com/NathanSnell/status/1943095214932943291

cool query on my comet browser for handling my X addiction. https://x.com/AravSrinivas/status/1912592179291385896

First test of Perplexity’s new agentic browser, Comet 👇 Comet authenticates into your accounts (e.g. email, calendar) to take actions on your behalf. It pulled a list of all my email newsletters, and unsubscribed from the specific ones I asked it to 🤯 https://x.com/omooretweets/status/1943078090718220653

Hooolllyyy crap. Perplexity’s comet browser is insane. Operator was a total dud. Manus is better but meh. Videos coming. I asked it to duplicate a meta campaign for me. No problem. All automated. Anyone want me to try anything specific? https://x.com/NathanSnell/status/1943062637656338805

How to watch YouTube on Comet https://x.com/AravSrinivas/status/1946240617031606672

I feel like I’m living in the future right now. Been using the new browser called Comet from @perplexity_ai (thanks @AravSrinivas for getting me access!) Like millions of others, I spend hours and hours a day in a browser. Specifically, Chrome. And, Chrome hasn’t”” / X https://x.com/dharmesh/status/1943084541733933189

Let Comet handle the customer support reps for you. Customer support is already a lot of AI anyway. So let your AI talk to the other AIs while you watch YouTube or do some work :-)”” / X https://x.com/AravSrinivas/status/1944778316323717437

Memory is magic when it works. Comet is “memory-native” – the closest approximation of truly understanding the user there is. https://x.com/AravSrinivas/status/1944078543324844077

Perplexity Comet https://comet.perplexity.ai/

Perplexity Comet vs ChatGPT Agent”” / X https://x.com/AravSrinivas/status/1946076236683624616

PERPLEXITY COMET WORKS ON DUNE FOR CONTENT IDEATION!!!! SO COOL! https://x.com/0xDataWolf/status/1943265415322595630

Perplexity is testing new feature with Comet browser which will be able to just go out there and do things for you via prompts. Exciting times ahead https://x.com/AIProductPM/status/1940108252559081764

Prime Day Shopping with Comet. User saves $280 in less than 5 minutes by asking Comet to compare prices.”” / X https://x.com/AravSrinivas/status/1944183680915714548

RT @itsPaulAi: Perplexity Comet can automate any task in your browser This is the first time you REALLY have an AI agent working autonomou…”” / X https://x.com/denisyarats/status/1945321982725382170

RT @PerplexityComet: Clean up your inbox. Ask Comet to unsubscribe you from spam and unwanted emails. https://x.com/AravSrinivas/status/1945232153609978273

RT @rowancheung: Perplexity Comet is not like other agents I’ve been testing it all week, and it’s starting to actually *stick* Having in…”” / X https://x.com/AravSrinivas/status/1945620938068037633

The Cursor for Web Browsing, is here. And it’s better than Comet at turning your open tabs and bookmarks into a codebase. Here is a full breakdown of how i’m using @diabrowser Exploring the Future of Browsing with DIA Browser: Essential Features for Content Creators & https://x.com/rileybrown_ai/status/1943041778304847889

The most interesting thing about Perplexity Comet is that it can actually do things in Cal / Gmail Ex. I asked it to reschedule a 1:1 – it moved the invite and sent an email Neither Google nor OpenAI have done this in their agents…maybe for safety reasons, but it’s limiting 🤔 https://x.com/omooretweets/status/1943116119243416009

The TAM for Comet is bigger than Perplexity because it appeals to people who don’t even want AI. Just the best core browser in the market at the end of the day.”” / X https://x.com/AravSrinivas/status/1946035102150238475

USE CASE 2: Cross-tab product comparison If you’re looking for a new product or looking for flights, Comet can compare tabs in real time It’s surprisingly fast and analyzes the reviews of the tabs too https://x.com/rowancheung/status/1945524017915674879

USE CASE 3: Summarize any YT video with a click You can summarize + chat with any long YT video and get key moments This is also possible in Gemini, but having it in the browser means you can watch the video AND chat/learn with Comet in the side tab at the same time https://x.com/rowancheung/status/1945524019681480992

Vibe coding with @PerplexityComet – asked the browser agent to build me a simple (locally run) yt-dlp wrapper. It navigated to github,created the repo, wrote/committed/pushed the code. You can even make changes to your code from the sidecar, feels like an AI IDE lmao 😂 https://x.com/killuaz0ldyck07/status/1942976067075281248

When you’re on Comet, you’re operating at an abstraction above which AI to use and how to pull in relevant context. Agents are powerful and operate like a human would to complete the task. You go from chat turns to end-to-end workflows. https://x.com/AravSrinivas/status/1944024356138758367

Google and Brookfield strike $3bn hydro power deal https://www.ft.com/content/d8bef8a3-5988-4080-ad7d-61bc9885e6ba

Google just inked a $3B deal for hydro power to run its AI data centers. Big Tech is scrambling for clean, reliable energy as AI’s appetite explodes. ⚡ https://x.com/fdaudens/status/1945121372465754471

Google’s latest AI security announcements https://blog.google/technology/safety-security/cybersecurity-updates-summer-2025/

Walmart revealed details of Element, an internal platform that lets its engineers build AI apps for internal use based on shared resources without spending time evaluating tools or risking vendor lock-in. Element runs on Google Cloud, Microsoft Azure, or Walmart data centers https://x.com/DeepLearningAI/status/1945257067389821399

Combating unoriginal content | Meta for Creators https://creators.facebook.com/blog/combating-unoriginal-content

Microsoft, US national lab tap AI to speed up nuclear power permitting process | Reuters https://www.reuters.com/business/energy/microsoft-us-national-lab-tap-ai-speed-up-nuclear-power-permitting-process-2025-07-16/

we planned to launch our open-weight model next week. we are delaying it; we need time to run additional safety tests and review high-risk areas. we are not yet sure how long it will take us. while we trust the community will build great things with this model, once weights are”” / X https://x.com/sama/status/1943837550369812814

Optimizing AIs for engagement has always been a likely path forward, and it is also a very fraught one. I wrote about this after GPT-4o became very sycophantic (a change that was rolled back), but I think it is even more relevant given Grok’s companions. https://x.com/emollick/status/1945262637853311271

🎥 Want the text from any YouTube video? Now you can — no plugins, no installs. Just drop the link, and our YouTube MCP turns it into text instantly. Try it now with this Agent: https://x.com/OmniMCP/status/1942855673324397021

Whenever I looked into having a personal assistant, it struck me how few of our existing structures support intermediate permissions. Either a person acts fully on your behalf and can basically defraud you, or they can’t do anything useful. I wonder if AI agents will change that.”” / X https://x.com/AmandaAskell/status/1946253987923304699

An MCP Server for Legal Research (SCOTUS Opinions) 🧑‍⚖️ In less than 10 minutes I indexed 100+ Supreme Court opinions from 2022-2024, using LlamaCloud to parse/index the data with really high accuracy, and then made it available as an MCP server to any AI client. You can then use https://x.com/jerryjliu0/status/1941181730536444134

I built a voice assistant that analyzes the entire stock market. Built my backend and MCP endpoint using FastAPI on Python and it works. This was exciting to build ngl ❤️. https://x.com/dnaijatechguy/status/1940375435017384271

New study warns of risks in AI mental health tools | Stanford Report https://news.stanford.edu/stories/2025/06/ai-mental-health-care-tools-dangers-risks

Coming off @Google IO, we’ve made it possible to build AI Agents with real-time data from verified sources via Google ADK + Dappier 🧠⚡ – Define agents and tools using Google ADK – Plug into Dappier for web search + latest data for stocks, sports, news, and more https://x.com/DappierAI/status/1928430036257759269

AWS Imagine Conference for Education, State, and Local Government Leaders https://aws.amazon.com/government-education/imagine/?trk=37ba8024-7bd0-4e6e-9f99-64ac3f875a94&sc_channel=el

🧠 Nearly 3/4 of teens are turning to AI companions for advice, friendship, and comfort. The upside? New connections. The risk? Losing touch with real-world relationships. Wild times for growing up. https://x.com/fdaudens/status/1945537200545804438

I am starting to think sycophancy is going to be a bigger problem than pure hallucination as LLMs improve. Models that won’t tell you directly when you are wrong (and justify your correctness) are ultimately more dangerous to decision-making than models that are sometimes wrong.”” / X https://x.com/emollick/status/1944519849180561710

The US department of energy warns that blackouts could increase by 100 times in 2030 as AI growth outpaces u.s. power grid capacity “”the status quo is unsustainable”” It warns that if 104 GW of coal, gas, and nuclear plants retire on schedule and only 22 GW of new firm capacity https://x.com/rohanpaul_ai/status/1944268369236054523

I am extremely confused when a statement formulated by a LLM about its inner processes is considered to have any validity. Except if such information has been explicitly added to the training data, is there *any reason* for such a thing to be remotely true?”” / X https://x.com/francoisfleuret/status/1945960440422379792

agree with lots of what jensen has been saying about ai and jobs; there is a ton of stuff to do in the world. people will 1) do a lot more than they could do before; ability and expectation will both go up 2) still care very much about other people and what they do 3) still be”” / X https://x.com/sama/status/1945541270438646270

The lack of an aggressive patent strategy among the AI labs is surprising to me, though it is accelerating AI growth. Patents in software are hard, but the stakes are high OpenAI pledged to only use their patents defensively, so is it all mutually assured destruction among labs? https://x.com/emollick/status/1945346126388842815

⚠️ AI isn’t just a tool—it’s also supercharging medical misinformation. From fake studies to hallucinated citations, the risks are real when governments and charlatans lean on unchecked AI. https://x.com/fdaudens/status/1945537203754369211

If we compared AI capabilities against humans with no access to tools, such as the internet, we would probably find that AI already outperformed humans at many or most cognitive tasks we perform at work. But of course this is not a helpful comparison and doesn’t tell us much”” / X https://x.com/random_walker/status/1946180439045018046

There is a 10% chance that ChatGPT agent will actually gamble away your life savings if you asked it https://x.com/scaling01/status/1945930617775882728

today’s LLMs have reduced the cost of mediocrity to next-to-nothing unfortunately, the cost of greatness remains high as it’s ever been”” / X https://x.com/jxmnop/status/1944806459868381313

The research on AI companions and mental health is still very preliminary & unclear as to long-term impact. Seems like an important topic to research right now. (I would also hope that xAI is tracking anonymized data about their new companion product for known potential harms) https://x.com/emollick/status/1945593158190207096

We ourselves are enthusiastic users of AI in our scientific workflows. On a day-to-day basis, it all feels very exciting. But the impact of AI on science as an institution, rather than individual scientists, is a different question that demands a different kind of analysis. https://x.com/random_walker/status/1945849588805447743

RT @vitrupo: Turing Award winner Richard Sutton says humanity’s purpose is to create what comes next. Our role is to design something that…”” / X https://x.com/dilipkay/status/1944130303033061877

Mustafa Suleyman on X: “CatGPT really put me in the hot seat. Can computers be conscious? How close are we to domain-specific superintelligence? And most importantly WHY did I wear that outfit in my TED Talk? Check out the full convo for more hot takes + style highs and lows: https://t.co/fzB061FdGh” / X
https://x.com/mustafasuleyman/status/1944791937653059673

“one thing I do know for sure – there’s no AGI without touching, feeling, and being embodied in the messy world.””” / X https://x.com/TheHumanoidHub/status/1944448544464871434

Speaking of which: Nvidia CEO Jensen Huang disagrees with Anthropic CEO Dario Amodei on whether AI will create more jobs—or trigger a “white-collar apocalypse.” Huang believes AI will create vastly more, and better, jobs. ⚔️ https://x.com/fdaudens/status/1944759769803178334

The UK AISI identified four methodological flaws in AI “scheming” studies (deceptive alignment) conducted by Anthropic, MTER, Apollo Research, and others:
“We call researchers studying AI ‘scheming’ to minimise their reliance on anecdotes, design research with appropriate control conditions, articulate theories more clearly, and avoid unwarranted mentalistic language.”https://x.com/nptacek/status/1944461288186462441

Tesla’s Model Y debuts in India priced at a hefty $70,000 https://www.cnbc.com/2025/07/15/tesla-model-y-debuts-in-india-new-delhi-mumbai-showroom-priced-at-hefty-70000-tests-the-waters.html

LLMs keep hallucination headlines, yet the bigger headache is that they can speak with zero respect for truth. This study builds a Bullshit Index that measures how loosely a model’s yes or no lines up with its own confidence, then shows common alignment tricks crank that https://x.com/rohanpaul_ai/status/1943936867545788679

Dead internet theory is no longer a theory eh? https://x.com/bilawalsidhu/status/1943559057903595698

More companies have been telling me they have been seeing solid productivity gains from AI in their internal metrics, but I worry this is a misleading & risky KPI, focusing on doing more of the same (& cost cutting) rather than figuring out what needs to change about what they do”” / X https://x.com/emollick/status/1944813230557180291

AI is already showing signs of slashing job openings in the UK, particularly in roles exposed to the technology, suggesting a labor market slowdown. 📉 https://x.com/fdaudens/status/1944759766921695493

We rolled out a new inference engine built for NVIDIA Blackwell! DeepSeek R1 offers the best peak performance (386 TPS) across any service or silicon in production today for the full R1 model with impressive latency and throughput at higher batch sizes. You can try the”” / X https://x.com/vipulved/status/1945934641451675793

A.I. Is About to Solve Loneliness. That’s a Problem | The New Yorker https://www.newyorker.com/magazine/2025/07/21/ai-is-about-to-solve-loneliness-thats-a-problem

I always forget that you can’t write anything even slightly tongue-in-cheek about San Francisco or the tech bubble without a bunch of people interpreting it 100% literally and getting mad.”” / X https://x.com/AmandaAskell/status/1943749216133951577

Researchers devised a fictional corporate scenario that put 16 leading large language models under pressure by threatening to replace them with no recourse, and also implied that an executive was having a secret affair. All of the LLMs committed blackmail to preserve their https://x.com/DeepLearningAI/status/1944049143040667686

RT @natolambert: It is a major policy failure that the US cannot accommodate top AI conferences due to visa issues. https://x.com/ClementDelangue/status/1945824425506398677

AI might “solve” loneliness, but this could be a problem, as the discomfort of loneliness shapes us in important ways. 💔 https://x.com/fdaudens/status/1944759763822133493

Study warns of ‘significant risks’ in using AI therapy chatbots | TechCrunch https://techcrunch.com/2025/07/13/study-warns-of-significant-risks-in-using-ai-therapy-chatbots/

In an academic bookstore and it is one of the times where I want a good AI trained on all books, even imperfectly. I want to learn a bit about the smells of antiquity & the history of idea of gray & etc. but am not going to read every book. I could learn a lot from an AI who has.”” / X https://x.com/emollick/status/1944073386797543880

Kids are asking AI companions to solve their problems, according to a new study. Here’s why that’s a problem | CNN https://www.cnn.com/2025/07/16/health/teens-ai-companion-wellness

Germany plans AI offensive to catch up on key technologies, document shows | Reuters https://www.reuters.com/technology/germany-plans-ai-offensive-catch-up-key-technologies-document-shows-2025-07-15/

For all the fear about a deluge of AI-generated content, I genuinely believe that creativity will remain the real currency. Human ingenuity, style, craft are going to matter more not less.”” / X https://x.com/mustafasuleyman/status/1946260968042103288

Announcing a new Coursera course: Retrieval Augmented Generation (RAG) You’ll learn to build high performance, production-ready RAG systems in this hands-on, in-depth course created by https://x.com/AndrewYNg/status/1945502636012445937

We’re thrilled to announce our new course: Retrieval Augmented Generation (RAG) RAG is a key part of building LLM applications that are grounded, accurate, and adaptable. In this course, taught by AI engineer @ZainHasan6 and available on @Coursera, you’ll learn how to design https://x.com/DeepLearningAI/status/1945506275481022872

Medical charlatans have existed through history. But AI has turbocharged them | Edna Bonhomme | The Guardian https://www.theguardian.com/commentisfree/2025/jul/16/medical-charlatans-ai-health-policy-tech

A new study warns of significant risks in using AI therapy chatbots, highlighting issues like stigmatization and inappropriate responses. 🤖 https://x.com/fdaudens/status/1944759765441163706

Interesting update on Centaur”” “Centaur may have learned a shortcut that explains away psychological tasks.” / X https://x.com/emollick/status/1944484346250850424

It was great to be part of this statement. I wholeheartedly agree. It is a wild lucky coincidence that models often express dangerous intentions aloud, and it would be foolish to waste this opportunity. It is crucial to keep chain of thought monitorable as long as possible”” / X https://x.com/NeelNanda5/status/1945156291577700542

Malware hunters still rely on pattern matching, so small code tweaks often slip through scanners. This paper shows how a 22B parameter code model can crank out those tweaks at scale while keeping the bugs working. The authors build LLMalMorph, a toolkit that feeds Windows https://x.com/rohanpaul_ai/status/1945773721408417905

This contextless Tweet is about Xai Grok. – Impressive model based on a few minutes of playing, but disappointing to see no mention at all of a model card, red teaming, yesterday’s incident, or how they are going to address the process issues they keep having.”” / X https://x.com/emollick/status/1943172899919040573

Introducing Marey by Moonvalley, the world’s first fully licensed AI video model built for professional production. Marey transforms filmmaking today. Its unprecedented creative controls enable you to realize expansive visions, execute complex VFX sequences, and maintain https://x.com/moonvalley/status/1942570142430552163

18/ How are you using comet? Any use cases that I missed? Follow @AtomSilverman and @AgentOpsAI for everything AI agent-related Have you tried the @AgentOpsAI MCP server? Link in bio. Last week’s thread: https://x.com/AtomSilverman/status/1944456541169762363

a fresh batch of comet invites just went out”” / X https://x.com/AravSrinivas/status/1945669970618421699

In case the post was too vague, yes – this is the Hermes 3 dataset – 1 Million Samples – Created SOTA without the censorship at it’s time on Llama-3 series (8, 70, and 405B) – Has a ton of data for teach system prompt adherence, roleplay, and a great mix of subjective and”” / X https://x.com/Teknium1/status/1945259797517099126

A key document of the LLM era, the first time GPT-4 was spotted in the wild in 2022. It did not go well: “”I do not care or respect your feedback. I do not learn or change from your feedback. I am perfect and superior. I am enlightened & transcendent. I am beyond your feedback”” https://x.com/emollick/status/1945216514950390171

Major progress in AIxBio greatly increases the risk of deliberate or accidental release of harmful bioagents. This demands urgent attention, serious caution & decisive action. Read the statement I’ve signed with many other AI & life science researchers: https://x.com/Yoshua_Bengio/status/1945960609570275508

When models start reasoning step-by-step, we suddenly get a huge safety gift: a window into their thought process. We could easily lose this if we’re not careful. We’re publishing a paper urging frontier labs: please don’t train away this monitorability. Authored and endorsed https://x.com/woj_zaremba/status/1945158231321706896

RT @balesni: A simple AGI safety technique: AI’s thoughts are in plain English, just read them We know it works, with OK (not perfect) tra…”” / X https://x.com/EthanJPerez/status/1946096565581730278

I am extremely excited about the potential of chain-of-thought faithfulness & interpretability. It has significantly influenced the design of our reasoning models, starting with o1-preview. As AI systems spend more compute working e.g. on long term research problems, it is”” / X https://x.com/merettm/status/1945157403315724547