Publishing: AI News Week Ending 09/05/2025

Publishing: AI News Week Ending 09/05/2025

September 5, 2025

Image created with Flux Pro v1.1 Ultra. Image prompt: Publishing, stack of hardcovers with dust jackets in subtle small-banana patterns, tidy bookmark, photorealistic, editorial, minimal, high detail, 3:2 landscape

UI-TARS-2 Technical Report Advancing GUI Agent with Multi-Turn Reinforcement Learning https://x.com/_akhaliq/status/1963229296236937443

Can AI agents reliably navigate the web? Does the choice of agent scaffold affect web browsing ability? To answer these questions, we added Online Mind2Web, a web browsing benchmark, to the Holistic Agent Leaderboard (HAL). We evaluated 9 models (including GPT-5 and Sonnet 4) https://x.com/sayashk/status/1963343022252315112

We can finally share UI-TARS-2🥳🥳 — a native GUI agent trained with multi-turn agent RL ⚡️⚡️Key highlights (all-in-one model!): 💻Computer Use: 47.5 OSWorld · 50.6 WindowsAgentArena 📱Phone Use: 73.3 AndroidWorld 🛜Browser Use: 88.2% Online-Mind2Web 🎮Gameplay: ~60% human https://x.com/TsingYoga/status/1963629621326614940

We’ve developed Claude for Chrome, where Claude works directly in your browser and takes actions on your behalf. We’re releasing it at first as a research preview to 1,000 users, so we can gather real-world insights on how it’s used. https://x.com/AnthropicAI/status/1960417002469908903

🚨 Apple just released FastVLM on Hugging Face – 0.5, 1.5 and 7B real-time VLMs with WebGPU support 🤯 > 85x faster and 3.4x smaller than comparable sized VLMs > 7.9x faster TTFT for larger models > designed to output fewer output tokens and reduce encoding time for high https://x.com/reach_vb/status/1961471154197053769

And FastVLM was released by Apple today! 🚀 All about on-device use. Model sizes: 0.5B, 1.5B, 7B. Available in MLX and Core ML. Vision encoder designed to output fewer tokens and reduce encoding time. Which means much faster time-to-first-token.”” / X https://x.com/pcuenq/status/1961464859465269757

Holy crap! That is some fast video captioning — all happening locally in your browser 🤯 This is the aptly named FastVLM by Apple; available on HF: https://x.com/bilawalsidhu/status/1962545148136444380

NEW: Apple releases FastVLM and MobileCLIP2 on Hugging Face! 🤗 The models are up to 85x faster and 3.4x smaller than previous work, enabling real-time VLM applications! 🤯 It can even do live video captioning 100% locally in your browser (zero install). Huge for accessibility! https://x.com/xenovacom/status/1961454543503344036

Enterprise AI, Built Your Way | You.com https://you.com/home

We’re officially a YOUnicorn! Excited to share that @youdotcom just raised $100M Series C at a $1.5B valuation, led by @CoxEnterprises We’ve been heads down building the search infrastructure for the AI and agent future. Soon there will be more AI agents using the web than humans, but today’s search wasn’t built for this. Agents need deep, contextual information from both public web and internal private data to make real decisions. Our web search API delivers the most up-to-date, accurate, and fastest search results for LLMs and agents. Real benchmarks show we consistently outperform the competition on accuracy and speed while staying cost-effective.https://x.com/RichardSocher/status/1963277700711461241

Google’s on a roll. That’s a lot of performance for that tiny size! I just embedded 1.4 million documents in ~80 mins on my M2 Max for free. Would’ve been ~$200 with the text-embedding-3-large, with worse quality.”” / X https://x.com/rishdotblog/status/1963805087014502497

We raised $85M in Series B funding at a $700M valuation, led by Benchmark. Exa is a research lab building the search engine for AI. https://x.com/ExaAILabs/status/1963262700123000947

Did you know you can build a Browser Agent that can navigate Chromium with Gemini 2.5 Flash and @browser_use in under 10 lines of code? https://x.com/_philschmid/status/1963233076034650481

Judge rules in Google’s illegal search monopoly case: it can keep Chrome | The Verge https://www.theverge.com/policy/717087/google-search-remedies-ruling-chrome

Autonomous News Agent A LangGraph-powered AI agent that autonomously curates news briefings, extracts facts, and summarizes content with integrated human feedback and dynamic tool selection. https://x.com/LangChainAI/status/1962213801249710230

Exa is the market leading search engine for AI.
https://x.com/ExaAILabs/status/1963262700123000947

Notebook LM Rolling out NEW audio overview formats:
(Default) Deep Dive: a thorough examination of your sources
Brief: 1-2 minute, bite-sized overviews
Critique: an expert review, offering constructive feedback on your material
Debate: a thoughtful debate between two hosts https://x.com/NotebookLM/status/1962949985546187120

Comet is coming soon to mobile and is now available for pre-orders on Android Play Store https://x.com/AravSrinivas/status/1963620578344276366

Another major Perplexity iOS app update. Team cooked. Answers are now streamed smooth as butter. Tables, markdown, intermediate steps. Update and enjoy! https://x.com/AravSrinivas/status/1963758210281882029

Pro users in South Korea, Brazil, and Spain can now download Comet. https://x.com/perplexity_ai/status/1963638853975040456

🚀 Select PayPal and @Venmo customers can skip the waitlist for early access to @perplexity_ai’s AI-powered Comet browser and receive a free 12-month Perplexity Pro trial. This offer is part of the new PayPal Subscriptions Hub, where you can: ✨ Manage subscriptions ✨ Update https://x.com/PayPal/status/1963229273071698199

We are rolling out Comet to all students worldwide. Ask Comet to manage your schedule, order textbooks, or prepare for exams with Study Mode. https://x.com/perplexity_ai/status/1963285255198314951

Framer Raises $100 Million Series D at a $2 Billion Valuation to Redefine How Businesses Build Websites https://www.businesswire.com/news/home/20250828901842/en/Framer-Raises-%24100-Million-Series-D-at-a-%242-Billion-Valuation-to-Redefine-How-Businesses-Build-Websites

Finally…an AI video editor that just works!! Edit any videos or cut the best moments directly from YouTube link from just a simple English prompt. This is insane! https://x.com/Saboo_Shubham_/status/1962891766232739919

A really useful prompt for writing: “”review this for accuracy, look up any facts you may want to challenge or explore.”” Even if not perfect, it is a good sanity check. Works well with Claude 4.1, GPT-5 Thinking, and Grok 4. Weirdly, Gemini 2.5 Pro often won’t do web searches. https://x.com/emollick/status/1961257429846691881

This is disappointing. Purposefully underselling what models can do is a really bad idea. It is possible to point out that AI is flawed without saying it can’t do math or count – it just isn’t true. People need to be realistic about capabilities of models to make good decisions.”” / X https://x.com/emollick/status/1963287621377167732

This chart is being horribly misinterpreted. This is not where the training data of AI comes from, it is a study done by a SEO firm that claims to show how often sites come up at least once in THE WEB SEARCH FUNCTION of certain AI agents when they do a web search for more info. https://x.com/emollick/status/1962678752887914918

Transforming human knowledge, sensors and actuators from human-first and human-legible to LLM-first and LLM-legible is a beautiful space with so much potential and so much can be done… One example I’m obsessed with recently – for every textbook pdf/epub, there is a perfect https://x.com/karpathy/status/1961128638725923119

Finally, MCP servers can now deliver UI-rich experiences!
MCP servers in Claude/Cursor don’t offer UI any experience yet, like charts. It’s just text/JSON.
mcp-ui lets you add interactive web components to its output that can be rendered by the MCP client. https://x.com/_avichawla/status/1961677831861395495

🌐 Our first open model has landed on the Search leaderboard! Diffbot-small-xl by @diffbot debuts at #9 (Apache 2.0) We look forward to more models with search capabilities contributing to ecosystem progress! https://x.com/lmarena_ai/status/1961526740754616545

Just posted my Harvard GSD talk on the future of creative AI – where we’re going and how to harness it: https://x.com/bilawalsidhu/status/1961481959110230129

ROSE: Remove Objects with Side Effects in Videos”” TL;DR: Diffusion transformer; five common cases: shadows, reflections, light, translucency and mirror as video side effects to remove https://x.com/Almorgand/status/1962846321372471755

More than a million people got Comet access this morning. The most widely deployed personal and agentic product in the world right now.”” / X https://x.com/AravSrinivas/status/1963633205351010795

Entire startups have raised more venture capital on the backs of Adobe video edits than actual products. Insane if you think about it. After Effects might be the most valuable VC fundraising tool ever invented.”” / X https://x.com/bilawalsidhu/status/1962915517326332086

AIcos: At long last, we have built almost literally exactly the AI That Tells Humans What They Want To Hear, from Isaac Asimov’s classic 1941 short story, “”Don’t Build AI That Tells Humans What They Want To Hear”” https://x.com/ESYudkowsky/status/1962574434231062861

Welcoming The Browser Company to Atlassian – Work Life by Atlassian https://www.atlassian.com/blog/announcements/atlassian-acquires-the-browser-company

Trusted news sites may benefit in an internet full of AI-generated fakes, a new study finds | Nieman Journalism Lab https://www.niemanlab.org/2025/08/trusted-news-sites-may-benefit-in-an-internet-full-of-ai-generated-fakes-a-new-study-finds/

The utter disrespect for CGI & VFX continues to baffle me. Do these Hollywood heavyweights make these remarks because it’s an easy PR win? Or do they genuinely think modern cinema and TV would be anywhere close to where it is without CGI assisted storytelling?”” / X https://x.com/bilawalsidhu/status/1962583444158062641

Netflix House Philadelphia Opens Nov. 12; the Dallas Location Arrives Dec. 11; How to Buy Tickets – Netflix Tudum https://www.netflix.com/tudum/articles/netflix-house

🚀Introducing Wan2.2-S2V — a 14B parameter model designed for film-grade, audio-driven human animation. 🎬Going beyond basic talking heads to deliver professional-level quality for film, TV, and digital content. And it’s open-source! ✨ Key features: 🔹 Long-video dynamic https://x.com/Alibaba_Wan/status/1960350593660367303