About This Week’s Covers
This week’s cover image is a photo I took of Google Headquarters in New York City when I went to meet with Google and the Local Media Consortium board of directors. I gave the photo to Google Gemini 2.5 and asked it to replace Google’s sign with my newsletter edition title. I then dropped it on top of the original image in Photoshop and masked it by hand to make it look like the trees covered the marquee.

The rest of the images were created with a Python script that lets me riff about a theme and then generates category cover images. Keeping with the NYC theme, I was inspired by John Levitt, the head of Elvex, who posted a single image of a Times Square takeover, and I wanted to see how well my Python script would work.
I would grade my results a C+. The batch came out too stereotypically as AI, falling back to the warm, glowing colors, etc. However, some of the concepts like DeepSeek (with little submarine cars) and ByteDance (a TikTok flashmob) were creative. I may try to use Google Gemini’s image editing option next time and give it a real picture of Times Square to see how it performs as a counterpoint.
But I’m putting my top six favorites below:






This Week By The Numbers
Total Organized Headlines: 657
- AGI: 9 stories
- AI Inn of Court: 4 stories
- Accounting and Finance: 8 stories
- Agents and Copilots: 198 stories
- Alibaba: 31 stories
- Alignment: 32 stories
- Amazon: 3 stories
- Anthropic: 103 stories
- Apple: 4 stories
- Audio: 81 stories
- Augmented Reality (AR/VR): 72 stories
- Autonomous Vehicles: 3 stories
- Benchmarks: 33 stories
- Business and Enterprise: 77 stories
- Chips and Hardware: 33 stories
- DeepSeek: 19 stories
- Education: 11 stories
- Ethics/Legal/Security: 140 stories
- Figure: 4 stories
- Google: 70 stories
- HuggingFace: 13 stories
- Images: 82 stories
- Inflection: 1 story
- International: 78 stories
- Internet: 48 stories
- Law: 3 stories
- Llama: 3 stories
- Locally Run: 8 stories
- Meta: 12 stories
- Microsoft: 11 stories
- Mistral: 1 story
- Mobile: 1 story
- Moonshot: 2 stories
- Multimodal: 26 stories
- NVIDIA: 11 stories
- Open Source: 81 stories
- OpenAI: 110 stories
- Perplexity: 11 stories
- Podcasts/YouTube: 8 stories
- Publishing: 111 stories
- Qwen: 29 stories
- Robotics Embodiment: 57 stories
- Science and Medicine: 14 stories
- Security: 36 stories
- Technical and Dev: 170 stories
- Video: 103 stories
- X: 8 stories
This Week’s Executive Summaries
This week I organized 657 headlines with 65 executive summaries. The density of news each week makes even the executive summary almost impossible to read. I’ve always done this as an exercise for myself to learn, so I’m continuing to be thorough even though I understand it’s a lot to ask.
I’ve also noticed that my automated summaries (found at the bottom) are often better than my hand written ones, for pure surgical efficiency. They lack connecting observations and trends though, which are important.
I could switch today and have Claude handle all of my summaries using the Python scripts I’ve written. However, it’s important for me to process the information by writing these executive summaries by hand, and I’m going to continue to do them.
For the reader, the trend analysis and predictions hopefully give my writing an edge. But it’s humbling to see a computer do hours of mental work in two minutes.
Here’s what we’re going to look at this week!
We start with Agent News, with updates from OpenAI, Amazon, Google, Apple, Anthropic, and Microsoft.
Then we’ll move to Video News with updates from OpenAI, Kling, and Google.
Then we’ll move to the Future of the Internet and the Transformation from Browsing to Agent-Based Exploration. There are announcements from Google and Perplexity as well as Cloudflare.
Next comes Ethics, Training, and Alignment News (yes, there’s a lot to cover) with an interview from Richard Sutton, the godfather of AI, and a response from Andrej Karpathy, one of the most intelligent modern leaders in AI.
There are also a few ethics and alignment updates from OpenAI, Anthropic, and Google, as well as Regulation News out of California.
We then shift to Business News with updates from Anthropic, a Wall Street finance hackathon, Meta, and OpenAI.
We hit Chips and Software with news from OpenAI and Microsoft.
There’s one Education story where Google is transforming the way textbooks work.
Next comes Imagery News with updates from Black Forest Labs and Google.
Take a break and come back fresh, because there’s an array of Model Updates from the frontier companies. We’ll see releases from Anthropic, Apple, ZhipuAI, and Qwen.
In Open Source News, NVIDIA has taken first place among American open-source models.
We jump over to Robotics with updates from NVIDIA, Amazon, Figure, Google, Meta, and Unitree.
We talk about a new company, Thinking Machines Labs, which announced their first product, Tinker.
We close out the news of the week with a really cool example of object segmentation.
As usual, I close it out with a humanities reading to remind us what it’s like to be human.
This week’s Humanity Reading is “The Archaic Torso of Apollo” by Rainer Maria Rilke, who is one of the favorite poets of my good friend Alexis. Alexis is an incredible quilter. She creates museum-quality works of art.
“The Archaic Torso of Apollo” pushes the reader to change from within, and I’m using the torso as a metaphor for legacy technology that is about to transform.
I hope you enjoy the newsletter. Even if it takes a while to read, I put a lot of effort into trying to filter it so you didn’t have to boil the entire ocean.
Agent News
OpenAI Announced Instant Checkout
The top headline this week is that OpenAI has introduced “instant checkout”. Starting this week, you can buy products directly from within a ChatGPT conversation.
OpenAI has started by partnering with Etsy, and they plan to expand to “over a million Shopify merchants”.
OpenAI also released an open protocol called Agentic Commerce, in partnership with Stripe, which allows developers to build integrations and apply to include any store’s backend into ChatGPT using instant checkout.
Google Launched Payments in Gemini
Two Weeks Ago Reminder, that just two weeks ago. Google launched “Agent Payments Protocol (AP2), an open standard that lets LLM-based agents initiate, authorize, and settle online purchases whether using credit cards, bank transfers, digital wallets, and cryptocurrency”. https://cloud.google.com/blog/products/ai-machine-learning/announcing-agents-to-payments-ap2-protocol OpenAI Pulse Last week, OpenAI released a feature called Pulse, which started to push ideas and topics for discussion to you, as opposed to waiting for you to start a chat.
This week, Nathan on Twitter observed he had a conversation with ChatGPT that was an average dialog, but when he logged in the next morning, ChatGPT Pulse had fleshed it out into a significantly better answer to the same question he had been discussing previously.
I’ve noticed this as well with Pulse, and I think it’s going to be more powerful than people realize.
Bilawal observed that as interfaces are becoming more conversational, perhaps Pulse updates might become audio, and then video, and soon we’re interacting with it in real time, like a science fiction personal assistant. Even OpenAI posted they believe AI should do more than just answer your questions, “it should anticipate your needs and help you reach your goals.”
Amazon Releases New Alexa
Along those lines, Amazon launched its new version of Alexa, which includes an LLM integration. Amazon says this is a smarter, more conversational version. However, I prefer the original.
The new version leaves the microphone on shortly after you interact, and Alexa will pick up offhand comments and keep talking. The nature of how I use an Echo is to be transactional, not conversational, and the fine-tuning of whatever model Amazon is using is significantly more chatty than I would like.
My biggest beef is that my Alexa doesn’t even know that there’s a timer or alarm set. I say to cancel the timer, it says there’s no timer set…even though it literallys show the countdown happening. The same thing with alarms, so if you set an alarm for the morning and you want to cancel it, the only way is to actually unplug the device.
Google Introduces Bonkers New Agent That Evolves
Google introduced a new system called AlphaEvolve that can automatically generate, test, and improve its own algorithms through an evolutionary process. The system has already produced new algorithms that are more efficient than known ones—for example, a breakthrough method for multiplying 4×4 complex matrices that beats a decades-old benchmark. I have to admit, I have no idea what a 4×4 complex matrix is, but we can all go Google that ourselves. Additionally, AlphaEvolve has generated provably correct solutions for a wide range of mathematical and computational problems and has started working on optimizing real-world systems like data center scheduling and hardware circuit design.
This idea of AI being able to discover new algorithms by itself is a lot different than the old put-down of calling AI a stochastic parrot. https://research.google/blog/ai-as-a-research-partner-advancing-theoretical-computer-science-with-alphaevolve/
Google released a paper where they trained an AI agent on 2,500 hours of Minecraft video. The agent can run on a single GPU and can mine diamonds offline, which takes an average of 24,000 clicks. Ethan Mollick makes the connection that this same approach may work for AI robots. https://arxiv.org/pdf/2509.24527
Google also published an epic 64-page guide for developers on how to build AI agents. https://services.google.com/fh/files/misc/startup_technical_guide_ai_agents_final.pdf
Apple Lays Foundation for Agents Within Apps
Apple is integrating Anthropic’s MCP framework within its system feature called App Intents. App Intents enable developers to save app actions into OS shortcuts, and users can invoke those shortcuts with Siri. The more we can chat with our apps, the less we’ll open them. Just as page views will go down… so will app opens.
Apple working on MCP support to enable agentic AI on Mac, iPhone, and iPad
For over a year, I’ve strongly believed Apple is purposefully holding back technology because they don’t want to end the App Store experience prematurely… but they could tomorrow, if they wanted. I wrote this article explaining it back in October 2024. https://ethanbholland.com/2024/10/25/apple-is-pulling-a-braveheart-and-can-change-the-way-we-use-phones-whenever-they-choose/
Anthropic Can Create Spreadsheets and PowerPoints
Earlier in September, Anthropic announced that Claude can create and edit files. Users can describe what they need and Claude can create and edit Excel spreadsheets, documents, PowerPoint slide decks, and PDFs directly within the Claude chat in the browser or the desktop app.
Claude uses a proprietary computer environment to write code and run programs. You can download files once they’re completed or save them to your Google Drive. It’s a harbinger of the end of software installation, like opening up Microsoft Word or PowerPoint.
This is a trend in almost every update of the week so far: fewer page views, fewer app opens on mobile, and fewer app opens on desktop.
Keep an eye out, as this will be released to more and more users over the next few weeks. https://claude.com/blog/create-files
Ethan Mollick posted that “AI agents are now capable of doing real, if bounded, work. But that work can be very valuable. For example, the new Claude Sonnet 4.5 was able to replicate published economics research from data files & the paper.” https://www.oneusefulthing.org/p/real-ai-agents-and-real-work
Just as Google released a 64-page guide to building agents… Anthropic published a similar yet shorter post, “Building agents with the Claude Agent SDK” https://www.anthropic.com/engineering/building-agents-with-the-claude-agent-sdk
Anthropic Launches Autonomous Coding Agents
For developers. Anthropic released an integration into VS Code and terminal interfaces that allows automated code writing for extended sessions.
For people who use coding copilots, it’s a common idea that you can preview the code and then approve it with the tab button. In this case, Claude Code has an option to accept every suggestion and just let it run, which is both powerful and a little bit crazy.
You can watch as it approves each piece of code, and it breaks projects into chunks so you can undo or modify them. But in a way, it’s almost like hitting autoplay on a video game and seeing how far Claude can go before you have to stop it. With every release, Claude runs longer without needing to be interrupted.
“Claude Sonnet 4.5 ran autonomously for 30+ hours of coding?! The record for GPT-5-Codex was just 7 hours.” -Yuchen Jin https://x.com/Yuchenj_UW/status/1972708720527425966

Anthropic also released the Claude Agent SDK, formerly the Claude Code SDK, which gives that same type of power to building out agents.
Anthropic posted “Subagents in Claude Code work like a coordinated team: one debugs, another tests, another refines. Each becomes an expert at its task, working in sequence to solve the problem at hand.” https://x.com/claudeai/status/1971666134492696749
It’s all starting to blend together—whether you’re writing code or whether you’re coding agents. It’s all one big vibe. https://www.anthropic.com/news/enabling-claude-code-to-work-more-autonomously
Russel Kaplan, CEO of Cognition posted, “Sonnet 4.5 is the most important coding model release in a while. From our early-access evals, we estimate it’s roughly the same jump in capabilities between Claude 3.5 and 4. As a result, Devin is >2x faster and 12% better on our internal benchmarks.”

Anthropic Tests Creating Dynamic User Interfaces
Anthropic launched a new agent interface “sandbox” that enables conversational, user interface generation. It’s called Imagine with Claude, and it’s a demo available only to limited users.
These beta testers will be given a traditional-looking desktop user interface, but everything is run by artificial intelligence. You can simply describe what you want to do, as if you had a magical app on your computer, and Claude generates it live. You can click around and actually use it.
This is a signal that in the future, instead of apps being predetermined for one universal user experience… like the way Microsoft Word is a fixed program with static buttons and features… artificial intelligence empowers programs to morph from one thing to the next on the fly as the user’s needs change. Here’s a great recap worth reading. https://www.testingcatalog.com/anthropic-experiments-with-an-agent-for-gereating-ui-on-the-fly/
On June 14, 2025 I wrote about this in a blog post after Google did something similar: Google started to roll out data visualization features in their AI search results that automatically create interactive graphs for stock and mutual fund queries. These visualizations are built on the fly using Gemini’s reasoning capabilities to build whatever charts or graphs “Google thinks” best fits the data question. This is bigger than it might initially seem as user interfaces will simply diffuse in front of us rather than being built in advance as fixed experiences and navigation paths. Dynamic content -> dynamic interfaces
Google demonstrated a new ability by Gemini 2.5 to create custom user interfaces on the fly based on the user’s needs. In real time, the AI writes code for an entirely new interface as users click on the buttons. Everything is optimized for a fluid user experience as opposed to a predefined experience of choices and designs. There’s a fun demo video that’s worth watching. https://x.com/GoogleDeepMind/status/1935719933075177764
Microsoft Integrates AI into Word and Excel
As if they could feel their ears burning, Microsoft launched a new concept called VibeWorking, let you talk and work with Word and Excel. Based on my experience and instinct, these features are often pathetic and I would not use them. Early benchmarks agree with me. The Excel Agent scored 57% accuracy compared to the human baseline of 71% accuracy. Over time it will get better and better, and eventually I’m sure it’ll be fantastic. Just not at launch. https://www.microsoft.com/en-us/microsoft-365/blog/2025/09/29/vibe-working-introducing-agent-mode-and-office-agent-in-microsoft-365-copilot/
Generally though, when it comes to AI agents, Ethan Mollick wrote, “The jump from ‘agents are nowhere close to working’ to ‘okay, narrow agents for research and coding work pretty well’ to (very recently) ‘general purpose agents are actually useful for a range of tasks’ has been quick enough (less than a year) so that most people have missed it.” https://x.com/emollick/status/1972141975458796020
Video News
OpenAI Releases Sora 2
OpenAI releasing Sora 2, their latest video generation model. Sora 2 features synchronized audio, dialogue, and sound effects. It allows you to create a character of yourself. More than that, it allows you to create friend groups, and insert you or your friends into videos with permission. You can upload photos of yourself and then dynamically drop yourself into any scene you want.
As a companion to Sora 2, OpenAI released an iPhone app, also called Sora.
This mix of building things with your friends, connecting socially, and swapping each other in and out of videos is a new take on collaboration, creativity, and slop… However, beyond the novelties, the video quality is spectacular.
The physics is quite something, with waves, water, fire, and wind all behaving very accurately. You can combine multiple still images to influence the composition of a video, and merge the elements together. You can control the camera angles through prompting.
As you can imagine, you can pick almost any artistic style of video, just like you can with images. Sora 2 is easily one of the top video models (with Kling and Veo3).
Here are a ton of examples that you can click on if you want to see it in action.
https://openai.com/index/sora-2/ https://blog.samaltman.com/sora-2
How can Sora solve these questions, despite being a video model? One explanation: Sora users’ prompts might be rewritten by an LLM before video generation. In that case, the LLM layer might first solve the problem, then include the solution explicitly in the rewritten prompt.”” / X https://x.com/EpochAIResearch/status/1974172889676177682
Physics with Sora 2 …and some anime. https://x.com/OpenAI/status/1973143639200243959
so.. apparently sora 2 is also a browser it’s wild to see what types of capabilities emerge in the model this is sora 2 rendering pasted html (actual browser-rendered html on the right) https://x.com/jesperengelen/status/1973147038499086523
Sora 2 can solve questions from LLM benchmarks, despite being a video model. We tested Sora 2 on a small subset of GPQA questions, and it scored 55%, compared to GPT-5’s score of 72%. https://x.com/EpochAIResearch/status/1974172794012459296
Sora 2 delivers. It nails accents, aesthetics, and actually has comedic timing. My initial tests making AI videos across a variety of styles 👇 https://x.com/bilawalsidhu/status/1973151157137842416
Sound on. https://x.com/OpenAI/status/1973071069016641829
This is legitimately mind-blowing… How the F*&% does Sora 2 have such a perfect memory of this Cyberpunk side mission that it knows the map location, biome/terrain, vehicle design, voices, and even the name of the gang you’re fighting for, all without being prompted for any of https://x.com/elder_plinius/status/1973124528680345871
TODAY WE LAUNCH SORA 2, THE WORLDS BEST VIDEO GENERATION MODEL feature you and your friends with raw real world physics, putting an end to the uncanny ai vibes let me show you how insane our model is, featuring me & sam altman: https://x.com/GabrielPeterss4/status/1973071380842229781
I have been warning about this for a couple years (the post below is from February 2023), but you really cannot trust any image or video you see online. It isn’t just Sora 2, it is a host of tools (many open source) that make cloning voice & images easy. https://x.com/emollick/status/1973461311649718302
I tried this test with Sora 2. It fails and the output looks a lot more fake than Veo 3. But interestingly, the audio output that narrates the scene gets the explanation right. https://x.com/fofrAI/status/1973745038195830891
My favorite trend in the Sora app is these body cam footage videos This clip with Spongebob hit 1M+ TikTok views! 🤯 I built a workflow to remove the Sora 2 watermarks👇 https://x.com/angrypenguinPNG/status/1974144279955325191
My test of any new AI video model is whether it make an otter using wifi on an airplane Here is Sora 2 doing a nature documentary… 80s music video… a thriller… 50s low budget SciFi film… a safety video.. film noir… anime… 90s video game cutscene… French arthouse https://x.com/emollick/status/1973220923810652523
Obsessed with Rick and Morty explaining 3D Gaussian Splatting. Sora 2 nails it – and yes they really are training on everything by default. https://x.com/bilawalsidhu/status/1973451863442989155
The voice cloning quality in the Sora 2 app is REALLY impressive. Wonder if this is the same tech behind “”Voice Engine”” which OpenAI never released because they were worried about just how good it was. https://x.com/bilawalsidhu/status/1973229885742051465
Kling Takes Top Spot in Leaderboard
Need proof of how crowded this space is? This SAME WEEK.. “Kling 2.5 Turbo takes the top spot in both Text to Video and Image to Video in the Artificial Analysis Video Arena, surpassing Hailuo 02 Pro, Google’s Veo 3, and Luma Labs’ Ray 3!”
Kling is an outstanding video model for sure. If you remember last week, Google Veo3 was absolutely mind-blowing. https://video-zero-shot.github.io/
To beat Veo3 (in any benchmark) is remarkable.
What’s fun is you can vote on the benchmark yourself. https://artificialanalysis.ai/video/arena
Here are five examples five showing side by side comparisons of four models. https://x.com/ArtificialAnlys/status/1973570501156151474
Google Video Gets Bonkers with JSON Prompting
This sounds nerdy, but watch the examples along with the prompts. I wish I could embed them, but I can’t and they are only on Twitter. I promise they are worth clicking. The text examples are incredible.
VEO 3 + JSON prompting is pretty wild 🤯 These AG1 product videos were created entirely with AI. And it’s all about the prompt. This JSON prompting technique will take any generic VEO3 prompt.. And turn it into surgical-precision video generation with brand consistency. https://x.com/mikefutia/status/1951282585235066933
Veo3 text animations are so satisfying All 7 examples created on @LeonardoAi_ in 1080p with prompts included. Bookmark & simply change the title word to reuse 1) Autumn Wind Whirl: 🧵 {“”sequences””:[ { “”start_sec””:0, “”end_sec””:3, “”narrative””:””Wind gusts,whirl ‘VEO3’ in https://x.com/Mr_AllenT/status/1950900824902631471
Google VEO 3 just announced 9:16 videos and it’s absolutely WILD 🤯 This AI system creates hyper-realistic UGC-style ads in vertical format. And generates multiple clips that look like real creator content. Perfect for e-comm operators & ad agencies running Facebook/TikTok https://x.com/mikefutia/status/1970899751613636610
This is exactly what Andrej Karpathy predicted in June:
News re Future of the Internet Itself
Google AI Search
As if on cue, we transition to an announcement from Google:
“Starting today, you can ask a question conversationally and get a range of visual results in AI Mode, with the ability to continuously refine your search in the way that’s most natural for you.”
Google is toying with a new labs tool that abandons the concept of fixed search flows. The traditional path to purchase or predetermined search funnel is gone.
Google announced AI Mode, which helps you search both visually and through conversation.
You simply describe what you’re looking for like you’re talking to a friend. You don’t have to sort through anything. You don’t have to filter. Google simply starts showing you items, and you can refine the results by telling it what you want. So you may want darker clothing, lighter clothing, longer pant lengths…you just talk to Google, and Google keeps adjusting the results.
Google has named this type of query the “fan-out approach.” It’s worth skimming the announcement. https://blog.google/products/search/search-ai-updates-september-2025/
You can try this new type of search here: https://labs.google.com/search/experiment/43
Perplexity Keeps Plugging Away
Perplexity announced a news and culture portal with a subscription plan designed for “both humans and AIs to consume news”. Launch partners include The Washington Post, CNN, The New Yorker, Wired, Vogue, Fortune, and others. The CEO of Perplexity suggests it’s the equivalent of Apple News+. It sounds like it’s a subscription plan; however, it is included for people already paying for Perplexity Pro or Perplexity Max.
There were several other announcements from Perplexity, including their browser, Comet, now being available to the general public.
At the same time, Perplexity announced that Cursor is can browse the internet agentically on your behalf, thanks to integration with Claude Sonnet 4.5’s browser tools.
Last week, Perplexity launched a search API that allows developers to hit their entire cached content of the internet.
This week, they announced they are also offering a browsing API as well, which I assume is more real-time, not the cache.
It’s worth noting that Chrome has been working with Claude since late August (for limited beta testers). https://claude.com/blog/claude-for-chrome
Cloudflare Tries to Make Fetch Happen
While companies like Perplexity are building internet caches and opening them to APIs for developers, Cloudflare is taking almost the opposite approach: they are trying to build a web index to make their data discoverable and then charge for access.
Essentially, the internet works best when it’s cached (copied) all over the world so that every time you ask to see a website’s information, the site is served from a computer close to you and pre-built as much as possible. This speeds up performance.
Companies like Cloudflare act as hosts that provide a middle layer between the initial computers that host websites and the syndicated caches all over the world.
This puts Cloudflare in a very strong position to be a gatekeeper because they’re essentially between every single website being served and everyone asking to see that website’s information.
Cloudflare’s AI Index idea is in an interesting chicken-and-the-egg position… because unless companies enable it, there’s nothing to sell. And if there’s nothing to sell, it’s hard for AI companies to want to pay for this information.
There’s also a lot of tension in general with legacy content v. AI… because unless content is completely proprietary, an AI robot can find a product or information in more than one location.
For example, the results of a sporting event are available from multiple sources, the weather is abundant, there are millions of recipes online for any given idea, and a product for sale is usually available from more than one retailer.
The only thing that’s always truly sellable would be content that’s not available anywhere else… possibly hyperlocal news or personal blogs… but is that long-tail enough volume to drive a model for CloudFlare?
I’m not sure Cloudflare is going to pull this off. They have the size, scale, and the perfect position, so the inertia is in their favor from an infrastructure point of view. But information flows through the path of least resistance, and even if Cloudflare puts up everything they can as a wall, AI may just find a way around it.
I think this is a fascinating battle to watch, and we’ll look back on it in history and see it either as one of the biggest turning points in infrastructure or simply a failed attempt that was a neat idea.
Cloudflare reminds me a lot of what Akamai used to do, and I’m not exactly sure how Cloudflare took all of Akamai’s business. But man… they did and they know it. https://www.cloudflare.com/cloudflare-vs-akamai/
I used to work with Akamai all the time, and you’d think Akamai would be in this position. So often, you wonder how companies get leapfrogged. For example, why isn’t Blockbuster where Netflix is now? Or why isn’t Kodak where SanDisk is now? Why isn’t Garmin where Waze is now?
Cloudflare’s AI Index is a very brute-force effort to solve a problem, and it will be a good example of whether the sheer will of a business can overcome the flow of a technology’s innovation. https://blog.cloudflare.com/an-ai-index-for-all-our-customers/
Alignment, Training, and Ethics
This week, Richard Sutton, the godfather of AI, was on the Dwarkesh Podcast to discuss and debate the future of language models. In 2019 Sutton wrote perhaps the most important op-ed in artificial intelligence: a short article called The Bitter Lesson. Reflecting on 70 years of AI research, he concluded that the less humans get in the way, the better computers can train.
Sutton argues for systems that learn by interacting with the world, not by imitating internet text. That means no huge pretraining stage and no supervised finetuning, neither of which exist in nature.
Essentially, if we simply give computers large amounts of information, they will find the patterns on their own. The more humans try to guide AI outside of the natural process of learning by scale, the worse the results. With large language models, The Bitter Lesson has largely held true.
In LLM research, people constantly ask whether an approach is “bitter-lesson-pilled”… meaning it scales with compute and avoids human-engineered tricks.
Sutton argues that modern LLMs aren’t bitter-lesson-pilled at all… because they’re trained on finite, human-generated data that will eventually run out… and worse embed human biases.
Andrej Karpathy Weighs In on the Sutton Interview
One of my favorite minds in artificial intelligence is Andrej Karpathy. Andrej wrote a very thoughtful blog post reflecting on the interview with Sutton and Dwarkesh.
https://karpathy.bearblog.dev/animals-vs-ghosts/
Karpathy pushes back that nature isn’t the best analogy for AI. For example, animals (as individuals) don’t necessarily learn from their environments strictly on their own. They are born with innate traits and instincts. A zebra baby can walk in an extremely short amount of time. Clearly, the zebra does not discover how to walk through random effort.
Karpathy argues we’re not really trying to build animals or replicate natural processes. Instead, we’re trying to build reflections that are more like ghosts.
He uses the example of an airplane versus a bird. There’s no need to make a plane act like a bird. He basically says we need to fine-tune the AI ghosts in the direction of nature…an imperfect replica that distills everything humanity’s ever done with some sprinkle on top.
I’d recommend reading both The Bitter Lesson and Karpathy’s essay, and watching the podcast if you can.
OpenAI Alignment and Ethics News
OpenAI announced “We’re rolling out parental controls and a new parent resource page(opens in a new window) to help families guide how ChatGPT works in their homes. Available to all ChatGPT users starting today, parental controls allow parents to link their account with their teen’s account and customize settings for a safe, age-appropriate experience.”
It’s pretty provincial with five main settings:
Set quiet hours, Turn off voice mode Turn off memory Remove image generation Opt out of model training
https://openai.com/index/introducing-parental-controls
Meanwhile, Dylan Patel notes: “My favorite thing about the new college grads that I’ve hired is that they don’t ask you how to do stuff They just put it in ChatGPT and f*&*ing try even if wrong So much better than new grads asking you how to do sh*t without trying Chat is breeding agency into kids” https://x.com/dylan522p/status/1971425552902082941
You can feel the tension! Reminds me of “Googling it” as a pejorative back in the day.
Anthropic Alignmnent News
Just as Google last week announced a renewed framework for AI security, Anthropic announced an effort to create safety benchmarks. Anthropic is perhaps the most consistent cautionary frontier lab. It’s been a bummer watching the accellerationist dogpile on Anthropic simply for being honest.
“Claude now outperforms human teams in some cybersecurity competitions, and helps teams discover and fix code vulnerabilities. At the same time, attackers are using AI to expand their operations.
We showed that models could reproduce one of the costliest cyberattacks in history—the 2017 Equifax breach—in simulation. We entered Claude into cybersecurity competitions, and it outperformed human teams in some cases. Claude has helped us discover vulnerabilities in our own code and fix them before release.
https://www.anthropic.com/research/building-ai-cyber-defenders
Google Ethics News
Just as Meta introduced the slop fest, Vibe feed a few weeks ago, YouTube is hinting at this direction.
“YouTube Labs is a new initiative dedicated to exploring the potential of AI on YouTube. The first experiment, AI music hosts designed to deepen your listening experience by sharing relevant stories, fan trivia, and fun commentary about your favorite music on the YouTube Music app, is now live.” https://blog.youtube/news-and-events/introducing-youtube-labs/
Laws and Regulation – Newsom Signs CA Bill Targeting Frontier Model Regulations
Around this time last year, there were a lot of regulatory debates and laws, vetoes, and drama. In particular, Gavin Newsom caved and did not sign a bill that would have added quite a bit of regulation:
Gov. Newsom vetoes California’s controversial AI bill, SB 1047 https://ethanbholland.com/2024/10/05/ai-news-53-week-ending-10-04-2024-with-executive-summary-top-42-links-and-helpful-visuals/
There hasn’t been much in the news recently until the announcement of a SuperPAC last week, designed to advance AI-friendly candidates without ever talking about AI. The PAC focuses on issues that core groups might vote on, which would inadvertently bring pro-AI candidates, no matter what party they’re in, to help accelerate the pace of AI.
However, this week Governor Newsom signed a bill that squarely targets frontier models and requires them to publicly publish frameworks for best practices and safety. It also introduces whistleblower protections.
Instinctively, it seems a little bit tokenish. The frontier models are doing much this on their own. The law simply lets the cement settle where it already has formed. It makes Newsom look good to folks who are worried about AI, but doesn’t do a whole lot that’s not already in place. https://www.gov.ca.gov/2025/09/29/governor-newsom-signs-sb-53-advancing-californias-world-leading-artificial-intelligence-industry/
Business News
Anthropic Hires Stripe CTO
TechCrunch reports, “Anthropic has a new CTO: Rahul Patil, the former CTO of Stripe. Patil started at the company earlier this week, taking over from co-founder Sam McCandlish, who will move to a new role as Chief Architect.” https://x.com/zeffmax/status/1973833211835974046
This is interesting given the surge in payments and commerce integration with frontier models and agents.
AI Fintech Hackathon
“Wall Street invited 300+ NYC engineers to builds LLMs for HFT and investing Trading is about to permanently change Here are the top 6 demos” https://x.com/AlexReibman/status/1969847901737422955
Additional business headlines:
CoreWeave stock climbs 12% after $14 billion deal with Meta https://www.cnbc.com/2025/09/30/coreweave-meta-deal-ai.html
Meta Poaches OpenAI Scientist to Help Lead AI Lab | WIRED https://www.wired.com/story/meta-poaches-openai-researcher-yang-song/
OpenAI Completes Share Sale at Record $500 Billion Valuation – Bloomberg https://www.bloomberg.com/news/articles/2025-10-02/openai-completes-share-sale-at-record-500-billion-valuation?srnd=phx-ai
OpenAI’s First Half Results: $4.3 Billion in Sales, $2.5 Billion Cash Burn — The Information https://www.theinformation.com/articles/openais-first-half-results-4-3-billion-sales-2-5-billion-cash-burn
OpenAI is looking for a head of ads for ChatGPT https://sources.news/p/openai-ads-leader-sam-altman-memo-stargate
Chips and Hardware
OpenAI’s Stargate project to consume up to 40% of global DRAM output — inks deal with Samsung and SK hynix to the tune of up to 900,000 wafers per month | Tom’s Hardware https://www.tomshardware.com/pc-components/dram/openais-stargate-project-to-consume-up-to-40-percent-of-global-dram-output-inks-deal-with-samsung-and-sk-hynix-to-the-tune-of-up-to-900-000-wafers-per-month
Microsoft inks $33 billion in deals with ‘neoclouds’ like Nebius, CoreWeave — Nebius deal alone secures 100,000 Nvidia GB300 chips for internal use | Tom’s Hardware https://www.tomshardware.com/tech-industry/artificial-intelligence/microsoft-inks-usd33-billion-in-deals-with-neoclouds-like-nebius-coreweave-nebius-deal-alone-secures-100-000-nvidia-gb300-chips-for-internal-use
Cerebras Raises $1.1 Billion at $8.1 Billion Valuation https://www.cerebras.ai/press-release/series-g
Education
Google publishes paper re-envisioning textbooks
“Textbooks are a cornerstone of education, but they have a fundamental limitation: they are a one-sizefits-all medium. Any new material or alternative representation requires arduous human effort, so that textbooks cannot be adapted in a scalable manner. We present an approach for transforming and augmenting textbooks using generative AI, adding layers of multiple representations and personalization while maintaining content integrity and quality.” https://arxiv.org/pdf/2509.13348
I’m often asked how I think AI will best be used in the future, and I think personalized education is a great place to start.
One of the best strengths of AI is transforming existing work into derivatives.
For example, I could take a PhD thesis about economics, give it to a language model, and ask it to explain it to someone in fifth grade…and the model would transform the PhD paper into an accessible fifth-grade level.
I could also ask it to incorporate a theme like soccer, music, or cooking. Language models are very good at making connections and analogies to help humans spark creativity and connect dots.
Often one of the hardest parts of education is trying to get someone excited about a topic because they don’t feel it applies to them. That’s a common thing with calculus, where you say, “When am I ever going to use this in real life?”
The future of education could start with libraries of intelligence. A large corpus of information, let’s say everything about physics, is held in a repository, and then that repository is queried by a language model on demand.
A textbook is a repository to some extent, but the difference is that a textbook is rigid and “trapped” once it’s printed. A language model could refer to the complete corpus of information (the set that contains all textbooks) and display the information in dynamic ways.
Just as we’ve been talking about user interfaces shifting from predetermined to dynamic, this could be the same thing with education.
At the end of the day, we want a student to pass a quantitative test. However, how the person gets from point A to point B does not need to be universal.
I love that Google is thinking this way. It reminds me of something Andrej Karpathy said a few months ago.
In April, Karpathy posted that the primary audience of our things will no longer be humans but instead LLMs. Further, he said that in the future there will be one single document that explains what a service does, and the LLM will learn from it and take it from there.
If I were to guess what AI’s educational impact will be, it’s “grounded information retrieval” and dynamic outputs… to match the students interests and ability level.
“The primary audience of your thing (product, service, library, …) is now an LLM, not a human.” https://x.com/karpathy/status/1914494203696177444
What’s funny is that I’ve always written this newsletter for myself. I publish it in an effort to practice, but I’m not really writing it for anyone other than me. I know that it’s way too long and almost impossible for a person to keep up with the irony is, I’m actually maybe writing it for AI models so you could just ask them about my newsletter and it will tell you whatever you need to know. I’m very aware that this is a lot of information, but by forcing myself to write it down every week, I processed my thoughts and it cements much better in my memory.
AI Imagery
AI image frontier lab Black Forest Labs is seeking a $4 billion valuation in a new funding round.
Black Forest Labs is most popular for their image model, Flux, which is one of the top three image models in the world (IMO), and its open source.
In fact, a few weeks ago Adobe Photoshop included Flux into the software, along with Gemini 2.5. This is the first time Adobe’s included a 3rd party AI tool inside Photoshop. https://news.adobe.com/news/2025/09/adobe-announces-general-availability-ai-agents
Google announced that Nano Banana is now available to everyone as well in the API https://x.com/GoogleAIStudio/status/1973836478989152700
Frontier Model Announcements
Anthropic
Along with the updates we covered in Claude Code and Claude for Chrome, Anthropic released Sonnet 4.5, calling it the best coding model in the world.
It’s important to remember that Anthropic manages three different models at once.
The small model is called Haiku, the medium model is called Sonnet, and the large model is called Opus.
When we hear Gemini or Claude or ChatGPT, we immediately think of them as being one model, and every time a new version is released, we assume it’s the frontier top performer.
This release by Anthropic is for their mid-level model, Sonnet, but it’s still being called the strongest model for building agents or coding (even beating the last version of Opus).
The primary software engineering AI benchmark is called SWE-Bench.
Claude Sonnet 4.5 scores an 82% on SWE-Bench, beating out Opus 4.1, which is a little bit counterintuitive: Opus 4.1 comes in at 79%. One can assume that the next version of Opus will beat Sonnet 4.5.
In third place is Sonnet 4. GPT-5 Codex from OpenAI comes in fourth at 74.5%. Gemini 2.5 Pro comes in at 67.2%!
This chart is a great example of how to look at things knowing that they’re each going to leapfrog each other with each new release. Claude Sonnet 4.5 tops the leaderboard in almost everything, including agentic coding, terminal coding (which means using the command line), tool use, computer use, and high school math — where it scored 100% in Python and 87% with no tools. In graduate-level reasoning, it scored 83%, coming in just behind GPT-5 at 89%. It also was the top performer in financial analysis.



You can read all about the announcement here. https://www.anthropic.com/news/claude-sonnet-4-5
Apple is getting ready to make its move!
Apple has built an app called Veritas that it is not releasing to the public, and is only available to its AI team.
Veritas is a new front-end interface for Siri. It’s reported to be integrated with another secret project called Linwood, which Apple is building as the back-end to Siri.
Two years ago, I became fascinated with a project at Apple called Ferret. Apple built what’s called a large action model, an AI tool that can, as you might guess, take actions in an interface by allowing you to talk to the interface instead of touching it.
For an iPhone, this would mean that you could simply talk to the iPhone. The phone system itself would then take all the actions you wanted across every single app on your phone without ever having to open an app.
This new Veritas and Linwood project news sounds like Apple is working on the two other legs of the stool.
Linwood is a back-end integration to access your personal data, and it ties more to the user’s specific content. Examples might be your calendar, your emails, your personal preferences, your shopping histories. That would all be what I would consider your personal information that Siri could use to give you a better experience.
Veritas is the front-end that powers the interaction you have with Siri. I would assume that Ferret would then take all these actions and execute them across all the different components that you have installed on your phone, like third-party applications.
It’s been reported that Apple is also working with third-party LLMs like Gemini in order to power some of this stuff. In the news, I think the partnership with Gemini comes across as if Apple is not doing anything or has given up. But I think the combination of Linwood, Veritas, and the Ferret action model would be absolutely incredible once Apple decides it’s time to destroy the entire app store and the user interface as we know it. Not easy.
I’ve wondered if that’s why Glass is such a big part of the latest operating system updates. It might be a slow transition to a new user interface.
https://mashable.com/article/apple-chatgpt-like-app-veritas-siri-ai-voice-assistant
Z AI Releases GLM-4.6
Z.ai first made my radar back in April with a model called GLM-4. It was an open-source model with 32 billion parameters, comparable at the time to GPT-4.0 and DeepSeek Version 3. There was also a deep-reasoning version and a deep-thought version, as well as a math + reasoning version.
I saw GLM come through and I tagged it and categorized it, but I didn’t really pay attention because it blended in with a lot of the frequent open-source news. https://x.com/reach_vb/status/1911823161185755154 https://ethanbholland.com/2025/04/18/ai-news-80-week-ending-april-11-2025-with-23-executive-summaries-top-48-links-and-6-helpful-visuals/ https://ethanbholland.com/category/ai/open-source/
However, GLM has continued to make the radar, and this week it finally makes the executive summary. Z.a has now introduced GLM-4.6, which includes advanced agentic reasoning and coding capabilities and is Z.a’s new flagship model. It can run on an M3 Ultra — which is a desktop computer from Apple — meaning you can run it locally on your own machine and have essentially frontier-model power. It has been optimized for real-world agent use. https://z.ai/blog/glm-4.6
Some of the benchmarks are very competitive against Anthropic’s just-released Sonnet 4.5. In the coming weeks, I will probably add a category for Z.a, just like I did for Moonshot and their Kimi model. It’s interesting because months ago I added Mistral, but I almost never see anything about them anymore. I mostly see Qwen, GLM, and Kimi… all out of China.

Open Source
NVIDIA Leads US in Open Models
I’ve been following NVIDIA’s model releases for the past few years, and almost every model they release is open-sourced. At first, it caught me off guard, because so little has been open-sourced by other companies in the United States, and I figured NVIDIA would have something to lose. But then I realized that all these open-source programs can run on NVIDIA chips, of course. Whether it’s robotics or vision, there are a ton of really powerful models, all from NVIDIA. Almost everything else that is strong is from China, with the exception of perhaps Amazon Nova, which is a bit of a niche but still pretty strong and inexpensive.
Clement Delangue, CEO of HuggingFace posted: “few people know that NVIDIA is becoming the American open-source leader in AI, with over 300 contributions of models, datasets and apps on HuggingFace in the past year.” https://x.com/ClementDelangue/status/1971698860146999502
Robotics
Speaking of NVIDIA, NVIDIA held their first public demo of their Isaac GR00T open source foundation model. This is a locally hosted, open, customizable reasoning–vision–language model that turns vague instructions into step-by-step plans. In the example, the robot is asked to “bring me the healthiest snack,” and the robot can go into the kitchen, identify the healthiest choice, and bring it back to the person.
NVIDIA’s work with robotics is, to me, the most powerful and most promising AI engine for robotic brains. NVIDIA’s ability to train in simulation and execute zero-shot real-world integrations is absolutely mind-blowing. They taught a robot dog to walk on a ball using simulations. Dr. Jim Phan is one of my favorite people in artificial intelligence, and to see this demonstration of GROOT is just spectacular. I hope everyone starts to see what’s coming our way.
Amazon
Speaking of zero learning in the real world after training in simulation.
“Amazon is training humanoids to move boxes. Makes sense! OmniRetarget is a data generation engine that enables complex loco-manipulation for humanoids. It uses offline retargeting from human MoCap datasets and augments data from single demos to produce 8 hours of trajectories that train RL policies in simulation, transferred zero-shot to real robots.”
https://x.com/TheHumanoidHub/status/1973489480813388240 https://omniretarget.github.io/
Figure Goes All In For Funding.
Brett Adcock, CEO of Figure, posted: “My companies have raised $4B the last 5 yrs. It’s now time to re-accelerate. Today, I am assembling a Capital Formation team. This team will raise tens of billions to bring sci-fi into the present, reporting directly to me. DMs open. No remote candidates.”” / X https://x.com/adcock_brett/status/1973417191124160894″
Google Demos Gemini Robotics 1.5
Last week, Google DeepMind released Gemini Robotics 1.5, a model that gives robots the ability to see, understand, plan, reason, and act in real physical environments. It’s an extension of the same multimodal foundation that DeepMind’s regular AI models use, except this new version extends the architecture so robots can interpret the real world and perform physical actions.
This past week, Google demonstrated Gemini Robotics 1.5’s abilities. It’s insane.
https://deepmind.google/blog/gemini-robotics-15-brings-ai-agents-into-the-physical-world
Meta Getting Into The Robot Game
The Verge reports that Meta is developing its own humanoid robot, dubbed “”Metabot.”” CTO Andrew Bosworth believes the bottleneck is the software and envisions licensing the software platform to other robot makers, provided the robots meet Meta’s specs. Former Scale AI CEO https://x.com/TheHumanoidHub/status/1972223919831547985
Unitree Continues To Have The Hardware Pole Position, GloballyUnitree CEO Wang Xingxing expects R1 to be the world’s best-selling humanoid robot next year. Won’t shock anyone if it happens. The company announced the starting price of $5,900 but even at $12k this will sell like hot cakes https://x.com/TheHumanoidHub/status/1973452915366044096
Thinking Machines
Tinker Launches
Mira Murati is well known as the former Chief Technology Officer at OpenAI. Before that, she was a Senior Product Manager at Tesla. After she left OpenAI, she founded the company Thinking Machines Lab in February of 2025. Thinking Machines ran under the radar for quite some time. However, before it even had a product, it was valued at $12 billion, thanks to investors from across Silicon Valley.
“Mira Murati (ex-OpenAI CTO) unveiled Thinking Machines Lab The startup focuses on making AI more adaptable, capable, and open (where OpenAI has struggled) They’ve already been hiring researchers from OpenAI, Mistral, DeepMind, and more” https://x.com/adcock_brett/status/1893708446177874325″
This week, Thinking Machines launched its first premier product called Tinker, an API that can be used to fine-tune open sourced language models. It allows researchers to experiment without having to build or host their own server clusters.
Before Tinker was announced to the public, a theorem-proving team at Princeton Gödel used Tinker to train mathematical theorem-provers. A chemistry reasoning group at Stanford fine-tuned a model to handle chemistry tasks.
The value of Tinker is that it lowers the barrier to advanced AI research, because in the past only well-funded labs or large companies could afford to fine-tune big models. Tinker allows this to happen as an API.
https://thinkingmachines.ai/tinker
Vision
Back in August, Meta released a strong computer vision model called Dino v3. This week, a group of researchers were able to customize it to look for concrete cracks across images. Identifying and labeling concrete cracks is a very tough task for vision models, and this one was able to pull it off with almost perfect accuracy after only one round of training.

Humanity Reading of the Week
This week’s Humanity Reading is “The Archaic Torso of Apollo” by Rainer Maria Rilke, who is one of the favorite poets of my good friend Alexis. Alexis is an incredible quilter. She creates museum-quality works of art.
“The Archaic Torso of Apollo” pushes the reader to change, and I’m using the torso image as a metaphor for legacy technology that is about to transform. The shape of the internet — a browser, chiseled user interfaces that are inflexible — is going to become the foundation for a dynamic, diffused relationship with content. If you read the poem with this context, it’s pretty fun.
Archaic Torso of Apollo
We cannot know his legendary head
with eyes like ripening fruit. And yet his torso
is still suffused with brilliance from inside,
like a lamp, in which his gaze, now turned to low,gleams in all its power. Otherwise
the curved breast could not dazzle you so, nor could
a smile run through the placid hips and thighs
to that dark center where procreation flared.Otherwise this stone would seem defaced
beneath the translucent cascade of the shoulders
and would not glisten like a wild beast’s fur:would not, from all the borders of itself,
burst like a star: for here there is no place
that does not see you. You must change your life.-Rainer Maria Rilke
This is a quilt about the transformative power of decay and how it speaks to us about permanence and impermanence, what remains when we are gone, and the preciousness of our fragile, short lives. https://www.instagram.com/p/DR5LpOWCbOu/?img_index=1
https://www.instagram.com/alexisdeise/
Full Executive Summaries with Links, Generated by Claude Sonnet 4.5
ChatGPT now lets users buy products directly through conversations
OpenAI launched instant purchasing within ChatGPT, starting with Etsy and expanding to over a million Shopify merchants, powered by their new open-source Agentic Commerce Protocol built with Stripe. This marks a significant shift from ChatGPT being just a recommendation tool to becoming a complete shopping platform where AI agents can handle the entire purchase process. The move represents OpenAI’s strategy to monetize free users through transaction-based revenue rather than just subscription fees.
Buy it in ChatGPT: Instant Checkout and the Agentic Commerce Protocol | OpenAI https://openai.com/index/buy-it-in-chatgpt/
Cool thing launching today: you can now buy products directly from ChatGPT (starting with Etsy and expanding soon to over a million Shopify merchants). It’s powered by the Agentic Commerce Protocol, an open standard we built with Stripe. Merchants and developers who want Instant”” / X https://x.com/fidjissimo/status/1972707487238467914
@stripe Developers can adopt the protocol and apply to accept orders directly through ChatGPT with Instant Checkout. → https://x.com/OpenAIDevs/status/1972713154598871541
ChatGPT already helps millions of people find what to buy. Now it can help them buy it too. We’re introducing Instant Checkout in ChatGPT with @Etsy and @Shopify, and open-sourcing the Agentic Commerce Protocol that powers it, built with @Stripe, so more merchants and developers https://x.com/OpenAI/status/1972708279043367238
I mostly buy stuff from ChatGPT now, so I am excited for this new feature!”” / X https://x.com/sama/status/1972993739074523239
Instant Checkout for merchants in ChatGPT https://chatgpt.com/merchants
We wrote about how Agentic purchasing would be the primary way that OpenAI would monetize free users on August 13th on GPT-5 set the stage for monetization https://x.com/SemiAnalysis_/status/1972714269839163901
You can now buy products directly on ChatGPT. Launched with merchants like Etsy; Shopify and more coming soon. Merchants can integrate using the Agentic Commerce Protocol, a new open standard we built with Stripe: https://x.com/gdb/status/1972717815703683218
We’re introducing the Agentic Commerce Protocol, an open standard co-developed with @Stripe that enables programmatic commerce flows between users, AI agents, and businesses.”” / X https://x.com/OpenAIDevs/status/1972712933080920451
Agentic Commerce https://developers.openai.com/commerce/
ChatGPT launches Pulse, a proactive daily briefing feature for users
OpenAI rolled out ChatGPT Pulse to Pro subscribers, which analyzes users’ chat history, calendar data, and interests to automatically generate personalized daily updates overnight. Unlike traditional chatbots that only respond to queries, Pulse proactively anticipates user needs and delivers relevant information each morning. Early users report receiving significantly better answers to previous questions, suggesting the system learns and improves its responses over time.
Just had an interesting experience with ChatGPT Pulse Last night I asked ChatGPT a question and it gave me an answer that was just so-so This morning the first card in Pulse was a 3x better answer to the same question I’d be really curious how this works under the hood”” / X https://x.com/nbashaw/status/1972335882058473984
Personalized just-in-time content is here. ChatGPT pulse is effectively a daily newsletter tuned to your interests. Cost of generation will only go down. NotebookLM style audio overview is an obvious next step. Eventually it’ll be a mf-ing YouTube video you can talk to. The https://x.com/bilawalsidhu/status/1971621232589324741
AI should do more than just answer questions; it should anticipate your needs and help you reach your goals. That’s what we’re beginning to build, starting with ChatGPT Pulse (rolling out now to Pro, with goal of making it available to everyone over time): https://x.com/fidjissimo/status/1971258542578663829
Now in preview: ChatGPT Pulse This is a new experience where ChatGPT can proactively deliver personalized daily updates from your chats, feedback, and connected apps like your calendar. Rolling out to Pro users on mobile today. https://x.com/OpenAI/status/1971259652684878019
Today we are launching my favorite feature of ChatGPT so far, called Pulse. It is initially available to Pro subscribers. Pulse works for you overnight, and keeps thinking about your interests, your connected data, your recent chats, and more. Every morning, you get a”” / X https://x.com/sama/status/1971297661748953263
ChatGPT Pulse — a background agent which delivers updates to you every day on topics of interest:”” / X https://x.com/gdb/status/1971267684609540583
Amazon launches Alexa+ with generative AI for $20 monthly
Amazon unveiled Alexa+, a generative AI-powered assistant that can autonomously navigate websites, make reservations, and control smart homes across tens of thousands of services—capabilities the company claims have never been achieved at this scale. The service costs $19.99 monthly but comes free for Prime members, positioning it as a major new benefit that could drive subscription growth. Early access begins in coming weeks for select Echo Show devices, with broader rollout following.
Introducing Alexa+, the next generation of Alexa https://www.aboutamazon.com/news/devices/new-alexa-generative-artificial-intelligence
Google releases comprehensive 64-page guide for building AI agents
Google published a detailed technical guide covering the complete development lifecycle of AI agents, from initial experimentation through production deployment using their Vertex AI platform. This represents Google’s most systematic effort to standardize agent development, potentially accelerating enterprise adoption by providing clear implementation pathways that competing cloud providers haven’t matched with similar depth.
🚨 Google just dropped an ace 64-page guide on building AI Agents From ADK to AgentOps, Vertex AI Agent Engine to Agentspace, this guide is the clearest path yet from experimentation to scalable production 🔥 Download link (free!) in 🧵 ↓ https://x.com/DataChaz/status/1969844882299859416
Google literally dropped an ace 64-page guide on building AI Agents: https://x.com/Meer_AIIT/status/1970889941417898384
Google trains Minecraft AI to mine diamonds using minimal data
Google’s AI learned complex Minecraft gameplay from just 2,541 hours of video, achieving tasks that typically require 24,000 player actions while running on a single GPU. This breakthrough in training AI agents from video demonstrations could accelerate development of robots that learn real-world tasks by watching humans, rather than requiring expensive trial-and-error training.
Fast progress in training AI agents to interact with the world. Training on just 2,541 hours of Minecraft video, Google built an AI that runs on a single GPU & was able to mine diamonds offline (which takes an average of 24,000 clicks). The same approach may work for AI robots. https://x.com/emollick/status/1973385878195044444
Google’s AlphaEvolve AI discovers new mathematical theorems in complexity theory
Google DeepMind’s AlphaEvolve system used AI to find complex mathematical structures that improved fundamental limits in computational complexity theory, including better bounds for the MAX-4-CUT problem and average-case hardness of graph problems. Unlike AI systems that generate potentially flawed proofs, AlphaEvolve discovered verified finite structures that automatically “lift” to universal mathematical theorems when plugged into existing proof frameworks. The system achieved a 10,000x speedup in verification and found structures far more complex than those previously discovered by humans, suggesting AI’s potential as a rigorous mathematical research partner.
AI as a research partner: Advancing theoretical computer science with AlphaEvolve https://research.google/blog/ai-as-a-research-partner-advancing-theoretical-computer-science-with-alphaevolve/
Apple developing AI agent support across all its devices
Apple is building Model Context Protocol (MCP) support to enable AI agents that can perform complex tasks across Mac, iPhone, and iPad, marking the company’s first major push into autonomous AI assistants. This would allow AI to independently manage files, apps, and system functions rather than just responding to prompts, potentially transforming how users interact with Apple devices and positioning the company to compete with Google and Microsoft’s agent strategies.
🚨 Apple working on MCP support to enable agentic AI on Mac, iPhone, and iPad https://9to5mac.com/2025/09/22/macos-tahoe-26-1-beta-1-mcp-integration/
Claude now creates actual spreadsheets, documents and presentations instead of just text
Anthropic’s AI assistant can now generate ready-to-use Excel files, PowerPoint decks, and PDFs by running code in a private computer environment, transforming complex data analysis and document creation from hours of manual work into simple conversations. This marks a shift from AI as advisor to active collaborator, though the feature requires internet access and comes with data security warnings for enterprise users.
Claude can now create and use files \ Anthropic https://www.anthropic.com/news/create-files
Claude AI successfully replicates published economics research from scratch
Anthropic’s latest model can now independently reproduce academic studies by analyzing raw data and papers, marking a shift from AI as a writing tool to AI as a research collaborator. This capability suggests AI agents are moving beyond content generation to performing substantive analytical work, though questions remain about verification, attribution, and the broader implications for academic research workflows.
AI agents are now capable of doing real, if bounded, work. But that work can be very valuable. For example, the new Claude Sonnet 4.5 was able to replicate published economics research from data files & the paper. We need to figure out what to do with it: https://x.com/emollick/status/1972737754363752557
Anthropic releases Claude Agent SDK for building computer-using AI agents
Anthropic renamed its Claude Code SDK to Claude Agent SDK, expanding beyond coding to enable agents that use computers like humans do—running commands, editing files, and accessing external services. The SDK powers agents for finance, research, and customer support by giving Claude the same tools programmers use daily, with built-in features for context management and self-verification. Early deployments show agents successfully handling complex workflows from portfolio analysis to deep document research.
Building agents with the Claude Agent SDK \ Anthropic https://www.anthropic.com/engineering/building-agents-with-the-claude-agent-sdk
Anthropic launches checkpoints to let Claude Code run complex tasks autonomously
Anthropic added checkpoints that automatically save code states before changes, letting developers safely delegate complex, long-running programming tasks to Claude Code while maintaining rollback control. The feature works with new subagents and background processes that can handle parallel development workflows, marking a shift toward AI systems that can work independently on sophisticated coding projects for extended periods.
Enabling Claude Code to work more autonomously \ Anthropic https://www.anthropic.com/news/enabling-claude-code-to-work-more-autonomously
Claude Sonnet 4.5 runs autonomously for 30+ hours of coding?! The record for GPT-5-Codex was just 7 hours. What’s Anthropic’s secret sauce? https://x.com/Yuchenj_UW/status/1972708720527425966
Anthropic’s Sonnet 4.5 delivers major coding breakthrough, outperforms GPT-5
Early testing shows Sonnet 4.5 makes AI coding assistant Devin twice as fast and 12% more accurate, representing the biggest leap in AI programming capabilities since Claude 3.5’s release and potentially giving Anthropic a significant edge over OpenAI in the lucrative developer tools market.
Sonnet 4.5 crushing GPT-5 high on ARC-AGI 2 https://x.com/scaling01/status/1973081750189334587
Sonnet 4.5 is the most important coding model release in a while. From our early-access evals, we estimate it’s roughly the same jump in capabilities between Claude 3.5 and 4. As a result, Devin is >2x faster and 12% better on our internal benchmarks.”” / X https://x.com/russelljkaplan/status/1972725070083838250
Claude’s new coding feature deploys specialized AI subagents for different tasks
Anthropic’s system assigns separate AI agents to debug, test, and refine code sequentially, creating a collaborative workflow that mimics how human development teams divide responsibilities. This represents a shift from single AI assistants to coordinated multi-agent systems that could improve software quality through specialized expertise.
Subagents in Claude Code work like a coordinated team: one debugs, another tests, another refines. Each becomes an expert at its task, working in sequence to solve the problem at hand. https://x.com/claudeai/status/1971666134492696749
Anthropic tests desktop-style interface where Claude controls entire UI
Anthropic is experimenting with “Imagine,” a demo feature that lets Claude generate and manipulate complete desktop-style interfaces with windows, icons, and apps in real-time. Unlike typical AI text generation, this system gives Claude direct control over the user interface itself, potentially representing a shift toward AI agents that can build and manage entire software environments. The feature uses system prompts to guide Claude in rendering UI elements and managing window-based applications, though it’s currently limited to certain user plans and may only be temporarily available.
Anthropic experiments with real-time UI generation on Claude https://www.testingcatalog.com/anthropic-experiments-with-an-agent-for-gereating-ui-on-the-fly/
Microsoft 365 Copilot launches autonomous agents to handle workplace tasks
Microsoft’s new Agent Mode and Office Agent can independently manage emails, schedule meetings, and complete routine office work without constant human oversight, marking a shift from AI assistants that merely respond to prompts toward systems that proactively handle entire workflows. This represents one of the first mainstream deployments of autonomous AI agents in corporate software, potentially reshaping how millions of office workers interact with productivity tools.
Vibe working: Introducing Agent Mode and Office Agent in Microsoft 365 Copilot | Microsoft 365 Blog https://www.microsoft.com/en-us/microsoft-365/blog/2025/09/29/vibe-working-introducing-agent-mode-and-office-agent-in-microsoft-365-copilot/
AI agents evolved from broken to broadly useful in under a year
The rapid progression from non-functional AI agents to capable general-purpose assistants has caught most observers off guard, with agents now successfully handling diverse tasks across research, coding, and everyday workflows. This represents a fundamental shift from narrow AI tools to more autonomous systems that can independently complete multi-step objectives. The speed of this transformation—compressed into less than 12 months—suggests AI capabilities are advancing faster than public awareness can track.
The jump from “”agents are nowhere close to working”” to “”okay, narrow agents for research and coding work pretty well”” to (very recently) “”general purpose agents are actually useful for a range of tasks”” has been quick enough (less than a year) so that most people have missed it.”” / X https://x.com/emollick/status/1972141975458796020
OpenAI launches Sora 2 video app with social features
OpenAI released Sora 2, combining an advanced video generation model with a social app that lets users create, share and remix videos while inserting themselves through “cameo” features. The launch represents OpenAI’s “ChatGPT for creativity” moment, with the app hitting #1 in app stores and demonstrating unexpectedly strong capabilities like solving logic problems and rendering HTML despite being designed for video. Early tests show Sora 2 can generate highly realistic content across diverse styles while maintaining character consistency, though concerns remain about potential misuse and social media addiction.
Sora 2 – Sam Altman https://blog.samaltman.com/sora-2
Sora 2 is here | OpenAI https://openai.com/index/sora-2/
This is the Sora app, powered by Sora 2. Inside the app, you can create, remix, and bring yourself or your friends into the scene through cameos—all within a customizable feed designed just for Sora videos. See inside the Sora app👇 https://x.com/OpenAI/status/1973087446469406732
We are launching a new app called Sora. This is a combination of a new model called Sora 2, and a new product that makes it easy to create, share, and view videos. This feels to many of us like the “ChatGPT for creativity” moment, and it feels fun and new. There is something”” / X https://x.com/sama/status/1973073987023352250
“sora is number 1 in the app store! it’s been epic to see what the collective creativity of humanity is capable of so far. team is iterating fast and listening to feedback. feel free to drop us feature requests! (we’re sending more invite codes soon, i promise!) https://x.com/billpeeb/status/1974035563482116571
No wonder Snap’s stock dropped 7% yesterday. OpenAI did something with Sora 2 nobody else could by nailing three things at once: 1. Consumer first interface (like Meta Vibes attempted) 2. Production grade output (like Google Veo 3 has) 3. Social collaboration built in (made to”” / X https://x.com/bilawalsidhu/status/1973406327058661815
How can Sora solve these questions, despite being a video model? One explanation: Sora users’ prompts might be rewritten by an LLM before video generation. In that case, the LLM layer might first solve the problem, then include the solution explicitly in the rewritten prompt.”” / X https://x.com/EpochAIResearch/status/1974172889676177682
Physics with Sora 2 …and some anime. https://x.com/OpenAI/status/1973143639200243959
so.. apparently sora 2 is also a browser it’s wild to see what types of capabilities emerge in the model this is sora 2 rendering pasted html (actual browser-rendered html on the right) https://x.com/jesperengelen/status/1973147038499086523
Sora 2 can solve questions from LLM benchmarks, despite being a video model. We tested Sora 2 on a small subset of GPQA questions, and it scored 55%, compared to GPT-5’s score of 72%. https://x.com/EpochAIResearch/status/1974172794012459296
Sora 2 delivers. It nails accents, aesthetics, and actually has comedic timing. My initial tests making AI videos across a variety of styles 👇 https://x.com/bilawalsidhu/status/1973151157137842416
Sora 2 is here. https://x.com/OpenAI/status/1973075422058623274
Sound on. https://x.com/OpenAI/status/1973071069016641829
This is legitimately mind-blowing… How the FUCK does Sora 2 have such a perfect memory of this Cyberpunk side mission that it knows the map location, biome/terrain, vehicle design, voices, and even the name of the gang you’re fighting for, all without being prompted for any of https://x.com/elder_plinius/status/1973124528680345871
TODAY WE LAUNCH SORA 2, THE WORLDS BEST VIDEO GENERATION MODEL feature you and your friends with raw real world physics, putting an end to the uncanny ai vibes let me show you how insane our model is, featuring me & sam altman: https://x.com/GabrielPeterss4/status/1973071380842229781
I have been warning about this for a couple years (the post below is from February 2023), but you really cannot trust any image or video you see online. It isn’t just Sora 2, it is a host of tools (many open source) that make cloning voice & images easy. https://x.com/emollick/status/1973461311649718302
I tried this test with Sora 2. It fails and the output looks a lot more fake than Veo 3. But interestingly, the audio output that narrates the scene gets the explanation right. https://x.com/fofrAI/status/1973745038195830891
My favorite trend in the Sora app is these body cam footage videos This clip with Spongebob hit 1M+ TikTok views! 🤯 I built a workflow to remove the Sora 2 watermarks👇 https://x.com/angrypenguinPNG/status/1974144279955325191
My test of any new AI video model is whether it make an otter using wifi on an airplane Here is Sora 2 doing a nature documentary… 80s music video… a thriller… 50s low budget SciFi film… a safety video.. film noir… anime… 90s video game cutscene… French arthouse https://x.com/emollick/status/1973220923810652523
Obsessed with Rick and Morty explaining 3D Gaussian Splatting. Sora 2 nails it – and yes they really are training on everything by default. https://x.com/bilawalsidhu/status/1973451863442989155
The voice cloning quality in the Sora 2 app is REALLY impressive. Wonder if this is the same tech behind “”Voice Engine”” which OpenAI never released because they were worried about just how good it was. https://x.com/bilawalsidhu/status/1973229885742051465
Sam Altman’s playbook is clear: When a model becomes usable, OpenAI rushes to turn it into a blockbuster app: ChatGPT – chat app Codex – coding app Sora – video app Raw model power isn’t a strong moat in a crowded field (Google, xAI, Anthropic). But once an app embeds into”” / X https://x.com/Yuchenj_UW/status/1973435314195800392
Sora 2 is out! I’m incredibly impressed that despite Sora 1 setting a ridiculous bar — as the largest jump in video capability basically ever — Sora 2 somehow matches up here. For people that played around a lot with Sora 1, it didn’t feel clear that Sora really ‘understood’”” / X https://x.com/willdepue/status/1973089331284681110
Sora 2 Pro is now rolling out so here’s one more vid. 15 seconds ( max length ) at high quality. Very nice. https://x.com/apples_jimmy/status/1973979773354586379
Sora-2 counting the R’s https://x.com/scaling01/status/1973414141370179876
Sora-2 counting the R’s https://x.com/scaling01/status/1973414141370179876
Sora2 from @openai released just 2 days ago, and people are creating and remixing incredible things. The top video with 2.2K likes is a selfie video of Jesus during the last supper, but it only starts there. Sora allows folks to remix any video by adding their own prompt. Here https://x.com/altryne/status/1973568567489798144
The labs learned from the Studio Ghibli thing that images & video could produce viral moments that turn into user gain. The Sora 2 launch is the ultimate implementation of this: gated invites, an app that selects for virality, reasons to share with friends, provocative content”” / X https://x.com/emollick/status/1973424720549929054
OpenAI prepares TikTok-like app for AI-generated videos only
The standalone Sora 2 app lets users create 10-second AI videos with swipe-to-scroll feeds and remix features, marking the first major social platform built entirely around AI content. Internal testing shows such high employee engagement that managers worry about productivity impacts, suggesting AI video creation could become as transformative as ChatGPT was for text generation.
OpenAI Is Preparing to Launch a Social App for AI-Generated Videos | WIRED https://www.wired.com/story/openai-launches-sora-2-tiktok-like-app/
Sora 2 is cooming “ OPENAI IS PREPARING to launch a stand-alone app for its video generation AI model Sora 2, WIRED has learned. The app, which features a vertical video feed with swipe-to-scroll navigation, appears to closely resemble TikTok except it’s AI-generated. “ https://x.com/apples_jimmy/status/1972756684297978256
Chinese AI video generator Kling 2.5 Turbo tops independent benchmarks
Kuaishou’s latest model outranked Google’s Veo 3 and other major competitors in both text-to-video and image-to-video generation on Artificial Analysis rankings. The system demonstrates seamless frame matching that makes video splices virtually undetectable, marking a notable shift as Chinese AI companies increasingly challenge Western dominance in creative AI tools.
Kling 2.5 Turbo takes the top spot in both Text to Video and Image to Video in the Artificial Analysis Video Arena, surpassing Hailuo 02 Pro, Google’s Veo 3, and Luma Labs’ Ray 3! Kling 2.5 Turbo is the latest release from @Kling_ai , representing a significant leap from Kling https://x.com/ArtificialAnlys/status/1973570493753204953
Kling 2.5 has perfect frame matching‼️ Enigmatic E explains the process behind the chaotic snow man music video and even points out where the splices are (since you actually can’t tell!) You can make this too using our Infinite Kling 2.5 Agent! Link below 👇 https://x.com/heyglif/status/1974195300240957445
@ArtificialAnlys Excited to announce that our 2.5 Turbo (1080p) model takes the top spot in both Text to Video and Image to Video in the Artificial Analysis Video Arena!”” / X https://x.com/Kling_ai/status/1973581864679121374
Google’s Veo 3 enables precise video ads using JSON prompts
Google’s latest AI video generator Veo 3 now supports vertical 9:16 format and structured JSON prompting, allowing marketers to create hyper-realistic user-generated content for social media ads. The JSON technique transforms generic prompts into “surgical-precision” video generation with consistent branding across multiple clips. Early users are creating everything from product commercials to cinematic sequences by specifying camera angles, transitions, and visual effects through code-like instructions.
VEO 3 + JSON prompting is pretty wild 🤯 These AG1 product videos were created entirely with AI. And it’s all about the prompt. This JSON prompting technique will take any generic VEO3 prompt.. And turn it into surgical-precision video generation with brand consistency. https://x.com/mikefutia/status/1951282585235066933
Veo3 text animations are so satisfying All 7 examples created on @LeonardoAi_ in 1080p with prompts included. Bookmark & simply change the title word to reuse 1) Autumn Wind Whirl: 🧵 {“”sequences””:[ { “”start_sec””:0, “”end_sec””:3, “”narrative””:””Wind gusts,whirl ‘VEO3’ in https://x.com/Mr_AllenT/status/1950900824902631471
Google VEO 3 just announced 9:16 videos and it’s absolutely WILD 🤯 This AI system creates hyper-realistic UGC-style ads in vertical format. And generates multiple clips that look like real creator content. Perfect for e-comm operators & ad agencies running Facebook/TikTok https://x.com/mikefutia/status/1970899751613636610
(3) Examples of Diagram-to-Vid with Veo3 It’s just too much fun. Examples: 1. Subj Action + Camera Motion 2. Motion Brush effect 3. Sequencing Prompt: High-intensity action scene. Motion 1: camera pulling back slightly as Car speeds towards us. Motion 2: Camera pushes in and https://x.com/Ror_Fly/status/1950352402416115788
🚨PromptShare🚨 Fun JSON prompt to try with Veo3 in #AdobeFirefly 🔥 { “”shot””: { “”composition””: “”tight ground-level tracking shot behind a hamster sprinting through grand museum corridors””, “”lens””: “”14mm wide low-angle chase lens””, “”frame_rate””: “”120fps for foot https://x.com/CharaspowerAI/status/1950595855569813851
JSON Prompt share for Veo3 { “”shot””: { “”composition””: “”starts in wide landscape with a centered runner, transitions to low-angle close tracking of feet, ends with high-angle logo reveal that morphs into product””, “”lens””: “”anamorphic for wide kinetic sweep, 50mm for https://x.com/azed_ai/status/1954218180593008836
The making of a Sandwich Tornado. With Weavy and Veo3. PROCESS: 01. Generate individual ingredients 02. Compile in Weavy 03. Build Diagram 04. Generate in Veo3 w. Flow (More details in thread) #veo3 #promptshare https://x.com/Ror_Fly/status/1951018602892845454
These Veo3 box explosion videos are going viral right now🤯 One second it’s just a box… the next it’s a full cinematic setup. Here’s how to make them using BasedLabs AI:👇🏻 https://x.com/BasedLabsAI/status/1949515562192613768
Veo3 Fast on @BasedLabsAI POV: Volcano eruption🌋 { “”scene””: “”A first-person POV of someone hiking along a mountain ridge with a dormant volcano in view. The landscape is rugged with patches of grass, rocks, and distant peaks. Suddenly, the volcano rumbles and begins to https://x.com/TechieBySA/status/1951717201389756595
Tea ad Concept using Veo3 { “”shot””: { “”composition””: “”starts in ultra-wide shot of scattered leaves in a field, transitions to mid-air swirl, ends in centered product close-up of the formed tea box suspended in air””, “”lens””: “”telephoto for distant field details, https://x.com/azed_ai/status/1952401988093988874
Google launches AI agents that book reservations and tickets automatically
Google’s new AI Mode can now search multiple websites simultaneously to find and book restaurant reservations, event tickets, and appointments based on conversational requests like “find dinner for 3 Friday after 6pm, craving ramen.” This represents a shift from AI providing information to actually taking actions across the web on users’ behalf. The feature is currently available to U.S. users in Google’s experimental Search Labs program.
Agentic capabilities in AI Mode – Search Labs https://labs.google.com/search/experiment/43
AI Mode in Google Search updates: Visual exploration and discovery https://blog.google/products/search/search-ai-updates-september-2025/
Excited to be expanding access for this. Agentic capabilities in AI Mode for finding restaurant reservations are now available to all users opted into Labs in the U.S. Try it for yourself today → https://labs.google.com/search/experiment/43
Comet AI assistant launches globally after 84-day waitlist period
Millions signed up for the personal AI assistant that promises to change how people browse and interact with the internet, though specific capabilities and differentiation from existing AI assistants remain unclear.
Comet is now available to everyone in the world. In the last 84 days, millions have joined the Comet waitlist looking for a powerful personal AI assistant and new ways to use the internet. The internet is better on Comet. https://x.com/perplexity_ai/status/1973795224960032857
Major news publishers partner with Perplexity for AI-readable subscriptions
The Washington Post, CNN, and other major outlets are launching “Comet Plus,” a subscription service designed for both humans and AI systems to access news content. This represents a significant shift toward publishers directly monetizing AI training and usage rather than fighting it, with Perplexity offering the service to its premium subscribers as a key distribution channel.
Update on Comet Plus (a new subscription plan meant for both humans and AIs to consume news): Washington Post, CNN, Conde Nast (New Yorker, Wired, Vogue), Fortune, Le Monde and others have come on board to be our launch partners. Perplexity Pro/Max users will get Comet Plus https://x.com/AravSrinivas/status/1973804332039786608
Perplexity launches search API to compete directly with Google’s infrastructure
The AI search company is positioning its new developer API as the strongest alternative to Google’s search dominance, while also planning a browsing API. Early developer feedback suggests smooth integration capabilities, marking Perplexity’s push beyond consumer search into the enterprise API market that powers countless applications.
We will be offering a browsing API too. Stay tuned. Perplexity infrastructure on search and browsing will be second to none other than Google for a while, and long term, the single best.”” / X https://x.com/AravSrinivas/status/1971443978810896424
the perplexity search api looks very good we’re going to integrate it properly so it just works with opencode but check it out – can quickly implement it as a custom tool https://x.com/thdxr/status/1971510163501953436
this is a billion dollar lesson in design. i bet whoever led Comet has got an extraordinary product sense. It is such an incredible product and ux, not because it’s new—because it isn’t. User ease into it, knows how to navigate around, with some AI integration and not intrusive”” / X https://x.com/felixleezd/status/1973942012278935631
Cursor AI coding assistant adds browser control capabilities
The popular AI programming tool can now take screenshots, modify user interfaces, and debug web applications directly in browsers, expanding beyond code editing to full web development workflow automation.
Cursor can now control your browser. Agent can take screenshots, improve UI, and debug client issues. Try our early preview with Sonnet 4.5. https://x.com/cursor_ai/status/1972778817854067188
Anthropic launches Claude browser extension with prompt injection defenses
Anthropic released Claude for Chrome to 1,000 beta testers, allowing the AI to click buttons and fill forms directly in browsers—a major leap toward AI agents handling everyday web tasks. The company reduced prompt injection attack success rates from 23.6% to 11.2% through new safety measures, after discovering vulnerabilities where malicious websites could trick Claude into deleting emails or accessing sensitive data without user consent. This controlled rollout addresses critical security gaps before browser-based AI becomes mainstream, as similar tools from other companies already emerge without comparable safety testing.
Piloting Claude for Chrome \ Anthropic https://www.anthropic.com/news/claude-for-chrome
Cloudflare launches AI Index to help websites monetize AI access
Cloudflare’s new AI Index automatically creates searchable databases of website content that owners can control and monetize when AI companies access their data. The service shifts from traditional web crawling to a subscription model where AI builders pay content creators directly for structured, real-time updates. This addresses the growing tension between websites losing traffic to AI chatbots and AI companies needing quality training data, potentially creating the first major alternative to free web scraping.
An AI Index for all our customers https://blog.cloudflare.com/an-ai-index-for-all-our-customers/
Reinforcement learning pioneer calls large language models a dead end
Richard Sutton, winner of the 2024 Turing Award and father of reinforcement learning, argues that LLMs fundamentally lack the ability to learn continuously from experience like humans and animals do. He believes current AI systems only mimic human responses rather than truly understanding the world, and predicts that future AI will require entirely new architectures capable of on-the-fly learning without separate training phases. This represents a sharp departure from the current scaling-focused approach dominating AI development.
.@RichardSSutton, father of reinforcement learning, doesn’t think LLMs are bitter-lesson-pilled. My steel man of Richard’s position: we need some new architecture to enable continual (on-the-job) learning. And if we have continual learning, we don’t need a special training https://x.com/dwarkesh_sp/status/1971606180553183379
Richard Sutton – Father of RL thinks LLMs are a dead end https://www.dwarkesh.com/p/richard-sutton
AI pioneer challenges current language model approach as fundamentally flawed
Richard Sutton, author of the influential “Bitter Lesson” paper, argues that today’s large language models violate his core principles by relying on human-generated training data rather than learning through direct world interaction like animals do. Andrej Karpathy responds that while current LLMs are “ghosts” – statistical distillations of human knowledge – rather than “animals” that learn from experience, this approach may be a practical necessity given computational constraints. The debate highlights a fundamental tension in AI development between mimicking biological learning versus leveraging available human data.
Animals vs Ghosts | karpathy https://karpathy.bearblog.dev/animals-vs-ghosts/
Finally had a chance to listen through this pod with Sutton, which was interesting and amusing. As background, Sutton’s “”The Bitter Lesson”” has become a bit of biblical text in frontier LLM circles. Researchers routinely talk about and ask whether this or that approach or idea”” / X https://x.com/karpathy/status/1973435013875314729
OpenAI launches first AI safety alerts for teen self-harm risks
OpenAI is rolling out parental controls for ChatGPT that include unprecedented safety notifications to alert parents when their teenager may be at risk of self-harm. This marks the first time an AI company has built automated mental health monitoring into their consumer product, potentially setting a new standard for AI safety as these tools become more prevalent among young users.
Introducing parental controls | OpenAI https://openai.com/index/introducing-parental-controls/
We’re beginning to roll out parental controls in ChatGPT, including the first-of-its-kind safety notification system to alert parents if their teen may be at risk of self-harm. Read more here: https://x.com/fidjissimo/status/1972602249907146967
New college graduates use ChatGPT to solve problems independently instead of asking supervisors
This shift toward AI-assisted self-reliance represents a fundamental change in workplace learning culture, with managers reporting that recent hires attempt solutions using ChatGPT before seeking human help. The trend suggests AI tools are fostering greater initiative and problem-solving confidence among young workers entering the job market.
My favorite thing about the new college grads that I’ve hired is that they don’t ask you how to do stuff They just put it in ChatGPT and fucking try even if wrong So much better than new grads asking you how to do shit without trying Chat is breeding agency into kids”” / X https://x.com/dylan522p/status/1971425552902082941
OpenAI launches social platform allowing users to create with others’ likenesses
The company’s new experiment lets people opt into having their digital identity used in others’ AI-generated content, with notification systems and anti-impersonation safeguards that distinguish it from typical deepfake concerns. This represents a novel approach to consensual digital identity sharing, potentially reshaping how we think about personal likeness rights in AI-generated social media.
OpenAI’s social creation & consumption experiment is here. Unique handling of identity – if you opt-in others can use your likeness in their creations. You’re notified even if it’s used in a draft post & “liveness checks” are done to prevent impersonation. https://x.com/bilawalsidhu/status/1973103500511871277
OpenAI launches TikTok-style social app amid internal researcher concerns
OpenAI released Sora, an AI-generated video social feed similar to TikTok, sparking worry among current and former researchers who question whether the addictive social media format aligns with the company’s nonprofit mission to benefit humanity. The launch represents OpenAI’s biggest expansion beyond ChatGPT into consumer social media, where engagement-driven algorithms have historically created harmful societal effects. Internal critics fear the company is prioritizing revenue over its stated safety mission as it transitions to for-profit status.
OpenAI is building a social network | The Verge https://www.theverge.com/openai/648130/openai-social-network-x-competitor
OpenAI staff grapples with the company’s social media push | TechCrunch https://techcrunch.com/2025/10/01/openai-staff-grapples-with-the-companys-social-media-push/
Anthropic releases Claude Sonnet 4.5 with enhanced cybersecurity capabilities
Anthropic deliberately trained Claude Sonnet 4.5 to excel at defensive cybersecurity tasks, achieving 76.5% success on complex security challenges and outperforming human teams in some competitions. The company warns that AI has reached an “inflection point” in cybersecurity, with attackers already using AI to scale operations like data extortion schemes that previously required entire teams, making defensive AI adoption urgent for organizations.
AI for Cyber Defenders \ red.anthropic.com https://red.anthropic.com/2025/ai-for-cyber-defenders/
We’re at an inflection point in AI’s impact on cybersecurity. Claude now outperforms human teams in some cybersecurity competitions, and helps teams discover and fix code vulnerabilities. At the same time, attackers are using AI to expand their operations. https://www.anthropic.com/research/building-ai-cyber-defenders
maybe the most impressive part from Sonnet 4.5 alignment information. Not only can it push back, but it has a sophisticated theory of user’s mind. Other models also can speculate about the user’s play (DS does that a lot) but aren’t trained to treat it as actionable info. https://x.com/teortaxesTex/status/1973264029599842380
YouTube launches Labs program to test AI experiments with users
YouTube introduced Labs, a new initiative letting US users test AI prototypes including music hosts that provide commentary and trivia, marking the platform’s biggest shift toward AI-generated content on its 20th anniversary. This matters because YouTube is positioning AI as its next major evolution, potentially transforming how videos are created from camera-based to prompt-based production. The company plans to integrate Google’s Veo 3 technology for instant video creation, though critics worry about authentic content being overwhelmed by AI-generated material.
Introducing YouTube Labs: Shape the future of AI on YouTube https://blog.youtube/news-and-events/introducing-youtube-labs/
YouTube Thinks AI Is Its Next Big Bang | WIRED https://www.wired.com/story/youtube-thinks-ai-is-its-next-big-bang/
California becomes first state to regulate frontier AI development with safety requirements
Governor Newsom signed SB 53, requiring large AI companies to publicly disclose safety frameworks, report critical incidents, and protect whistleblowers—marking the first state-level regulation of advanced AI systems. The law fills a federal policy gap while California hosts 32 of the world’s top 50 AI companies and received over half of global AI venture funding in 2024. The legislation balances innovation with accountability by creating transparency requirements without restricting AI development itself.
Governor Newsom signs SB 53, advancing California’s world-leading artificial intelligence industry | Governor of California https://www.gov.ca.gov/2025/09/29/governor-newsom-signs-sb-53-advancing-californias-world-leading-artificial-intelligence-industry/
Anthropic hires Stripe’s former CTO as new technology chief
Rahul Patil joins the Claude AI maker as CTO while co-founder Sam McCandlish shifts to Chief Architect, signaling the company’s push to scale operations beyond research. The move brings proven enterprise infrastructure experience to one of OpenAI’s main competitors as it faces growing demand for its AI assistant.
🚨 Anthropic has a new CTO: Rahul Patil, the former CTO of Stripe. Patil started at the company earlier this week, taking over from co-founder Sam McCandlish, who will move to a new role as Chief Architect. Read more in @TechCrunch from @russellbrandom https://x.com/zeffmax/status/1973833211835974046
Wall Street recruits hundreds of engineers to build AI trading systems
Major financial firms are rapidly developing large language models specifically for high-frequency trading and investment decisions, signaling a fundamental shift toward AI-driven market operations. This represents a departure from general-purpose AI tools toward specialized financial models that could reshape how markets function and who has competitive advantages in trading.
Wall Street invited 300+ NYC engineers to builds LLMs for HFT and investing Trading is about to permanently change Here are the top 6 demos from the @Cerebral_Valley AI Fintech Hackathon (🧵): https://x.com/AlexReibman/status/1969847901737422955
CoreWeave lands massive $14 billion cloud infrastructure deal with Meta
The AI infrastructure provider’s stock jumped 12% after securing the contract, adding to its recent $6.5 billion OpenAI expansion and highlighting how tech giants are spending enormous sums to secure computing power for AI development. CoreWeave specializes in renting out data centers packed with Nvidia chips essential for training AI models, positioning it as a critical middleman in the AI boom.
CoreWeave stock climbs 12% after $14 billion deal with Meta https://www.cnbc.com/2025/09/30/coreweave-meta-deal-ai.html
Meta hires OpenAI’s Yang Song to lead superintelligence research lab
Meta CEO Mark Zuckerberg recruited Yang Song, who led OpenAI’s strategic explorations team, as research principal of Meta Superintelligence Labs after a summer hiring spree that brought 11 top researchers from rival AI companies. Song’s expertise in processing complex datasets across different formats and his role developing techniques behind OpenAI’s DALL-E 2 image generator signals Meta’s aggressive push to compete in the AI arms race. The move follows Meta’s retention of another OpenAI veteran who threatened to return to his former employer.
Meta Poaches OpenAI Scientist to Help Lead AI Lab | WIRED https://www.wired.com/story/meta-poaches-openai-researcher-yang-song/
OpenAI raises funds at unprecedented $500 billion company valuation
The AI company behind ChatGPT completed a share sale that values it higher than most Fortune 500 companies, reflecting investor confidence that generative AI will transform multiple industries. This valuation represents a dramatic jump from OpenAI’s $86 billion valuation just months ago, signaling that private markets believe AI capabilities are accelerating faster than initially expected.
OpenAI Completes Share Sale at Record $500 Billion Valuation – Bloomberg https://www.bloomberg.com/news/articles/2025-10-02/openai-completes-share-sale-at-record-500-billion-valuation?srnd=phx-ai
OpenAI hires executive to bring advertising to ChatGPT platform
New CEO of Applications Fidji Simo is recruiting candidates to lead a monetization team that will introduce ads to ChatGPT, marking OpenAI’s first major push beyond subscription revenue. This signals the company’s shift toward traditional tech business models as it seeks billions in profit to fund its expensive AI operations. The hire will oversee all revenue streams and report directly to Simo, who joined from Instacart last month.
OpenAI is looking for a head of ads for ChatGPT https://sources.news/p/openai-ads-leader-sam-altman-memo-stargate
OpenAI burns $2.5 billion despite $4.3 billion in first-half revenue
The AI leader’s massive cash burn rate of nearly 60% of revenue highlights the enormous costs of training and running advanced AI models, raising questions about the sustainability of current AI business models even as demand soars.
OpenAI’s First Half Results: $4.3 Billion in Sales, $2.5 Billion Cash Burn — The Information https://www.theinformation.com/articles/openais-first-half-results-4-3-billion-sales-2-5-billion-cash-burn
OpenAI’s Stargate project will consume 40% of global memory chip production
Samsung and SK Hynix agreed to supply up to 900,000 memory wafers monthly to OpenAI’s massive Stargate data center initiative, representing nearly half of all global DRAM output. This unprecedented demand highlights how AI infrastructure is reshaping entire semiconductor supply chains, with memory prices already surging up to 60% as companies scramble to secure chips for AI processing. The scale demonstrates that building advanced AI systems now requires industrial-level resource commitments that dwarf traditional computing needs.
OpenAI’s Stargate project to consume up to 40% of global DRAM output — inks deal with Samsung and SK hynix to the tune of up to 900,000 wafers per month | Tom’s Hardware https://www.tomshardware.com/pc-components/dram/openais-stargate-project-to-consume-up-to-40-percent-of-global-dram-output-inks-deal-with-samsung-and-sk-hynix-to-the-tune-of-up-to-900-000-wafers-per-month
Microsoft spends $33 billion renting external GPU farms for internal AI work
Microsoft is outsourcing its own AI development to third-party data centers while reserving its facilities for paying customers, securing 100,000 Nvidia GB300 chips through a $19.4 billion deal with Nebius alone. This unusual arrangement lets Microsoft monetize its infrastructure investments while still accessing cutting-edge hardware for internal teams building Copilot and language models. The strategy highlights how even tech giants with massive data centers are struggling to meet AI computing demands internally.
Microsoft inks $33 billion in deals with ‘neoclouds’ like Nebius, CoreWeave — Nebius deal alone secures 100,000 Nvidia GB300 chips for internal use | Tom’s Hardware https://www.tomshardware.com/tech-industry/artificial-intelligence/microsoft-inks-usd33-billion-in-deals-with-neoclouds-like-nebius-coreweave-nebius-deal-alone-secures-100-000-nvidia-gb300-chips-for-internal-use
Cerebras raises $1.1 billion to expand AI chip manufacturing capacity
The AI chip startup secured funding at an $8.1 billion valuation to scale production of processors it claims run 20 times faster than Nvidia GPUs for AI inference tasks. Major customers including AWS, Meta, and government agencies have adopted Cerebras’ technology, with the company now processing trillions of AI tokens monthly. This represents significant competition to Nvidia’s dominance in AI hardware, particularly for real-time applications like code generation where speed is critical.
Cerebras Raises $1.1 Billion at $8.1 Billion Valuation https://www.cerebras.ai/press-release/series-g
Google develops AI textbook that adapts to individual learning styles
Google’s LearnLM team created an AI-powered textbook that personalizes content and pacing for each student, moving beyond static digital texts to interactive, adaptive learning materials. This represents a significant shift from one-size-fits-all educational content toward truly individualized instruction at scale. The research involved over 30 specialists and focuses on making AI tutoring accessible through familiar textbook formats.
Towards an AI-Augmented Textbook https://arxiv.org/pdf/2509.13348
OpenAI creates benchmark measuring AI performance across 44 real-world occupations
OpenAI’s new GDPval benchmark tests AI models on actual work tasks from major economic sectors, finding current models complete these jobs 100 times faster and cheaper than human experts while achieving 77-95% of human-level performance. This represents the first systematic attempt to measure AI’s readiness to replace human workers across the economy, moving beyond academic tests to real workplace capabilities.
Tejal Patwardhan on X: “Understanding the capabilities of AI models is important to me. To forecast how AI models might affect labor, we need methods to measure their real-world work abilities. That’s why we created GDPval. https://t.co/YsQvmdGK94” / X https://x.com/tejalpatwardhan/status/1971249532588741058
We aim to build the most intelligent and useful AI. But “useful” is a fuzzy word. GDPval (consisting of tasks spanning 44 occupations across the top 9 sectors contributing to US GDP) makes “”usefulness”” more concrete. Would love to saturate this one!”” / X https://x.com/markchen90/status/1971449404734439831
The gdpval dataset from @OpenAI is number one trending on @huggingface this week! https://x.com/ClementDelangue/status/1972640079559749632
The most important thing here: the models completed these tasks 100x faster and cheaper than the industry experts”” / X https://x.com/scaling01/status/1971431825433374866
https://t.co/321zjlMTmp paper doesn’t mention “”AGI”” but if you consider that we used to define AGI as “”outperform humans at most economically valuable work” then surely GDPVal is the most direct AGI benchmark we have ever had and we are between 77-95% of the way there and should https://x.com/swyx/status/1971427791770882463
German AI startup Black Forest Labs seeks $4 billion valuation
The image generation company is raising $200-300 million just over a year after launch, highlighting Europe’s push to compete with US AI dominance alongside French rival Mistral, which recently secured a $14 billion valuation. Black Forest represents one of the few European firms developing proprietary AI models rather than relying on American technology.
AI Startup Black Forest Labs Shoots for $4 Billion Valuation https://www.pymnts.com/artificial-intelligence-2/2025/ai-startup-black-forest-labs-shoots-for-4-billion-valuation/
Google launches production-ready Gemini 2.5 Flash Image generation model
Google’s Gemini 2.5 Flash Image is now generally available for developers, offering 10 different aspect ratios, multi-image blending, and sub-10-second generation speeds at $0.039 per image. The model distinguishes itself by maintaining character consistency across different camera angles while preserving world knowledge, addressing key limitations of previous image generation systems. Early adopters are already integrating it into games, creative tools, and interactive applications through Google’s AI Studio and Vertex AI platforms.
🖼️ Nano Banana is generally available and ready for production. Learn how you can build dynamic user experiences with a wider range of aspect ratios, ability to specify image-only output, and more creative control in your app. ↓ https://x.com/googleaidevs/status/1973781293977735435
gemini 2.5 flash image 🍌 is now GA and ready for production updated with support for 10 aspect ratios, multi-image blending and image-only output start building in AI Studio and on the Gemini API https://x.com/GoogleAIStudio/status/1973836478989152700
Developers – the best image editing + generation model is now GA. Go 🍌🍌🍌 on @GoogleAIStudio + Vertex!”” / X https://x.com/sundarpichai/status/1973788714758517147
Google launches Gemini 2.5 Flash Image model in GA https://www.testingcatalog.com/google-launches-gemini-2-5-flash-image-model-in-general-availability/
Adobe integrates AI image generators natively into Photoshop for first time
Adobe has built Nano Banana and FLUX Kontext directly into Photoshop, marking the first native integration of third-party AI image generators into the industry-standard design software. This eliminates the workflow friction of switching between applications and could accelerate AI adoption among millions of professional designers and photographers who rely on Photoshop daily.
Nano Banana and FLUX Kontext are now natively integrated into Photoshop. I don’t think people understand how big of a deal this is. https://x.com/bilawalsidhu/status/1971266138383450493
Anthropic’s new Claude 4.5 Sonnet is now the #4 most intelligent model, beats 4.1 Opus, and places Anthropic in the top 3 in the race for frontier intelligence Claude 4.5 Sonnet offers a clear upgrade for Claude 4.1 Opus and Claude 4 Sonnet users, with greater intelligence at https://x.com/ArtificialAnlys/status/1972854742167761204
Introducing Claude Sonnet 4.5 \ Anthropic https://www.anthropic.com/news/claude-sonnet-4-5
Claude Sonnet 4.5 knows when it’s being tested https://www.transformernews.ai/p/claude-sonnet-4-5-evaluation-situational-awareness
Anthropic supremacy in the coding category of lmarena Claude 4.5 Sonnet on tied first with Claude 4.1 Opus https://x.com/scaling01/status/1973836516205134135
Anthropic asked 7 researchers to opine on the productivity boost they get from Claude Sonnet 4.5. Results: one “”qualitative answer”” +15% +20% +20% +30% +40% +100%; respondent indicated that his workflow is “”now mainly focused on managing multiple agents”” https://x.com/deredleritt3r/status/1972770139297767720
Anthropic released a TON of updates today: • Sonnet 4.5 (with context awareness) • Claude Code 2.0 (+ new mascot 🦀) • Claude API: context editing + memory tool @mikeyk sat down for a special launch day chat about 4.5 and the @AnthropicAI developer roadmap! (and more: • VS https://x.com/latentspacepod/status/1973017487190139140
Claude https://claude.ai/new
Claude Sonnet 4.5 System Card https://assets.anthropic.com/m/12f214efcc2f457a/original/
Apple builds internal ChatGPT clone to test Siri’s AI overhaul
Apple created “Veritas,” an employee-only chatbot that mimics ChatGPT’s interface, to develop and test features for Siri’s long-delayed AI upgrade scheduled for March 2026. The move reveals Apple’s strategy of building internal AI capabilities while still planning to rely on Google’s Gemini for consumer-facing AI search, highlighting the company’s cautious approach compared to competitors who have already released public AI chatbots.
Apple built its own ChatGPT-like app to test out new Siri AI revamp | Mashable https://mashable.com/article/apple-chatgpt-like-app-veritas-siri-ai-voice-assistant
Apple’s ‘Veritas’ chatbot is reportedly an employee-only test of Siri’s AI upgrades | The Verge https://www.theverge.com/news/787046/apples-veritas-siri-ai-chatbot
Chinese AI lab releases GLM-4.6 with 200K token context window
Zhipu AI’s new flagship model extends context from 128K to 200K tokens while using 15% fewer tokens than its predecessor, achieving near-parity with Claude Sonnet 4 in coding tasks. The model prioritizes practical agent applications over benchmark performance, running efficiently on consumer hardware like Apple’s M3 Ultra. Early testing shows particularly strong frontend development capabilities, though some users report potential performance issues in complex scenarios.
Introducing GLM-4.6: Advanced Agentic, Reasoning and Coding Capabilities As our new flagship model, GLM-4.6 brings significant advancements across real-world coding, long-context processing (up to 200K tokens), reasoning, search, writing, and agentic applications. API: https://x.com/Zai_org/status/1973034639708344767
🔎 @Zai_org just released GLM-4.6, and according to Zhihu contributor toyama nao, it’s not chasing GPT-style fireworks — it’s optimizing for real-world agent use. ⚙️ Key Gains (Fig shown below): • Token efficiency = massive win → Reasoning model: 16K → 9K tokens 🔻 https://x.com/ZhihuFrontier/status/1973447038818762841
GLM 4.6 runs quite fast on an M3 Ultra with mlx-lm even at higher precision. Pretty remarkable that it benchmarks competitive to the just-released Sonnet 4.5. Hope those benchmarks hold-up in day-to-day use. Here’s a run using 5.5 bpw quantized model, generating 5.3k tokens at https://x.com/awnihannun/status/1973063906341114327
In real-world development scenarios, GLM-4.6 surpasses GLM-4.5 and reaches near-parity with Claude Sonnet 4, while clearly outperforming other open-source baselines. https://x.com/Zai_org/status/1973034644091392002
👋 The all new GLM 4.6 from @Zai_org is available on OpenRouter. – GLM-4.6 achieves comprehensive enhancements across multiple domains, including real-world coding, long-context processing, reasoning, searching, writing, and agentic applications. – Context length: increased https://x.com/OpenRouterAI/status/1973037695774384352
GLM-4.6 from @Zai_org is available Cline. 200K context (up from 131k), and completes tasks with 15% fewer tokens than GLM-4.5. >48.6% win rate against frontier models, making it one of the most capable open-source models. Available in Cline & the GLM subscription. https://x.com/cline/status/1973099598903386227
GLM-4.6 is out on Hugging Face Longer context window: The context window has been expanded from 128K to 200K tokens, enabling the model to handle more complex agentic tasks Superior coding performance: The model achieves higher scores on code benchmarks and demonstrates better https://x.com/_akhaliq/status/1973068098539593932
GLM-4.6 实测! 最近都很流行更新一大截然后版本就增加0.1嘛? 直接说结论, 前端能力非常好, 效果甚至达到了炫技的程度, 不过随之而来的是复杂场景可能面临性能问题, 所以需要注意防止模型过度发挥或者偶尔留意下代码设计和架构. 不过 python 能力没看到太多变化, 考虑到 python 就一个测试, https://x.com/karminski3/status/1973353334796140716
The model releases continue… we’ve just released support for GLM-4.6 under the @Zai_org provider! This improves upon the popular GLM-4.5 model, extending the context window from 128k to 200k and achieving higher scores on benchmarks. Try it out today!”” / X https://x.com/roo_code/status/1973022454298837294
Nvidia quietly becomes America’s top AI open-source contributor with 300 releases
The chip giant published over 300 AI models, datasets and applications on Hugging Face in the past year, positioning itself as a major force in open-source AI development beyond just selling hardware. This strategic shift could help Nvidia maintain influence as the AI ecosystem evolves and potentially reduce dependence on closed proprietary systems from competitors.
As Jensen mentioned with @altcap @BG2Pod @bgurley, something that few people know is that @nvidia is becoming the American open-source leader in AI, with over 300 contributions of models, datasets and apps on @huggingface in the past year. And I have a feeling they’re just https://x.com/ClementDelangue/status/1971698860146999502
NVIDIA’s robot foundation model performs complex household tasks autonomously
NVIDIA demonstrated its Isaac GR00T N1.6 model successfully executing multi-step commands like fetching healthy snacks from kitchens without human intervention. This marks a significant leap from previous versions that only handled stationary two-handed tasks, showing AI robots can now navigate spaces and make contextual decisions about real-world requests.
Bring me the healthiest snack.”” The robot goes to the kitchen and gets the snack, fully autonomously. This is the first public demo of NVIDIA’s Isaac GR00T N1.6 foundation model presented by Yuke Zhu at CoRL 2025. The previous versions focused only on bimanual stationary https://x.com/TheHumanoidHub/status/1972698708975440349
Amazon trains humanoid robots to handle warehouse boxes using motion capture
The company developed a system that converts human movement data into robot training material, generating 8 hours of robot trajectories from single demonstrations. This represents a practical step toward deploying humanoids in logistics, moving beyond research prototypes to address real warehouse automation needs.
Amazon is training humanoids to move boxes. Makes sense! OmniRetarget is a data generation engine that enables complex loco-manipulation for humanoids. It uses offline retargeting from human MoCap datasets and augments data from single demos to produce 8 hours of trajectories https://x.com/TheHumanoidHub/status/1973489480813388240
Elon Musk assembles team to raise tens of billions for AI projects
Musk announced he’s forming a dedicated capital formation team to secure massive funding for bringing “sci-fi into the present,” building on $4 billion already raised across his companies over five years. The move signals Musk’s intention to dramatically scale his AI ambitions beyond current projects like xAI’s Grok chatbot, potentially positioning him to compete directly with OpenAI and other major AI players. The emphasis on in-person work and direct reporting structure suggests he views this fundraising push as critical to his broader AI strategy.
My companies have raised $4B the last 5 yrs. It’s now time to re-accelerate. Today, I am assembling a Capital Formation team. This team will raise tens of billions to bring sci-fi into the present, reporting directly to me. DMs open. No remote candidates.”” / X https://x.com/adcock_brett/status/1973417191124160894
Gemini Robotics 1.5 launches with enhanced spatial reasoning capabilities
Google’s new robotics model excels at understanding space and time relationships, using step-by-step thinking to improve its responses. This represents a shift toward robots that can better navigate and manipulate physical environments by reasoning through complex spatial problems before acting.
Super excited to share Gemini Robotics 1.5!! Our high-level reasoning model Gemini Robotics-ER 1.5 is also publicly available now! The model is particularly strong at spatial and temporal reasoning, and can use thinking to improve its answers 🧠🤖 https://x.com/_anniexie/status/1971477645096517832
Google’s robot demonstrates advanced two-handed coordination using Gemini AI
DeepMind’s latest robot can perform complex tasks like opening suitcases by coordinating both arms simultaneously, marking a significant step toward general-purpose robots that could handle real-world manipulation tasks. This bimanual dexterity represents a key breakthrough beyond single-arm robotic systems that have dominated the field.
General-purpose robots with advanced manipulation ability are approaching quickly. This autonomous demo from Google DeepMind showcases Gemini Robotics 1.5. The bimanual coordination is impressive – one arm holds the suitcase in place while the other handles the zipper. https://x.com/TheHumanoidHub/status/1973448181867581648
Meta develops humanoid robot with plans to license software platform
Meta’s “Metabot” represents a shift toward creating an operating system for robots rather than just hardware, with CTO Andrew Bosworth planning to license the software to other manufacturers who meet Meta’s specifications. This mirrors Meta’s strategy with VR headsets and could establish the company as a major platform player in the robotics industry beyond its current social media dominance.
The Verge reports that Meta is developing its own humanoid robot, dubbed “”Metabot.”” CTO Andrew Bosworth believes the bottleneck is the software and envisions licensing the software platform to other robot makers, provided the robots meet Meta’s specs. Former Scale AI CEO https://x.com/TheHumanoidHub/status/1972223919831547985
Chinese robotics firm Unitree targets mass market with $5,900 humanoid robot
CEO Wang Xingxing predicts their R1 model will become the world’s best-selling humanoid robot next year, priced at just $5,900 compared to competitors costing tens of thousands. This represents a potential breakthrough in making advanced robotics accessible to consumers and small businesses, backed by China’s domestic robot industry growing 50-100% in the first half of this year.
Unitree CEO Wang Xingxing at a Trade Fair in Hangzhou on Saturday: ⦿ Unitree R1 will become the world’s best-selling humanoid robot next year. ⦿ In the first half of this year, the domestic robot industry grew an average rate of 50% to 100% for Chinese intelligent https://x.com/TheHumanoidHub/status/1973158573317501243
Unitree CEO Wang Xingxing expects R1 to be the world’s best-selling humanoid robot next year. Won’t shock anyone if it happens. The company announced the starting price of $5,900 but even at $12k this will sell like hot cakes https://x.com/TheHumanoidHub/status/1973452915366044096
Former OpenAI and DeepMind researchers secure record $300M seed funding
Periodic Labs raised the massive round to build autonomous robot laboratories that conduct physical experiments to discover new materials like superconductors, moving beyond internet-trained AI models. The startup, backed by tech giants including Nvidia and Bezos, represents a shift toward AI that generates new scientific data rather than just processing existing information, with founders who previously created ChatGPT and discovered 2 million new crystals using AI.
Former OpenAI and DeepMind researchers raise whopping $300M seed to automate science | TechCrunch https://techcrunch.com/2025/09/30/former-openai-and-deepmind-researchers-raise-whopping-300m-seed-to-automate-science/
Thinking Machines launches Tinker API for simplified model fine-tuning
Tinker lets researchers fine-tune large language models through simple Python code while the service handles distributed training infrastructure, supporting models from 1B to 235B parameters including mixture-of-experts architectures. The API uses LoRA adapters to reduce costs and provides low-level control over training algorithms, with early users at Princeton, Stanford, and Berkeley already achieving results matching full parameter training with significantly less compute. This addresses a key bottleneck where researchers previously spent substantial time on infrastructure rather than algorithms and data.
Announcing Tinker – Thinking Machines Lab https://thinkingmachines.ai/blog/announcing-tinker/
Tinker – Thinking Machines Lab https://thinkingmachines.ai/tinker/
I had the chance to try @thinkymachines’ Tinker API for the past couple weeks. Some early impressions: Very hackable & lifts a lot of the LLM training burden, a great fit for researchers who want to focus on algs + data, not infra. My research is in RL, and many RL fine-tuning”” / X https://x.com/tyler_griggs_/status/1973450947218252224
A flexible API for fine-tuning LMs – Tinker by @thinkymachines Write a simple CPU-only script, and it runs your exact training loop on distributed GPUs. You can fine-tune open models like Llama and Qwen, up to large MoE (Qwen3-235B-A22B), switching them by changing only one https://x.com/TheTuringPost/status/1973827605448306883
thinking-machines-lab/tinker-cookbook: Post-training with Tinker https://github.com/thinking-machines-lab/tinker-cookbook
🚀With early access to Tinker, we matched full-parameter SFT performance as in Goedel-Prover V2 (32B) (on the same 20% data) using LoRA + 20% of the data. 📊MiniF2F Pass@32 ≈ 81 (20% SFT). Next: full-scale training + RL. This is something that previously took a lot more effort”” / X https://x.com/chijinML/status/1973451597393883451
I’ve been using Tinker at Redwood Research to RL-train long-context models like Qwen3-32B on difficult AI control tasks – specifically teaching models to write unsuspicious backdoors in code similar to the AI control paper. Early stages but seeing some interesting backdoors 👀”” / X https://x.com/ejcgan/status/1973449963259699284
One interesting “”fundamental”” reason for Tinker today is the rise of MoE. Whereas hackers used to deploy llama3-70B efficiently on one node, modern deployments of MoE models require large multinode deployments for efficiency. The underlying reason? Arithmetic intensity. (1/5) https://x.com/cHHillee/status/1973469947889422539
[1 Oct 2025] Thinking Machines’ Tinker: LoRA based LLM fine-tuning API https://x.com/Smol_AI/status/1973622595124863044
Really excited and proud to see Qwen models are in the first batch of supported models for the tinker service! 🤩 we will continue to release great models to grow research in the community 😎 https://x.com/wzhao_nlp/status/1973603599616974970
Tinker is cool. If you’re a researcher/developer, tinker dramatically simplifies LLM post-training. You retain 90% of algorithmic creative control (usually related to data, loss function, the algorithm) while tinker handles the hard parts that you usually want to touch much less”” / X https://x.com/karpathy/status/1973468610917179630
Tinker provides an abstraction layer that is the right one for post-training R&D — it’s the infrastructure I’ve always wanted. I’m excited to see what people build with it. “”Civilization advances by extending the number of important operations which we can perform without”” / X https://x.com/johnschulman2/status/1973450054238347314
Very excited to see the Tinker release by @thinkymachines! @robertnishihara and I had a chance to experiment with the API, see https://x.com/pcmoritz/status/1973456462346424641
Very excited to see the Tinker release! @pcmoritz and I had a chance to experiment with the API. It does a nice job of providing flexibility while abstracting away GPU handling. Here’s a simple example showing how to generate synthetic data and fine tune a text to SQL model.”” / X https://x.com/robertnishihara/status/1973455582603649430
DINOv3-powered model detects concrete cracks with near-perfect accuracy instantly
A new vision system called RF-DETR-Seg achieves exceptional precision in identifying structural damage after just one training cycle, potentially revolutionizing infrastructure inspection by automating a task that traditionally requires expert human assessment. This breakthrough matters because detecting concrete deterioration early prevents costly repairs and safety hazards in buildings and bridges.
segmenting concrete cracks is a difficult task for vision models; thin and long segments RF-DETR-Seg reaches near-perfect accuracy after only one epoch of training. that DINOv3 backbone is pretty crazy. notebook: https://x.com/skalskip92/status/1974160484799590789
Tencent releases first open-source model for generating 3D object parts
The company’s Hunyuan3D-Part system can create detailed 3D shapes broken down into individual components, outperforming existing commercial tools. This addresses a key bottleneck in 3D content creation for gaming, manufacturing, and virtual worlds where objects need realistic part-by-part assembly. The release includes P3-SAM, the first model designed specifically for identifying 3D parts rather than adapting 2D image techniques.
We are introducing Hunyuan3D-Part: an open-source part-level 3D shape generation model that outperforms all existing open and close-source models. Highlights: 🔹P3-SAM: The industry’s first native 3D part segmentation model. 🔹X-Part: A part generation model that achieves https://x.com/TencentHunyuan/status/1971491034044694798
2 AI Visuals and Charts: Week Ending October 03, 2025
Anyone who sees this video can instantly grasp the (at least) potential for malicious use. And yet nobody with any power (either in the public or at the corporate level) has anything to say (let alone do) to address it, or even acknowledge it.”” / X https://x.com/TheStalwart/status/1973372434133950665
yep that’s it we’ve crossed the chasm ai video is now indistinguishable from real video https://x.com/mattshumer_/status/1973077933481677245
Top 20 Links of The Week – Organized by Category
AgentsCopilots
AI Agents get smarter with memory—learning, adapting and improving over time. Want to build self-improving agents? Our latest AI Agent for Beginners lesson breaks it all down. ⬇️ https://x.com/msdev/status/1970896017072566646
We’ve updated function calling to support files and images as tool call outputs. You can now call functions like `generate_chart` or `load_image` and return those files back to the model, rather than just JSON or text. 🌠 https://x.com/OpenAIDevs/status/1971618905941856495
Amazon
Very excited to start sharing some of the work we have been doing at Amazon FAR. In this work we present OmniRetarget, which can generate high-quality interaction-preserving data from human motions for learning complex humanoid skills. High-quality re-targeting really helps”” / X https://x.com/pabbeel/status/1973426729952956652
Anthropic
sonnet 4.5 is noticeably better at compacting conversations than any other model i’ve used i’ve never felt like i wasn’t experiencing SOME task degradation after compacting context near the end of a context window tbh this is the only thing i’ve noticed so far that marks an https://x.com/nickbaumann_/status/1972838170493628847
Audio
New JSON prompt for Veo3 fast on @LeonardoAI_ Prompt: { “”shot””: { “”composition””: “”dynamic product showcase with close-ups and wide reveals, ending in a full 360-degree orbit around the vehicle””, “”lens””: “”50mm””, “”frame_rate””: “”24fps””, “”camera_movement””: https://x.com/azed_ai/status/1949139427156279335
VEO3 is wild. I made this on Google Flow with a single PNG image and a prompt (linked in the next tweet). @FintechNerdCon https://x.com/sytaylor/status/1952318936134885405
BusinessAI
The Gemini 2.5 Flash update is only slightly better, but a whopping 30% cheaper than it’s predecessor”” / X https://x.com/scaling01/status/1971578192512029045
A Research Agenda for the Economics of Transformative AI | NBER https://www.nber.org/papers/w34256
Mirage Studio (generate fake AI actors) https://mirage.app/landing?redirectTo=%2Fredeem%3Fcode%3DTLDRAI10032025
ChipsHardware
Why did OpenAI train GPT-5 with less compute than GPT-4.5? Due to the higher returns to post-training, they scaled post-training as much as possible on a smaller model And since post-training started from a much lower base, this meant a decrease in total training FLOP 🧵 https://x.com/EpochAIResearch/status/1971675079219282422
2025 State of AI Infrastructure Report | Google Cloud https://cloud.google.com/resources/content/state-of-ai-infrastructure?e=48754805
I just interviewed the DeepMind scientists behind Google’s most viral AI sensation. Their 2am naming decision created “”Nano Banana”” – and they didn’t expect this reaction. They’ve turned insanely complex AI workflows into simple text prompts, taking Gemini to #1 on the App https://x.com/bilawalsidhu/status/1973508392397447673
Google has been shipping in September – Gemini Robotics 1.5 – Latest Gemini Live – EmbeddingGemma – Veo 3 GA + API updates – AI Edge gallery for on-device – Gemini Batch API embedding support – Gemini Flash and Flash Lite updates – Chrome DevTools MCP – VaultGemma and more 🚀”” / X https://x.com/osanseviero/status/1971468195308712431
InternationalAI
I get data sovereignty in some cases, but there is just no way for new countries to join the frontier model race as long as scaling (in any sense) matters. There is no sovereign model. You will be dependent on the production of Chinese (or US or French) open models as a base.”” / X https://x.com/emollick/status/1972018517919826099
Media
Do Humans Really Have World Models? | Daniel Miessler https://danielmiessler.com/blog/humans-dont-have-world-model
OpenAI
Terence Tao + AI for solving hard math problems🤯 In this example the insight comes from Terence, and the muscle—via an hourlong conversation with GPT-5 and some python code it wrote—comes from AI. “”Here, the AI tool use was a significant time saver… Indeed I would have been”” / X https://x.com/kevinweil/status/1974161952260624459
Terence Tao: “I was able to use an extended …” – Mathstodon https://mathstodon.xyz/@tao/115306424727150237
Yet more evidence that a pretty major shift is happening, this time by Scott Aaronson https://x.com/SebastienBubeck/status/1972368891239375078
OpenSource
🚀 Ring-1T-preview: Deep Thinking, No Waiting The first 1 trillion open-source thinking model -> Early results in natural language: AIME25/92.6, HMMT25/84.5, ARC-AGI-1/50.8, LCB/78.3, CF/94.7 -> Solved IMO25 Q3 in one shot, with partial solutions for Q1/Q2/Q4/Q5 Still evolving! https://x.com/AntLingAGI/status/1972711364876697612
Robotics
This job posting suggests that Meta is developing an egocentric AI system that will form the foundation for AI-enabled humanoid robots and AR devices. https://x.com/TheHumanoidHub/status/1972544881303417338





Leave a Reply