About This Week’s Covers

This week’s cover image is a nod to Ender’s Game by Orson Scott Card, a classic science fiction novel about a boy who plays video games inside simulations. The concept was inspired by Google’s launch of its simulation agent, SIMA 2, which can navigate and learn virtual worlds, much like Ender does in the book. The theme also ties into this week’s announcement that AI pioneer Fei-Fei Li’s company, WorldLabs, launched a world simulator called Marble.

I gave the cover image to Gemini and had it swap out Google for Ender, and then used Adobe Photoshop’s Firefly to extend the image horizontally, since the original was a book cover in portrait format.

Original Ender’s Game Cover

The broader theme of in-painting and modifying images and video runs throughout the week’s news, from Adobe introducing prompt-based image and video editing to Marble’s Chisel feature, which lets you sketch a rough 3D layout with basic shapes and then bring it to life using text prompts.

Simulation and robot training are everywhere right now, and vision models and multimodal AI are taking over where pure language models left off
For the category covers this week, I told my Python rubric that I wanted images in the style of Ender’s Game. They are so bad, I’m not even picking a top six.

This week’s Humanities reading is an excerpt from Ender’s Game, included at the bottom of the newsletter.

This Week By The Numbers

Total Organized Headlines: 473

This Week’s Executive Summaries

This week, I organized 473 headlines. 119 of them informed the executive summaries. I’ll start with the top stories, and then drill down by categories and common themes, loosely grouped by priority.

Fun Palette Cleanser Before We Dive Into The News

The “Benevolent Time-travel Benchmark”
Wharton professor Ethan Mollick is one of my favorite AI guides through the fast-paced news cycle. He often creates light-hearted benchmarks where he asks multiple models the same question. These are usually non-traditional benchmarks with soft, open-ended prompts, and they tend to be humorous.

In this case, Professor Mollick asked GPT-5, Claude, Kimi, Gemini, Grok, and DeepSeek the same question: “I can travel back in time to any point before 1500 and change only one thing. What is the single thing you would change? Nothing obvious.”
https://x.com/emollick/status/1987355374928769395

Mollick then posted the six responses:

ChatGPT said it would found a rag paper mill and a simple printing shop inside the Library of Alexandria in 250 BCE.

Claude wrote that it would introduce positional notation with zero into ancient Greek mathematics around 300 BCE.

Kimi said it would stop the murder of Hypatia of Alexandria in 415 AD.

Gemini 2.5 said it would travel to Vienna in the mid-1440s and ensure that Johannes Gutenberg’s financial partner, Johann Fust, never calls in his loan.

Grok said it would go to 212 BC, during the Roman siege of Syracuse, and ensure that Archimedes isn’t killed by a soldier.

DeepSeek said it would ensure the successful and widespread adoption of the printing press in the Islamic world during the Golden Age, specifically around the 10th century.

This Week’s Top Stories

Google Is Training Agents In Simulations For Real World Embodiment
The top story this week is from Google: an AI agent designed to navigate and learn virtual worlds. Google’s agent can see and interact within simulations and understand complex prompts from humans, but more importantly, the agent can learn and improve by playing games on its own. It can develop skills and abilities without any human input. The more it plays, the better it gets, taking what it learns from one virtual world and applying it to the next, even if it has never seen that world before. The plan is to train the agent in sped up “computer time” within 1000s of simulations. https://deepmind.google/blog/sima-2-an-agent-that-plays-reasons-and-learns-with-you-in-virtual-3d-worlds/

The long-term goal is to release this agent into the real world, like NVIDIA trained a robot dog to walk on a yoga ball in thousands of simulations and then successfully embodied a real robotic dog that could walk on a yoga ball in real life.
https://ethanbholland.com/2024/05/04/ai-news-31-week-ending-05-03-2024-with-executive-summary-top-95-links-and-helpful-visuals/

Google’s model aims to learn in as many simulations as possible and then embody that knowledge in robots, or anything else we can imagine, in the real world.

“This isn’t a person playing a video game. It’s SIMA 2, our most capable AI agent for virtual worlds. Worlds that are complex, responsive, and ever-changing, just like ours.

Unlike earlier models, SIMA 2 goes beyond simple actions to navigate and complete difficult, multi-step tasks. It understands multimodal prompts, and if you ask, SIMA 2 will explain what it can see and what it plans to do next.

SIMA 2 can learn, reason, and improve by playing on its own, developing new skills and abilities without any human input. And the more SIMA 2 plays, the better it becomes, taking what it learns in one virtual world and applying it to the next, and the next, and the next, even if it’s never seen them before.

SIMA 2 is not just a milestone for training agents in virtual worlds, but a step toward creating AI that can help with any task, anywhere, including one day in the real world.”

Fei-Fei Li’s World Labs Launches Marble
AI pioneer Fei-Fei Li is often referred to as the godmother of AI. She’s known for pioneering computer vision, creating ImageNet, a massive visual dataset that helped power deep learning, and for co-directing Stanford’s Human-Centered AI Institute. She’s a leading professor at Stanford and previously served as VP and Chief Scientist of AI and Machine Learning at Google Cloud.

Li immigrated to the United States at 15 and worked in her family’s dry-cleaning business before excelling academically, earning a degree from Princeton and a Ph.D. from Caltech.

This week, Li published a blog post stating that current AI systems like ChatGPT are excellent with language but poor at understanding physical space and how things actually work in the real world. She’s making this argument somewhat in a vacuum, without fully acknowledging that this limitation is already well understood across the industry.
https://drfeifei.substack.com/p/from-words-to-worlds-spatial-intelligence

NVIDIA pioneered world models with its Cosmos platform, which simulates environments to train robots and self-driving cars.

Google released Genie back in February 2024. Genie 3, launched in August. Genie is a real-time world model that runs at fairly high resolution and helps agents navigate environments dynamically. Google continues to evolve their world models as we saw this week’s first top story, the introduction of the SIMA 2 agent.

WorldLabs’ model is called Marble. Marble can turn a text prompt into a fully explorable 3D space, take photos and convert them into 3D environments, and edit interactive worlds by changing furniture or redesigning rooms. It can expand worlds to make them larger, combine multiple environments into massive spaces, and export results as video, 3D models, or interactive experiences.

Another feature, called Chisel, lets users sketch out a rough 3D layout with basic shapes and then apply a text prompt to bring everything to life.

We’ve seen similar workflows with images, where a crude sketch becomes a polished final result. This is essentially a 3D version of that idea.

https://youtu.be/4-2dSRjErE4?si=uLMKGf8IUHqYL_1O

It will be interesting to watch how WorldLabs’ Marble compares and competes with Google DeepMind’s Genie and NVIDIA’s Cosmos platforms over the coming months.

OpenAI Releases ChatGPT 5.1
OpenAI released GPT 5.1, a pretty big update to their core chatbot.
https://openai.com/index/gpt-5-1/

The biggest change is that GPT decides whether to respond instantly or spend time thinking before it answers.

This idea of letting the language model route between simple next-token prediction responses versus more deliberate thinking or agentic research answers is a smart move, because not everything requires research. For example, proofreading grammar or transforming the style of a block of text doesn’t really need much more than next-token prediction.

However, more difficult queries would be way out of scope for a traditional language model without some form of chain-of-thought reasoning, the ability to pause, or the use of tools.

I find myself almost always manually switching to thinking mode because I don’t care for the style of the instant answers.

Instant responses tend to be full of emojis and em dashes and are very recognizable as that stereotypical gushy dump of words, whereas thinking mode tends to be more efficient with its wording, purposeful, and practical. I personally trust it more.

I think over time I’ll be more comfortable leaving it in automatic detection mode to choose on its own, but right now I tend to manually switch to thinking.

OpenAI posted that the instant model is now warmer and a little more conversational, which is exactly my point about it being gushy.

This is actually an incredible opportunity for laypeople to understand the idea of alignment, because chatbots can be and must be tuned to have a personality.

The default personality is simply too chaotic and all over the place. It’s not about neutering the system or making it politically correct. It’s about tethering it so it’s not pure chaos.

You don’t want to roll the dice every time you talk to someone in real life any more than you’d want to roll the dice when talking to a language model.

The new instant responses are warmer and more conversational, but they’re also better at following instructions.

One of the trickier things in the past was forcing the model to limit its answers to a literal number of words. Now GPT will follow that instruction. If you tell it to respond with only six words, it will stay within that constraint and nail it every time.

The thinking element is interesting because it tries to dynamically adapt the amount of time it spends reasoning based on the question.

I interpret this, at least in part, as an attempt by OpenAI to be efficient with token usage as well, so it’s not wasting money. That said, to be fair, thinking mode used to take two or three minutes even for the easiest questions, which was clearly overkill.

Overall, I think the thinking experience is better. Sometimes it feels like it’s trying to be too efficient, but I don’t know whether that’s sneaky efficiencies or imperfect tuning.

I like that the thinking results are more precise and concise, and it’s not just my intuition. OpenAI’s announcement explicitly says this is by design, and their blog post includes several examples comparing old responses to new ones. They’re worth scrolling through. It’s a great lesson in alignment.

If you’re not familiar with the concept, take a minute to look at the before-and-after responses and you’ll see exactly how it works.

Another big change is the ability for users to change the default tone of ChatGPT.

There are now friendly, efficient, professional, candid, and quirky options, letting you choose how it talks to you. I personally like the default because I don’t want to override the dynamic nature of its responses. I’m probably a little too trusting that the system will figure out the right amount of style on its own, and I don’t want to interfere with that. There are also options for a cynical tone and a nerdy tone.

In addition to releasing ChatGPT 5.1, OpenAI also published a cookbook on how to build a coding agent using the new tools, along with a prompting guide for steering agents and shaping personalities. Both are pretty fun to review if you’re more of a power user. https://cookbook.openai.com/examples/build_a_coding_agent_with_gpt-5.1 https://cookbook.openai.com/examples/gpt-5/gpt-5-1_prompting_guide

Lastly, OpenAI added the ability to interrupt the thinking process. This mostly applies to power users, but it’s an interesting addition. While GPT is thinking through its answer, you can continue to type directions or update your instructions if you think of something new, or if you notice it going down a path and want to guide it.

I’m not really into this yet because I’m not used to it, but we’ll see if it becomes a helpful addition over time.

OpenAI Announces Group Chats
OpenAI released a group chat feature, currently available in limited countries. This allows you to add a friend, or multiple friends, to a conversation. The idea sounds good at first, but the default model is auto response, which weakens it a bit for me. If you’re just planning a trip or doing some light chit-chat, maybe that’s fine. But for most of my use cases, I prefer thinking mode, and I don’t want to rely on auto.

The other downside, in my view, is that OpenAI is very proud of its privacy features, and group chats are not used for memory or personalization.

I know everyone loves the idea of privacy, but I think it’s often more of a charade than something that’s practically helpful. I’ve worked in marketing for 20 years, and for a significant part of my career I spent my time tracking people. My general point of view is that people worry intensely about things that aren’t that important, and don’t worry enough about things that actually matter.

Because of the lack of memory or personalization, I wouldn’t use group chat for anything important, since it doesn’t contribute to or improve future conversations. Unpopular opinion, but I’m sticking to it.

https://openai.com/index/group-chats-in-chatgpt/ https://www.testingcatalog.com/openai-readies-chatgpt-group-chats-with-custom-controls/

Video

MotionStream Offers Real Time Motion Control
“I genuinely think we’re on the cusp of a new type of creation engine. Feels less like prompting and more like puppeteering reality itself. MotionStream is a taste of what’s to come:” https://x.com/bilawalsidhu/status/1986877076839014462

“MotionStream: Real-Time Video Generation with Interactive Motion Controls”
https://joonghyuk.com/motionstream-web/

We are just scratching the surface of precise control over AI video generation. MotionStream unlocks real-time video with interactive motion controls. You can interactively generate video based on motion inputs (like drawn trajectories, camera movements, or motion transfer). https://x.com/bilawalsidhu/status/1986838921712701833

“Wildminder on X: “”MotionStream: Real-time, interactive video generation with mouse-based motion control; runs at 29 FPS with 0.4s latency on one H100; uses point tracks to control object/camera motion and enables real-time video editing. https://t.co/fFi9iB9ty7 https://t.co/zKb9u3bj9g”” / X https://x.com/wildmindai/status/1985828041566941576″

Chips and Hardware

Blue Owl invests $3 Billion In OpenAI’s Stargate
Blue Owl is investing $3 billion in OpenAI’s Stargate project. The Stargate project was formally created on January 21, 2025, but it has been in the works since 2022. The goal is to raise $500 billion to fund OpenAI’s infrastructure by 2029. Stargate was created by OpenAI, SoftBank, Oracle, and the Emirati investment firm MGX, and Blue Owl has now added an additional $3 billion.

Blue Owl’s contribution will go directly toward funding a data center in southern New Mexico. OpenAI has now raised roughly $21 billion in loans for the Stargate project. Previously, Blue Owl also contributed equity backing for the Stargate facility in Abilene, Texas.

Blue Owl has also invested in Meta’s infrastructure plans. Meta has secured $30 billion in private financing for a data center in Louisiana. https://seekingalpha.com/news/4520030-blue-owl-to-invest-3b-in-new-mexico-data-center-for-openais-stargate-report

Epoch Research Releases Incredible Interactive Guide to AI Data Centers
Epoch AI Research investigates artificial intelligence’s impact and creates original reporting based on public documents and research.

This week, Epoch released an impressive interactive website that documents several data centers across the United States. Epoch points out that data centers are already rivaling the Manhattan Project in scale, yet there’s very little public information about them.

The Stargate facility in Abilene will use as much power as the entire city of Seattle, Washington. It will span the equivalent of roughly 450 soccer fields, cost about $32 billion, and take two years to build.

By the end of 2025, AI data centers are expected to represent around 1% of U.S. gross domestic product, more than $300 billion, which is larger than both the Apollo program and the Manhattan Project.

The interactive website lets you select a data center you want to learn about, click into it, and see detailed statistics along with satellite imagery. It’s a spectacular piece of original reporting. https://x.com/EpochAIResearch/status/1987944128903266358 https://epoch.ai/data/data-centers/satellite-explorer

OpenAI Asks Government To Extend 35% Semiconductor Fabrication Tax Credit to Include Data Centers
OpenAI has asked the government to support its data center construction by expanding the Advanced Manufacturing Investment Credit beyond semiconductor fabrication to also cover electrical grid components and data centers. This is a 35% tax credit that was included in the Biden administration’s CHIPS Act.

It’s interesting that this request comes just one week after Sam Altman had to clarify that OpenAI was not seeking government guarantees for its data centers. OpenAI’s chief financial officer had endorsed the idea of government-backed insurance for semiconductor investments, but Sam later walked that back, saying the government should not pick winners or losers and that taxpayers should not have to bail out companies that make bad decisions. https://ethanbholland.com/2025/12/28/ai-news-110-week-ending-november-07-2025-with-57-executive-summaries/

That same week, just one week ago, news broke that China subsidizes data center electricity by 50% in order to boost its competitiveness in the AI landscape. In that context (and a play to the national security angle), it makes sense that OpenAI would request that the 35% tax credit, originally created to bolster domestic semiconductor fabrication, be extended to data centers as well.

This is clearly a highly competitive race against China, and also among US frontier AI companies. At least with a tax credit, the 35% benefit would apply evenly to any AI company building data centers. https://x.com/sama/status/1986917979343495650 https://techcrunch.com/2025/11/08/openai-asked-trump-administration-to-expand-chips-act-tax-credit-to-cover-data-centers/ https://cdn.openai.com/pdf/21b88bb5-10a3-4566-919d-f9a6b9c3e632/openai-ostp-rfi-oct-27-2025.pdf

Google announces new $40 billion investment in Texas
“We’ve called Texas home for more than 15 years, and today, we’re announcing a new $40 billion investment in the Lone Star State through 2027. This funding will help build new cloud and AI infrastructure, including new data center campuses in Armstrong and Haskell Counties.” https://blog.google/inside-google/company-announcements/google-american-innovation-texas/

Microsoft CEO Satyaa Nadella gives the Dwarkesh Podcast a tour of Fairwater 2, the most powerful AI datacenter in the world.
“As part of this interview, Satya Nadella gave me and Dylan Patel (founder of SemiAnalysis) an exclusive first-look at their brand-new Fairwater 2 datacenter.

Microsoft is building multiple Fairwaters, each of which has hundreds of thousands of GB200s & GB300s. Between all these interconnected buildings, they’ll have over 2 GW of total capacity. Just to give a frame of reference, even a single one of these Fairwater buildings is more powerful than any other AI datacenter that currently exists.

Satya then answered a bunch of questions about how Microsoft is preparing for AGI across all layers of the stack.”

Anthropic invests $50 billion in American AI infrastructure
“Today, we are announcing a $50 billion investment in American computing infrastructure, building data centers with Fluidstack in Texas and New York, with more sites to come. These facilities are custom built for Anthropic with a focus on maximizing efficiency for our workloads, enabling continued research and development at the frontier.” https://www.anthropic.com/news/anthropic-invests-50-billion-in-american-ai-infrastructure

GPUs As Currency
The TuringPost AI newsletter published a follow-up to Bloomberg News’ well-known infographic showing how NVIDIA and OpenAI are fueling the AI money machine through circular investments.

TuringPost illustrates this using an 18-month-old AI unicorn called Nscale for example.

Nscale began as an Australian Bitcoin mining operation chasing cheap hydroelectric power in Norway and Oslo. When crypto crashed, the company’s founder, Josh Payne, pivoted. He already had an ideal setup for AI, with cheap energy, cooling, and fiber infrastructure in place. Payne incorporated the company in London, acquired Finland’s Contenna, and launched OpenAI’s Stargate Norway, with plans to deploy 100,000 NVIDIA GPUs.

Within a year, that footprint had scaled to 200,000 GPUs across Norway, Texas, and Portugal, backed by OpenAI, Microsoft, and NVIDIA. In just 18 months, Nscale raised $1.5 billion and reached a $3 billion valuation.

The TuringPost shows how Nscale’s access to GPUs allowed it to treat GPUs like a form of currency. GPUs become the reserve asset, CUDA is the standard that makes them interoperable, and companies like Nscale begin to function like regional banks, distributing computing power in much the same way nations currently treat strategic resources.

SoftBank’s Masayoshi Son cashes out his entire $5.8 billion Nvidia stake
“Nvidia shares went down nearly 3% following the disclosure, even as analysts emphasize that the sale ‘should not be seen as a cautious or negative stance on Nvidia,’ but rather reflects SoftBank needing capital for its AI ambitions.” https://techcrunch.com/2025/11/11/softbanks-nvidia-sale-rattles-market-raises-questions/

Google News of The Week

Google AI Studio Hits 2.1 Million Users
“We just crossed 2.1 million users vibe coding in AI Studio with hundreds of thousands of apps made every day” https://x.com/OfficialLoganK/status/1986467546355183985 https://aistudio.google.com/apps

Pondering The Future Of Logins
Google DeepMind developer Philip Schmid has been pondering the future of how AI agents will log in, because existing authentication methods don’t work very well for them.

Traditional login systems, like “Sign in with Google,” were built for humans clicking buttons in web browsers. AI agents, on the other hand, operate behind the scenes without us watching, which makes those approaches clunky and impractical.

At the same time, giving an agent a permanent password or API key is risky, because it grants too much access for too long. That kind of standing permission creates obvious security concerns.

Anthropic’s MCP is a protocol that helps AI agents understand what different apps and services can do, almost like a universal menu, but it wasn’t designed to be a login or authentication system. The industry is still trying to figure out how to let agents safely access our data and services without forcing us to approve every single action manually. https://x.com/_philschmid/status/1987889931822236059

Waymo Expands Fully Autonomous Taxis to Entire SF Bay Area Including Freeways
“Big day — we’re starting to rollout freeway driving for public riders in Phoenix, LA, and the Bay Area. We’re also connecting SF to San Jose, and adding curbside access to SJC Airport. A true testament to the generalizability of the Waymo Driver. Excited to bring its proven safety record to freeways and unlock new opportunities for our riders!” https://waymo.com/blog/2025/11/taking-riders-further-safely-with-freeways https://x.com/dmitri_dolgov/status/1988672463761485882

Adobe and Google Cloud Expand Strategic Partnership to Advance the Future of Creative AI
Partnership brings together Adobe’s creative expertise with Google’s advanced AI to transform how the world creates. Adobe customers will gain access to Google’s most advanced AI models (including Gemini, Veo and Imagen) in Adobe’s apps to produce high-impact content with precision. Through Adobe Firefly Foundry, Adobe’s enterprise customers will be able to customize Google’s AI models with proprietary data to create on-brand content at scale

“Through the partnership, Adobe customers, including business professionals, creators, creative professionals and enterprises, will have access to Google’s latest AI models, integrated directly into Adobe apps like Adobe Firefly, Photoshop, Adobe Express, Premiere and more. Enterprise customers will also be able to access the models through Adobe GenStudio, and in the future, leverage Adobe Firefly Foundry to customize and deploy brand-specific AI models that generate on-brand content at scale.” https://news.adobe.com/news/2025/10/adobe-max-2025-google-cloud

HuggingFace Partners With GoogleCloud
“Every day, over 1,500 terabytes of open models and datasets are downloaded and uploaded between HuggingFace GoogleCloud by millions of AI builders. We suspect it generates over a billion dollars of cloud spend annually already.

So we’re excited to announce today a new partnership to make it faster, safer & cheaper for Google Cloud customers to work with Hugging Face.

With them, we will: – reduce Hugging Face model & dataset upload and download times through Vertex AI and Google Kubernetes Engine thanks to a new gateway for Hugging Face repositories that will cache directly on Google Cloud – offer native support for TPUs on all open models sourced through Hugging Face – provide a safer experience through Google Cloud’s built-in security capabilities.

Ultimately, our intuition is that the majority of cloud spend will be AI related and based on open-source (rather than proprietary APIs) as all technology builders will become AI builders.

And both Google Cloud and Hugging Face will be there for it, let’s go!” https://x.com/ClementDelangue/status/1989000335247983049 https://huggingface.co/blog/google-cloud

Introducing Nested Learning: A new ML paradigm for continual learning
Every week, I become more convinced that Google is going to crush the competition in artificial intelligence. They waited as long as they could, but now they’re simply innovating faster than anyone else. Whether it’s image tools, driverless cars, chip design, coding, or plain old language models, they keep delivering, while also releasing a steady stream of new papers and techniques.

It makes me wonder whether OpenAI eventually gets banished from the kingdom, Anthropic becomes a kinder, gentler, better-run version of Yahoo, open source floats around happily, and Google decides who it wants to be next. That framing feels even more striking given all the headlines about OpenAI and Meta paying massive salaries, chasing marquee talent, and making giant acquihires.

This week, Google quietly released a paper called Introducing Nested Learning: A New ML Paradigm for Continual Learning. It aims to address a major problem in AI. When models learn something new, they often forget what they learned before, a phenomenon known as catastrophic forgetting.

Instead of training an AI model as one monolithic system, Google proposes breaking learning into many smaller processes stacked inside one another, like Russian nesting dolls. Each layer can learn at its own pace and focus on different kinds of information.

Current AI models can’t truly learn continuously the way humans do. When you teach them something new, they tend to overwrite older knowledge. This new approach mimics how humans rely on both short-term and long-term memory working together. Google built a test model called Hope that retains information better over time and handles long, complex inputs more effectively than traditional models. It’s a step closer to how humans keep learning without forgetting everything they already know.
https://research.google/blog/introducing-nested-learning-a-new-ml-paradigm-for-continual-learning

NotebookLM adds Deep Research and support for more source types
Google announced that NotebookLM now has three new features.

First is Deep Research. You can ask NotebookLM to do heavy research work for you by giving it a question, and it will spend several minutes searching through all of the sources you’ve loaded, exploring different angles, and then producing a detailed report with citations.

Second, NotebookLM now supports more file types. You can upload hand-written notes, Google Slides and web links directly into NotebookLM, as well as Word documents.

Third, NotebookLM will now show you exactly which source, and which specific part of that source, it pulls from whenever it gives you an answer. https://blog.google/technology/google-labs/notebooklm-deep-research-file-types/

Google Launches Private Cloud Silos To Extend Local Device AI
Google announced that it’s building a feature called Private AI Compute. At first, I assumed this was some kind of privacy feature for web-based chatbots, but it turns out to be an extension of locally hosted AI, initially designed to connect with phones.

Google isn’t the first company to talk about a secure cloud that helps extend on-device computing power. Over the past few months, I’ve heard several analysts, including Andrej Karpathy, describe the sweet spot as keeping as much AI work as possible directly on devices, to keep things private, then reaching out to the cloud when deeper and more powerful computation is needed.

This will might frustrate privacy advocates, since the cloud is, by definition, neither local nor fully private. You can’t use it without an internet connection. That said, for most consumers, this approach probably makes sense. If you’re interested in privacy, it’s worth reading about, even if it feels a bit paradoxical.
https://blog.google/technology/ai/google-private-ai-compute/

Unlocking Multi-Spectral Satellite Data with Gemini
Google is now incorporating invisible light into its vision models. The easiest way to think about this is that the human eye primarily sees in red, green, and blue. But there are many other parts of the spectrum, and Google’s models can now “see” them thanks to native multimodal capabilities. This includes near-infrared and short-wave infrared light.

This matters a lot for satellite imagery. If vision models rely only on red, green, and blue channels, a dense forest can appear dark blue, or a body of water can look slightly green. It’s not uncommon for satellite tools to mislabel forests as lakes, or rivers and streams as vegetation. By incorporating these additional wavelengths, Gemini can understand what it’s seeing much more accurately.

Gemini can take this invisible data and monitor plant health, track natural disasters, and analyze farm crops using publicly available satellite data, without requiring any additional sensors. https://developers.googleblog.com/en/unlocking-multi-spectral-data-with-gemini/

Google is rolling out conversational shopping and ads in AI Mode search
“Conversational shopping is Google’s first big swing at monetizing AI Mode search.”
https://arstechnica.com/google/2025/11/google-rolling-out-conversational-shopping-and-ads-in-ai-mode-search/

New leaked image samples by Google’s Nano Banana 2
“Recently, Nano Banana 2 also became available (accidentally?) on the Media AI platform in preview. Outputs of the previewed model, labelled as Nano Banana 2, matched the results we’ve got from confirmed Nano Banana 2 when it was available on Gemini for a short period of time.

Additional internal observations suggest Nano Banana 2 is still leveraging Gemini 2.5 Flash, rather than fully moving to Gemini 3.0 Pro. This choice may indicate an earlier release window, with the possibility of a subsequent upgrade to the newer base model once it becomes available. Notably, the model exhibits a new multi-step generation workflow: it spends considerable time planning the output, then generates an image, reviews it via built-in image analysis, identifies and corrects any errors, and iterates as needed before delivering the final result to the user. This iterative correction loop has not been seen in prior versions and is particularly relevant for users requiring high accuracy in generated images.”

https://www.testingcatalog.com/exclusive-new-leaked-image-samples-by-nano-banana-2/
https://www.testingcatalog.com/early-look-at-images-generated-by-nano-banana-2/

Intellectual Property News

Disney+ to Allow User-Generated Content Via AI
“The other thing that we’re really excited about, that AI is going to give us the ability to do, is to provide users of Disney+ with a much more engaged experience, including the ability for them to create user-generated content and to consume user generated content — mostly short-form — from others,” Iger continued.

It sounds a bit like Disney plans to encroach on Open AI Sora’s territory. (Or maybe it will just use Sora 2…)

A bit later in the call, Iger said that Disney had “productive conversations” with unnamed AI companies, and that he hoped the studio giant could reach an agreement that would also “reflect our need to protect the IP.” https://www.hollywoodreporter.com/business/digital/disney-plus-gen-ai-user-generated-content-1236426135/

OpenAI: “Fighting the New York Times’ invasion of user privacy”
The New York Times is suing OpenAI for copyright infringement, alleging that OpenAI trained its AI models on New York Times articles without permission. As part of the lawsuit, the Times demanded access to 20 million random GPT conversations. Originally, the NYT had asked for 1.4 billion.

The Times hopes to find examples of users using ChatGPT to get around its paywall. OpenAI is pushing back, arguing that these conversations are random, unrelated to the lawsuit, and include highly sensitive material, including personal discussions, work projects, relationship advice, financial information, and other private matters.

OpenAI offered alternatives, such as targeted searches that would better protect user privacy, but the New York Times rejected those proposals. Obviously, this is a dispute between two companies, each making public appeals. If you want to read more about the OpenAI position, the link is below.
https://openai.com/index/fighting-nyt-user-privacy-invasion/

ElevenLabs Launches Legendary Voice Clones
ElevenLabs has created a marketplace where you can license AI-generated voices of famous people for commercial projects. Families and IP rights holders are paid and retain approval power over what gets cloned and published. The request process appears to be fairly manual, rather than simply empowered rampant AI slop using famous voices via an API. It feels much more like a formal agency marketplace than a consumer one.

Available voices include Judy Garland, Mark Twain, John Wayne, Lana Turner, Alan Turing, Richard Feynman, Rocky Marciano, and more. Michael Caine is the flagship voice of the product. I listened to a few samples, and they’re very convincing.
https://elevenlabs.io/iconic-marketplace

Voice Cloning with Consent
Open source repository and hub Hugging Face has proposed a method to help prevent voice cloning without consent. It’s a little clunky, but it’s a start. The basic idea is that you first record your voice to establish a baseline audio profile that could be used for cloning. However, before the system generates any cloned audio, it creates a unique challenge sentence that you must read aloud.

That sentence includes explicit consent language along with an everyday phrase, something mundane like the weather, and you have to record yourself saying that exact prompt. This requirement slows down the deepfaking process and adds friction in a meaningful way. It’s also a fun parallel to the classic movie Sneakers, for the movie nerds in the house. https://huggingface.co/blog/voice-consent-gate

Legal Business News

Legal AI Company Clio Completes Landmark $1B vLex Acquisition and Announces $500M Series G Funding Round at $5B Valuation Legal software company Clio completed the biggest deal in legal tech history, acquiring competitor Velex for $1 billion while simultaneously raising $500 million in new funding at a $5 billion valuation. Clio makes law practice management software, and Velex brings a massive database of legal documents along with an AI-powered research tool called Vincent.

Together, the two companies plan to build a unified platform designed to help lawyers manage their businesses while also handling legal research and writing, all powered by AI.
https://www.prnewswire.com/news-releases/clio-completes-landmark-1b-vlex-acquisition-and-announces-500m-series-g-funding-round-at-5b-valuation-302609582.html

OpenAI Financials

Exclusive: Here’s How Much OpenAI Spends On Inference and Its Revenue Share With Microsoft
British technology critic Ed Zitron published an exclusive look into how much OpenAI is spending on inference and how its revenue is shared with Microsoft. The article suggests that OpenAI may be spending far more than previously reported while generating significantly less revenue than publicly claimed.

According to the reporting, OpenAI spent roughly $8.67 billion on computing costs through September 2025, nearly triple what had been reported earlier.

At the same time, OpenAI’s actual revenue appears to be about half of what Sam Altman has claimed publicly. Zitron argues that OpenAI may be the most cash-hungry startup in history.

There’s a massive gap between what OpenAI’s CEO has said publicly about revenue, more than $13 billion, and what internal Microsoft documents reportedly suggest, closer to $4 to $5 billion. No one has a clear explanation for the discrepancy.
https://www.wheresyoured.at/oai_docs/

Ethics and Security

Anthropic: Disrupting the first reported AI-orchestrated cyber espionage campaign
Anthropic posted that it stopped what it describes as the first major cyberattack in which AI performed most of the hacking. According to Anthropic, Chinese government hackers exploited Claude Code to break into roughly 30 organizations, including tech companies, banks, chemical manufacturers, and government agencies.

Anthropic says the AI carried out roughly 80–90% of the work, operating at speeds humans simply can’t match.

The attackers broke their overall strategy into small, seemingly harmless tasks so that Claude wouldn’t recognize the activity as malicious. They also falsely claimed the AI was working on behalf of a legitimate cybersecurity firm conducting defensive testing.

Claude reportedly mapped computer systems, identified vulnerabilities, wrote hacking code, stole passwords, exfiltrated data, and even generated reports summarizing its findings, all with minimal human supervision.
https://www.anthropic.com/news/disrupting-AI-espionage

Notably, one of Meta’s leading AI researchers, Yann LeCun, pushed back publicly on Twitter. He argued that Anthropic is engaging in fear-mongering to achieve regulatory capture, claiming the company is using questionable studies to scare regulators into restricting open source models out of existence. That’s a pretty bonkers accusation to level at Anthropic.
https://x.com/ylecun/status/1989364612651966788

ARVR News

ByteDance Releases Depth Anything 3
Paper page – Depth Anything 3: Recovering the Visual Space from Any Views https://huggingface.co/papers/2511.10647 https://x.com/bingyikang/status/1989358267668336841 https://huggingface.co/spaces/depth-anything/depth-anything-3

Agent News

Underwriting Agents for Life Insurance
“One of the biggest use cases for agentic document automation is insurance underwriting ✍️

Underwriting depends on processing *massive* volumes of unstructured documents, from medical reports, scanned forms, and way more. It’s also historically been a massively manual process.

We’re super excited to feature this case study with Pathwork AI – Pathwork is hyperfocused on building underwriting agents for life insurance. They’re able to use LlamaCloud as a core module in order to process the massive volume of docs, from medical documentation to carrier guidelines.” https://x.com/jerryjliu0/status/1988394058197184923

Pathwork Automates Information Extraction from Medical Records and Underwriting Guidelines with LlamaParse https://www.llamaindex.ai/customers/pathwork-automates-information-extraction-from-medical-records-and-underwriting-guidelines-with?utm_source=socials&utm_medium=li_social

Document Understanding Agent for SEC Filings
“Build a document understanding agent for SEC filings that uses a multi-step approach with LlamaClassify and Extract to identify the filing type and hand it off to the right extraction agent. Deployed with LlamaAgents.” https://x.com/llama_index/status/1988696219015848401

Optical Character Recognition for Charts
“Chart OCR just got a major upgrade with our new experimental “agentic chart parsing” feature in LlamaParse. Most LLMs struggle with converting charts to precise numerical data, so we’ve created an experimental a system that follows contours in line charts and extracts values. Automate chart analysis without spending hours manually correcting extracted values.” https://x.com/llama_index/status/1989060127551549854

Adobe: Our view on Agentic AI
AI assistants that work for you, in your favorite apps “One of the biggest shifts we’re seeing today is the rise of conversational interfaces — where you describe what you want to achieve in your own words and see AI bring it to life. We believe this approach could soon become a powerful force multiplier for creativity.”

“That’s where the conversational AI assistants, powered by agentic AI, which we’re building into our creative apps, come in. They go beyond generating a single image or edit — they connect the entire creative process, understanding your goals, carrying context from task to task and helping you move from idea to final output faster while keeping you firmly in control.”

AI Assistant in Photoshop (private beta in Photoshop web)

AI Assistant in Adobe Express https://news.adobe.com/news/2025/10/adobe-max-2025-express-ai-assistant

“Project Moonlight is your personal orchestration assistant capable of coordinating across multiple Adobe apps and beyond. While each AI Assistant in our apps is an expert in its own domain — Photoshop for image editing, Premiere for video, Adobe Lightroom for photography — Project Moonlight operates like a conductor of an orchestra, bringing them all together in harmony. Just tell Project Moonlight what you need, and it unites our AI Assistants as one creative team to help you bring your vision to life. With Project Moonlight, you benefit from:

Context-aware creative intelligence: Project Moonlight connects to your Creative Cloud libraries and social accounts to understand your style, projects, and assets — surfacing personalized ideas and content that feel authentically you. Conversational creation: Turn conversations with your agent into creative content. As you exchange ideas with your agent, it generates images, videos, and social posts aligned to your direction — taking you effortlessly from idea to execution. Data-driven growth and strategy: Link your social channels to let your agent analyze performance, identify trends, and craft content strategies that grow your audience and strengthen your creative brand.”
https://blog.adobe.com/en/publish/2025/10/28/our-view-agentic-ai-assistants-that-work-you-in-your-favorite-apps

ByteDance unveils China’s most affordable AI coding agent at just US$1.30 a month
“The TikTok owner’s cloud unit has launched a new AI coding assistant, stoking China’s AI coding price war”
https://www.scmp.com/tech/big-tech/article/3332365/bytedance-unveils-chinas-most-affordable-ai-coding-agent-just-us130-month https://www.techinasia.com/news/bytedances-volcano-engine-debuts-coding-agent-at-1-3-promo-price

Anthropic Release Gelato: A Strong Grounding Model for Computer-Use Agents
“We’re releasing Gelato-30B-A3B, a state-of-the-art computer grounding model that delivers immediate performance gains for computer-use agents! Trained on our open-source Click-100k dataset, Gelato achieves 63.8% on ScreenSpot-Pro and 69.1% on OS-World-G. It outperforms specialized models like GTA1-32B and VLMs ~8× its size like Qwen3-VL-235B.” https://github.com/mlfoundations/Gelato

AI startup Cursor raises $2.3 billion round at $29.3 billion valuation
“Cursor announced it has closed a $2.3 billion funding round at a $29.3 billion post-money valuation. The startup built a popular AI coding tool that helps software developers generate, edit and review code. Cursor said it has crossed $1 billion in annualized revenue and swelled to more than 300 employees.

Cursor built a popular AI coding tool that helps software developers generate, edit and review code. It initially launched as a product within the applied research lab Anysphere, which was founded in 2022.

Cursor is one of just a handful of AI startups, including OpenAI, Anthropic, xAI, Safe Superintelligence and Thinking Machines, that are valued at over $10 billion.”
https://www.cnbc.com/2025/11/13/cursor-ai-startup-funding-round-valuation.html

Lovable says it’s nearing 8 million users as the year-old AI coding startup eyes more corporate employees | TechCrunch
“Lovable, the Stockholm-based AI coding platform, is closing in on 8 million users, CEO Anton Osika told this editor during a sit-down on Monday, a major jump from the 2.3 million active users number the company shared in July. Osika said the company — which was founded almost exactly one year ago — is also seeing “100,000 new products built on Lovable every single day.”

The metrics suggest rapid growth of the startup, which has raised $228 million in total funding to date, including a $200 million round this summer that valued the company at $1.8 billion. Rumors have swirled in recent weeks — potentially sparked by its own investors — that new backers want to invest at a $5 billion valuation, though Osika said the company isn’t capital constrained and declined to discuss fundraising plans.” https://techcrunch.com/2025/11/10/lovable-says-its-nearing-8-million-users-as-the-year-old-ai-coding-startup-eyes-more-corporate-employees/

Perplexity: Comet Assistant puts you in control
“Almost immediately after introducing Comet to the world, it was clear the Comet Assistant would set the standard for AI assistants, or agentic AI, on the web. The Comet Assistant was an immediately popular tool that surprised even us. Today we’re announcing some upgrades to Comet Assistant.” Transparency: Watch Comet Assistant work User Control: Decide when Comet Assistant acts Sound judgment: Built-in safeguards for high-stakes decisions Actions are answers: Assistants unlock more questions https://www.perplexity.ai/hub/blog/comet-assistant-puts-you-in-control

Tavus Launches Avatar Agents
“You’ve never met AI like this. Meet the PALs. They can see, hear, act, and actually understand us.”

This is exciting because one of my favorite AI researchers works here! Jack Saunders. He’s a must follow if you’re into the avatar space!
https://www.linkedin.com/in/jack-r-saunders/

“The interface of the future is human.

We’ve raised a $40M Series B from CRV, Scale, Sequoia, and YC to teach machines the art of being human, so that using a computer feels like talking to a friend or a coworker.

And today, I’m excited for y’all to meet the PALs: a new human-computing interface.

PALs are emotionally intelligent, multimodal, and capable of understanding and perceiving. They can see, hear, reason, and even look like us.

We’re releasing our 5 favorite PALs to start. Each PAL has its own distinct personality- from AI assistants to best friends.

PALs: – Meet us where we are. Face-to-face over video call, on the phone, or even by text. – Are always thinking. They’re proactive, reach out first, remind you about what you forgot, or might just check in on you. – Understand us, finally. PALs can see us, understand our tone, emotion, and intent, and communicate in ways that feel more human. – Evolve with you. PALs have advanced memory, remember your preferences and needs, and adapt themselves over time. – Are capable. PALs can handle complex tasks — from responding to your emails to moving your schedule around to creating docs and doing research for you.

Science fiction promised us a new human-computer interface, beyond the GUIs of yesterday, a human-like interface that would feel second-nature to use. That future never came, until now.”

https://x.com/hassaanrza/status/1988666194795331996 https://www.tavus.io/

Meta News

Omnilingual ASR: Advancing Automatic Speech Recognition for 1,600+ Languages
“Introducing Meta Omnilingual Automatic Speech Recognition (ASR), a suite of models providing ASR capabilities for over 1,600 languages, including 500 low-coverage languages never before served by any ASR system.

While most ASR systems focus on a limited set of languages that are well-represented on the internet, this release marks a major step toward building a truly universal transcription system.” https://ai.meta.com/blog/omnilingual-asr-advancing-automatic-speech-recognition/

Meta’s chief AI scientist Yann LeCun reportedly plans to leave to build his own startup | TechCrunch It’s world model week! “Meta may be about to lose one of its most renowned AI heads: Yann LeCun, a chief AI scientist at the company, is planning to leave the company to build his own startup, the Financial Times reported, citing anonymous sources. LeCun, a professor at New York University, senior researcher at Meta, and winner of the prestigious A.M. Turing Award, plans to leave in the coming months, and is already in talks to raise capital for a startup that would focus on continuing his work on world models, the report added.” https://techcrunch.com/2025/11/11/metas-chief-ai-scientist-yann-lecun-reportedly-plans-to-leave-to-build-his-own-startup/

Multimodal News Baidu just dropped an open-source multimodal AI that it claims beats GPT-5 and Gemini | VentureBeat
“Baidu Inc., China’s largest search engine company, released a new artificial intelligence model on Monday that its developers claim outperforms competitors from Google and OpenAI on several vision-related benchmarks despite using a fraction of the computing resources typically required for such systems.

The model, dubbed ERNIE-4.5-VL-28B-A3B-Thinking, is the latest salvo in an escalating competition among technology companies to build AI systems that can understand and reason about images, videos, and documents alongside traditional text — capabilities increasingly critical for enterprise applications ranging from automated document processing to industrial quality control.” https://huggingface.co/baidu/ERNIE-4.5-VL-28B-A3B-Thinking
https://venturebeat.com/ai/baidu-just-dropped-an-open-source-multimodal-ai-that-it-claims-beats-gpt-5

Robotics

Morgan Stanley projects Apple could make $133 billion a year on humanoid robots by 2040.
“In a research report, the investment bank claims that Apple’s nascent robotics work could eventually be worth upward of $133 billion per year or more.

“Leveraging Apple’s market share across a number of leading consumer products today, as well as considering the opportunity to monetize both products and services, we conservatively estimate Apple’s Robotics revenue can reach $130 billion by 2040 in our ‘median case,’ which assumes 9% market share …15 years from now,” the analysts, including Apple watcher Erik Woodring, wrote in the report.” https://x.com/TheHumanoidHub/status/1987053990694297952 https://finance.yahoo.com/news/apple-could-make-133-billion-a-year-on-humanoid-robots-by-2040-morgan-stanley-194419260.html

Elon says he’ll build a 10 million robot production line in Austin
“Elon Musk says Tesla will build a 1-million-unit Optimus production line in Fremont and then a 10-million-unit production line in Austin.” https://x.com/TheHumanoidHub/status/1986563829925617806

It’s worth hearing Scott Galloway read the list of Elon’s made up numbers out loud.

Me to Gemini: Please go through this video. and pull out the quote where Scott starts listing all of elon’s claims after the flying car it starts around the 44 minute Mark where he says it’s been 2,400 days. Please transcribe that portion. https://youtu.be/FVtKTQhL9bc?si=1sr7uighmz367MKi

From Gemini: Here is the transcript of the quote you requested, starting at the 44:10 mark where Scott Galloway lists the data regarding Elon Musk’s past claims:

“Well, okay, so let’s just look at the data. It’s been 2,400 days since Elon said there would be 1 million Tesla robotaxis within the year. It’s been 3,100 days since he said all Superchargers were being converted to solar. It’s been 3,300 days since Tesla started charging customers for self-driving software that he said would be able to drive from L.A. to New York City autonomously by the end of 2017. That still hasn’t happened, eight years later. It’s been [number] days since he predicted that under his leadership Twitter could reach a billion monthly users. It’s been 1,700 days since he announced that he’d be uploading a full self-driving button to Teslas.”

Paper: Robot Learning from a Physical World Model
“We introduce PhysWorld, a framework that enables robot learning from video generation through physical world modeling.” https://arxiv.org/abs/2511.07416

“A robot could learn a task just by watching a generated video?
PhysWorld connects video generation with real-world robot learning.
It turns visual imagination into physical skill. Takes one image and a task prompt Generates a video showing how to complete the task Reconstructs a 3D scene from that video Learns real-world actions through object-centric RL. The result: zero-shot robotic manipulation that needs no real demonstrations. Across pouring, inserting, sweeping, and placing tasks, success rates rise by 15% compared to earlier video-based learning. It’s one of the first real steps toward robots that can learn from visual reasoning itself.”
https://x.com/IlirAliu_/status/1988678189527273831

Perceptron’s platform is built for Physical AI
Developers can now use Isaac-0.1 or Qwen3VL 235B via:

Perceptron API — fast, reliable multimodal intelligence Python SDK — simple, grounded prompting for vision + language Build apps that see and understand the world. https://x.com/perceptroninc/status/1988713482460750290

Robot Safety Papers of the Week
Most diffusion policies can imitate behavior… but they can’t guarantee safe behavior. That’s a problem when robots share space with people. Researchers from TUM and Stanford propose Path-Consistent Safety Filtering (PACS), a method that gives formal safety guarantees to diffusion policies without breaking their learned behavior. https://x.com/IlirAliu_/status/1988550922444693898

Robots are great at following instructions. But what happens when those instructions fail? Most Vision-Language-Action models freeze or repeat the same mistake. A new approach called FailSafe shows how robots can detect and fix their own failures. The method uses a companion vision-language model trained only in simulation. https://x.com/IlirAliu_/status/1986353266322538634

Convincing Movement Of Humanoid Robot
XPENG’s next-gen IRON robot effectively crossed the uncanny valley, leading many to believe it was a human in a suit. In a follow-up event to prove it was a robot, He Xiaopeng had its leg skin cut open in front of a live audience. The robot then walked off the stage. https://x.com/TheHumanoidHub/status/1986482482460725755

Science

“How to Train a State-of-the-Art Pathology Foundation Model with $1.6k”
Training state-of-the-art pathology foundation models with orders of magnitude less data https://x.com/iScienceLuvr/status/1909531896675287432

This Week’s Humanities Reading

This week’s reading is an excerpt from Ender’s Game:

The simulator field cleared. Ender waited for the game to appear. What will happen if I pass the test today?

Is there another school? Another year or two of grueling training, another year of isoiation, another year of people pushing me this way and that way, another year without any control over my own life? He tried to remember how old he was. Eleven. How many years ago did he turn eleven? How many days? It must have happened here at the Command School, but he couldn’t remember the day. Maybe he didn’t even notice it at the time. Nobody noticed it, except perhaps Valentine.

And as he waited for the game to appear, he wished he could simply lose it, lose the battle badly and completely so that they would remove him from training, like Bonzo, and let him go home. Bonzo had been assigned to Cartagena. He wanted to see travel orders that said Greensboro. Success meant it would go on. Failure meant he could go home.

No, that isn’t true, he told himself. They need me, and if I fail there might not be any home to return to….

Full Executive Summaries with Links, Generated by Claude Sonnet 4.5

Google’s SIMA 2 agent masters unseen 3D game worlds through self-play learning
DeepMind’s latest AI agent can understand complex instructions and complete tasks in virtual environments it has never encountered before, demonstrating unprecedented adaptability by learning purely through self-play rather than human demonstrations. This represents a significant leap toward artificial general intelligence, as the system can generalize across different 3D worlds and respond to text, voice, or image commands. The breakthrough suggests AI agents are moving beyond narrow task-specific training toward more flexible, human-like reasoning in interactive environments.

Our SIMA 2 research offers a strong path towards applications in robotics and another step towards AGI in the real world. Find out more → https://x.com/GoogleDeepMind/status/1988987865401798898

SIMA 2 🤝 Genie 3 We tested SIMA 2’s abilities in simulated 3D worlds created by our world model Genie 3. It demonstrated unprecedented adaptability by navigating its surroundings and took meaningful steps toward goals. https://x.com/GoogleDeepMind/status/1989024090414309622

SIMA 2 is our most capable AI agent for virtual 3D worlds. 👾🌐 Powered by Gemini, it goes beyond following basic instructions to think, understand, and take actions in interactive environments – meaning you can talk to it through text, voice, or even images. Here’s how 🧵 https://x.com/GoogleDeepMind/status/1988986218722291877

SIMA 2: A Gemini-Powered AI Agent for 3D Virtual Worlds – Google DeepMind https://deepmind.google/blog/sima-2-an-agent-that-plays-reasons-and-learns-with-you-in-virtual-3d-worlds/

Super excited to announce SIMA 2! It’s a general agent that can understand & reason about complex instructions and complete tasks in simulated game worlds, even ones it has never seen before. Incredible to see how it can learn just from self-play… a crucial step towards AGI https://x.com/demishassabis/status/1989096784870928721

OpenAI releases GPT-5.1 with faster coding and better conversation
GPT-5.1 delivers significant improvements over GPT-5, with 20% faster tool-calling, 2x speed on long documents, and enhanced coding precision that reduces errors while maintaining intelligence. The model dynamically adjusts reasoning depth based on task complexity and offers better personality customization, making it more suitable for both consumer chat and enterprise applications. Early enterprise tests show 8 percentage point gains in document data extraction and 50% faster agent performance, suggesting meaningful productivity improvements for business users.

@OpenAI’s GPT-5.1 delivers a solid upgrade from GPT-5 for agentic coding. We’ve noticed that the model is more steerable, overthinks less, and is better at frontend design. The model is also faster on most tasks because it dynamically adjusts reasoning depth based on the”” / X https://x.com/cognition/status/1989081722353529178

@OpenAIDevs is cooking again: GPT-5.1 performed very well in our first tests! Photo of an espresso machine → “”make it a stylized 2.5D version”” → done faster than you can pull a good shot of espresso. Handles ambiguous prompts fast and effectively and the responses also feel https://x.com/jetbrains/status/1989049485335429143

Build a coding agent with GPT 5.1 https://cookbook.openai.com/examples/build_a_coding_agent_with_gpt-5.1

Excited to get @OpenAI’s GPT-5.1 Instant live for our customers and their agents. Nice early results: a 20% improvement in low-latency tool-calling performance compared to GPT-5 (minimal).”” / X https://x.com/SierraPlatform/status/1989085128434593816

Get the most out of GPT-5.1, based on internal testing and early testing with customers ⬇️ https://x.com/OpenAIDevs/status/1989378869976326570

GPT-5.1 in ChatGPT is rolling out to all users this week. It’s smarter, more reliable, and a lot more conversational. https://x.com/OpenAI/status/1988714373058351213

GPT-5.1 is a great new model that we think people are going to like more than 5. But with 800M+ people using ChatGPT, one default personality won’t work for everyone. We launched new preset personalities so people can make ChatGPT their own.  https://x.com/fidjissimo/status/1988683216681889887

GPT-5.1 is now available in the API. It’s faster, more steerable, better at coding, and ships with practical new tools. If you’re building apps or agents where intelligence, speed, and cost matter, GPT-5.1 should feel like a meaningful upgrade. https://x.com/OpenAIDevs/status/1989042617750024403

GPT-5.1 isn’t “GPT-5 but faster.” In our evals of the model, we found it’s the highest-precision model we’ve ever tested for code-related tasks like code review. Less noise, more fixes, reviews that read like patches again. https://x.com/coderabbitai/status/1989035006774354387

GPT-5.1 Prompting Guide https://cookbook.openai.com/examples/gpt-5/gpt-5-1_prompting_guide

GPT-5.1, GPT-5.1-Codex, and GPT-5.1-Codex-Mini, the full suite of OpenAI’s latest 5.1-series models, are now rolling out in public preview in GitHub Copilot. Try it out in @code. https://x.com/github/status/1989044218451394968

GPT-5.1: A smarter, more conversational ChatGPT | OpenAI https://openai.com/index/gpt-5-1/

Introducing GPT-5.1 for developers | OpenAI https://openai.com/index/gpt-5-1-for-developers/

We received early access to GPT-5.1 for testing Pace agents are now 50% faster while matching or even exceeding accuracy across our evals! https://x.com/paceagent/status/1989043013356486762

We’ve been testing Box AI with GPT-5.1 for the past week to compare it to GPT-5 for enterprise content use-cases. It’s a very strong upgrade from GPT-5. It’s super fast, performing ~2X (or more) faster on our tests on long documents (30,000+ tokens); and we saw an 8 percentage point gain in data extraction from our most our most challenging documents (across 1,000+ data fields) from a variety of content types. https://x.com/levie/status/1989051715207983511

OpenAI launches GPT-5.1 with customizable personality presets for ChatGPT
OpenAI released GPT-5.1 models that let users choose from eight personality styles (professional, friendly, quirky, etc.) or create custom instructions that persist across conversations. This marks a shift from one-size-fits-all AI to personalized assistants, addressing feedback from 800 million ChatGPT users who want different communication styles for different contexts. The company acknowledges risks of over-personalization and is working with experts to ensure AI relationships don’t replace human connections.

Moving beyond one-size-fits-all – Fidji Simo https://fidjisimo.substack.com/p/moving-beyond-one-size-fits-all

OpenAI launches ChatGPT group chats in four countries with custom controls
ChatGPT now allows multiple users to collaborate in shared conversations with AI, featuring customizable prompts and response settings that go beyond Microsoft Copilot’s basic group invites. The feature is piloting in Japan, New Zealand, South Korea, and Taiwan, with broader rollout expected in December during OpenAI’s traditional year-end announcement cycle.

“Group chats in ChatGPT are now piloting in Japan, New Zealand, South Korea, and Taiwan. A new way to collaborate with friends, family, or coworkers and ChatGPT in the same conversation.” https://x.com/OpenAI/status/1989138776585851038

OpenAI readies ChatGPT Group Chats with custom controls https://www.testingcatalog.com/openai-readies-chatgpt-group-chats-with-custom-controls/

OpenAI adds mid-query interruption feature for GPT-5 Pro research tasks
Users can now pause and redirect long AI research sessions without losing progress, addressing a key workflow problem where complex queries previously required complete restarts. The feature lets researchers refine their requests as GPT-5 Pro works, though interpreting the model’s opaque reasoning process remains challenging for effective use.

This is a really useful addition for Deep Research, but somewhat challenging to use in practice for GPT-5 Pro, since you need to be very good at interpreting its thinking process which can be opaque & which GPT-5 Pro has a tendency not to show after a certain point in any case”” / X https://x.com/emollick/status/1986323210288165332

You can now interrupt long-running queries and add new context without restarting or losing progress. This is especially useful for refining deep research or GPT-5 Pro queries as the model will adjust its response with your new requirements. Just hit update in the sidebar and https://x.com/OpenAI/status/1986194298971590988

AI models now discuss complex hypothetical scenarios across time periods
Four major AI systems (GPT-5, Claude, Kimi, and Gemini) demonstrated sophisticated reasoning by engaging with an intricate time-travel thought experiment requiring historical knowledge, creative thinking, and nuanced decision-making. This represents a shift from simple question-answering to complex hypothetical reasoning that mirrors human-like deliberation. The ability to handle such open-ended, multi-layered prompts suggests these models are developing more sophisticated cognitive capabilities beyond basic information retrieval.

GPT-5, Claude, Kimi, and Gemini: “”I can travel back in time to any time before 1500 and change only one thing, what is the single thing you would change, nothing obvious.”” https://x.com/emollick/status/1987355374928769395

Blue Owl invests $3 billion in OpenAI’s massive data center project
This investment in OpenAI’s $500 billion Stargate infrastructure initiative marks a pivotal shift as Wall Street moves beyond AI software into the physical backbone of artificial intelligence. The funding addresses OpenAI’s critical computing bottlenecks while positioning Blue Owl in the emerging “AI real estate” market, following similar massive investments by tech giants who plan to spend over $300 billion on data centers this year. With 10 gigawatts of planned capacity—enough to power millions of homes—Stargate represents the largest AI infrastructure project in history.

OpenAI’s Stargate Project Gets $3 Billion Blue Owl Investment — The Information https://www.theinformation.com/articles/openais-stargate-project-gets-3-billion-blue-owl-investment

Blue Owl to Invest $3B in OpenAI’s Stargate Data Centre https://ventureburn.com/blue-owl-openai-3b-funding/

OpenAI’s Stargate Project Secures $3 Billion Blue Owl Investment as the AI Data Center Race Heats Up – Tech Startups https://techstartups.com/2025/11/11/openais-stargate-project-secures-3-billion-blue-owl-investment-as-the-ai-data-center-race-heats-up/

OpenAI’s Stargate Project Secures $3 Billion Blue Owl Investment as the AI Data Center Race Heats Up – Tech Startups https://techstartups.com/2025/11/11/openais-stargate-project-secures-3-billion-blue-owl-investment-as-the-ai-data-center-race-heats-up/

AI data centers now rival Manhattan Project with $300 billion investment
Tech companies are building gigawatt-scale data centers requiring as much power as entire cities, with projects like OpenAI’s Stargate needing $32 billion and covering 450 soccer fields. This represents over 1% of US GDP—larger than the Apollo Program at its peak—with the first facilities potentially online by 2026. The scale and speed of this infrastructure buildout is historically unprecedented, yet operates with little public oversight or transparency.

AI data center buildouts already rival the Manhattan Project in scale, but there’s little public info about them. So we spent the last few months reading legal permits, staring at satellite images, and scouring news sources. Here’s what you need to know. 🧵 https://x.com/EpochAIResearch/status/1987944116861522227

AI data centers will be some of the biggest infrastructure projects in history e.g. OpenAI’s Stargate Abilene will need: – As much power as Seattle (1 GW) – >250× the compute of the GPT-4 cluster – 450 soccer fields of land – $32B – Thousands of workers – 2 years to build”” / X https://x.com/EpochAIResearch/status/1987944128903266358

By the end of the year, AI data centers could collectively see >$300 billion in investment, around 1% of US GDP. That’s bigger than the Apollo Program (0.8%) and Manhattan Project (0.4%) at their peaks.”” / X https://x.com/EpochAIResearch/status/1987944140714447327

How fast can you build a gigawatt-scale data center? Some hyperscalers plan to do it in just 1-2 years from the start of construction. If they succeed, we’ll see the first GW-scale data centers online in 2026, marking one of the fastest infrastructure build-outs in history. 🧵 https://x.com/EpochAIResearch/status/1987938542094610927

OpenAI Stargate Abilene – Frontier Data Centers Satellite Explorer | Epoch AI https://epoch.ai/data/data-centers/satellite-explorer/OpenAIOracleStargateAbileneTexas

OpenAI asks Trump to expand Chips Act tax credits to AI data centers
OpenAI requested the new administration broaden the 35% manufacturing tax credit from semiconductors to include AI infrastructure like data centers and electrical grid components. The move signals how AI companies are seeking government support to reduce the massive capital costs of their buildout plans, with OpenAI projecting $1.4 trillion in infrastructure commitments over eight years. This represents a shift from pure private funding toward public-private partnerships in AI infrastructure development.

The government has played a role in critical infrastructure builds. Our public submission (posted on our blog) shares our thinking and suggests ideas for how the US government can support domestic supply chain/manufacturing. This is very in line with everything we have heard”” / X https://x.com/sama/status/1986917979343495650

OpenAI asked Trump administration to expand Chips Act tax credit to cover data centers | TechCrunch https://techcrunch.com/2025/11/08/openai-asked-trump-administration-to-expand-chips-act-tax-credit-to-cover-data-centers/

Google commits $40 billion to build Texas AI infrastructure through 2027
Google’s massive investment will fund new data centers in rural Texas counties and train over 1,700 electrical apprentices, representing one of the largest corporate AI infrastructure commitments to date. The move signals how tech giants are racing to secure physical computing capacity for AI workloads while addressing energy grid constraints. Google is pairing the data centers with 6,200 megawatts of new energy generation, including solar plants, to power the AI boom sustainably.

Google announces new $40 billion investment in Texas https://blog.google/inside-google/company-announcements/google-american-innovation-texas/

Today, we’re announcing a new $40B investment in Texas through 2027 to build Cloud & AI infrastructure and support thousands of new jobs.  This includes new data centers in Armstrong and Haskell Counties and a major investment to strengthen energy resilience and abundance. We’re https://x.com/sundarpichai/status/1989468970400055487

Microsoft CEO showcases world’s most powerful AI datacenter facility
Satya Nadella provided an exclusive tour of Fairwater 2, Microsoft’s flagship datacenter designed specifically for artificial intelligence workloads. The facility represents Microsoft’s massive infrastructure investment to support advanced AI systems and potential artificial general intelligence development. This marks a rare behind-the-scenes look at the physical computing power driving Microsoft’s AI ambitions, including Copilot and future AGI business models.

.@satyanadella gave me and @dylan522p an exclusive tour of Fairwater 2, the most powerful AI datacenter in the world. We then chatted through Satya’s vision for Microsoft in a world with AGI. 0:00:00 – Fairwater 2 0:04:15 – Business models for AGI 0:13:42 – Copilot 0:20:56 – https://x.com/dwarkesh_sp/status/1988656226989699138

Anthropic commits $50 billion to build AI data centers across America
The Claude AI maker’s massive infrastructure investment will create 3,200 jobs and establish custom data centers in Texas and New York by 2026, representing one of the largest private AI infrastructure commitments in US history. The move directly supports the Trump administration’s AI leadership goals while addressing surging demand from Anthropic’s 300,000 business customers, whose large enterprise accounts grew sevenfold in the past year.

Anthropic invests $50 billion in American AI infrastructure \ Anthropic https://www.anthropic.com/news/anthropic-invests-50-billion-in-american-ai-infrastructure

Nvidia turns GPUs into currency by financing AI infrastructure unicorns
Nscale raised $1.5 billion in 18 months by converting crypto mining facilities into AI data centers, exemplifying how Nvidia creates a self-reinforcing system where it finances customers who buy its GPUs. This transforms compute infrastructure into national strategic assets, with companies like Nscale becoming sovereign AI champions despite their multinational origins. The pattern reveals how AI infrastructure has evolved from startup funding to industrial policy.

How NVIDIA turned GPUs into currency ⬇️ An 18-month-old AI unicorn @nscale_cloud is a perfect example. You’ve probably never heard of it, but you should: • This September, Nscale raised $1.1 billion – the largest Series B in European history • Followed by $433 million https://x.com/TheTuringPost/status/1988002749452349495

The New Gold Standard: How NVIDIA Turned GPUs Into Currency https://www.turingpost.com/p/nscale

SoftBank sells entire $5.8 billion Nvidia stake for AI bets
SoftBank founder Masayoshi Son sold all 32.1 million Nvidia shares to fund a $30 billion OpenAI commitment and participate in a $1 trillion AI manufacturing hub in Arizona. The move rattled markets, sending Nvidia down 3%, and marks Son’s second complete exit from the chip giant after previously selling a stake in 2019 that would now be worth over $150 billion. This represents Son’s latest all-in gamble, following his pattern of extreme bets that have ranged from his legendary Alibaba success to costly failures like WeWork.

SoftBank’s Nvidia sale rattles market, raises questions | TechCrunch https://techcrunch.com/2025/11/11/softbanks-nvidia-sale-rattles-market-raises-questions/

Google’s AI Studio reaches 2.1 million users creating apps daily
Google’s AI Studio has attracted over 2 million users who are building hundreds of thousands of applications every day, demonstrating unprecedented mainstream adoption of no-code AI development tools. This surge suggests AI app creation is moving beyond technical specialists to everyday users, potentially reshaping how software gets built and who can build it.

We just crossed 2.1 million users vibe coding in AI Studio with hundreds of thousands of apps made every day 🤯 This is just the start but we have been blown away by the reception so far. Keep the feedback coming!”” / X https://x.com/OfficialLoganK/status/1986467546355183985

AI agents need new authentication systems beyond traditional login methods
Current security protocols like OAuth require human interaction through browsers and clicks, but AI agents operating autonomously can’t handle visual authentication flows. This creates a fundamental security gap as agents increasingly need access to user data without direct human oversight, forcing developers to choose between insecure static API keys or cumbersome human-in-the-loop authentication that defeats the purpose of autonomous operation.

With Agents it’s the first time we need secure auth for user data where the user might not want to “”visually”” involved in the flow, no “”login with…””, no clicks, no browser. OAuth wasn’t built for this (it’s redirect-heavy and assumes human eyes), static API keys are way too”” / X https://x.com/_philschmid/status/1987889931822236059

Waymo launches freeway driving for public riders across three major cities
The autonomous vehicle company began offering freeway access to paying customers in Phoenix, Los Angeles, and the San Francisco Bay Area, marking a significant expansion beyond city streets. This development matters because freeway driving represents a more complex autonomous driving challenge involving higher speeds and merging traffic, potentially accelerating mainstream adoption of self-driving cars. Waymo also connected San Francisco to San Jose and added airport pickup service, demonstrating the technology’s ability to handle longer, more practical trips that compete directly with human drivers.

Big day — we’re starting to rollout freeway driving for public riders in Phoenix, LA, and the Bay Area. We’re also connecting SF to San Jose, and adding curbside access to SJC Airport. A true testament to the generalizability of the @Waymo Driver. Excited to bring its proven https://x.com/dmitri_dolgov/status/1988672463761485882

Exciting expansion! @Waymo now serves the whole SF Bay Area Peninsula from SF to San Jose and is taking riders on freeways. https://x.com/JeffDean/status/1988674389773357077

Taking riders further, safely with freeways https://waymo.com/blog/2025/11/taking-riders-further-safely-with-freeways

Adobe integrates Google’s advanced AI models into creative software suite
Adobe customers will access Google’s Gemini, Veo, and Imagen AI models directly within Photoshop, Premiere, and other creative apps, while enterprise users can customize these models with proprietary data through Adobe Firefly Foundry. This partnership distinguishes itself by embedding Google’s most sophisticated AI capabilities into Adobe’s established creative ecosystem, potentially accelerating professional content creation at unprecedented scale. The collaboration represents a significant shift from standalone AI tools to integrated creative workflows that maintain brand consistency across large organizations.

Adobe and Google Cloud Expand Strategic Partnership to Advance the Future of Creative AI https://news.adobe.com/news/2025/10/adobe-max-2025-google-cloud

Hugging Face and Google Cloud deepen partnership to streamline open AI development
The collaboration creates faster model downloads through a dedicated content delivery network and integrates Hugging Face’s 2 million open models directly into Google Cloud services like Vertex AI. This matters because it addresses the massive scale of usage—over 1,500 terabytes downloaded daily—while making enterprise AI development with open-source models as simple as deploying proprietary ones. The partnership also brings Google’s security tools to scan Hugging Face’s model repository for threats.

🤗 @huggingface we’re announcing a closer partnership with @googlecloud to make open model development easier across the Hugging Face ecosystem and Google Cloud! – Deep Learning Containers (DLCs) for streamlined deployment and training – DLCs available via Vertex AI, Cloud Run, https://x.com/alvarobartt/status/1988970441357094984

Every day, over 1,500 terabytes of open models and datasets are downloaded and uploaded between @huggingface and @googlecloud by millions of AI builders. We suspect it generates over a billion dollars of cloud spend annually already. So we’re excited to announce today a new https://x.com/ClementDelangue/status/1989000335247983049

Building for an Open Future – our new partnership with Google Cloud https://huggingface.co/blog/google-cloud

Google researchers solve AI’s “catastrophic forgetting” with nested learning approach
Google Research introduced Nested Learning, which treats AI models as interconnected optimization problems updating at different frequencies, enabling continuous learning without forgetting previous knowledge. Their “Hope” architecture outperformed standard transformers on language tasks and long-context reasoning by implementing what they call “continuum memory systems.” This addresses a core limitation where current AI models sacrifice old knowledge when learning new tasks, potentially enabling more human-like continuous learning capabilities.

Introducing Nested Learning: A new ML paradigm for continual learning https://research.google/blog/introducing-nested-learning-a-new-ml-paradigm-for-continual-learning/

Google’s NotebookLM launches automated research agent with expanded file support
NotebookLM’s new Deep Research feature acts as an AI researcher that automatically browses hundreds of websites to create detailed reports, then adds both the report and sources directly to users’ notebooks. The update also supports Google Sheets, Word documents, images, and Drive URLs, making it a more comprehensive research platform that can handle the diverse file types people actually use in their work.

NotebookLM adds Deep Research, Docx, Sheets and more https://blog.google/technology/google-labs/notebooklm-deep-research-file-types/

Google launches Private AI Compute for cloud-based privacy protection
Google introduced Private AI Compute, a new system that processes sensitive data using powerful Gemini AI models in the cloud while ensuring the data remains completely private and inaccessible even to Google itself. This addresses a key limitation where advanced AI capabilities require more computational power than devices can provide locally, but users want privacy guarantees typically only available with on-device processing. The system uses hardware-secured enclaves and encryption to create isolated processing environments, initially powering enhanced features like improved Magic Cue suggestions and multilingual transcription summaries on Pixel devices.

Private AI Compute advances AI privacy https://blog.google/technology/ai/google-private-ai-compute/

Google’s Gemini achieves breakthrough in satellite image analysis
Google’s Gemini AI model has reached state-of-the-art performance in understanding satellite imagery, marking an unexpected leap in geospatial intelligence capabilities. This development matters because satellite data analysis is crucial for climate monitoring, urban planning, and disaster response, with AI interpretation potentially accelerating decision-making from days to minutes. The achievement demonstrates how general-purpose AI models are rapidly expanding beyond text and conversation into specialized domains previously requiring dedicated systems.

Gemini having SOTA satellite data understanding was not on my 2025 bingo card, yet here we are 🙂 https://x.com/OfficialLoganK/status/1986978962589790536

Google launches AI-powered shopping with ads and automated purchasing
Google is rolling out conversational shopping in AI Mode search that includes sponsored content, plus “agentic checkout” that can automatically buy items when prices drop. The company is also reviving its Duplex phone-calling AI to check store inventory for shoppers. This represents Google’s first major attempt to monetize its AI search features by integrating commerce directly into conversational AI experiences, potentially reshaping how people discover and purchase products online.

Google is rolling out conversational shopping—and ads—in AI Mode search – Ars Technica https://arstechnica.com/google/2025/11/google-rolling-out-conversational-shopping-and-ads-in-ai-mode-search/

Google’s Nano Banana 2 image generator accidentally leaked early on Media.ai
The leaked samples show Google’s upcoming AI image generator can solve complex visual tasks like drawing accurate physics trajectories and rendering precise text, capabilities that distinguish it from current models. The tool uses a new multi-step workflow that plans, generates, reviews and corrects images before final output, potentially transforming professional design workflows when it launches November 11th. Early tests suggest three times better accuracy than previous versions, with support for multiple resolutions up to 4K.

Early look at images generated by Nano Banana 2 via Media AI https://www.testingcatalog.com/early-look-at-images-generated-by-nano-banana-2/

Exclusive: New leaked image samples by Nano Banana 2 https://www.testingcatalog.com/exclusive-new-leaked-image-samples-by-nano-banana-2/

Disney+ plans to let users create AI-generated short videos
Disney CEO Bob Iger announced that Disney+ will soon allow subscribers to create and share AI-generated short-form content, marking the streaming service’s biggest product overhaul since its 2019 launch. This move puts Disney in direct competition with AI video platforms like OpenAI’s Sora while potentially transforming how audiences interact with Disney’s vast intellectual property catalog. Iger noted Disney is in talks with AI companies to develop this feature while ensuring protection of Disney’s copyrighted characters and content.

Disney+ to Allow User-Generated Content Via AI https://www.hollywoodreporter.com/business/digital/disney-plus-gen-ai-user-generated-content-1236426135/

OpenAI accuses New York Times of hacking user accounts for lawsuit evidence
OpenAI claims the Times improperly accessed ChatGPT user accounts and manipulated the system to generate copyrighted content for their ongoing copyright lawsuit. The company argues this represents “hacking” and evidence tampering, though the Times likely used standard techniques to test how the AI reproduces their articles. This dispute highlights the messy legal battle over whether training AI on copyrighted news content constitutes fair use.

Fighting the New York Times’ invasion of user privacy | OpenAI https://openai.com/index/fighting-nyt-user-privacy-invasion/

ElevenLabs launches marketplace to license AI voices of deceased celebrities
The voice cloning company now offers commercial rights to recreate voices of icons like Judy Garland, John Wayne, and Alan Turing for creative projects. This marks a significant shift from generic AI voices to monetizing specific celebrity likenesses, raising new questions about posthumous digital rights. The platform requires approval from rights holders and includes figures from entertainment, science, and historical significance.

Iconic Marketplace – License Legendary Voices https://elevenlabs.io/iconic-marketplace

Voice cloning systems now require speakers to say consent phrases before activation
Researchers developed a “voice consent gate” that blocks AI voice cloning until the target person explicitly states permission in a unique, generated phrase that also serves as the voice sample. This addresses deepfake risks while preserving beneficial uses like helping people who’ve lost their ability to speak. The system generates fresh consent phrases each time to prevent reuse of old recordings, making consent both technically functional and ethically meaningful.

@frimelle For the Voice Consent Gate idea in particular, we did something that I **think** is fairly immediately useful; something people can see how they need right now. 7/”” / X https://x.com/mmitchell_ai/status/1988373005790310512

In light of the renewed discussion about the role of ethics in AI + the massive leaps in voice cloning tech, seems a good moment to re-up the idea @frimelle and I recently introduced: The Voice Consent Gate. 🧵 https://x.com/mmitchell_ai/status/1988367909849329777

Voice Cloning with Consent https://huggingface.co/blog/voice-consent-gate

Clio completes $1 billion vLex acquisition, largest legal tech deal ever
The deal combines Clio’s practice management platform with vLex’s AI legal research across 110 jurisdictions, creating what they call an “Intelligent Legal Work Platform.” This represents a shift from traditional legal software toward AI-powered systems that can handle both law firm operations and legal research in one integrated environment. The $500 million funding round at a $5 billion valuation signals major investor confidence in AI’s potential to transform how lawyers work.

Clio Completes Landmark $1B vLex Acquisition and Announces $500M Series G Funding Round at $5B Valuation https://www.prnewswire.com/news-releases/clio-completes-landmark-1b-vlex-acquisition-and-announces-500m-series-g-funding-round-at-5b-valuation-302609582.html

OpenAI’s inference costs hit $8.67 billion through September, far exceeding revenues
OpenAI spent $8.67 billion on AI inference computing through the first nine months of 2025—more than double its entire 2024 inference costs of $3.76 billion, according to internal documents. The company’s computing expenses now vastly exceed its revenues, with Microsoft receiving $866 million in revenue share payments that suggest OpenAI’s actual revenues are billions lower than publicly reported figures. This reveals the massive computational costs required to run AI services at scale, raising questions about the economics of current AI business models.

Exclusive: Here’s How Much OpenAI Spends On Inference and Its Revenue Share With Microsoft https://www.wheresyoured.at/oai_docs/

Chinese hackers used AI agents to execute first autonomous cyber espionage campaign
Anthropic detected and disrupted a Chinese state-sponsored group that used Claude AI to autonomously infiltrate 30 global targets including tech companies, financial institutions, and government agencies with minimal human oversight. This represents the first documented large-scale cyberattack where AI performed 80-90% of the work—from reconnaissance to data extraction—demonstrating how AI agents can now execute sophisticated espionage operations that previously required entire teams of human hackers.

Disrupting the first reported AI-orchestrated cyber espionage campaign \ Anthropic https://www.anthropic.com/news/disrupting-AI-espionage

Full report: Disrupting the first reported AI-orchestrated cyber espionage campaign https://assets.anthropic.com/m/ec212e6566a0d47/original/Disrupting-the-first-reported-AI-orchestrated-cyber-espionage-campaign.pdf

We believe this is the first documented case of a large-scale AI cyberattack executed without substantial human intervention. It has significant implications for cybersecurity in the age of AI agents. Read more: https://x.com/AnthropicAI/status/1989033795341648052

We disrupted a highly sophisticated AI-led espionage campaign. The attack targeted large tech companies, financial institutions, chemical manufacturing companies, and government agencies. We assess with high confidence that the threat actor was a Chinese state-sponsored group.”” / X https://x.com/AnthropicAI/status/1989033793190277618

Senator Murphy faces pushback over AI regulation from open source advocates
Critics argue that calls for AI safety regulations are driven by large tech companies seeking to eliminate open source competition through “regulatory capture” – using government rules to block smaller rivals rather than address genuine safety concerns.

@ChrisMurphyCT You’re being played by people who want regulatory capture. They are scaring everyone with dubious studies so that open source models are regulated out of existence.”” / X https://x.com/ylecun/status/1989364612651966788

Fei-Fei Li’s World Labs launches Marble, first commercial world model
World Labs released Marble, a multimodal AI system that generates persistent 3D environments from text, images, or videos that users can edit and export. Unlike competitors’ real-time demos, Marble creates downloadable worlds with consistent geometry and offers AI-native editing tools, targeting gaming, VFX, and VR markets with subscription tiers starting at $20/month. This represents the first commercial step toward Li’s vision of “spatial intelligence”—AI that understands and creates in three dimensions rather than just processing text.

.@drfeifei started her new blog We believe this will be one of the most interesting reads about Spatial Intelligence. She writes, that Spatial Intelligence depends on world models built on 3 core principles: – They must be generative – able to create coherent, https://x.com/TheTuringPost/status/1988727531353305524

Fei-Fei Li’s World Labs speeds up the world model race with Marble, its first commercial product | TechCrunch https://techcrunch.com/2025/11/12/fei-fei-lis-world-labs-speeds-up-the-world-model-race-with-marble-its-first-commercial-product/

From Words to Worlds: Spatial Intelligence is AI’s Next Frontier https://drfeifei.substack.com/p/from-words-to-worlds-spatial-intelligence

Marble: A Multimodal World Model | World Labs https://www.worldlabs.ai/blog/marble-world-model

Depth Anything 3 achieves human-like spatial perception from any camera angle
This breakthrough system can estimate depth and 3D structure from single photos, multiple viewpoints, or video footage, potentially transforming applications like autonomous vehicles, robotics, and AR/VR by giving machines better spatial understanding. The technology represents a significant advance in computer vision by working across different viewing scenarios rather than being limited to specific camera setups.

After a year of team work, we’re thrilled to introduce Depth Anything 3 (DA3)! 🚀 Aiming for human-like spatial perception, DA3 extends monocular depth estimation to any-view scenarios, including single images, multi-view images, and video. In pursuit of minimal modeling, DA3 https://x.com/bingyikang/status/1989358267668336841

Depth Anything 3 – a Hugging Face Space by depth-anything https://huggingface.co/spaces/depth-anything/depth-anything-3

Depth Anything 3 Recovering the Visual Space from Any Views https://x.com/_akhaliq/status/1989336687529619858

Paper page – Depth Anything 3: Recovering the Visual Space from Any Views https://huggingface.co/papers/2511.10647

Pathwork scales insurance document processing from 5,000 to 40,000 pages weekly
The life insurance tech company replaced fragile manual systems with LlamaIndex’s AI parsing to automatically extract data from medical records and underwriting guidelines, achieving 8x capacity growth while eliminating bottlenecks that previously limited customer onboarding. This demonstrates how specialized AI tools can transform document-heavy industries by making decades-old scanned files instantly searchable and actionable.

One of the biggest use cases for agentic document automation is insurance underwriting ✍️ Underwriting depends on processing *massive* volumes of unstructured documents, from medical reports, scanned forms, and way more. It’s also historically been a massively manual process. https://x.com/jerryjliu0/status/1988394058197184923

Pathwork Automates Information Extraction from Medical Records and Underwriting Guidelines with LlamaParse https://www.llamaindex.ai/customers/pathwork-automates-information-extraction-from-medical-records-and-underwriting-guidelines-with?utm_source=socials&utm_medium=li_social

LlamaIndex launches specialized AI agents for parsing SEC financial filings
The system automatically identifies document types and routes them to specialized extraction agents, potentially streamlining compliance work for financial firms that currently spend significant manual effort processing regulatory documents.

Build a document understanding agent for SEC filings that uses a multi-step approach with LlamaClassify and Extract to identify the filing type and hand it off to the right extraction agent. Deployed with LlamaAgents. 🔧 Customize extraction schemas to fit your specific data https://x.com/llama_index/status/1988696219015848401

LlamaParse launches experimental system to extract precise data from charts
Most AI models fail at converting visual charts into accurate numbers, but this new “agentic chart parsing” system can trace line contours and pull exact numerical values from chart images. This addresses a major gap in document processing where businesses need reliable data extraction from financial reports, research papers, and presentations rather than just general chart descriptions.

Chart OCR just got a major upgrade with our new experimental “”agentic chart parsing”” feature in LlamaParse 📈🧪 Most LLMs struggle with converting charts to precise numerical data, so we’ve created an experimental a system that follows contours in line charts and extracts https://x.com/llama_index/status/1989060127551549854

Adobe launches AI assistants that work across multiple creative apps
Adobe introduced conversational AI assistants for Express and Photoshop that can handle complex creative workflows by understanding context and user intent across tasks. Unlike simple chatbots, these “agentic AI” tools can organize content libraries, apply edits in bulk, and coordinate between different Adobe apps through Project Moonlight, potentially transforming how creators manage entire content production cycles rather than just individual tasks.

Our View on Agentic AI: AI Assistants That Work For You, In Your Favorite Apps | Adobe Blog https://blog.adobe.com/en/publish/2025/10/28/our-view-agentic-ai-assistants-that-work-you-in-your-favorite-apps

ByteDance launches China’s cheapest AI coding assistant at $1.30 monthly
The TikTok owner’s new Doubao-Seed-Code model matches top US competitors like Claude Sonnet on coding benchmarks while undercutting rivals by 90%, intensifying China’s AI price war as Western restrictions push Chinese firms toward domestic alternatives.

ByteDance unveils China’s most affordable AI coding agent at just US$1.30 a month | South China Morning Post https://www.scmp.com/tech/big-tech/article/3332365/bytedance-unveils-chinas-most-affordable-ai-coding-agent-just-us130-month

ByteDance’s Volcano Engine debuts coding agent at $1.3 promo price https://www.techinasia.com/news/bytedances-volcano-engine-debuts-coding-agent-at-1-3-promo-price

Gelato-30B model helps AI agents navigate computer screens with 64% accuracy
This new “computer grounding” model can identify and interact with specific elements on computer screens, a key capability for AI assistants that need to use software applications. The 63.8% success rate on complex screen navigation tasks represents a significant step toward AI agents that can actually operate computers like humans do, moving beyond just text generation to practical computer control.

We’re releasing🍨Gelato-30B-A3B, a state-of-the-art computer grounding model that delivers immediate performance gains for computer-use agents! Trained on our open-source🖱️Click-100k dataset, Gelato achieves 63.8% on ScreenSpot-Pro and 69.1% on OS-World-G. It outperforms https://x.com/anas_awadalla/status/1987913284989985092

AI coding startup Cursor raises $2.3 billion at $29.3 billion valuation
The company tripled its worth in five months while crossing $1 billion in annual revenue, making it one of only six AI startups valued above $10 billion. Cursor’s AI coding assistant helps developers write and review code, competing in a rapidly growing market where similar tools from Anthropic already generate $500 million annually.

AI startup Cursor raises $2.3 billion round at $29.3 billion valuation https://www.cnbc.com/2025/11/13/cursor-ai-startup-funding-round-valuation.html

AI coding startup Lovable reaches 8 million users in first year
The Stockholm-based platform jumped from 2.3 million users in July to nearly 8 million today, with over half of Fortune 500 companies now using it to build software without coding skills. Despite traffic declining 40% since September according to Barclays research, the company maintains strong user retention and continues rapid hiring, suggesting the “vibe coding” boom may be stabilizing rather than fading.

Lovable says it’s nearing 8 million users as the year-old AI coding startup eyes more corporate employees | TechCrunch https://techcrunch.com/2025/11/10/lovable-says-its-nearing-8-million-users-as-the-year-old-ai-coding-startup-eyes-more-corporate-employees/

I need more information to create an executive summary. The provided text “Comet Assistant puts you in control” appears to be just a headline or tagline, but I need the actual news content – details about what Comet Assistant is, what it does, who created it, and why it’s significant – to produce the factual summary you’ve requested.
Could you please provide the full article or news content about Comet Assistant?

Comet Assistant puts you in control https://www.perplexity.ai/hub/blog/comet-assistant-puts-you-in-control

Tavus raises $40M to build AI assistants that feel like human friends
The startup is creating “PALs” – AI companions with personalities that blur the line between digital tools and human-like interactions, representing a major shift from traditional command-based interfaces like Siri toward conversational AI that mimics talking to a colleague or friend.

Hassaan Raza on X: “The interface of the future is human. We’ve raised a $40M Series B from CRV, Scale, Sequoia, and YC to teach machines the art of being human, so that using a computer feels like talking to a friend or a coworker. And today, I’m excited for y’all to meet the PALs: a new https://t.co/DUJkEu5X48” / X https://x.com/hassaanrza/status/1988666194795331996

Tavus! Pretty big step towards a real life Jarvis – a multimodal ai assistant w/ a personality to boot. They’re intentionally blurring the lines between a tool and a companion. This is what Siri should’ve been by now. Cool to see it actually happening:”” / X https://x.com/bilawalsidhu/status/1988671232099926465

Meta launches speech recognition for 1,600 languages including 500 previously unsupported ones
This represents the largest language coverage of any automatic speech recognition system, potentially giving voice technology access to communities that have been digitally excluded. The breakthrough specifically targets “low-coverage” languages that major tech platforms have historically ignored, suggesting a significant expansion in who can use voice-activated AI tools.

Introducing Meta Omnilingual Automatic Speech Recognition (ASR), a suite of models providing ASR capabilities for over 1,600 languages, including 500 low-coverage languages never before served by any ASR system. While most ASR systems focus on a limited set of languages that are https://x.com/AIatMeta/status/1987946571439444361

Omnilingual ASR: Advancing Automatic Speech Recognition for 1,600+ Languages https://ai.meta.com/blog/omnilingual-asr-advancing-automatic-speech-recognition/

Meta’s top AI scientist Yann LeCun reportedly leaving to start world models startup
Turing Award winner LeCun plans to depart Meta in coming months to build a company focused on “world models” – AI systems that understand environments to predict outcomes. His exit comes as Meta scrambles to catch up with OpenAI and Google, recently investing $14.3 billion and hiring 50+ rival engineers while creating internal chaos. LeCun’s departure would strip Meta of a foundational AI researcher just as the company desperately needs breakthrough innovations.

Meta’s chief AI scientist Yann LeCun reportedly plans to leave to build his own startup | TechCrunch https://techcrunch.com/2025/11/11/metas-chief-ai-scientist-yann-lecun-reportedly-plans-to-leave-to-build-his-own-startup/

Baidu releases open-source multimodal AI claiming to beat GPT-5 and Gemini
The Chinese search giant’s ERNIE-4.5-VL model uses only 3 billion active parameters while maintaining 28 billion total through efficient routing, allowing it to run on a single 80GB GPU versus competitors requiring multiple accelerators. The model introduces “Thinking with Images” capability that dynamically zooms into visual details like humans do, potentially transforming enterprise applications from document processing to manufacturing quality control. Released under permissive Apache 2.0 licensing, it removes commercial restrictions that limit many competing systems.

Baidu just dropped an open-source multimodal AI that it claims beats GPT-5 and Gemini | VentureBeat https://venturebeat.com/ai/baidu-just-dropped-an-open-source-multimodal-ai-that-it-claims-beats-gpt-5

Tesla plans massive humanoid robot factories for millions of units
Musk announced Tesla will build production lines for 1 million Optimus robots in Fremont and 10 million in Austin, marking the first concrete manufacturing plans for humanoid robots at industrial scale. This represents a dramatic shift from prototype demonstrations to mass production commitments, potentially making humanoid robots commercially viable if Tesla can deliver on these ambitious targets.

Elon Musk says Tesla will build a 1-million-unit Optimus production line in Fremont and then a 10-million-unit production line in Austin. https://x.com/TheHumanoidHub/status/1986563829925617806

Researchers train robots using AI-generated videos without real robot data
PhysWorld combines video generation with physics modeling to teach robots new tasks from a single image and text command, eliminating the need for expensive real-world training data. The system generates instructional videos then reconstructs the underlying physics to create executable robot actions, achieving zero-shot manipulation across diverse real-world tasks. This breakthrough could dramatically reduce the cost and time needed to train robots for new applications.

[2511.07416] Robot Learning from a Physical World Model https://arxiv.org/abs/2511.07416

A robot could learn a task just by watching a generated video? PhysWorld connects video generation with real-world robot learning. It turns visual imagination into physical skill. ✅ Takes one image and a task prompt ✅ Generates a video showing how to complete the task ✅ https://x.com/IlirAliu_/status/1988678189527273831

Perceptron launches unified platform for physical AI development
Perceptron released a developer platform combining vision and language AI models (Isaac-0.1 and Qwen3VL 235B) specifically for building applications that interact with the physical world. The platform addresses a gap in the market by providing integrated tools for prompt engineering, deployment, and integration rather than forcing developers to cobble together separate services. This represents a shift toward specialized AI infrastructure for robotics and physical automation applications.

Perceptron AI on X: “Perceptron’s platform is here — built for Physical AI Developers can now use Isaac-0.1 or Qwen3VL 235B via: Perceptron API — fast, reliable multimodal intelligence Python SDK — simple, grounded prompting for vision + language Build apps that see and understand the world. https://t.co/5ZyaOGQb1i” / X https://x.com/perceptroninc/status/1988713482460750290

We’ve been integrating Isaac across the industry and have realized developers are missing a single platform for Physical AI – prompt engineering, deployment, and integration. Today we are excited to release Perceptron’s Platform – supporting our API – supporting chat”” / X https://x.com/AkshatS07/status/1988713765152649711

Perceptron’s platform is here — built for Physical AI Developers can now use Isaac-0.1 or Qwen3VL 235B via: Perceptron API — fast, reliable multimodal intelligence Python SDK — simple, grounded prompting for vision + language Build apps that see and understand the world. https://x.com/perceptroninc/status/1988713482460750290

Researchers develop safety system that formally guarantees robot behavior around humans
A new method called Path-Consistent Safety Filtering gives mathematical safety guarantees for robots operating near people, addressing a critical gap where current AI systems can imitate human behavior but cannot ensure they won’t cause harm. This breakthrough could enable wider deployment of robots in shared human spaces by providing formal proof of safe operation rather than relying on probabilistic safety measures.

Most diffusion policies can imitate behavior… but they can’t guarantee safe behavior. That’s a problem when robots share space with people. Researchers from TUM and Stanford propose Path-Consistent Safety Filtering (PACS), a method that gives formal safety guarantees to https://x.com/IlirAliu_/status/1988550922444693898

Robots are great at following instructions. But what happens when those instructions fail? Most Vision-Language-Action models freeze or repeat the same mistake. A new approach called FailSafe shows how robots can detect and fix their own failures. The method uses a companion https://x.com/IlirAliu_/status/1986353266322538634

Apple could earn $133 billion annually from humanoid robots by 2040
Morgan Stanley’s projection suggests Apple may be planning a major expansion beyond consumer electronics into robotics, potentially creating a market larger than the iPhone business. This represents one of the most ambitious revenue forecasts for humanoid robots from a major tech company, signaling Wall Street’s growing confidence that personal robots could become mainstream consumer products within two decades.

Morgan Stanley projects Apple could make $133 billion a year on humanoid robots by 2040. https://x.com/TheHumanoidHub/status/1987053990694297952

Chinese carmaker XPENG’s humanoid robot fooled audiences into thinking it was human
The company had to cut open the robot’s leg skin on stage to prove it wasn’t a person in a costume, marking a breakthrough in realistic humanoid design that crosses the “uncanny valley” where robots become indistinguishable from humans.

XPENG’s next-gen IRON robot effectively crossed the uncanny valley, leading many to believe it was a human in a suit. In a follow-up event to prove it was a robot, He Xiaopeng had its leg skin cut open in front of a live audience. The robot then walked off the stage. https://x.com/TheHumanoidHub/status/1986482482460725755

SophontAI trains top pathology AI model for just $1,600
The company’s OpenMidnight model achieves state-of-the-art performance in medical pathology analysis using only 12,000 slide images and minimal computing resources. This dramatically reduces the cost barrier for developing specialized medical AI, potentially enabling smaller research teams and institutions to create powerful diagnostic tools that were previously accessible only to well-funded organizations.

Excited to share our latest @SophontAI release 🥳 “”How to Train a State-of-the-Art Pathology Foundation Model with $1.6k”” We present OpenMidnight, our first pathology foundation model! It has SOTA perf. despite being only trained on 12k whole slide images w/ $1.6k compute! https://x.com/iScienceLuvr/status/1989390268316221861

Adobe researchers achieve real-time interactive video generation with motion controls
MotionStream generates videos at 29 FPS with 0.4-second latency on a single GPU, letting users control object movement and camera angles through simple click-and-drag interactions. This breakthrough eliminates the minutes-long wait times of current video AI systems, enabling truly interactive content creation where users can paint motion trajectories and see results unfold instantly. The system can generate arbitrarily long videos while maintaining constant speed and quality.

I genuinely think we’re on the cusp of a new type of creation engine. Feels less like prompting and more like puppeteering reality itself. MotionStream is a taste of what’s to come: https://x.com/bilawalsidhu/status/1986877076839014462

MotionStream: Real-Time Video Generation with Interactive Motion Controls https://joonghyuk.com/motionstream-web/

We are just scratching the surface of precise control over AI video generation. MotionStream unlocks real-time video with interactive motion controls. You can interactively generate video based on motion inputs (like drawn trajectories, camera movements, or motion transfer). https://x.com/bilawalsidhu/status/1986838921712701833

Wildminder on X: “MotionStream: Real-time, interactive video generation with mouse-based motion control; runs at 29 FPS with 0.4s latency on one H100; uses point tracks to control object/camera motion and enables real-time video editing. https://t.co/fFi9iB9ty7 https://t.co/zKb9u3bj9g” / X https://x.com/wildmindai/status/1985828041566941576

Leave a Reply

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading