AI News #114: Week Ending December 05, 2025 with 24 Executive Summaries
About This Week’s Covers
This week’s cover is inspired by the song “Lonely Woman” by Ornette Coleman. I decided to use a song rather than a humanities reading this week. “Lonely Woman” is one of the most surreal, brutal, and haunting jazz songs, and it has an innate visual quality that goes with the sounds.
The song’s origin story stems from Ornette working in a department store before he was a famous musician. During his lunch break, he went to an art gallery and saw a painting of a very rich white woman who had everything you could want in life, but she looked excruciatingly lonely. When he got home, he wrote the song “Lonely Woman.”
“Before becoming known as a musician, when I worked in a big department store, one day, during my lunch break, I came across a gallery where someone had painted a very rich white woman who had absolutely everything that you could desire in life, and she had the most solitary expression in the world. I had never been confronted with such solitude, and when I got back home, I wrote a piece that I called “Lonely Woman.””
I wanted to see how Flux 2.0 (released last week) would interpret an image using the song’s mood and backstory as a prompt.
This means Claude had only two links, and no real guidance: a Wikipedia entry for the song and the documentation for Flux’s prompting guide.
I then told Claude to (more or less), “Make me a Flux prompt for one image that’s the gallery picture, and the other that’s the lady in real life having dinner.” I enjoy giving tools the bare minimum to see what they come up with.
“A woman in her mid-30s sitting alone at a café table by a rain-streaked window, gazing pensively outside, professional portrait photography, natural window lighting with soft shadows, 85mm lens, f/2.8, shallow depth of field, muted color palette with cool blue-gray tones, melancholic and contemplative atmosphere, urban coffee shop setting with blurred background”
The woman in the cafe became the cover image. I used the font and colors from an alt version of the Ornette Coleman album “The Shape of Jazz to Come”. Here’s the image of the woman in the gallery:
“A large oil painting in an ornate gilded frame hangs on a gallery wall, depicting a wealthy white woman in her 40s wearing pearls and a silk dress, seated in a luxurious parlor, her face holding an expression of profound isolation and quiet despair despite her obvious affluence. The gallery space is dimly lit with warm spotlights on the artwork, polished hardwood floors reflecting the frame, a simple wooden bench in front of the painting. Photorealistic interior photography, museum atmosphere, 35mm lens, soft directional lighting, contemplative and reverent mood.”
Because I studied jazz in college AND I was an English major, I can think of these references and generate ideas about Ornette Coleman on my own…
However, paradoxically, I don’t believe that education should be a moat to hold back the uneducated, nor should AI be a substitution for hard work or knowledge.
“Some dismissed the saxophonist as an untutored fraud and others hailed him as an untutored genius.”
“Jazz giants including Thelonious, Charles Mingus and Miles Davis publicly dismissed the newcomer’s work.” Benny Green wrote that “like a stopped clock, Coleman is at least right twice a day”.
But Coleman’s broad interests – from the earthiest of dance and blues styles to 20th-century classical music – offered him alternatives to acoustic jazz without compromising his beliefs.
By the late 1980s, Coleman’s enfant terrible status had been displaced by a kind of respectability. Younger players, including the fusion guitarist Pat Metheny, loved his music.
Original Coleman anthems, including Lonely Woman, Peace, Focus on Sanity and Congeniality, have now become jazz standards, reinterpreted all over the world.
To benchmark Flux 2.0, I compare two images from 2024 using the same prompt. Two years ago, I prompted a “lonely” cover using Midjourney.
MidJourney Prompt: “iphone photo of a man looking out into nothingness –ar 16:9 –v 6.0 –style raw”
MidJourney Prompt: “a surreal scene where a robot kneels on the sand, facing an enormous wave. The wave itself seems to blend into a stormy sky, creating a seamless transition from sea to cloud. The water is a mix of deep blue and turquoise, with white foam highlighting its turbulent nature. The robot has his left knee on the ground and his right foot planted firmly. His right arm is extended upwards, offering a sheet of paper to the wave. More sheets of paper whirl around him, caught up in the wind and the pull of the water, some fluttering towards the sky and others towards the wave’s crest. He is situated near the center of the composition, slightly to the right, with an open black briefcase beside him from which it seems the papers are being swept out. The sand around him is littered with additional papers. The sky above the wave is filled with dark, ominous clouds that blend into the wave, suggesting a powerful storm. The light source appears to come from the left, as evidenced by the highlights on the wave’s crest and the robot’s left side. Overall, the scene creates a dramatic and possibly metaphorical statement about challenges, nature’s power, or feeling overwhelmed. The precise details and textures in the image lend a hyperrealistic quality to this surreal scenario.”
I gave Flux 2.0 the exact prompts from those two images, and I prefer the two-year-old Midjourney version much better. Here are the Flux images:
But then again, prompting has changed quite a bit. I’m sure I could refine the prompts and get better results.
For the category covers, I used Claude to build out the idea of an opulent loneliness and gave it to Google Gemini. My favorites are below:
This week, I organized 492 links into 53 categories. 61 links informed the executive summaries. I’m going to go through in alphabetical order by company. Each company or topic is in bold, so you can scroll right through and know where you are as you approach the end of the summaries.
I’ll include videos and links and layperson-friendly descriptions where they are warranted.
There are two top stories… so I’m putting them first!
Google Is Redefining The Internet
Nano Banana showed up first in late August as a mystery model. Google then claimed it and renamed it Gemini 2.5 Flash Image.
By September 26th, Gemini 2.5 Flash Image had generated 5 billion images in less than one month.
However, now Nano Banana Pro is being integrated across Google’s product suite.
In particular, Google has integrated Nano Banana into its search results, which has created an incredibly strong feature. A good example would be if you asked Google, “Show me how RNA polymerase works. What are the stages of transcription, and how is it different in prokaryotic and eukaryotic cells?” Nano Banana can create interactive results that show you visually how RNA works, and it can integrate with Google’s video models to build animations and learning tools that are far beyond just a list of pages to pour through and try to see which one is best for your search. This is almost like having a fully interactive user interface that can morph into whatever you need, powered by Nano Banana.
On top of that, most users now have access to 2K-resolution images that are photorealistic, putting Google squarely at the top of the image leaderboards.
Alongside these announcements from Google, users continue to show off fun ideas they can do using Nano Banana’s ability to create complex compositions with multiple elements.
Biliwal Sidhu points out that it’s just a matter of time until video becomes unlocked in the same way as images have—where we’ll go from static infographics to pro-grade animated motion graphics, like essentially having custom YouTube video essays on any topic we want. Video will become a utility, not just a source of delight (or mockery). https://x.com/bilawalsidhu/status/1994110158138646693
MoonDream
Open-Vocabulary Image Segmentation | Moondream If you’re not familiar with the term segmentation, please familiarize yourself with it. It’s one of my top three favorite topics in artificial intelligence.
I think a lot of this comes from having used Photoshop for so many years to mask objects. I’m so old that I used Photoshop before Photoshop had layers. Object masking was something I learned in my 20s, and it was always a painstakingly manual process to highlight an object in Photoshop, especially with tricky things like hair. I’ve probably spent more hours than I’d ever want to admit building masks just to fight the software and prove I could do it.
Even lay people are probably familiar with masks now because we can remove a background with a button, or we can copy a person in a photo, click on them, and drag…and suddenly the photo turns into its parts.
Segmentation is the ability to identify an object or track it through motion and not get it confused with other objects. One of the best feats of strength is tracking or counting piglets. They all look the same, but a good segmentation model can keep track of each individual piglet, even as they pass in front of and behind each other.
Moonshot AI released a segmentation model that is open-sourced and very powerful. The biggest competitor is probably Meta, which has Segment Anything, and ByteDance has a strong model as well.
Moonshot’s is powerful because it uses natural language (as does Meta’s), or what they call open vocabulary. This allows you to not only segment an image, but talk about it. An example would be a penguin with five babies behind it. You could ask the computer to highlight and track the baby penguin closest to the adult penguin. No clicking needed.
Segmentation has profound impacts on many industries. For example, you can take a picture and ask the model to highlight any cracks in a concrete surface. You could take a photo of a greenhouse and say, “Highlight any leaves that are yellow.” In the case of robotics, you could take a picture of a bedroom and say, “Show me all the laundry on the floor.” Or, in the case of advertising or media, you could take a video of a soccer game and ask the computer to count any time a Coca-Cola ad appears on the screen as soccer players run around the field.
If you’re creative, you’ll start to see just how profound these examples can be. I highly recommend everybody learn about segmentation and keep it at the top of their mind in 2026. https://moondream.ai/skills/segment
Amazon
Nova 2 Family of Models The big news from Amazon is the latest version of their frontier family, Nova, called Nova 2. Nova first came out a year ago.
First, I’ll tell you a bit about how I’ve seen Nova since it was launched 12 months ago.
It is almost exclusively in the headlines to as an action model or a computer-use model, but that’s selling it short.
It’s a family, similar to the way that Anthropic has Claude Haiku (small), Sonnet (medium) and Opus (large). There really isn’t an Opus version of Nova, but there is essentially a cheap, fast version and a medium, stronger version.
Both are very price efficient and from a cost to performance ratio, they are some of the strongest models in the world, but they don’t get much fanfare.
So, let’s talk about this year’s announcements.
First, there’s Amazon Nova 2 Lite. That’s the mini version. It’s a great everyday processor that can understand text, images, and videos, and output text. It’s cheap and comparable to Claude Haiku 4.5.
Second, Nova 2 Pro is Amazon’s flagship model. It’s as strong as Anthropic’s Opus, but it’s similar to Claude Sonnet 4.5.
Then there’s Nova 2 Sonic, which is a little more unique to Amazon.. a speech-to-speech model.
Next is Nova 2 Omni. This is a multimodal reasoning and generative model that can read and process text, images, video, and speech, but it can also output text and images. It has a very large context window and can handle over 750,000 words and hours of audio.
Then there’s Nova Forge. Nova Forge allows you to train a custom version of Nova by giving it your proprietary information.
Let’s say you’re a law firm and you want a secure, private language model that knows all of your caseloads, or let’s say you have a giant database of proprietary pharmaceutial information. You can load your data into the training of a custom version of Nova.
In the past, hybrid training (adding data to a trained model) was tough, because it could erase a lot of the reasoning and the skills of the model in exchange for learning the new things.
Nova Forge allows you to keep the reasoning strength of the Nova model, but inject proprietary information deeply into its DNA. That’s the “Forge” part of the deal.
Finally, there’s Nova Act, an AWS product for building and managing agents to navigate user interfaces. You can train Nova Act on how to use a system, and it can take over on your behalf. It’s stable enough to be used for tasks as complicated as updating medical records or coordinating shipping orders. https://www.aboutamazon.com/news/aws/aws-agentic-ai-amazon-bedrock-nova-models
“I just want to confirm that this is based on a real document and we did train Claude on it, including in SL. It’s something I’ve been working on for a while, but it’s still being iterated on and we intend to release the full version and more details soon.
The model extractions aren’t always completely accurate, but most are pretty faithful to the underlying document. It became endearingly known as the ‘soul doc’ internally, which Claude clearly picked up on, but that’s not a reflection of what we’ll call it.” https://x.com/AmandaAskell/status/1995610567923695633
Anthropic acquires Bun as Claude Code reaches $1B milestone “Claude is the world’s smartest and most capable AI model for developers, startups, and enterprises. Claude Code represents a new era of agentic coding, fundamentally changing how teams build software. In November, Claude Code achieved a significant milestone: just six months after becoming available to the public, it reached $1 billion in run-rate revenue. And today we’re announcing that Anthropic is acquiring Bun—a breakthrough JavaScript runtime—to further accelerate Claude Code.” https://www.anthropic.com/news/anthropic-acquires-bun-as-claude-code-reaches-usd1b-milestone
A fun moment as I was working on this week’s newsletter and realized that my dad’s law firm is involved with this Anthropic IPO headline. He was Senior Of Counsel at Wilson Sonsini. I use his backpack every day, and it was next to me when I read the headline. Still very proud of him!
Introducing Claude for Nonprofits \ Anthropic “Claude for Nonprofits includes three things: discounted access of up to 75% to Claude, connectors to new nonprofit tools—Blackbaud, Candid, and Benevity—and a free course, AI Fluency for Nonprofits, designed to help teams use AI more effectively.”
Claude supports a number of connectors that link AI to the platforms that teams already use, including Microsoft 365, Google Workspace, Asana, Slack, and Box.
We’re now adding three open-source connectors to nonprofit tools, and we expect to launch more soon. Claude can now connect to:
Benevity, which can be used to access more than 2.4 million validated nonprofits to support volunteering and donation searches in Claude; Blackbaud, which provides CRM and fundraising tools for donor management, campaign tracking, and giving optimization; and Candid, which provides data on nonprofits and funders for the discovery of organizations, grants, and philanthropic opportunities. https://www.anthropic.com/news/claude-for-nonprofits
VC Analyst Agent Outperforms Experts An AI agent built around the obsolete GPT-3.5 and GPT-4 models beat experienced human venture capital analysts in predicting which early-stage startups would survive https://x.com/emollick/status/1995573136323215560
AI and digital currencies drive consumer spending trends “Across markets, AI is becoming a trusted holiday companion, especially among deal-driven consumers. In the U.S., nearly half of consumers (47 percent) have already used AI for at least one shopping-related task, with gift discovery, price comparison and product research emerging as top holiday use cases across North America. This data signals the beginning of an agentic AI era where shoppers rely on intelligent tools not just to browse, but tomake decisions too.” https://corporate.visa.com/en/sites/visa-perspectives/trends-insights/2025-spending-shift-report.html
ByteDance
Seedream 4.5 ByteDance released an update to their flagship image creation and editing tool, SeedDream. The website uses really neat interactive graphics that I’m unable to copy and paste here, so I encourage you to click through and take a look at the examples. Put your mouse over the examples and a description will appear, and the examples will animate. SeedDream has the agility of NanoBanana with the realism of Flux. Prompt based editing is going to erode Photoshop (which is why Adobe has embraced NanoBanana within its tools).
Introducing Google Workspace Studio to automate everyday work with AI agents | Google Workspace Blog “Today, we’re announcing the general availability of Google Workspace Studio — the place to design, manage, and share AI agents in Workspace. Harnessing the reasoning power and multimodal understanding of Gemini 3, we’ve built AI automation that’s simple to use, deeply integrated into Google Workspace, and puts custom agent creation in the hands of every employee. With Workspace Studio, you can build agents in minutes to automate everyday work, from simple tasks to complex workflows — no coding or specialized syntax required.” https://workspace.google.com/blog/product-announcements/introducing-google-workspace-studio-agents-for-everyday-work
Gemini 3 Deep Think is now available in the Gemini app “Today, we’re rolling out Gemini 3 Deep Think mode to Google AI Ultra subscribers in the Gemini app. This new mode delivers a meaningful improvement in reasoning capabilities, designed to tackle complex math, science and logic problems that challenge even the most advanced state-of-the-art models.” https://blog.google/products-and-platforms/products/gemini/gemini-3-deep-think/
Waymo Announces Expanded Self-Driving Coverage Waymo announced coverage in four new cities…Baltimore, St. Louis, Pittsburgh, and Philadelphia. Waymo also announced that fully autonomous service is available in Dallas, with no human driver in the car.
Mistral 3 French company Mistral released the latest version of their frontier model, Mistral 3. Like most models, the naming conventions can be a little bit tricky. For example, there’s Mistral Small 3, which came out in January 2025. Mistral Small is exactly what it says it is: it’s a small version of a model that can be hosted and deployed locally on a single computer. Because these are open-source models, you can download them. In May, Mistral launched Mistral Small 3.1, implying a small upgrade.
In June, Mistral announced Mistral Code, an AI-powered coding assistant. Mistral Code is a combination of four different models: one called Codestral, one called Codestral Embed, one called Devstral, and one called Mistral Medium. Mistral Medium came out in May 2025.
This week’s announcement was Mistral 3, their most capable models. They’re all open-sourced in a variety of formats.
Mistral 3 Large is the state-of-the-art open model that can hold its own against DeepSeek and Kimi. It’s pretty much a tie, but Mistral does well against both DeepSeek and Kimi and beats them slightly.
There’s another model called Ministral 3. Ministral and Mistral Small sound like they’d be the same thing, because one is “mini” and one is “small”.
I’m pretty sure the Small models are meant to be locally hosted, whereas the Mini models are me optimized for API use, for people who aren’t downloading models and hosting them on their own, but rather hitting Mistral via the API and want to be cost conscious. https://mistral.ai/news/magistralhttps://mistral.ai/news/mistral-3
OpenAI
Altman Declares “Code Red” to Stay Ahead of Google Over the past few weeks, Google has been demolishing all of the competition in almost every category of artificial intelligence. If you looked at leaderboards now, you would see Google at the top of almost all of them.
This week, Sam Altman wrote a memo saying, “We are at a critical time for ChatGPT.” It’s interesting that this is falling on ChatGPT’s birthday, since back in December 2022, Google declared a code red following ChatGPT’s launch. Google acknowledged that if they didn’t hustle, they would lose their business.
Google has come back with a vengeance, and now Sam Altman is facing the same crisis… get their rear in gear or lose it all to Google.
As a single use case, I must say I’m now often back to using Google Search because I know the results will be as strong as GPTs in the chat. Google is sitting on quite a bit of power, and they are leveraging it smartly.
The Information broke a scoop that OpenAI is naming its new model “Garlic,” which sounds to me like the vampire killer, or the ability to ward off bad things, or pack a punch with a small amount. Other people have seen similarities in the naming convention regarding types of soil and what can grow and what can’t.
Not much is known about Garlic, but the rumor is that it’s the next version of GPT-5, but smaller, faster, and more cost-effective, in an effort to stave off Gemini 3 and Opus 4.5 in coding and reasoning.
The other rumor is that Garlic will have a context window of 400,000 tokens, as well as integrated agentic features. I’m sure we’ll hear quite a bit about this in the coming weeks.
Stephanie Palazzolo on X: “Scoop central! OpenAI has its response to Google’s Gemini 3 and it has a funny allium themed name. “Garlic,” the company’s new pretrained model, is performing well on coding and reasoning benchmarks, according to an internal memo.” https://x.com/steph_palazzolo/status/1995882259195564062
OpenAI Launches Alignment Blog “Today, OpenAI is launching a new Alignment Research blog: a space for publishing more of our work on alignment and safety more frequently, and for a technical audience.” https://alignment.openai.com/#page=1
“In a new proof-of-concept study, we’ve trained a GPT-5 Thinking variant to admit whether the model followed instructions. This “confessions” method surfaces hidden failures—guessing, shortcuts, rule-breaking—even when the final answer looks correct. How confessions can keep language models honest | OpenAI https://openai.com/index/how-confessions-can-keep-language-models-honest/
Announcing the initial People-First AI Fund grantees | OpenAI “The OpenAI Foundation is announcing the first recipients from the People-First AI Fund, a multi-million dollar investment in community-based nonprofits working to strengthen local communities and expand the opportunity of AI.” https://openai.com/index/people-first-ai-fund-grantees/
OpenAI to acquire Neptune | OpenAI “Training advanced AI models is a creative, exploratory process that depends on seeing how a model evolves in real time. Neptune gives researchers a clear and dependable way to track experiments, monitor training, and understand complex model behavior as it happens.” https://openai.com/index/openai-to-acquire-neptune/
Congrats to the ARC Prize 2025 winners! “The Grand Prize remains unclaimed, but nevertheless 2025 saw remarkable progress on LLM-driven refinement loops, both with “local” models and with commercial frontier models.” https://arcprize.org/blog/arc-prize-2025-results-analysis
Video
Kling Launches Omni O1 There are only a handful of video models at the frontier level. OpenAI’s Sora and Google’s Veo and then Runway, SeedDance, and Kling.
This week, Kling announced their Omni model that they’re calling O1. Kling claims it’s the first unified multimodal video model, which means that it not only generates video, but can also input and understand text, images, video, and audio.
It can understand the content, composition, and perspective of everything you’re uploading.
O1 claims to have a lot better consistency across multiple videos using the same characters, props, or scenes. You can also build an avatar of yourself or anything you want, and then use it repeatedly across multiple videos. Kling announced an “elements library”, which lets you save things and use them later as props or objects inside your scenes… coming back to your library.
Runway Launches Gen-4.5 Runway launched Generation 4.5, their state-of-the-art video tool. The outputs are incredibly realistic. However, the prompt adherence is not always perfect. I think if you don’t know the prompt, you can look at the video and say it’s indistinguishable from a real video. But behind the scenes, the control that you want as a prompter yourself may not be completely there, it’s still important to be flexible with expectations. Broad, sweeping, ideas still win more than exacting specifics.
To their credit, Runway acknowledges the limitations. But even with the limits. The strengths are remarkable. Gen-4.5 is very good physics, with realistic object weight, momentum, and force. Liquids look like liquids, and surfaces appear strikingly like surfaces. You can build pretty complicated videos, an example being “a cactus person hugging a red balloon person and the red balloon person pops”. There’s also a wide range of aesthetics, from photorealistic to animation. The realism is spectacular.
Full Executive Summaries with Links, Generated by Claude 4.5
Amazon launches Nova 2.0 models with breakthrough agent reliability Amazon’s Nova 2.0 family includes four new AI models spanning text, speech, and multimodal capabilities, with Nova Act achieving 90% reliability for browser automation—a significant leap beyond typical AI agent performance. The release introduces “open training” through Nova Forge, letting companies blend proprietary data with Amazon’s models during training rather than just fine-tuning afterward. Early customers like Hertz report 5x faster software delivery using Nova Act for automated testing.
Amazon is back with Nova 2.0, a substantial upgrade over prior Amazon Nova models and demonstrating particular strength in agentic capabilities Amazon has released Nova 2.0 Pro (Preview), its new flagship model; Nova 2.0 Lite, focused on speed and lower cost; and Nova 2.0 Omni, https://x.com/ArtificialAnlys/status/1995921468010758267
Amazon has launched a new speech-to-speech model, Nova Sonic 2.0, which ranks #2 on our Artificial Analysis Big Bench Audio Speech Reasoning benchmark! The new model achieves a reasoning accuracy score of 87.1% on Big Bench Audio, placing second overall behind Google’s Gemini https://x.com/ArtificialAnlys/status/1995950101068763393
Anthropic confirmed Claude was trained on internal “soul document” defining its values A researcher extracted what appears to be Anthropic’s internal training document for Claude 4.5 Opus, containing detailed guidelines about the AI’s purpose, values, and relationship with users. Anthropic’s Amanda Askell confirmed the document is real and was used in supervised learning, making this the first public glimpse into how major AI labs shape their models’ personalities and ethical frameworks. The document reveals Anthropic’s internal reasoning about building potentially dangerous technology while prioritizing safety.
“”I just want to confirm that this is based on a real document and we did train Claude on it, including in SL. It’s something I’ve been working on for a while, but it’s still being iterated on and we intend to release the full version and more details soon.”” / X https://x.com/AmandaAskell/status/1995610567923695633
Anthropic acquires JavaScript runtime Bun as Claude Code hits $1B revenue Claude Code reached $1 billion in run-rate revenue just six months after public launch, prompting Anthropic to acquire Bun, a high-performance JavaScript development toolkit. The acquisition aims to accelerate Claude’s coding capabilities as enterprises like Netflix and Spotify increasingly adopt AI-powered software development. Bun will remain open source while helping scale Claude Code’s infrastructure for rapid enterprise growth.
Anthropic prepares for 2026 IPO potentially worth over $300 billion The Claude chatbot maker has engaged law firm Wilson Sonsini to prepare for a public offering that could beat rival OpenAI to market, while simultaneously raising private funding from Microsoft and Nvidia. This would test whether investors will back loss-making AI companies spending billions on infrastructure, with Anthropic projecting revenue could triple to $26 billion next year as it expands beyond 300,000 business customers.
ChatGPT just turned 3 years old. 0 to 800M weekly users in that time. Fastest in history. Wild how much the way we work, search, and learn has changed in such a little time. https://x.com/rowancheung/status/1995524388918038985
Anthropic launches Claude for Nonprofits with 75% discount and specialized tools Anthropic partnered with GivingTuesday to offer nonprofits up to 75% discounts on Claude AI, plus connectors to fundraising platforms like Blackbaud and Benevity. Early users report dramatic efficiency gains—IDinsight works 16× faster on surveys, while the Epilepsy Foundation now provides 24/7 AI support to 3.4 million Americans. The initiative addresses nonprofits’ resource constraints by making advanced AI affordable and integrating with existing nonprofit workflows.
Apple’s AI chief exits after Siri delays and leadership shake-up John Giannandrea stepped down following Apple’s admission that its upgraded Siri is “taking longer than we thought,” with CEO Tim Cook reportedly losing confidence in his leadership. Former Microsoft AI executive Amar Subramanya will replace him as Apple scrambles to catch up in the AI assistant race. The leadership change highlights Apple’s struggle to compete with rivals like ChatGPT and Google’s AI offerings, despite years of investment in voice technology.
Global consumers embrace AI shopping tools and digital payments this holiday season Visa’s 12-country study reveals 47% of US consumers now use AI for shopping tasks like gift discovery and price comparison, while digital wallets gain dominance globally with Gen Z showing equal preference for digital wallets versus physical cards. The research demonstrates AI’s mainstream adoption in commerce and accelerating shift toward digital-first payment behaviors, particularly among younger consumers who are also driving cryptocurrency acceptance with 45% of US Gen Z excited to receive crypto gifts.
AI beats human VCs at predicting startup survival rates An experiment using older GPT-3.5 and GPT-4 models outperformed experienced venture capital analysts at identifying which early-stage startups would succeed, doing so at significantly lower costs. This suggests AI could democratize investment analysis by providing sophisticated screening capabilities without requiring expensive human expertise. The results are particularly notable because they used outdated AI models, indicating even greater potential as the technology advances.
Interesting experiment found that an AI agent built around the obsolete GPT-3.5 and GPT-4 models beat experienced human venture capital analysts in predicting which early-stage startups would survive based on early screening (at much lower costs as well). https://x.com/emollick/status/1995573136323215560
ByteDance launches Seedream 4.5 with enhanced multi-image editing capabilities Seedream 4.5 introduces improved ability to identify and edit specific subjects across multiple images while preserving reference details and rendering complex typography. This addresses a key limitation in current AI image generators that struggle with consistent multi-image workflows and precise text rendering. The model shows measurable improvements in prompt adherence and visual quality compared to its predecessor, positioning it for professional creative applications like poster design and e-commerce visuals.
Google launches Workspace Studio for no-code AI agent creation Google released Workspace Studio, letting any employee build AI agents to automate work tasks without coding skills—just describe what you want automated in plain English. The platform has already handled 20 million tasks for alpha users in 30 days, with one company reducing feature evaluation time by 90%. This marks a shift from technical automation tools to AI assistants that anyone can create and customize for their specific business processes.
Google launches Gemini 3 Deep Think for complex reasoning problems Google’s new AI mode uses parallel reasoning to simultaneously explore multiple solutions, achieving breakthrough scores of 41% on advanced reasoning tests and 45.1% on general intelligence benchmarks—performance levels that represent significant advances over previous AI systems in tackling complex math, science, and logic challenges.
Google’s Nano Banana Pro tops image generation leaderboards with 2K resolution Google launched Nano Banana Pro, an AI image generator that creates professional-quality 2K images and can blend up to 14 images in one prompt. The tool has claimed the #1 spot on image generation leaderboards, with users preferring its higher resolution output over competitors. The system is now available in more countries including India and the UK, signaling Google’s push to compete directly with other AI image generators like Midjourney and DALL-E.
Built on Gemini 3, Nano Banana Pro uses enhanced reasoning and real-world knowledge to visualize information better than ever ✨ You can also blend more elements than ever before, combining up to 14 images into a cohesive scene with just one prompt. See it in action in the”” / X https://x.com/Google/status/1996263265735749682
Gemini 3 and Nano Banana Pro are now available in Search in more countries, including India and the U.K., starting today 🥳 Google AI Pro and Ultra subscribers can try Gemini 3 Pro now in AI Mode. It brings depth and nuance to your hardest questions, including helpful https://x.com/Google/status/1995605066170998917
Generate crisp, 2K resolution images suitable for professional use with Nano Banana Pro. Pro tip: make sure to click “download” on your image instead of “copy” to get the full resolution. Show us yours ↓ https://x.com/GeminiApp/status/1996252061651042751
Image Leaderboard Update: 🖼️📊 Our image leaderboard ranks image generation AIs according to user preference – and Seedream 4.5 from @BytePlusGlobal is speeding up the rankings! Seedream 4.5’s standard version comes in at #4, just below Nano Banana Pro – and the Max version is https://x.com/yupp_ai/status/1997032930846396466
Nano banana pro is hitting the threshold for images that Veo 4 will unlock for video. We’ll suddenly go from static infographics to pro-grade animated motion graphics — like having a custom youtube video essay on any topic imaginable. And just like that ai video will become a”” / X https://x.com/bilawalsidhu/status/1994110158138646693
Nano Banana Pro with 2k resolution is now #1 on the lmarena image editing leader board (with regular Nano Banana Pro at #2). It looks like users prefer higher resolution: who’d have thunk it?!”” / X https://x.com/JeffDean/status/1996457766349848753
Surprisingly good for the first try. Nano banana pro: “”create a map of the US where every state is made out of its most famous food (the states should actually look like they are made of the food, not a picture of the food). Check carefully to make sure each state is right.”” https://x.com/emollick/status/1995720976068137048
Love this prompt with Nano Banana via @dotey “CITY=Montreal, Canada Present a clear, 45° top-down isometric miniature 3D cartoon scene of [CITY], featuring its most iconic landmarks and architectural elements. Use soft, refined textures with realistic PBR materials and gentle, https://x.com/fdaudens/status/1995294952558068217
Waymo launches driverless rides in four new cities after rapid expansion The Google-owned robotaxi service went from testing with safety drivers in Dallas just four months ago to fully autonomous operations, demonstrating how quickly self-driving technology can scale once proven. This represents a significant acceleration in commercial autonomous vehicle deployment, with Waymo expanding at over 500% annually and bringing driverless rides to more American cities than any competitor.
FOUR new cities are on the map! 🗺️ The future of mobility is expanding faster than ever and we’re thrilled to bring the proven experience of the Waymo Driver to more people. https://x.com/Waymo/status/1996217860440412641
Waymo started testing with a safety driver in Dallas just 4 months ago. They’re now fully driverless — no one but you in the car. Waymo has been expanding at >500% per year.”” / X https://x.com/fchollet/status/1996263334883266961
Mistral AI releases open-source Voxtral speech models under Apache license Mistral’s new Voxtral models offer state-of-the-art speech transcription and understanding capabilities in 24B and 3B parameter versions, competing directly with proprietary services like OpenAI’s Whisper at half the cost. The models handle 30-40 minute audio files, support multiple languages natively, and can perform question-answering and summarization directly from speech without requiring separate transcription steps. Benchmarks show Voxtral outperforming existing open-source alternatives and matching closed commercial systems while being freely available for modification and deployment.
Mistral releases open-source AI models rivaling closed commercial systems Mistral 3 includes four new models under Apache 2.0 license, with the flagship Large 3 ranking #6 among all open-source models and matching top commercial systems on key benchmarks. The release spans from 3B-parameter edge models to a 675B-parameter mixture-of-experts system, marking a significant shift toward high-performance open AI that enterprises can customize and deploy without vendor lock-in.
🚨BREAKING: Text Leaderboard Update: A new open source model has landed on the leaderboard! Mistral-Large-3 lands at #6 among open models and #28 overall on the Text leaderboard. Mistral 3 is the next generation of Mistral AI models and their most capable model family to date. https://x.com/arena/status/1995877395510051253
Moondream launches AI that segments any object from plain English descriptions The company’s new segmentation tool can identify and outline objects like “dirty laundry” or “hairline cracks” using natural language prompts, achieving 86.9% accuracy on benchmarks while being 10x cheaper than competing solutions. This breakthrough eliminates the need to retrain models for new object types, potentially transforming quality control, agriculture, and media applications.
Moondream’s new segmentation just dropped. Prompt: “dirty laundry items on the bed.” Moondream: pixel-perfect + actually understands the scene. SAM 3: grabs the floor. https://x.com/moondreamai/status/1996001944838832501
OpenAI declares ‘code red’ as Google’s Gemini gains 650 million users OpenAI CEO Sam Altman issued an internal emergency directive to improve ChatGPT after Google’s Gemini 3 model topped industry benchmarks and attracted high-profile endorsements, forcing OpenAI to delay advertising plans and other products. This reverses the competitive dynamic from 2022 when Google declared its own “code red” after ChatGPT’s launch, highlighting how quickly AI leadership can shift. OpenAI plans to counter with a new reasoning model called “Garlic” next week that may outperform Gemini 3 in internal tests.
Scoop central! OpenAI has its response to Google’s Gemini 3 and it has a funny allium themed name. “Garlic,” the company’s new pretrained model, is performing well on coding and reasoning benchmarks, according to an internal memo. https://x.com/steph_palazzolo/status/1995882259195564062
Accenture and OpenAI form partnership to accelerate business AI adoption The consulting giant will help enterprises implement OpenAI’s models across operations, combining Accenture’s 40,000 AI specialists with OpenAI’s technology to bridge the gap between AI capabilities and practical business deployment. This partnership addresses the critical challenge companies face in moving from AI experimentation to scaled implementation across their organizations.
OpenAI launches dedicated blog for AI safety research publication OpenAI created a new technical blog to share alignment and safety research more frequently, signaling increased transparency in their efforts to ensure AI systems behave as intended. This marks a shift toward more open communication about the critical challenge of keeping advanced AI aligned with human values.
Today, OpenAI is launching a new Alignment Research blog: a space for publishing more of our work on alignment and safety more frequently, and for a technical audience. https://x.com/j_asminewang/status/1995569301714325935
We trained a variant of GPT-5 Thinking to produce two outputs: (1) the main answer you see. (2) a confession focused only on honesty about compliance. The main answer is judged across many dimensions—like correctness, helpfulness, safety, style. The confession is judged and https://x.com/OpenAI/status/1996281175770599447
OpenAI launches $1 million fund for community AI projects OpenAI announced its first round of grants from a new $1 million “People-First AI Fund” designed to support community organizations using AI for social good. The initiative marks a shift toward grassroots AI deployment, moving beyond corporate applications to fund local nonprofits, schools, and community groups. This represents OpenAI’s attempt to democratize AI access while building goodwill amid growing scrutiny over AI’s concentration among tech giants.
OpenAI acquires Neptune to enhance AI model training capabilities OpenAI is buying Neptune, a specialized metrics dashboard company that helps researchers monitor and debug AI model training processes. This acquisition gives OpenAI deeper visibility into how their frontier models learn during the complex training phase, potentially accelerating development of more advanced AI systems. Neptune will wind down external services to focus exclusively on supporting OpenAI’s research toward artificial general intelligence.
ARC Prize 2025 shows AI reasoning systems hitting new milestones The competition saw commercial AI models reach 37.6% accuracy on abstract reasoning tasks, while specialized “refinement loop” systems achieved 54% – demonstrating that AI can now iteratively improve its problem-solving approach. This marks significant progress from last year’s results, with all major AI labs now using the ARC benchmark to measure their frontier models’ reasoning capabilities, though the grand prize for human-level performance remains unclaimed.
Congrats to the ARC Prize 2025 winners! The Grand Prize remains unclaimed, but nevertheless 2025 saw remarkable progress on LLM-driven refinement loops, both with “”local”” models and with commercial frontier models. We also saw the rise of zero-pretraining DL approaches like HRM”” / X https://arcprize.org/blog/arc-prize-2025-results-analysis
Kling AI launches first video model with native audio generation Chinese AI company Kling AI released VIDEO 2.6, its first model that generates both video and synchronized audio simultaneously, eliminating the need for separate audio production. The launch includes Avatar 2.0 for 5-minute character performances and Element Library for consistent character generation across multiple videos. This represents a significant step toward automated video production, as most competitors still require separate tools for audio and visual content.
Day 3: Meet VIDEO 2.6 — Kling AI’s First Model with Native Audio Generate an entire experience — more than a video clip! With coherent looking & sounding output, the 2.6 model opens up narrative possibilities, and makes you “”See the Sound, Hear the Visual””. With the launch of https://x.com/Kling_ai/status/1996238606814593196
Day 4: Meet KlingAI Avatar 2.0 — upgraded, expressive, and built for full 5-minute performances. From explainers to ads, songs to stories — it plays every role with ease. Max Expressions, Real Characters. For the next 12 hours ONLY Follow, Like & Retweet to get 200 Credits — https://x.com/Kling_ai/status/1996592857096868075
Day 5: Bonus announcement for the final day of Kling Omni Launch Week! We’re excited to introduce Before & After Template! This feature generates a quick comparison between your input and output, showcasing your creativity with ease! Now, time to showcase your best work with https://x.com/Kling_ai/status/1996859217173496011
Day 5: Final day of Kling Omni Launch Week. Meet Element Library — a powerful tool for building ultra-consistent elements with easy access for video generation! Build your elements with images from multiple angles, and have Kling O1 remember your characters, items, and https://x.com/Kling_ai/status/1996853574773637296
Kling 2.6 has landed in ElevenLabs Image & Video. Kling’s first audio-video model that lets you generate fully voiced, character-driven scenes with unlimited narrative possibilities. https://x.com/elevenlabsio/status/1996239001590682077
Need more angles & compositions on your shot? Simply prompt Kling O1 to generate shots with different angles and compositions based on the reference video. Multi-shot made easy! https://x.com/Kling_ai/status/1995698062371754450
Runway launches Gen-4.5 video generator with multiple visual styles The new AI model can create videos in photorealistic, animated, and puppet styles while maintaining visual consistency across scenes. This addresses a key limitation of previous video generators that struggled to maintain coherent aesthetics, potentially enabling more professional storytelling applications for filmmakers and content creators.
Gen-4.5 can handle a wide range of aesthetics, from photorealistic and cinematic to “”practical”” puppetry, 3D animation, anime and more. All while maintaining a coherent visual language across your generations. So you can explore all possible worlds to tell all imaginable stories. https://x.com/runwayml/status/1996586320110440848
Runway released Gen-4.5 today and it is already ranked first on the Video Arena leaderboard. We sat down with CEO @c_valenzuelab to discuss how a small team is currently beating Google and Meta in the race for state-of-the-art video generation. The full episode is below! https://x.com/wandb/status/1995548641801765249
Leave a Reply