A photorealistic scene inside a cozy, softly lit child’s bedroom that has been subtly converted into a home media consumption room. A young child (around 6–8 years old), seen from a gentle side angle, sits cross-legged on a plush rug or low beanbag, quietly watching a bank of large TV screens lining the walls. The room feels nurturing and calm—warm pastel tones, soft blankets, a small stuffed animal nearby—yet the decor is intentionally sparse aside from the TVs. Several screens display muted, non-violent, non-branded abstract news footage and calming imagery (no readable logos). Scattered newspapers on the floor and a short stack of books on a small shelf or nightstand suggest information overload. The child’s face is illuminated by the glow of the screens, expression curious and absorbed. Shallow depth of field, natural skin texture, realistic fabric detail, soft cinematic lighting, subtle dust in the air, ultra-detailed, 35mm lens look, high dynamic range, realistic shadows. Landscape / wide shot, centered composition with TVs creating a gentle panoramic arc around the child, emotionally tender but slightly uncanny contrast between softness and screens.

AI News #116: Week Ending December 19, 2025 with 43 Executive Summaries

December 20, 2025

About This Week’s Covers

This week’s humanities reading is a short quote from a poem by Louise Glück: “We look at the world once, in childhood. The rest is memory.” I thought that was pretty spectacular because it lines up with the way large language models are trained all at once and then released. After that, with the exception of tool use, they basically have to rely on their training and “memory.” So there’s a really cool parallel with this quote.

For the main cover image, I decided to have a child who is essentially training on the world through way too much media consumption. Each TV has a famous AI thought-leader.

The cover image was first generated by Flux 1.1 Ultra. Flux outperformed Gemini and ChatGPT on image quality. I liked the hues and color tones, and the composition of the televisions.

I took Flux’s image and created masks on the TVs using Photoshop, then swapped what was on the screens with four prominent computer scientists: Andrej Karpathy, Mira Murati, Geoffrey Hinton, and Dr. Gladys West, a pioneering Black American mathematician who was critical to the development of GPS.

I’ve consistently noticed that you prompt any of these image tools for a vague young person, they all generated Caucasian children with no diversity whatsoever. The genders seemed to be fairly evenly distributed: I made eight test images and got about half and half boys and girls.

Here is an image, running the same prompt through Gemini.

I gave the Flux prompt to Claude and ran it up against my Python script that generates the categories for my 53 covers. The covers came out pretty well. Here are a few of my favorites:

This Week By The Numbers

Total Organized Headlines: 576

This Week’s Executive Summaries

This week, I organized 576 links, and 115 of them informed the executive summaries. I’ve crossed the 45,000-link mark! 45,200 manually organized AI links over the past 116 weeks.

I’m organizing this week’s headlines in alphabetical order by company, with the exception of a few that don’t fit. You’ll be able to skim and look at the bold headlines, and as we approach Z, you’ll know you’re near the end.

But first, my favorite story of the week is Segment Anything Model Audio (SAM Audio).

Meta’s Segment Anything Model Audio (SAM Audio)

“With SAM Audio, you can use simple text prompts to accurately separate any sound from any audio or audio-visual source.”
https://ai.meta.com/samaudio/

My favorite sub-genre of artificial intelligence is a category of skills called segmentation and depthing. Segmentation is the ability to identify an object and select it. Depthing is knowing how close or far away anything is from the camera.

People who use Photoshop will think of segmentation as masking, and anyone who’s watched sports might think of tracking an athlete or the puck in a hockey game. Depthing is a little more nuanced, and it often looks like a heat map or thermal imagery. However, in this case, instead of heat, it’s distance from the camera.

One of the recent breakthroughs in segmentation and depthing is the ability to talk to a video system in order to identify or track an object. For example, segmentation would allow a firefighting drone to be given instructions in plain English rather than using a joystick or coordinates. You could say, “Fly up to that tree on the right with the bare branch and hover above it,” or “Track the surfer wearing the red swimsuit.” Or you could say, “Fly into the upper right-hand window on the second floor of that building.” Clearly, being able to talk to a computer and interact with images or video is pretty powerful.

Meta is one of the leaders in this space, with a lot of open-source segmentation and depthing tools.

This week, Meta came out with the ability to segment audio, which is an incredible pivot from just using visuals.

Now you could have multi-layered audio and extract various elements using plain language. The example Meta uses is a woman outside a nightclub talking on her phone while a train goes by. You could tell the system to isolate just the train noise, or only isolate the music, and more practically, you could isolate her voice.

This is by far my favorite update of the week, and it’s worth browsing Meta’s announcement. Technically speaking, Google’s Gemini 3 Flash is a bigger deal, but just as a nerd, I love this one the most.

https://ai.meta.com/sam3/
https://ai.meta.com/sam3d/ https://ai.meta.com/research/publications/pushing-the-frontier-of-audiovisual-perception-with-large-scale-multimodal-correspondence-learning/

Allen Institute for AI

Molmo 2: State-of-the-art video understanding, pointing, and tracking
What a segue from the audio segmentation!

Molmo2 is spectacular, and it dovetails right into segmentation and depthing, but takes it a step further. Molmo can follow and track objects in videos, but it can also understand the context of a video or image way beyond segmentation and depthing.
https://allenai.org/blog/molmo2

One of the best examples from the Molmo release blog is that you could have Molmo watch a video of someone cooking, and Molmo will extract recipe instructions from the video. Molmo is so good at looking at video and understanding it that it can even sniff out artifacts from AI deepfakes. There are some incredible demonstrations on YouTube of how Molmo can count objects, track videos, and answer complex questions. I encourage you to watch them all.

The Allen Institute for AI is a nonprofit scientific research institute that was founded by former Microsoft co-founder Paul Allen 16 years ago. All of this is open source and free to download and use.
https://en.wikipedia.org/wiki/Allen_Institute_for_AI

Amazon

Amazon announced that Peter DeSantis is going to run the AI team. Peter’s been at Amazon for over 27 years. He has a degree in economics and computer science from Dartmouth. I have mixed feelings about Amazon. However, I appreciate that they’ve continued to promote from within. This is a huge contrast to a company like Meta, where young hotshot talent is being paid monopoly-money through acquihires.

If you’ve been somewhere for 27 years, that means you’ve never been anywhere else, which limits your experience culturally. However, it also gives you unequaled knowledge of the workings of a corporation and its holistic nuances. I think it’s neat that Peter has been promoted, and underneath him, of course, he can have all the new hotshots. I think this continuity at Amazon, including Andy Jassy, who has been there forever, is a really neat testament to Amazon, and unique in today’s day and age.
https://www.theregister.com/2025/12/17/jassy_taps_peter_desantis_to_run_agi/
https://www.aboutamazon.com/news/company-news/andy-jassy-peter-desantis-amazon-leadership-update

Anthropic

Anthropic preparing new Agentic Tasks Mode for Claude
“Anthropic testing Claude’s Agent mode with a new interface for tasks, to introduce new modes for research, analysis, writing, and building.”
https://www.testingcatalog.com/anthropic-testing-new-agentic-tasks-mode-for-claude/

Sonnet 4.5 was underestimated on METR
Benchmarks are one of the most important measurements of AI’s capabilities. It’s critical to have a variety of benchmarks because some AI models can train for the test or have jagged performance across benchmarks.

The creativity of benchmarks never ceases to amaze me. One of my favorites is from a company or nonprofit called METR. They evaluate frontier AI models to understand capabilities and risks. METR assesses the extent to which an AI system can autonomously carry out a task, as measured by length of time. Given a general purpose task like researching a topic or developing an application, how long can a model run without falling apart?

So for an example, to teach an AI model how to categorize data (i.e. understanding the difference between a cat or a dog in a photo) would take an AI agent about 45 minutes. To fix bugs in a small Python library of code would take just over an hour. To figure out how to exploit a weakness in a security system that’s fairly detailed and technical might take almost two and a half hours. And to build a full image model might take four hours.

These sorts of tests are important to see how much an AI agent can handle without getting lost in the sauce or losing its place in a long linear series of tasks, refinements, and reattempts.

My favorite benchmark is the “time horizon of software engineering tasks that an AI model can complete 50% of the time”. It’s a benchmark for just comparing agents against each other. The speed of improvement is something else.

In the beginning of this year, OpenAI’s O1 model could make it 41 minutes. In February, Claude 3.7 Sonnet made it 56 minutes. In July, Grok 4 made it 1 hour and 49 minutes. In August, GPT-5 lasted 2 hours and 18 minutes. In November, GPT-5.1 Codex Max was able to focus on a task length for 2 hours and 53 minutes. And by the end of November, Claude Opus 4.5 was able to focus on a task that was 4 hours and 49 minutes long.

From the beginning of 2004 to the beginning of 2005 was much smaller improvement: 9 minutes in January and an ending of 22 minutes in December. From 2022 to 2023, models improved from 36 seconds to 5 minutes.

So the amount of improvement in one year is fairly steep. This week, Claude jumped 20 minutes.

Benchmarks

The 2025 Physicians AI Report
“Surprisingly rapid & high Ai adoption by doctors: 67% use it daily, 84% says it makes them better doctors, 42% says it makes them want to stay in medicine more (10% said less). A lot of the use cases appear to be administrative and research assistance.” https://x.com/emollick/status/2001061282485547116

“What we found lines up with why we started Offcall in the first place: to restore physician autonomy and help doctors take back control of their profession.”

“While healthcare organizations struggle with AI adoption, physicians are already using it daily, and they’re desperate for more. But there’s a catch: they’re terrified of who’s controlling it. And if they don’t get more empowered to affect how it’s deployed, it runs the risk of burning out the workforce even further.”
https://2025-physicians-ai-report.offcall.com/

Top app with doctors: https://www.openevidence.com/

Reasoning Models Ace the CFA Exams
“All the frontier AIs now pass all levels of the very challenging Chartered Financial Analyst (CFA) exam”

“The paper used paywalled, new mock exams to reduce the risk of leakage but AI grading for the essays. Interestingly, prompting strategy doesn’t matter for most question types” https://x.com/emollick/status/2000605774695837711 https://arxiv.org/pdf/2512.08270

Business

BBVA and OpenAI collaborate to transform global banking
“BBVA and OpenAI are expanding their collaboration with a multi-year strategic AI transformation program that will see ChatGPT Enterprise rolled out to all 120,000 global employees—a 10x increase on their current deployment. Under this new agreement, OpenAI will help advance BBVA’s AI strategy, which aims to transform the customer experience, enable new ways of working, and optimize internal operations.” https://openai.com/index/bbva-collaboration-expansion/

Databricks raises capital at $134 billion valuation in latest funding round
“The valuation is a 34% jump from the funding round announced in August, which valued the company at $100 billion. At the time, Databricks became one of a handful of private companies to surpass a $100 billion valuation, after SpaceX, ByteDance and OpenAI.”

“Databricks said it plans to use the capital to support customer app building as artificial intelligence accelerates development. It wants to be the go-to company for organizations looking to build and run AI agents that can carry out work, Ali Ghodsi, Databricks’ co-founder and CEO, told CNBC in an interview.”

“It’s kind of a land grab, with do-it-yourself winning right now. So that’s a big opportunity,” he said.”
https://www.cnbc.com/2025/12/16/databricks-funding-valuation.html

Meta’s Yann LeCun targets $3.5 billion valuation for new AI startup
“The startup plans to build AI systems using ⁠world models that can understand the physical world. The systems could support applications including robotics and transport.” https://finance.yahoo.com/news/metas-yann-lecun-targets-3-110641727.html

Fastweb + Vodafone Creating Customer Service Agents
“Fastweb + Vodafone (Swisscom Group), one of Europe’s leading telecom providers, is building Super TOBi, which brings agentic customer service to massive scale. Using LangSmith, they are:

Achieving 90% response correctness and 82% resolution rates across ~9.5M customers Running daily automated evals with human oversight to continuously improve agent behavior Getting end-to-end observability into how agents reason, route, and act in real customer interactions” https://www.blog.langchain.com/customers-vodafone-italy/

Flux (Images)

FLUX.2 [max] “Our highest quality model to date. Grounded generation – searches the web for real-time context. Up to 10 reference images. Products, characters, styles stay consistent. #2 on ArtificialAnlys in text-to-image and image editing.”
https://bfl.ai/models/flux-2-max

This week’s cover image was created (largely) using Flux Pro v. 1 Ultra. Here is the same prompt run through Flux.2 Max. I ran four iterations, and these are my favorite two.

This week’s cover prompt but run through the strongest Flux model. One result.

This week’s cover prompt but run through the strongest Flux model. Another result.

The real strength is the natural lighting and soft colors. https://bfl.ai/models/flux-2-max

Google

Antigravity
Google released an agent development app called Antigravity. It’s an actual software application you download. It’s meant to power-boost AI agent development using a combination of traditional tools (like a source code editor) but more importantly natural-language commands.

WYSIWYG stands for “What you say is what you get” now?

Essentially, Antigravity bundles app development for small projects into a soup-to-nuts container that even non-coders could potentially use to build little agents.

The example Google uses is a flight tracker, where you ask Antigravity to build you a flight-tracking tool, and Antigravity uses Google’s NanoBanana to design the mock up for the interface. You then give feedback on the interface the same way you’d tell a designer or product manager. Google Antigravity then iterates on the UI and builds the flight-tracking agent. Along the way, it tells you what it’s doing and takes screenshots of its work. So it’s basically a co-pilot on steroids that you have to download and, in theory, it will build apps for you as a partner/developer.

The sales pitch from Google is that you can focus on your solutions. You dream up the architecture of what you want to do as components, and then Google Antigravity builds all the components and puts them together as the codebase.

Much like HTML editors or WordPress, where you can switch back and forth between a WYSIWYG editor and the code, Antigravity has the ability to switch views. You can use the natural-language command view, or you can go back to a more traditional co-pilot environment where you can look at the code and use tab autocompletion and things like that. https://antigravity.google/

This YouTube video is a great introduction.

Google Opal
As a contrast to Google Antigravity, or more like a complement/kid sister, Google has also released a web-based app builder. It’s not as robust, but it shows just how much third parties are going to need to worry about building moats with API-driven software-as-a-service, because Google can just release a wrapper-killer any day they want.
https://blog.google/innovation-and-ai/models-and-research/google-labs/mini-apps-opal-gemini-app-experiment/

In this case, Opal is a WYSIWYG interactive app builder that reminds me a lot of n8n, or maybe Zapier. You drag-and-drop elements to create an app workflow. You wireframe your app, then hit build, and it generates it for you.

Google has a great how-to article that demonstrates a recipe-generation tool for meal planning. The thing that really stands out to me is that you can pick which Google model you want to use from all of their various AI options. This feels strikingly similar to a lot of third-party tools for enterprise use.

There is literally no protection anymore for people who are building apps using APIs. To me, it’s like Garmin GPS getting eaten by Google Maps. Just like software ate the hardware world, AI can pretty much eat any of the software world(s). I think it’s going to be a very interesting next three years. https://opal.google/

Google Launches Full Integration of Gmail, Calendar, and Drive for Daily Briefings
It sounds like Google’s version of Chat GPT Pulse (but grounded in Google v. broad topics) “Say hello to CC, a new AI productivity agent that connects your Gmail, Calendar and Drive to deliver a personalized briefing every morning. Need more help? Just email CC.”
https://labs.google/cc

Gemini 3 Flash: The Most Important Model Update of the Year?
Google released Gemini 3 Flash, which gives very strong frontier-level intelligence at a much cheaper cost. This is important: you don’t always need the greatest model in the world for every task. This is true especially for folks using APIs where you pay per token. Sometimes you just need a smaller model that can get the job done.

What’s amazing about Google Flash (Gemini 3 Flash) is that it actually beats many of the frontier models on benchmarks while coming in cheaper than them.

Gemini 3 Flash is one quarter of the price of Gemini 3 Pro and goes neck and neck on almost every benchmark available.

This is called the Pareto frontier, which is a fancy term for balancing quality, cost, and responsiveness. The Pareto frontier is where you’ve optimized all three things: it’s cheap, it’s fast, and it’s good.

Gemini 3 Flash has advanced coding skills with very low latency for anyone needing to build an interactive app. It actually outperformed the frontier models on many tasks, which is never something you see with “flash” or “turbo” or “mini” models. Google has taken the gloves off.

The previous small version, Gemini—2.5 Flash, was bad at tool calling, but Gemini 3 Flash is very good.

It’s also natively multimodal, which means it can understand images and videos and pull context from them using queries.

It’s really worth looking into the Gemini 3 product release page and watching the demo videos in particular.
https://blog.google/products-and-platforms/products/gemini/gemini-3-flash/

Gemini Integration Throughout Google
Google is starting to integrate Gemini 3 into all of its products, so that Gemini can leverage those products as tools when you have a query. Google has so many products…Search, Google Maps, Google Docs, Gmail, Calendar, etc…that are already built, owned, and operated by Google, and it can connect all of those things to Gemini.

Gemini can embed deeply into the core experience of most people’s search patterns, like looking for a restaurant, booking a flight, finding a place on a map, or getting directions. Google’s example of the integration is a user querying a hike that’s stroller-friendly in San Francisco, with easy access to restaurants. Google can come back with not only the hiking recommendation, but also the maps, pictures, and a lot of context, all using Google’s owned products.

I’m pretty sure Google is leveraging Gemini 3 Flash, which is a nice segue from the story above this one. Google is absolutely fighting for their life, and in my opinion, coming back very strong. If you asked me who the frontrunner was at the end of 2025, I’d say Google.

Commentary About Google’s Laser Focus
“Got Sergey-pilled recently. Google invented Transformer, OpenAI ran with it. Sergey owned the mistake, slammed the gas on Gemini, cut through big-corp BS with his super voting power, and forced Google back into startup mode. Founder mode matters.” https://x.com/Yuchenj_UW/status/2000435232089207179

Gemini Image + AI Testing
Wharton professor Ethan Mollick consistently has the clever examples of what I call “soft benchmarks,” where he comes up with creative tests to see if an AI can handle nuance. Gemini 3 came out a couple weeks ago, and people have had a chance to test it. In particular, the integration with Gemini’s image tool is very powerful for graphics.

Ethan created two clever tests:

One was a Venn diagram test, where he asked: “Do a very novel and clever and funny Venn diagram. Think hard. Do not do research.” And I have to say, it’s a pretty spectacular Venn diagram—very creative—with an overlap of TikTok influencers, Victorian child ghosts, and cats. https://x.com/emollick/status/2000805347590856822

His second test was for Gemini 3 to create a subway map for Middle-earth from The Lord of the Rings. https://x.com/emollick/status/1999930443001737700

Gemini For Visual Reporting
“Bring your research to life with integrated visual reports from Gemini Deep Research.”

Google Gemini can build presentations to visualize complex information and reporting. Gemini’s Deep Research tool can now integrate spreadsheets and graphs, pie charts, and interactive diagrams. It’s sort of the serious side of that tool integration we talked about earlier in this week’s newsletter.
https://blog.google/products-and-platforms/products/gemini/visual-reports/

Google expands Gemini with NotebookLM integration
“NotebookLM integration is rolling out to Gemini web, letting users attach notebooks as live data sources for conversations, supporting both free and paid accounts.”
https://www.testingcatalog.com/google-expands-gemini-with-notebooklm-integration/

Gemini Agent
“Gemini Agent can help tackle all sorts of tasks. Even renting a car. Tell Gemini Agent your budget and it’ll get to work comparing prices, gathering info from your inbox, and booking the car. Now available for Google AI Ultra users in the US on desktop and mobile.” https://x.com/GeminiApp/status/2000616120106221781

Video As World Simulation Tools for Robotics
Evaluating Gemini Robotics Policies in a Veo World Simulator
https://veo-robotics.github.io/

Meta

NYT Falsely Publishes Alexandr Wang Is Not an Engineer: Perception v. Reality
Alexandr Wang has gotten quite a bit of press as one of the highest-paid AI executives. His compensation package with Meta exceeded $100 million in bonuses after his company Scale AI was acquired for a little under $15 billion. He is 28 years old and worth about $2 billion.

Alexandr’s in the news as an entrepreneur, and was considered an acquirer when Meta invested in Scale AI. The New York Times wrote that Alexandr was not an engineer, but rather a “well-connected AI entrepreneur”.

In fact, Alexandr was a coding prodigy at 17 years old and was ranked as the 10th best competitive programmer in the United States before he turned 18. https://x.com/justindross/status/2001148079341174875 https://x.com/alexandr_wang/status/2001217783497945140

Nvidia

NVIDIA Debuts Nemotron 3 Family of Open Models
“The Nemotron 3 family of open models — in Nano, Super and Ultra sizes — introduces the most efficient family of open models with leading accuracy for building agentic AI applications.”

“As organizations shift from single-model chatbots to collaborative multi-agent AI systems, developers face mounting challenges, including communication overhead, context drift and high inference costs. In addition, developers require transparency to trust the models that will automate their complex workflows. Nemotron 3 directly addresses these challenges, delivering the performance and openness customers need to build specialized, agentic AI.”
https://nvidianews.nvidia.com/news/nvidia-debuts-nemotron-3-family-of-open-models

OpenAI

GPT-5.2 Feedback
Last week, OpenAI released GPT-5.2, which was specialized (especially for API users) to assist professional knowledge work. I covered it at length last week. However, there’s some additional feedback coming in…
https://openai.com/index/introducing-gpt-5-2/

On the first day of its release, GPT-5.2 exceeded a trillion tokens in the API. The ARC Prize compared GPT-5.2 Pro against last year’s o3 model. There was a 390x efficiency improvement in the last 12 months!!

The GPT-Val benchmark for AI v. Humans at knowledge work shows that GPT-5.2 is at 71% for tasks that would require a human four to eight hours. In other words, GPT-5.2 beats humans 71% of the time! The previous model, GPT-5.1, only beat a human 38% of the time.

Ethan Mollick demonstrated 5.2 Thinking as a second-opinion and fact-checking assistant. He gave it a very dense paragraph with a few errors that required research to confirm, and GPT-5.2 was able to identify and correct all of the minor problems.

Fidji Simo on X: “GPT-5.2 is here and it’s the best model out there for everyday professional work. On GDPval, the thinking model beats or ties human experts on 70.9% of common professional tasks like spreadsheets, presentations, and document creation. It’s also better at general intelligence,”
https://x.com/fidjissimo/status/1999183159356006450

Sam Altman on X: “GPT-5.2 exceeded a trillion tokens in the API on its first day of availability and is growing fast!”
https://x.com/sama/status/1999624463013544024

ARC Prize on X: “A year ago, we verified a preview of an unreleased version of @OpenAI o3 (High) that scored 88% on ARC-AGI-1 at est. $4.5k/task Today, we’ve verified a new GPT-5.2 Pro (X-High) SOTA score of 90.5% at $11.64/task This represents a ~390X efficiency improvement in one year
https://x.com/arcprize/status/1999182732845547795

Title: Ethan Mollick on X: “Whoa. This new GDPval score is a very big deal. Probably the most economically relevant measure of AI ability suggesting that in head-to-head competition with human experts on tasks that require 4-8 hours for a human to do, GPT-5.2 wins 71% of the time as judged by other humans https://t.co/M8NqSUXl6X” / X URL: https://x.com/emollick/status/1999189828756263359

Ethan Mollick on X: “I have found GPT-5.2 Thinking to be a surprisingly deep second-opinion/fact checker. I gave it a dense paragraph with a few correct claims, a couple errors that required research to find, and some things that needed interpretation It found and gently corrected all the problems
https://x.com/emollick/status/2000666007010971787

OpenAI Rolls Back ChatGPT’s Model Router System for Most Users
“As OpenAI scrambles to improve ChatGPT, it’s ditching a feature in its free tier that contributed to last summer’s user revolt.”

One of the big features of ChatGPT 5.2 was an automatic routing system that tried to optimize the balance of the “Pareto frontier”…getting the best response in the shortest amount time.

Since an AI model can handle a wide variety of tasks, there are certain prompts that take a lot of thinking and others that can be more off-the-cuff. To balance this, OpenAI introduced a router that essentially reads your query and tries to assign it to the right “level of thinking” to match your intention. So “2 + 2 = 4” should be really fast, but if you want to solve a math question like Gödel’s incompleteness theorem, maybe that will take a few days (I kid).

The router was erring on the side of too much thinking and people got sick of waiting. Occassionally, it would be too quick and give a bubblegum answer for a serious question. I think the router was a great idea, but out of the gate it just wasn’t calibrated well.

I don’t use the router at all. I use 5.2 Thinking for everything, with the exception of really deep stuff where I’ll use Pro. https://www.wired.com/story/openai-router-relaunch-gpt-5-sam-altman/

OpenAI in talks with Amazon about investment that could exceed $10 billion
“OpenAI is in discussions with Amazon about a potential investment and an agreement to use its artificial intelligence chips, CNBC confirmed on Tuesday.”

“Microsoft has invested more than $13 billion in OpenAI and backed the company since 2019, but it no longer has a right of first refusal to be OpenAI’s compute provider, according to an October release. OpenAI can now also develop some products with third parties.”

“Amazon has invested at least $8 billion into OpenAI rival Anthropic, but the e-commerce giant could be looking to expand its exposure to the booming generative AI market. Microsoft has taken a similar step and announced last month that it will invest up to $5 billion into Anthropic, while Nvidia will invest up to $10 billion in the startup.”

“Amazon Web Services has been designing its own AI chips since around 2015, and the hardware has become crucial for AI companies that are trying to train models and meet growing demand for compute. AWS announced its Inferentia chips in 2018, and the latest generation of its Trainium chips earlier this month.”

“OpenAI has made more than $1.4 trillion of infrastructure commitments in recent months, including agreements with chipmakers Nvidia, Advanced Micro Devices and Broadcom. Last month, OpenAI signed a deal to buy $38 billion worth of capacity from AWS, its first contract with the leader in cloud infrastructure leader.”
https://www.cnbc.com/2025/12/16/openai-in-talks-with-amazon-about-investment-could-top-10-billion.html

Edit with Photoshop in ChatGPT
“Today, we’re empowering anyone to edit with Photoshop directly in ChatGPT, making this iconic creativity tool accessible for everyone. If you have an idea, you can use Photoshop for ChatGPT to make edits to photos, just by describing what you want in your own words.” https://blog.adobe.com/en/publish/2025/12/10/edit-photoshop-chatgpt

Apple Music is coming to ChatGPT, OpenAI announces
“If Spotify’s ChatGPT app is anything to go by, this is great news for Apple Music subscribers In a Substack post published earlier today, Fidji Simo, OpenAI’s CEO of applications, said that Apple Music is among the upcoming partners that will integrate with ChatGPT.”

“Last October, OpenAI introduced apps in ChatGPT, with the first round of partnerships and integrations including Spotify, Booking.com, Canva, Coursera, Figma, Expedia, and Zillow. Back then, OpenAI also released a preview of the Apps SDK, which would soon let developers integrate their own apps into ChatGPT.”

“Soon, according to Simo, “even more apps will be available in a new directory, including Adobe, Airtable, Apple Music, Clay, Lovable, OpenTable, Replit, and Salesforce, and other developers will be able to submit their apps for review.”
https://9to5mac.com/2025/12/16/apple-music-is-coming-to-chatgpt-openai-announces/

Developers can now submit apps to ChatGPT
“Earlier this year at DevDay, we introduced apps in ChatGPT. Starting today, developers can submit apps for review and publication in ChatGPT by following our app submission guidelines⁠(opens in a new window). Apps extend ChatGPT conversations by bringing in new context and letting users take actions like order groceries, turn an outline into a slide deck, or search for an apartment. We’ve published resources to help developers build high-quality apps that users will love—based on what we’ve learned since DevDay—like best practices on what makes a great ChatGPT app⁠(opens in a new window), open-source example apps⁠(opens in a new window), an open-sourced UI library⁠(opens in a new window) for chat-native interfaces, and a step-by-step quickstart guide⁠(opens in a new window).

We’re also introducing an app directory right inside ChatGPT, where users can browse featured apps or search for any published app. The app directory is discoverable from the tools menu or directly from chatgpt.com/apps. Developers can also use deep links on other platforms to send users right to their app page in the directory.”
https://openai.com/index/developers-can-now-submit-apps-to-chatgpt/

Interview with OpenAI’s Atlas Lead, Ben Goodger
“If AGI is going to take meaningful action in the world, it needs a body – and Ben Goodger, head of engineering for ChatGPT Atlas at OpenAI, argues that the browser is the first real one.” https://www.youtube.com/watch?v=Oko9NFHGw3k

Introducing GPT-5.2-Codex
“Today we’re releasing GPT‑5.2-Codex, the most advanced agentic coding model yet for complex, real-world software engineering. GPT‑5.2-Codex is a version of GPT‑5.2⁠ further optimized for agentic coding in Codex, including improvements on long-horizon work through context compaction, stronger performance on large code changes like refactors and migrations, improved performance in Windows environments, and significantly stronger cybersecurity capabilities.” https://openai.com/index/introducing-gpt-5-2-codex/

OpenAI: The shift from text to more dynamic AI experiences
“Over the past few months, I’ve talked about how ChatGPT is evolving from a reactive, text-based product into something more intuitive and connected to any of the tasks you want to accomplish. The shift from text to multimedia and dynamic UI is an important part of that transformation, and I’m excited about the progress we’re making.”

“There are many other use cases that can benefit from interfaces that go beyond text. For example, when you’re researching products or restaurants, you don’t just want a report describing options; you want to see photos and side-by-side specs that help you decide. When you’re learning about new topics, you want to be able to go deeper without losing your place in a thread. We’re improving answers to bring in more visuals with clear sources and adding new ways to get additional context. Coming soon, answers will start to highlight important people, places and products, which you can tap to instantly pull up more information without asking a follow up question. You’ll be able to highlight any word or phrase in an answer, and ChatGPT will tell you more about it.”

“The same idea applies to other everyday tasks. For things like converting measurements or getting sports scores, you want a fast visual answer that you can absorb at a glance. (This will be great for my husband, who is often doing both in the kitchen.) We’re rolling out a number of these types of utilities in ChatGPT and will continue to add more over time.”

For the past few weeks, I’ve been talking incessantly about how Google is starting to blur the lines between predetermined user interfaces and dynamically generated user experiences. Last week, I talked a lot about how the internet introduced dynamic content and then shifted to personalization, and now we’re shifting to personalized interfaces.

Right on cue, Fidji Simo from OpenAI posted a Substack article where she talks directly about how ChatGPT is going to evolve from a reactive, text-based product into a multimedia, dynamic user interface—and that this will be part of the transformation.
https://fidjisimo.substack.com/p/more-dynamic-ai-experiences

Enterprise Will Be a Top OpenAI Priority In 2026, Sam Altman Tells Editors at NYC Lunch
“Altman’s plans for OpenAI’s enterprise push was the biggest revelation from the lunch, details of which multiple people with knowledge of the discussion relayed to me. Under Altman, OpenAI has excelled at building consumer products, with ChatGPT approaching 900 million weekly users. But the company has faced fierce competition when selling its AI models to businesses, primarily from Anthropic, which is leading the enterprise AI market. At the lunch, Altman made clear that selling to enterprises was a massive OpenAI priority, and mentioned that it was an application problem, not a training problem, that the company needed to solve.” https://www.bigtechnology.com/p/enterprise-will-be-a-top-openai-priority

GPT Image 1.5: The new ChatGPT Images is here
I feel badly because I’m about to blow off a pretty big product announcement. OpenAI released their latest image generation tool, GPT Image 1.5.

The only reason I’m going to blow it off is because it feels like a second place version of Gemini NanoBanana. It does a lot of the same things NanoBanana does. That’s fantastic. But it’s not pioneering or as good (sorry).

GPT Image 1.5 allows for precision editing, so you can use plain language to add, subtract, combine, blend, or transpose things very specifically in an image without changing the rest of the image’s consistency. Back in the day, that was considered inpainting, but now it’s officially moved on to “editing” (likely powered in part by my favorite topic, segmentation).

Like NanoBanana, complex compositions are now supported, including elements that would require math or multiple elements. For example, drawing 6×6 grids or a certain number of columns. And much like NanoBanana, the text accuracy is way higher, so paragraphs worth of text can be maintained with no artifacting.

The color palettes are a better too, with more realistic colors as opposed to the sepia or blue tones. But again: while it’s a really great update, to me it just feels like OpenAI finally caught up to NanoBanana.

If you use an API, it’s 20% cheaper than the previous version. I’ll share some examples, but I recommend looking at the product release page. https://openai.com/index/new-chatgpt-images-is-here/

GPT Image 1.5 achieves both #1 in Text to Image and Image Editing in the Artificial Analysis Image Arena, surpassing Nano Banana Pro GPT Image 1.5 is OpenAI’s newest flagship image generation model, demonstrating improved image quality and prompt fidelity relative to earlier https://x.com/ArtificialAnlys/status/2001016199094948185

https://twitter.com/grx_xce/status/2000993261914350070/photo/1

This is the biggest jump in Image Arena that we’ve seen since Nano Banana GPT-Image-1.5 has taken #1 on Image Arena with a significant lead Huge congratulations to the team at @OpenAI for this achievement! https://x.com/grx_xce/status/2000993261914350070

Evaluating AI’s ability to perform scientific research tasks | OpenAI
“We introduce FrontierScience, a new benchmark that evaluates AI capabilities for expert-level scientific reasoning across physics, chemistry, and biology.”

“The full FrontierScience evaluation spans over 700 textual questions (with 160 in the gold set) covering subfields across physics, chemistry, and biology. The benchmark is composed of an Olympiad and a Research split. FrontierScience-Olympiad contains 100 questions designed by international olympiad medalists to assess scientific reasoning in a constrained, short answer format. The Olympiad set was designed to contain theoretical questions at least as difficult as problems at international olympiad competitions. FrontierScience-Research consists of 60 original research subtasks designed by PhD scientists (doctoral candidates, professors, or postdoctoral researchers) that are graded using a 10-point rubric. The Research set was created to contain self-contained, multi-step subtasks at the level of difficulty that a PhD scientist might encounter during their research.” https://openai.com/index/frontierscience/

Qwen

Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
This is one of the coolest demos I’ve seen in a while. Segmentation on steroids. “Today, we are excited to introduce Qwen-Image-Layered, a model capable of decomposing an image into multiple RGBA layers. This layered representation unlocks inherent editability: each layer can be independently manipulated without affecting other content. Meanwhile, such a layered representation naturally supports high-fidelity elementary operations-such as resizing, reposition, and recoloring. By physically isolating semantic or structural components into distinct layers, our approach enables high-fidelity and consistent editing.” https://qwen.ai/blog?id=qwen-image-layered

Terms to Know – 2025 End of Year Edition

Concepts and Methods you HAVE to Know About -> AI 101 Recap
“AI concepts you HAVE to know about at the end of 2025 – Reinforcement learning – RLHF variations: DPO, RRHF, RLAIF – Continual learning – Test-time scaling – Neuro-Symbolic AI – Hardware that powers AI: GPU, CPU, TPU, ASICs, APU, NPUs and others”
https://www.turingpost.com/p/2025-concept-method-recap

Thinking Machines

Tinker: General Availability and Vision Input
Mira Murati is well known as the former Chief Technology Officer at OpenAI. Before that, she was a Senior Product Manager at Tesla. After she left OpenAI, she founded the company Thinking Machines Lab in February of 2025. Thinking Machines ran under the radar for quite some time. However, before it even had a product, it was valued at $12 billion, thanks to investors from across Silicon Valley.

“Mira Murati (ex-OpenAI CTO) unveiled Thinking Machines Lab The startup focuses on making AI more adaptable, capable, and open (where OpenAI has struggled) They’ve already been hiring researchers from OpenAI, Mistral, DeepMind, and more” https://x.com/adcock_brett/status/1893708446177874325″

Back in October 2025, Thinking Machines launched its first premier product called Tinker, an API that can be used to fine-tune open sourced language models. It allows researchers to experiment without having to build or host their own server clusters. https://thinkingmachines.ai/tinker/ https://github.com/thinking-machines-lab/tinker-cookbook

Before Tinker was announced to the public, a theorem-proving team at Princeton Gödel used Tinker to train mathematical theorem-provers. A chemistry reasoning group at Stanford fine-tuned a model to handle chemistry tasks.

The value of Tinker is that it lowers the barrier to advanced AI research, because in the past only well-funded labs or large companies could afford to fine-tune big models. Tinker allows this to happen as an API.

“Today we are announcing four updates to Tinker:

No more waitlist
New reasoning model: Kimi K2 Thinking
New inference interface that is compatible with the OpenAI
API Vision input support with Qwen3-VL

Twitter/X/Grok

Grok Voice Agent API
This was a big week for Grok. They launched their voice agent API, which is a speech-to-speech model that allows developers to build voice agents that can speak dozens of languages, use different tools, and search real-time data. It’s very fast and can respond in under a second. It can also handle dozens of languages and even switch mid-sentence, with accents and tone intact.
https://x.ai/news/grok-voice-agent-api

It’s priced at 5 cents per minute—or $3 per hour—which seems pretty cheap. It’s the top scorer on the benchmark BigBench Audio at 92.3%. It can connect to telecom systems, and it includes some nice pseudocode for scripting emotions—whispering, sighing, laughter—using audio cues in the script.

It also has domain-specific training that allows the agent to assume personas that match an industry, like customer support, finance, healthcare, or legal. There’s a sandbox you can play in that looks very straightforward.

I have to say, this is the first Grok product that I’m actually excited about—and I may even trust that it would work as promised.

https://x.com/MarioNawfal/status/2001472484869329288 https://x.com/ArtificialAnlys/status/2001388724987527353 https://console.x.ai/team/7675e953-4b20-4d88-bd86-d375a3a6e191

Video

Kling 2.6 Motion Control
Newly upgraded Motion Control is now live in Kling VIDEO 2.6!
Full-Body Motions — Body movements captured in stunning detail
Fast & Complex Actions — From martial arts to dances, nothing moves too fast
Flawless Hand Moves — Precise gestures, zero blur
Expressive Faces — Expressions and lip sync, perfectly preserved and naturally alive
You can even upload 3–30s motion references for uninterrupted sequences, and fine-tune scene details via text prompts.
https://app.klingai.com/global/quickstart/motion-control-user-guide

Full Executive Summaries with Links, Generated by Claude 4.5 Sonnet

Amazon elevates 27-year veteran to lead unified AI division
CEO Andy Jassy appointed Peter DeSantis to head a new organization combining AI models, custom silicon, and quantum computing, signaling Amazon’s bet on vertical integration to compete with Microsoft and Google. This marks a strategic shift from treating AI as solely an AWS service to an Amazon-wide priority, with DeSantis bringing together chip development, frontier AI research, and long-term quantum bets under unified leadership. The move positions Amazon to optimize across hardware and software like Apple, while leveraging its massive robotics deployment as a testing ground for embodied AI applications.

Amazon’s big AGI reorg decoded by Corey Quinn • The Register https://www.theregister.com/2025/12/17/jassy_taps_peter_desantis_to_run_agi/

Peter DeSantis to lead Amazon’s AI models, silicon, and quantum computing efforts https://www.aboutamazon.com/news/company-news/andy-jassy-peter-desantis-amazon-leadership-update

Anthropic launches task-focused agent mode with five specialized workflows
The company is testing a new interface that transforms Claude from a chatbot into a structured work assistant, offering dedicated modes for research, analysis, writing, building, and general tasks with granular controls and progress tracking—marking a shift toward AI agents that can manage complete workflows rather than just answer questions.

Anthropic preparing new Agentic Tasks Mode for Claude https://www.testingcatalog.com/anthropic-testing-new-agentic-tasks-mode-for-claude/

Claude’s time horizon planning abilities were underestimated due to task flaws
Anthropic discovered two issues in their time horizon benchmark that artificially lowered Claude’s performance, with one bug specifically affecting Claude models more than others. After fixing these problems, Claude Sonnet 4.5’s planning abilities improved by roughly 20 minutes, suggesting the model’s capacity for long-term reasoning tasks was previously undervalued due to measurement errors rather than actual capability limitations.

We’re working on updating and improving our time horizon task suite. Recently, we found two issues with our tasks, one of which was differentially lowering the performance of Claude models. We think these also illustrate some interesting model behavior.”” / X https://x.com/METR_Evals/status/2001473506442375645

Sonnet 4.5 was underestimated on METR its time horizon improves around 20 minutes https://x.com/scaling01/status/2001476927362605354

Doctors embrace AI with 67% using it daily for better patient care
A new survey reveals physicians are rapidly adopting AI tools, with 84% reporting improved performance and 42% saying AI makes them want to stay in medicine longer. The technology is primarily helping with administrative tasks and research assistance, suggesting AI could address physician burnout while enhancing medical practice quality.

2025 Physicians AI report https://2025-physicians-ai-report.offcall.com/

Surprisingly rapid & high Ai adoption by doctors: 67% use it daily, 84% says it makes them better doctors, 42% says it makes them want to stay in medicine more (10% said less). A lot of the use cases appear to be administrative and research assistance. https://x.com/emollick/status/2001061282485547116

AI models now pass all levels of the challenging CFA exam
All major AI systems achieved passing scores on new, paywalled CFA mock exams designed to prevent data contamination, with AI graders evaluating essay portions. This marks a significant milestone in AI’s financial reasoning capabilities, as the CFA is considered one of the most rigorous professional finance certifications. Notably, different prompting techniques made little difference in performance across question types.

All the frontier AIs now pass all levels of the very challenging Chartered Financial Analyst (CFA) exam The paper used paywalled, new mock exams to reduce the risk of leakage but AI grading for the essays. Interestingly, prompting strategy doesn’t matter for most question types https://x.com/emollick/status/2000605774695837711

Reasoning Models Ace the CFA Exams https://arxiv.org/pdf/2512.08270

BBVA partners with OpenAI to deploy AI across global banking operations
Spanish banking giant BBVA announced a strategic partnership with OpenAI to integrate AI tools across its worldwide operations, from customer service to risk management. This marks one of the first comprehensive AI transformations by a major global bank, potentially setting a template for how traditional financial institutions adopt generative AI at scale. The collaboration could accelerate AI adoption across the conservative banking sector, where regulatory concerns have previously slowed implementation.

BBVA and OpenAI collaborate to transform global banking | OpenAI https://openai.com/index/bbva-collaboration-expansion/

Databricks raises $4 billion at $134 billion valuation for AI development
The data analytics company’s valuation jumped 34% in four months as it positions itself as the platform for building AI agents that can perform work tasks. With $4.8 billion in annual revenue and 55% growth, Databricks exemplifies how AI infrastructure companies are commanding massive private market premiums while delaying public offerings. The funding reflects investor appetite for picks-and-shovels AI plays rather than just model developers.

Databricks raises capital at $134 billion valuation https://www.cnbc.com/2025/12/16/databricks-funding-valuation.html

Meta’s AI chief Yann LeCun seeks $3.5 billion startup valuation
The “godfather of deep learning” is raising $586 million for a new company focused on building AI systems that understand the physical world, potentially fueling concerns about an AI investment bubble. LeCun’s departure from Meta and the startup’s pre-launch valuation highlight how top AI talent is commanding unprecedented investor interest, with applications planned for robotics and transportation.

Meta’s Yann LeCun targets $3.5 billion valuation for new AI startup, FT reports https://finance.yahoo.com/news/metas-yann-lecun-targets-3-110641727.html

Fastweb builds AI agent handling 9.5 million telecom customers with 90% accuracy
The Italian telecom provider’s “Super TOBi” system achieves 82% problem resolution rates, demonstrating that AI agents can now handle complex customer service at massive European scale. This marks a significant leap from basic chatbots to AI that actually solves problems rather than just answering questions.

Fastweb + Vodafone (Swisscom Group), one of Europe’s leading telecom providers, is building Super TOBi, which brings agentic customer service to massive scale. Using LangSmith, they are: 🔹Achieving 90% response correctness and 82% resolution rates across ~9.5M customers https://x.com/LangChain/status/2001321491703443877

Black Forest Labs launches FLUX.2 max with web-connected image generation
The new AI model searches the web in real-time to create images with current information, while maintaining consistency across up to 10 reference photos for characters and products. This “grounded generation” capability distinguishes it from typical AI image tools that rely only on training data, enabling marketers to visualize trending products or current events without manually gathering reference materials.

FLUX.2 [max] – Top-Tier Quality Image Generation | Black Forest Labs https://bfl.ai/models/flux-2-max

FLUX.2 [max] is here Our highest quality model to date. * Grounded generation – searches the web for real-time context. * Up to 10 reference images. Products, characters, styles stay consistent. * #2 on @ArtificialAnlys in text-to-image and image editing. https://x.com/bfl_ml/status/2000945755125899427

I don’t see any actual news content provided in your message – just the phrase “Google Antigravity” which appears to be a reference to Google’s April Fools’ Day joke from 2013 where they claimed to have developed antigravity technology.
Could you please provide the actual AI news articles or content you’d like me to summarize? I need the full text or details of the news items to create the two-line executive summary format you’ve requested.

Google Antigravity https://antigravity.google/

Google upgrades Gemini’s voice AI with real-time translation capabilities
Google enhanced its Gemini 2.5 Flash Native Audio model with sharper function calling (71.5% accuracy on complex tasks) and 90% instruction adherence, up from 84%. The upgrade enables live speech-to-speech translation across 70+ languages while preserving speaker intonation and pitch, now rolling out in Google Translate’s beta. Companies like Shopify and United Wholesale Mortgage report users often forget they’re talking to AI, with UWM generating over 14,000 loans using the technology.

Gemini 2.5 Native Audio upgrade, plus text-to-speech model updates https://blog.google/products-and-platforms/products/gemini/gemini-audio-model-updates/

Listen up 🔊 We’ve made some updates to our Gemini Audio models and capabilities: — Gemini’s live speech-to-speech translation capability is rolling out in a beta experience to the Google Translate app, bringing you real-time audio translation that captures the nuance of human https://x.com/GoogleAI/status/1999560839679082507

Today we are rolling out an updated Gemini Native Audio model, built with 🎙️: – higher precision function calling – better realtime instruction following – smoother and more cohesive conversational abilities Available to developers in the Gemini API right now! https://x.com/OfficialLoganK/status/1999586764382523521

Google launches CC, an AI assistant that reads your email and calendar daily
CC represents Google’s push into personalized AI productivity by automatically scanning Gmail, Calendar, and Drive to deliver morning briefings and respond to user requests via email. This marks a significant step beyond chatbots toward AI agents that proactively manage personal workflows, though it raises privacy questions about Google’s access to sensitive business communications. The tool is part of Google Labs’ broader experimental suite including AI-powered marketing tools and creative applications.

Say hello to CC, a new AI productivity agent that connects your Gmail, Calendar and Drive to deliver a personalized briefing every morning. Need more help? Just email CC https://labs.google/cc

Google launches Gemini 3 Flash with frontier intelligence at breakthrough speed
Google’s new Gemini 3 Flash model delivers Pro-level reasoning at 3x faster speeds and significantly lower costs than previous models, making advanced AI accessible to millions globally. The model outperforms Gemini 2.5 Pro on coding benchmarks while using 30% fewer tokens, and uniquely beats even Gemini 3 Pro on software engineering tasks despite being designed for speed. This represents a major shift in the speed-intelligence tradeoff that has historically limited AI deployment in real-time applications.

After a day of gemini 3 flash in antigravity, I think I’m convinced. It’s really good to have a lightning fast and smart model for daily work. I’ve been pretty adamant that slower is ok if the model is smarter, but the models have produced just slightly too much cruft and I”” / X https://x.com/andrew_n_carr/status/2001487412749570549

For a fast model, Gemini 3 Flash offers incredible performance, allowing us to provide frontier intelligence to everyone globally. Try the ‘fast’ mode from the model picker in the @GeminiApp – it’s shockingly speedy AND smart. Best pound-for-pound model out there ⚡️⚡️⚡️ https://x.com/demishassabis/status/2001325072343306345

For developers, it combines advanced coding skills with the low latency needed for building interactive apps. On SWE-bench Verified – a benchmark for evaluating coding agents – it outperforms not only the 2.5 series, but also Gemini 3 Pro. Watch 3 Flash give near real-time AI https://x.com/GoogleDeepMind/status/2001321765503377546

Gemini 3 Flash gives you frontier intelligence at a fraction of the cost. ⚡ Here’s how it’s built for speed and scale 🧵 https://x.com/GoogleDeepMind/status/2001321759702663544

Gemini 3 flash is a bigger deal than Gemini 3 pro. While 2.5 flash is the most used model this year, but it struggled with tool calling. But Gemini 3 flash gets it. – tool calling feels natural to the model – it’s faster than turbo models + way smarter too (best for real time”” / X https://x.com/0xdevshah/status/2001330346961604732

Gemini 3 Flash is beating 3 Pro on SWE bench verified Hmm what https://x.com/MS_BASE44/status/2001698991801798927

Gemini 3 Flash is starting to roll out in the @GeminiApp and across Google products. Learn more ↓ https://x.com/Google/status/2001746491275083925

Gemini 3 Flash punches way above its weight class, surpassing 2.5 Pro on many benchmarks, while being much cheaper, faster, and more token efficient. https://x.com/OfficialLoganK/status/2001323840459456715

Gemini 3.0 Flash is an absolutely fantastic release. Consider this: It costs a quarter (1/4) of what Gemini 3.0 Pro costs and achieves similar results to the Pro model in almost all benchmarks, such as HLE and ARC-AGI 2. In other benchmarks, it even outperforms the more https://x.com/kimmonismus/status/2001326181875154983

Google has released Gemini 3 Flash Preview – 2x cheaper than Gemini 3 Pro Preview, with only a 2-point drop in our Intelligence Index, making it the most intelligent model for its price range @GoogleDeepMind gave us pre-release access to Gemini 3 Flash Preview. The model scores https://x.com/ArtificialAnlys/status/2001335953290670301

Introducing Gemini 3 Flash: Benchmarks, global availability https://blog.google/products-and-platforms/products/gemini/gemini-3-flash/

Google integrates Maps data directly into Gemini chat responses
Google’s Gemini AI assistant now displays local business information with photos and ratings inline during conversations, eliminating the need to switch between apps. This marks a shift toward AI assistants becoming comprehensive information hubs rather than simple chatbots, potentially reducing direct visits to specialized services like Maps and review sites.

Starting today, Gemini can serve up local results in a rich, visual format. See photos, ratings, and real-world info from @GoogleMaps, right where you need them.”” / X https://x.com/GeminiApp/status/1999631529379791121

Google launches Opal mini app builder inside Gemini web interface
Users can now create custom AI-powered mini apps directly within Gemini’s web platform, moving beyond basic chatbot interactions to build reusable tools with visual editing capabilities. This represents Google’s push to transform Gemini from a simple AI assistant into a customizable app development platform, potentially competing with no-code tools by making AI app creation accessible to non-programmers.

Build mini apps with Opal in the Gemini web app https://blog.google/innovation-and-ai/models-and-research/google-labs/mini-apps-opal-gemini-app-experiment/

Gemini 3 creates surprisingly coherent creative content from unusual prompts
Google’s latest model successfully generated a Middle Earth subway map with contextual details like “service suspended – Balrog” at Moria, and attempted complex Venn diagrams, showing improved creative reasoning beyond typical AI text generation. While not perfect, these examples suggest large language models are getting better at synthesizing fictional worlds and visual concepts through text descriptions.

Gemini 3, create a really novel and clever and funny Venn diagram. think hard. do not do research.”” So close to coming together (I am not sure the center works for all three, illustrations are odd), but also better than I expected. https://x.com/emollick/status/2000805347590856822

Gemini 3, please provide the rail/subway map for Middle Earth in the third age, with accurate stops and taking into account natural barriers, alliances, and so on.”” Not bad. I do like the “”service suspended – Balrog”” note at Moria. https://x.com/emollick/status/1999930443001737700

Google’s Gemini now creates visual reports with charts and interactive simulations
Gemini Deep Research, available to Ultra subscribers, automatically generates custom images, charts, and dynamic models alongside text analysis. This moves beyond typical AI text generation to create comprehensive visual reports that users can interact with to test different scenarios and outcomes. The feature transforms complex data analysis into accessible visual formats, marking a shift toward more multimedia AI assistance tools.

Gemini can now illustrate a visual report https://blog.google/products-and-platforms/products/gemini/visual-reports/

Google connects Gemini chatbot directly to NotebookLM research tool
Google is rolling out integration that lets users attach NotebookLM notebooks as live data sources within Gemini conversations, available to both free and paid accounts. This marks a strategic shift toward a unified AI ecosystem where Google’s tools work together seamlessly, positioning Gemini as a central hub for knowledge work. The feature includes citation linking that takes users directly to specific notebook entries, making research and reference workflows more fluid.

Google expands Gemini with NotebookLM integration https://www.testingcatalog.com/google-expands-gemini-with-notebooklm-integration/

Google launches AI agent that books rental cars automatically
Google’s new Gemini Agent can search prices, check your email for details, and complete car rental bookings on your behalf, marking a shift from chatbots that just answer questions to AI systems that take actions in the real world. The service is currently limited to Google’s premium AI Ultra subscribers in the US, suggesting this autonomous task completion represents a new tier of AI capability that companies view as premium features.

Gemini Agent can help tackle all sorts of tasks. Even renting a car. Tell Gemini Agent your budget and it’ll get to work comparing prices, gathering info from your inbox, and booking the car. Now available for Google AI Ultra users in the US on desktop and mobile.”” / X https://x.com/GeminiApp/status/2000616120106221781

Google DeepMind tests robot safety using AI-generated virtual worlds
The company created a new system that uses its Veo video generator to simulate realistic environments where robots can be safely tested before real-world deployment. This addresses a critical challenge in robotics development: how to evaluate potentially dangerous robot behaviors without risking actual damage to property or people.

Generalist robots need a generalist evaluator. But how do you test safety without breaking things? 💥 🌎 Introducing our new work from @GoogleDeepMind: Evaluating Gemini Robotics Policies in a Veo World Simulator https://x.com/Majumdar_Ani/status/1999525259276423569

Google co-founder Sergey Brin pushes company back into AI race
Brin reportedly used his voting control to accelerate Google’s Gemini development after acknowledging the company lost early AI leadership to OpenAI, despite Google inventing the underlying transformer technology. This represents a rare case of a tech founder leveraging special voting rights to override corporate inertia and force rapid strategic pivots in response to competitive threats.

Got Sergey-pilled recently. Google invented Transformer, OpenAI ran with it. Sergey owned the mistake, slammed the gas on Gemini, cut through big-corp BS with his super voting power, and forced Google back into startup mode. Founder mode matters.”” / X https://x.com/Yuchenj_UW/status/2000435232089207179

OpenAI CEO admits New York Times didn’t verify his coding claims
Sam Altman acknowledged on social media that the New York Times failed to fact-check his assertion that he can code, highlighting potential gaps in media verification of tech leader credentials. This admission raises questions about journalistic rigor when covering AI executives and their technical backgrounds, particularly as these leaders shape public understanding of artificial intelligence capabilities.

it’s true, i can code nyt didn’t fact check that one 🤷‍♂️”” / X https://x.com/alexandr_wang/status/2001217783497945140

Meta releases SAM Audio to isolate any sound from audio mixtures
Meta’s SAM Audio represents the first unified model that can extract specific sounds from complex audio using text descriptions, visual cues, or time-based prompts. This breakthrough enables applications like removing background noise from recordings or isolating individual instruments from music tracks. The company is open-sourcing the technology along with benchmarks and research papers, potentially accelerating development across audio editing, accessibility tools, and content creation industries.

🔉 Introducing SAM Audio, the first unified model that isolates any sound from complex audio mixtures using text, visual, or span prompts. We’re sharing SAM Audio with the community, along with a perception encoder model, benchmarks and research papers, to empower others to https://x.com/AIatMeta/status/2000980784425931067

SAM Audio https://ai.meta.com/samaudio/

sam-audio – a facebook Collection https://huggingface.co/collections/facebook/sam-audio

Molmo 2 delivers state-of-the-art video understanding with precise pointing and tracking
AI2 released Molmo 2, an open-source model that can analyze videos and pinpoint specific objects, actions, and events with timestamps and coordinates. The 8-billion parameter model outperforms much larger competitors including Gemini 3 Pro on video tracking while training on just one-eighth the video data used by Meta’s comparable system. This breakthrough makes sophisticated video analysis accessible to researchers and developers who previously needed massive proprietary systems.

(13) Molmo 2 | Complex video question answering – YouTube https://www.youtube.com/watch?v=Ej3Hb3kRiac

(13) Molmo 2 | Counting objects and actions – YouTube https://www.youtube.com/watch?v=fvYfPTTTZ_w

(13) Molmo 2 | Video Tracking – YouTube https://www.youtube.com/watch?v=uot140v_h08

Molmo 2: State-of-the-art video understanding, pointing, and tracking | Ai2 https://allenai.org/blog/molmo2

NVIDIA releases open-source Nemotron 3 language models for developers
NVIDIA launched its Nemotron 3 family of open-source language models, marking the chip giant’s direct entry into competing with OpenAI and Google in providing accessible AI tools. The move signals NVIDIA’s strategy to expand beyond selling hardware into offering complete AI solutions, potentially reshaping the competitive landscape by giving developers free alternatives to proprietary models. This represents a significant shift as the company that powers most AI training now also provides the models themselves.

NVIDIA Debuts Nemotron 3 Family of Open Models | NVIDIA Newsroom https://nvidianews.nvidia.com/news/nvidia-debuts-nemotron-3-family-of-open-models

OpenAI’s GPT-5.2 beats human experts on 71% of professional tasks
The new model outperforms humans on common workplace activities like creating spreadsheets and presentations, while becoming 390 times more cost-efficient than previous versions at complex reasoning tasks. This marks a potential inflection point where AI systems can reliably handle the bulk of knowledge work that typically requires 4-8 hours of human effort.

A year ago, we verified a preview of an unreleased version of @OpenAI o3 (High) that scored 88% on ARC-AGI-1 at est. $4.5k/task Today, we’ve verified a new GPT-5.2 Pro (X-High) SOTA score of 90.5% at $11.64/task This represents a ~390X efficiency improvement in one year https://x.com/arcprize/status/1999182732845547795

GPT-5.2 exceeded a trillion tokens in the API on its first day of availability and is growing fast!”” / X https://x.com/sama/status/1999624463013544024

GPT-5.2 is here and it’s the best model out there for everyday professional work. On GDPval, the thinking model beats or ties human experts on 70.9% of common professional tasks like spreadsheets, presentations, and document creation. It’s also better at general intelligence,”” / X https://x.com/fidjissimo/status/1999183159356006450

I have found GPT-5.2 Thinking to be a surprisingly deep second-opinion/fact checker. I gave it a dense paragraph with a few correct claims, a couple errors that required research to find, and some things that needed interpretation It found and gently corrected all the problems https://x.com/emollick/status/2000666007010971787

Whoa. This new GDPval score is a very big deal. Probably the most economically relevant measure of AI ability suggesting that in head-to-head competition with human experts on tasks that require 4-8 hours for a human to do, GPT-5.2 wins 71% of the time as judged by other humans https://x.com/emollick/status/1999189828756263359

OpenAI seeks $10 billion Amazon investment tied to chip usage deal
OpenAI is negotiating a potential $10+ billion investment from Amazon that would include using Amazon’s AI chips, marking a significant diversification from its Microsoft partnership after October restructuring gave it more freedom to raise capital and work with other cloud providers.

OpenAI in Talks to Raise At Least $10 Billion From Amazon and Use Its AI Chips — The Information https://www.theinformation.com/articles/openai-talks-raise-least-10-billion-amazon-use-ai-chips

OpenAI in talks with Amazon about investment could top $10 billion https://www.cnbc.com/2025/12/16/openai-in-talks-with-amazon-about-investment-could-top-10-billion.html

Apple Music integration arrives in ChatGPT for music discovery
OpenAI partnered with Apple to let ChatGPT users search and play Apple Music directly through the chatbot, marking the first major music streaming service to integrate with ChatGPT. This represents a significant expansion beyond text-based AI assistance into entertainment services, potentially setting a precedent for how AI assistants will integrate with consumer media platforms.

Apple Music is coming to ChatGPT, OpenAI announces https://x.com/9to5mac/status/2001014465689469051

Apple Music is coming to ChatGPT, OpenAI announces – 9to5Mac https://9to5mac.com/2025/12/16/apple-music-is-coming-to-chatgpt-openai-announces/

OpenAI opens ChatGPT app store to all developers worldwide
OpenAI launched a public app store for ChatGPT, allowing any developer to create and distribute custom AI applications built on the platform. This marks a shift from invitation-only access to an open marketplace model similar to Apple’s App Store, potentially accelerating AI integration across industries. The move positions OpenAI to capture revenue from third-party developers while expanding ChatGPT’s functionality beyond its core chat interface.

Developers can now submit apps to ChatGPT | OpenAI https://openai.com/index/developers-can-now-submit-apps-to-chatgpt/

OpenAI’s engineering head calls web browsers AGI’s first body
Ben Goodger, who helped build Firefox and Chrome, argues that browsers give AI systems their first way to meaningfully interact with the digital world. This represents a shift from viewing AI as purely conversational to seeing it as an agent capable of taking actions across websites and applications, potentially transforming how we think about AI’s practical capabilities.

If AGI is going to take meaningful action in the world, it needs a body – and @bengoodger, head of engineering for ChatGPT Atlas at @OpenAI, argues that the browser is the first real one. In this interview, we discuss: 0:23 – Ben Goodger: From Firefox & Chrome to OpenAI 1:08 – https://x.com/TheTuringPost/status/2000735294211965338

OpenAI releases GPT-5.2-Codex designed specifically for autonomous software development
This new model represents OpenAI’s first AI system built exclusively for independent coding work, capable of running complex programming tasks for hours without human intervention. Early users report the system successfully completing multi-hour software projects with full test coverage and no broken code, suggesting a significant leap beyond current AI coding assistants that require constant human guidance.

GPT-5.2-Codex launches today. It is trained specifically for agentic coding and terminal use, and people at OpenAI have been having great success with it.”” / X https://x.com/sama/status/2001724019188408352

Introducing GPT-5.2-Codex | OpenAI https://openai.com/index/introducing-gpt-5-2-codex/

Meet GPT-5.2-Codex, the best agentic coding model yet for complex, real-world software engineering. With native compaction, better long-context understanding, and improved tool-calling, it is a more dependable partner for your hardest tasks. Available in Codex starting today.”” / X https://x.com/OpenAIDevs/status/2001723687373017313

Today I ran two complex tasks through Codex with GPT 5.2 Extra High The first ran for 2 hours 30 minutes The second ran for 1 hours 45 minutes Both resulted in: – all acceptance criteria resolved – all test coverage complete – zero broken or non-working code Amazing”” / X https://x.com/nummanali/status/2000228337030152347

ChatGPT launches visual creative studio to move beyond text conversations
OpenAI is transforming ChatGPT from a text-based chatbot into a multimedia platform with dedicated image creation tools, interactive apps, and visual interfaces for tasks like research and learning. This shift recognizes that humans think in images, sounds, and patterns—not just words—and aims to match how people naturally process information. The company is adding apps from Adobe, Spotify, and others while developing “generative UI” that automatically surfaces the right visual tools based on user needs.

The shift from text to more dynamic AI experiences https://fidjisimo.substack.com/p/more-dynamic-ai-experiences

OpenAI CEO signals major enterprise push for 2026 amid competition
Sam Altman told media executives that selling AI to businesses will be a top priority next year, acknowledging OpenAI trails competitor Anthropic in the fast-growing enterprise market. The shift could help OpenAI tap into the $37.5 billion enterprise AI market and justify its massive infrastructure investments, while leveraging greater independence from Microsoft gained through recent agreements.

Enterprise Will Be a Top OpenAI Priority In 2026, Sam Altman Tells Editors at NYC Lunch https://www.bigtechnology.com/p/enterprise-will-be-a-top-openai-priority

OpenAI’s GPT-Image-1.5 claims top spot on image generation leaderboards
OpenAI launched GPT-Image-1.5, powering new ChatGPT Images feature that generates and edits pictures four times faster than before. The model immediately captured first place on the Image Arena benchmark, beating Google’s previous leader with what observers call the biggest performance jump since major competitor releases. The system offers improved text rendering, better prompt following, and precise editing while preserving details like logos and faces.

BREAKING: OpenAI releases “”GPT-Image-1.5″” (ChatGPT Images) & It instantly takes the #1 Spot on LMArena, beating Google’s Nano Banana Pro. : r/singularity https://www.reddit.com/r/singularity/comments/1po98xo/breaking_openai_releases_gptimage15_chatgpt/

GPT Image 1.5 achieves both #1 in Text to Image and Image Editing in the Artificial Analysis Image Arena, surpassing Nano Banana Pro GPT Image 1.5 is OpenAI’s newest flagship image generation model, demonstrating improved image quality and prompt fidelity relative to earlier https://x.com/ArtificialAnlys/status/2001016199094948185

GPT Image 1.5 is now available in the API: ✏️ More precise image editing and preservation of logos & faces 🎯 Better instruction following and adherence to prompts 🔤 Improved text rendering, particularly for denser and smaller text Learn more in docs: https://x.com/OpenAIDevs/status/2000992413402456485

Grace Li (@grx_xce): “”This is the biggest jump in Image Arena that we’ve seen since Nano Banana GPT-Image-1.5 has taken #1 on Image Arena with a significant lead Huge congratulations to the team at @OpenAI for this achievement!”” | XCancel https://xcancel.com/grx_xce/status/2000993261914350070?s=20

Images 1.5 launches today in ChatGPT and the API! Much better images in tons of ways, faster, and new editing capability.”” / X https://x.com/sama/status/2000997906078388332

Introducing ChatGPT Images, powered by our flagship new image generation model. – Stronger instruction following – Precise editing – Detail preservation – 4x faster than before Rolling out today in ChatGPT for all users, and in the API as GPT Image 1.5. https://x.com/OpenAI/status/2000990989629161873

The Image Arena is buzzing 👀 @OpenAI’s GPT-image-1.5 is live and already shaking up the leaderboard. Watch it in action below, then try your own prompt and share what you create 👇🎨 https://x.com/arena/status/2001014708254773549

The new ChatGPT Images is here | OpenAI https://openai.com/index/new-chatgpt-images-is-here/

This is the biggest jump in Image Arena that we’ve seen since Nano Banana GPT-Image-1.5 has taken #1 on Image Arena with a significant lead Huge congratulations to the team at @OpenAI for this achievement! https://x.com/grx_xce/status/2000993261914350070

Adobe integrates Photoshop, Acrobat and Express directly into ChatGPT for free
Adobe made its flagship creative tools accessible to ChatGPT’s 800 million weekly users through conversational prompts, eliminating the traditional learning curve for complex software. Users can now edit photos, create designs, and modify PDFs by simply describing what they want in plain English, with full slider-level control available within the chat interface. This represents a significant shift from Adobe’s traditional desktop-first approach to making professional creative tools broadly accessible through AI conversation.

Adobe launches free ChatGPT-integrated apps for Photoshop, Acrobat, and Express on desktop, the web, and iOS, after OpenAI added app integrations in October (@zombie_wretch / The Verge) https://x.com/Techmeme/status/1998741032091996348

Edit with Photoshop in ChatGPT | Adobe Blog https://blog.adobe.com/en/publish/2025/12/10/edit-photoshop-chatgpt

Photoshop is now inside ChatGPT. Just prompt what you want and get slider-level control to dial in the perfect look. Intelligently select content and apply effects — without opening Photoshop. You’re the conductor. Photoshop is the orchestra. For me, this one’s personal — I’ve https://x.com/bilawalsidhu/status/1999594990868267227

OpenAI quietly removes automatic model routing for most ChatGPT users
OpenAI rolled back its model router system that automatically directed user questions to advanced reasoning models, after finding it hurt user engagement despite improving answer quality. The company discovered that free users preferred faster responses over waiting 20+ seconds for better answers, as daily active users declined when the router sent 7% of queries to slower, more expensive reasoning models.

OpenAI Rolls Back ChatGPT’s Model Router System for Most Users | WIRED https://www.wired.com/story/openai-router-relaunch-gpt-5-sam-altman/

OpenAI launches sophisticated benchmark to test AI’s scientific research abilities
OpenAI released FrontierScience, a 160-question benchmark designed to evaluate how well AI models can handle complex scientific research tasks across multiple domains. This represents the most advanced scientific evaluation tool for AI systems to date, moving beyond basic knowledge tests to assess actual research capabilities that could determine AI’s role in accelerating scientific discovery.

How good is AI for science? Yesterday, OpenAI released a benchmark, FrontierScience, to measure frontier model performance on scientific tasks. This is the most sophisticated benchmark for science I’ve seen. FrontierScience has 160 questions across various subdomains, https://x.com/jungofthewon/status/2001302379527114798

Evaluating AI’s ability to perform scientific research tasks | OpenAI https://openai.com/index/frontierscience/

Alibaba releases open-source AI that decomposes images into editable layers
Qwen-Image-Layered automatically separates photos into 3-10 distinct RGBA layers that can be edited like Photoshop files, marking the first AI tool to create truly native layered image formats rather than just generating flat pictures.

🎨 Qwen-Image-Layered is LIVE — native image decomposition, fully open-sourced! ✨ Why it stands out ✅ Photoshop-grade layering Physically isolated RGBA layers with true native editability ✅ Prompt-controlled structure Explicitly specify 3–10 layers — from coarse layouts to https://x.com/Alibaba_Qwen/status/2002034611229229388

🚨 Qwen Image Layered is live on fal! ✨ Photoshop-grade layering – Native Decomposition 👑 Physically isolated RGBA layers with true native editability 🎨 Explicitly specify layers, from coarse layouts to fine-grained details https://x.com/fal/status/2002055913390195137

AI education platform maps essential concepts defining 2025’s breakthrough year
Turing Post compiled a comprehensive guide covering reinforcement learning’s dominance, test-time scaling methods, and specialized AI hardware beyond GPUs. The recap highlights how 2025 became the “Year of Inference-Time Search,” with models shifting from training-focused to reasoning-focused approaches, while new alignment techniques like DPO and RLAIF emerged as alternatives to standard human feedback methods. This educational synthesis provides the foundational knowledge needed as AI transitions from pattern recognition to genuine reasoning capabilities.

AI concepts you HAVE to know about at the end of 2025 – Reinforcement learning – RLHF variations: DPO, RRHF, RLAIF – Continual learning – Test-time scaling – Neuro-Symbolic AI – Hardware that powers AI: GPU, CPU, TPU, ASICs, APU, NPUs and others – Robotics Find everything from https://x.com/TheTuringPost/status/2001441981780890063

Concepts and Methods you HAVE to Know About -> AI 101 Recap https://www.turingpost.com/p/2025-concept-method-recap

Thinking Machines Lab opens Tinker AI training platform to all users
The company removed its waitlist and added vision capabilities through Qwen3-VL, plus a trillion-parameter reasoning model called Kimi K2 Thinking. This democratizes access to custom AI model training with advanced features like image processing and complex reasoning, previously available only to select users.

Tinker is now generally available. We also added support for advanced vision input models, Kimi K2 Thinking, and a simpler way to sample from models. https://x.com/thinkymachines/status/1999543421631946888

Tinker: General Availability and Vision Input – Thinking Machines Lab https://thinkingmachines.ai/blog/tinker-general-availability/

Today we are releasing Tinker to everyone, and now with vision input! You can now finetune a frontier Qwen3-VL-235B on your own image+text data, bringing your own algorithm (sft, RL, something else?). We’ll take care of the GPU infra. Full update: https://x.com/rown/status/1999544121984245872

xAI launches Grok Voice API with proven Tesla deployment track record
xAI opened its Grok Voice Agent API to developers, leveraging the same voice system already running in millions of Tesla vehicles. The API scored 92.3% on Big Bench Audio benchmarks, outperforming Google’s Gemini 2.5 Flash and OpenAI’s GPT Realtime. Developers demonstrated the technology’s accessibility by porting it to a robot in under an hour, suggesting rapid real-world adoption potential.

GROK JUST TURNED VOICE AI INTO A REAL PRODUCT, FAST, AND EVERYWHERE xAI just opened Grok Voice to developers, and this isn’t some early experiment dressed up as a launch. It’s the same system already running inside millions of Teslas, now exposed through an API that actually https://x.com/MarioNawfal/status/2001472484869329288

Grok Voice Agent API | xAI https://x.ai/news/grok-voice-agent-api

Today, we’re excited to launch the Grok Voice Agent API, empowering developers to build voice agents that speak dozens of languages, call tools, and search realtime data. https://x.com/xai/status/2001385958147752255

Took less than an hour for Grok Voice Agent by @xai to be ported to Reachy Mini thanks to @atariorbit! https://x.com/ClementDelangue/status/2001410494528213481

xAI’s new Grok Voice Agent is the new leading Speech to Speech model, surpassing Gemini 2.5 Flash Native Audio and GPT Realtime in our Big Bench Audio benchmark The new model achieves a score of 92.3% on Big Bench Audio, just ahead of the previous leader, Google’s Gemini 2.5 https://x.com/ArtificialAnlys/status/2001388724987527353

Kling 2.6 launches advanced motion and voice control for AI video generation
Chinese AI video company Kling released version 2.6 with precise motion capture capabilities that let users control character movements, facial expressions, and lip-sync in generated videos. The upgrade represents a significant leap in user control over AI video creation, moving beyond text prompts to direct motion input. Early tests show the system can accurately replicate dance moves, martial arts, and complex actions while maintaining natural hair physics and body dynamics.

🎥 Kling 2.6 Motion Control Feature Is Now Live! To celebrate the launch of Kling 2.6 Motion Control Feature, we’re kicking off a new contest – and the prizes are one post away from you! 🔥 Show us your creative power with Kling 2.6 Motion Control Feature – The Kling 2.6 Motion https://x.com/Kling_ai/status/2001891240359632965

🎥 Kling 2.6 Voice Control Feature Is Now Live! To celebrate the launch of Kling 2.6 Voice Control Feature, we’re kicking off a new contest – and the prizes are one post away from you! 🔥 Show us your creative power with Kling 2.6 Voice Control Feature – Use your signature voices https://x.com/Kling_ai/status/2001198609115628029

🚀 Motion Control, Leveled Up Newly upgraded Motion Control is now live in Kling VIDEO 2.6! Experience precise, full control over every action & expression ✅ Full-Body Motions — Body movements captured in stunning detail ✅ Fast & Complex Actions — From martial arts to https://x.com/Kling_ai/status/2001306445262823431

🚨 Kling O1 Video Standard is here on fal! 🎬 Same powerful editing model, 720P mode ✨ Start & end frame control for precision 🎯 3-10 second range for flexible videos 💰 Faster generation, lower cost https://x.com/fal/status/2000590369545744599

🚨Video Leaderboard Updates Kling 2.6 Pro by @kling_AI and the new Kandinsky 5.0 open models by @kandinskylab have now landed on the Video Arena leaderboard. Kling 2.6 Pro delivers a major 16-point jump over Kling-2.5-turbo-1080p. While Kandinsky 5.0 enters strong, taking the https://x.com/arena/status/1999530939886768205

A new prompt unlock? Multiple gliding rack focus through a cyberpunk nightclub, yes the characters in close up are prompted, prompt share in later post. Not keyframes. Created in @Kling_ai 2.6 Image to video. 🔊🔊🎧 https://x.com/StevieMac03/status/2002001196383391813

Do you want to create ultra-dynamic action animations with @Kling_ai 2.6? 🎬⚡️ After testing many prompts, I’ve noticed what works best. And here’s the key. 👉 What usually gives the best results is starting the prompt with “”High-speed anime battle.”” Other combinations that https://x.com/Artedeingenio/status/2001960379610767835

Kling 2.6「MotionControl」ダンス動画で検証・全身のステップや重心移動が自然・髪の毛の追尾性能も◎ こういったダンスやアクションの方が相性が良く、強みを発揮できる印象です✨ https://x.com/genel_ai/status/2001532885673873677

Kling cooked so hard with this new Motion Control. Don’t like the way your character moves? Take control of it yourself. With this, motion capture from home is activated. It’s fast, it’s cheap, it’s easy, it works! @Kling_ai https://x.com/WuxiaRocks/status/2001517467852771467

Oh my… Kling just dropped the next era of motion control. Kling VIDEO 2.6 can copy any action with perfect lip-sync, lifelike motion and expressive gesture. It outperforms Wan 2.2-Animate, Act-Two and DreamActor 1.5 across all metrics. More examples below. https://x.com/AngryTomtweets/status/2001569619375698199

Quick test of Kling 2.6 Motion Control Shall I keep going? 😭 https://x.com/blizaine/status/2001849003819098168

Your frames. Your timing. Kling VIDEO O1 now supports Start & End Frames generation with freely selectable durations from 3- 10s, giving you smoother transitions and more control over pacing. From fast, high-impact moments to fully immersive cinematic shots—your story moves the https://x.com/Kling_ai/status/2000581619556421673