OpenAI: AI News Week Ending 11/21/2025

OpenAI: AI News Week Ending 11/21/2025

November 21, 2025

Image created with gemini-2.5-flash-image with claude-sonnet-4-5. Image prompt: Cinematic wide shot of the Emerald City throne room from Wicked with towering green pillars and ornate throne, every architectural element outlined with glowing neon green and gold object segmentation boundaries and translucent colored masks floating in mid-air, moody dramatic lighting with rays of light through stained glass, the text OpenAI overlaid as a large bold movie title across the bottom third

Small-but-happy win: If you tell ChatGPT not to use em-dashes in your custom instructions, it finally does what it’s supposed to do!”” / X https://x.com/sama/status/1989193813043069219

Group chats in ChatGPT are now rolling out globally. After a successful pilot with early testers, group chats will now be available to all logged-in users on ChatGPT Free, Go, Plus and Pro plans. https://x.com/OpenAI/status/1991556363420594270

A new way to collaborate in ChatGPT – Fidji Simo https://fidjisimo.substack.com/p/a-new-way-to-collaborate-in-chatgpt

Oracle has lost $315 billion in market value since announcing its $300 billion deal with OpenAI https://www.msn.com/en-us/money/savingandinvesting/oracle-has-lost-315-billion-in-market-value-since-announcing-its-300-billion-deal-with-openai/ar-AA1QH6et?ocid=finance-verthp-feeds

OpenAI can’t beat Google in consumer AI – by John Hwang https://nextword.substack.com/p/openai-cant-beat-google-in-consumer

The most crushing defeat for OpenAI I did not expect Gemini 3 Pro to be SOTA on WeirdML WeirdML has been an OpenAI stronghold for quite some time. https://x.com/scaling01/status/1991154001283358992

Gemini 3 Pro takes the crown on LisanBench – it scores 2.2x higher than GPT-5 while using 2.4x fewer reasoning tokens – it has the highest score on 23 out of 50 words – Grok-4 is the only model that can keep up https://x.com/scaling01/status/1990845163652993166

The Artificial Analysis leaderboard shows Gemini 3 at 73%, GPT-5.1 at 70%, and Kimi at 67% – minor differences. On our leaderboard, Gemini is 47%, GPT-5.1 is 38%, and Kimi is 27% – Gemini 3 is substantially more capable on hard benchmarks. https://x.com/hendrycks/status/1991188104804208736

Build a coding agent with GPT 5.1 https://cookbook.openai.com/examples/build_a_coding_agent_with_gpt-5.1

gpt-5.1-codex is genuinely cracked – the strongest agentic coding model available right now. what’s becoming clear is the increasing importance of the model + the harness + the tools.”” / X https://x.com/shyamalanadkat/status/1989184364727632348

OpenAI just published an ace 28-page guide on context engineering for AI agents. Instead of throwing more memory at LLMs, it shows how to engineer context: when to trim, summarize, prevent drift, and defend against context poisoning. 100% free. Link to the guide in 🧵↓ https://x.com/DataChaz/status/1988581390452249022

GPT-5.1 Pro is rolling out today to all Pro users. It delivers clearer, more capable answers for complex work, with strong gains in writing help, data science, and business tasks.”” / X https://x.com/OpenAI/status/1991266192905179613?s=20

Today we at @OpenAI are releasing GPT-5.1-Codex-Max, which can work autonomously for more than a day over millions of tokens. Pretraining hasn’t hit a wall, and neither has test-time compute. Congrats to my teammates @kevinleestone & @mikegmalek for helping to make it possible! https://x.com/polynoamial/status/1991212955250327768

Building more with GPT-5.1-Codex-Max | OpenAI
https://openai.com/index/gpt-5-1-codex-max/

New Codex model is a significant improvement!”” / X https://x.com/sama/status/1991258606168338444

GPT-5.1 (High) coming in on par with GPT-5 Pro on ARC-AGI but nearly an OOM cheaper https://x.com/GregKamradt/status/1990501297095909486

GPT-5.1-Thinking-high finally beats Grok-4 on ARC-AGI-2 https://x.com/scaling01/status/1990506507125895444

GPT-5.1-Codex-Max beats GPT-5.1 by 8% on OpenAI interal Pull-Requests https://x.com/scaling01/status/1991219951932489738

My GPT-5.1 Pro Review — matt shumer https://shumer.dev/gpt51proreview

GPT-5.1-Codex-Max is out (API coming soon)! • Outperforms GPT-5.1-Codex and more efficient • Natively trained with compaction to handle long-running tasks • New “”Extra High”” reasoning effort for your hardest problems $ npm install -g @openai/codex@latest https://x.com/dkundel/status/1991224903031210453

GPT-5.1-Codex-Max is new SOTA on METR https://x.com/scaling01/status/1991220418535936302

GPT-5.1-Codex was released six days ago, now we have GPT-5.1-Codex-Max. (The use of every naming scheme piled on top of each other, from version numbers to qualifiers like Max, makes it hard to see how big a deal each release is, but this looks like a big jump in ability)”” / X https://x.com/emollick/status/1991220527550157282

GPT-5.1-Codex-Max shows big improvements in CTF https://x.com/scaling01/status/1991218908833939818

New model is out in Codex. Gets to same quality of solution faster and raises the ceiling for how complex of a tasks are achievable. $ codex -m gpt-5.1-codex-max Best experienced in the latest CLI version 0.59, which also packs a lot of other fixes and improvements. https://x.com/thsottiaux/status/1991210545253609875

Introducing group chats in ChatGPT | OpenAI https://openai.com/index/group-chats-in-chatgpt/

Building more with GPT-5.1-Codex-Max | OpenAI https://openai.com/index/gpt-5-1-codex-max/

GPT-5.1-Codex-Max improves over GPT-5.1s Paperbench score (replicate state-of-the-art AI research) https://x.com/scaling01/status/1991219458426433729

GPT-5.1-Codex-Max shows progress on MLE-bench https://x.com/scaling01/status/1991219683450843145

GPT-5.1 is now available in the API. Pricing is the same as GPT-5. We are also releasing gpt-5.1-codex and gpt-5.1-codex-mini in the API, specialized for long-running coding tasks. Prompt caching now lasts up to 24 hours! Updated evals in our blog post.”” / X https://x.com/sama/status/1989048466967032153

These examples of different personalities from ChatGPT 5.1 seem to give fundamentally different types of advice, including, weirdly, completely different breathing patterns and roles for the presenter. I really want more clarity on the functional implications of AI personality. https://x.com/emollick/status/1988829651368575282

💥 Today we say “hello world” from OpenAI for Science. We’re releasing a paper showing 13 examples of GPT-5 accelerating scientific research across math, physics, biology, and materials science. In 4 of these examples, GPT-5 helped find proofs of previously unsolved problems.”” / X https://x.com/kevinweil/status/1991567552640872806

GPT-5 Pro is an incredibly useful tool for social science. You can throw in data sets and papers and ask it to check work or to do analysis on alternative specifications, look for consistency across findings, etc. It provides code & statistical results so findings are verifiable”” / X https://x.com/emollick/status/1989204496556384627

[2511.16072] Early science acceleration experiments with GPT-5 https://arxiv.org/abs/2511.16072

We’re also releasing new research on how GPT-5 is accelerating scientific discovery. Our new paper, Early science acceleration experiments with GPT-5, presents case studies where GPT-5 accelerated key steps in real research workflows and, in a few cases, contributed novel”” / X https://x.com/OpenAI/status/1991570422148788612

Early experiments in accelerating science with GPT-5 | OpenAI https://openai.com/index/accelerating-science-gpt-5/

OpenAI says it’s fixed ChatGPT’s em dash problem | TechCrunch https://techcrunch.com/2025/11/14/openai-says-its-fixed-chatgpts-em-dash-problem/

OpenAI’s dominance is unlike anything Silicon Valley has ever seen https://www.cnbc.com/2025/10/11/open-ai-silicon-valley-tech-startup.html

OpenAI is finally letting employees donate their equity to charity | The Verge https://www.theverge.com/ai-artificial-intelligence/822496/openai-employee-equity-donation-charity-rounds-share-valuation

Intuit will spend more than $100 million on a multiyear contract with OpenAI to further weave the ChatGPT maker’s artificial intelligence models into financial apps like TurboTax https://x.com/business/status/1990787090024436085

Intuit Inks Deal to Spend Over $100 Million on OpenAI Models – Bloomberg https://www.bloomberg.com/news/articles/2025-11-18/intuit-to-spend-over-100-million-on-openai-models-in-new-deal?taid=691c809375694200019ff88f

Introducing ChatGPT for Teachers–a secure ChatGPT workspace built for educators, with admin controls and compliance support for school and district leaders. Free for verified U.S. K-12 educators through June 2027. https://x.com/OpenAI/status/1991218197530378431

Epstein emails: Larry Summers roles at Harvard, OpenAI affected https://www.cnbc.com/2025/11/19/larry-summers-epstein-openai.html

Crisis Helpline Support in ChatGPT | OpenAI Help Center https://help.openai.com/en/articles/12677603-crisis-helpline-support-in-chatgpt

We’ve expanded access to localized crisis helplines in ChatGPT. When our systems detect potential signs that someone may be experiencing distress, our models now offer an easy way to reach real people directly via @ThroughlineCare. Learn more here: https://x.com/OpenAI/status/1991634046624116784

OpenAI backs startup aiming to block AI-enabled bioweapons | Reuters https://www.reuters.com/technology/openai-backs-startup-aiming-block-ai-enabled-bioweapons-2025-11-13/

Open-source research agents have been lagging behind proprietary systems like OpenAI’s Deep Research. The gap has been frustrating for developers who want powerful, deep research agents without vendor lock-in. I’ve been building my own called Kimi Deep Researcher. Similarly, https://x.com/omarsar0/status/1990794651608219727

Instant Checkout is now rolling out for @Shopify merchants starting with Glossier, SKIMS, and Spanx. Available for Plus, Pro, and Free users in the US. https://x.com/OpenAI/status/1991646997322035520

Don’t underestimate the importance of a good harness that fits the model. In terminal-bench2, GPT-5.1-Codex goes from 16th place (36%) using Terminus 2 to 1st place (57%) using Codex CLI. Gemini 3 Pro enters at #2 with Terminus 2. https://x.com/tristanzajonc/status/1990879703935103256

the future is bright https://x.com/gdb/status/1991003743408583110

From seeing how a lot of people use ChatGPT, 95% of all practical problems folks encounter can be solved by turning on Extended Thinking.”” / X https://x.com/emollick/status/1989738355199017249

Our goal with “”gpt5.1 thinking”” was making thinking models usable as *daily drivers* for productive usecases. That’s why we focused on improving the model’s efficiency with its thinking (~60% less thinking on easy prod queries) while retaining accuracy! If you use ChatGPT for https://x.com/yanndubs/status/1990470488573858144

OpenAI is shifting the center of gravity from “prompt your agent better” to “train your agent inside your world.” The vibe was “okay, we know prompting isn’t enough anymore, here are the knobs you actually need.” Their tech staff – Will and Cathy- were talking about Agent RFT https://x.com/TheTuringPost/status/1991920970555162956

Science underpins medicine, energy, and national security, yet progress remains slow. Early experiments with university and national-lab partners show GPT-5 helping researchers explore ideas and reach insight faster. Hear directly from OpenAI researchers behind the work: https://x.com/OpenAI/status/1991569987933458814

early-science-acceleration-experiments-with-gpt-5.pdf https://cdn.openai.com/pdf/4a25f921-e4e0-479a-9b38-5367b47e8fd0/early-science-acceleration-experiments-with-gpt-5.pdf

Imagine how unhobbled Codex will be once we rollout a proper fix to this. In the meantime, believe me, there are … technical reasons for why this is. Not that I like them.”” / X https://x.com/thsottiaux/status/1989940347494084683

inference is perhaps the most valuable emerging software category. as models get smarter and more economically valuable, compute will increasingly be spent drawing samples from the models. if you’d like to work on inference at openai, reach out — gdb@openai.com. include a”” / X https://x.com/gdb/status/1990507010769760394

Matches the vibes pretty closely – 5.1 is incrementally but meaningfully better (and I believe it means that OpenAI retakes the lead from Grok in GPQA Diamond)”” / X https://x.com/emollick/status/1989056567690326258

OpenAI’s Fidji Simo Plans to Make ChatGPT Way More Useful–and Have You Pay For It | WIRED https://www.wired.com/story/fidji-simo-is-openais-other-ceo-and-she-swears-shell-make-chatgpt-profitable/

Sam Altman suggested that OpenAI could reach $100B in revenue by 2027. Anthropic reportedly forecasted $70 billion in revenue by 2028. Satya reacts to these projections. https://x.com/dwarkesh_sp/status/1989111299637440720

ChatGPT 5.1 is a MAJOR improvement, at least from a writing/tone point of view. Check out these rap lyrics about quantum mechanics it spit out. (and enjoy the @suno song I made with them). Equally impressively, look at the linguistic and physics analysis of the lyrics that https://x.com/kyleshannon/status/1989052882495422616

Today, we are officially open-sourcing a set of high-quality speculator models on the @huggingface Hub. Our first release includes Llamas, Qwens, and gpt-oss. In practice, you can expect 1.5-2.5× speedups on average, with some workloads seeing more than 4× improvements! https://x.com/_EldarKurtic/status/1991160711838359895