OpenAI: AI News Week Ending 08/29/2025

OpenAI: AI News Week Ending 08/29/2025

August 29, 2025

Image created with Flux Pro v1.1 Ultra. Image prompt: Giant “100” as pure white negative‑space cutout dominating the frame; minimalist poster style; spiral tessellation and neat chat window elements arranged inside the zeros; black‑soft teal backdrop; high contrast, crisp edges, soft studio light, no other text, no logos

Transforming human knowledge, sensors and actuators from human-first and human-legible to LLM-first and LLM-legible is a beautiful space with so much potential and so much can be done…

One example I’m obsessed with recently – for every textbook pdf/epub, there is a perfect “LLMification” of it intended not for human but for an LLM (though it is a non-trivial transformation that would need human in the loop involvement).

– All of the exposition is extracted into a markdown document, including all latex, styling (bold/italic), tables, lists, etc. All of the figures are extracted as images.
– All worked problems get extracted into SFT examples. Any referenced made to previous figures/tables/etc. are parsed and included.
– All practice problems are extracted into environment examples for RL. The correct answers are located in the answer key and attached. Any additional information is added as “answer key” for a potential LLM judge.
– Synthetic data expansion. For every specific problem, you can create an infinite problem generator, which emits problems of that type. For example, if a problem is “What is the angle between the hour and minute hands at 9am?” , you can imagine generalizing that to any arbitrary time and calculating answers using Python code, and possibly generating synthetic variations of the prompt text.
– All of the data above could be nicely indexed and embedded into a RAG database for later reference, or maybe MCP servers that make it available.

Then just as a (human) student could take a high school physics course, an LLM could take it in the exact same way. This would be a significantly richer source of legible, workable information for an LLM than just something like pdf-to-text (current prevailing practice), which simply asks the LLM to predict the textbook content top to bottom token by token (umm – lame). https://x.com/karpathy/status/1961128638725923119

Build rich experiences with connectors to: – Read emails – Fetch calendar events – Search files and chats Gmail, Google Calendar, Drive, Dropbox, Teams, Outlook Calendar + Email, and SharePoint are available now. They work with deep research, too! https://x.com/OpenAIDevs/status/1958660214057791853

💥 We launched a host of great features in Codex today: * A new extension for Cursor, VSCode, Windsurf, and the like * A much improved Codex CLI running in your local environment * Ability to manage both local and cloud Codex tasks seamlessly, including… * … Codex-driven”” / X https://x.com/kevinweil/status/1960854500278985189

📣 We shipped major improvements to the Codex CLI today GPT-5, with usage included in your ChatGPT Plan (no API key needed) Upgraded prompt, harness, approvals & sandboxing logic… you name it Get the latest: 1. `npm install -g @openai/codex` -> v0.16+ 2. `codex login`”” / X https://x.com/embirico/status/1953526045573059056

BTW, I’ve basically stopped using Opus entirely and I now have several Codex tabs with GPT-5-high working on different tasks across the 3 codebases (HVM, Bend, Kolmo). Progress has never been so intense. My job now is basically passing well-specified tasks to Codex, and reviewing”” / X https://x.com/VictorTaelin/status/1958543021324029980

codex cli with gpt-5 is getting pretty good”” / X https://x.com/gdb/status/1959209931267297586

Codex https://developers.openai.com/codex

Codex is becoming much more integrated into the full stack of development, including code review and integrating between local and remote:”” / X https://x.com/gdb/status/1960900413785563593

Finally got around to trying Codex CLI with my OpenAI Plus subscription – and I was not prepared for how good it is!! 🔥🔥 codex -m gpt-5 -c model_reasoning_effort=””high”” Blew away Gemini CLI on same tasks 💥 Try it – feels way smarter and more capable.”” / X https://x.com/TendiesOfWisdom/status/1958938621311955249

I confirm same feeling on Claude Code, not cancelling yet, but surely downgrading and moving to: codex -m gpt-5 -c model_reasoning_effort=””high”””” / X https://x.com/ivanfioravanti/status/1959277577920536740

I really like the MLX, truly enables quick experimentation. Just spend a few minutes to get the codex to work with local GLM 4.5 air via the mlx_lm server. It works beautifully. https://x.com/LiMzba/status/1960277996172149103

Image inputs landed in codex cli. Update to 0.24 to try it along with many other improvements. https://x.com/thsottiaux/status/1960579534257820024

Meanwhile, I’ve been having a blast pair-programming with gpt-5 (medium+high) in codex-cli. I can really bounce API-design ideas off it, ask for pros/cons, alternative ideas, and it’s been spot-on. It doesn’t mind pushing back on bad ideas, it makes me aware of pitfalls I’ve https://x.com/giffmana/status/1959362175648084124

new features in codex cli! try it out: npm install -g @openai/codex”” / X https://x.com/gdb/status/1960759142089658798

Seems like people are really coming around to codex cli w/ gpt5-high!! What is codex cli still missing? How can we make it even better??”” / X https://x.com/ericmitchellai/status/1959236423124492769

Using Codex with your ChatGPT plan | OpenAI Help Center https://help.openai.com/en/articles/11369540-using-codex-with-your-chatgpt-plan

We’re releasing new Codex features to make it a more effective coding collaborator: – A new IDE extension – Easily move tasks between the cloud and your local environment – Code reviews in GitHub – Revamped Codex CLI Powered by GPT-5 and available through your ChatGPT plan.”” / X https://x.com/OpenAIDevs/status/1960809814596182163

With these updates, Codex works as one agent across your IDE, terminal, cloud, GitHub, and even on your phone — all connected by your ChatGPT account. It’s all included in Plus, Pro, Team, Edu, and Enterprise plans. Check out the new Codex developer hub to get started.”” / X https://x.com/OpenAIDevs/status/1960809823387443479

yeah so OpenAI’s Codex CLI slaps crank that up to High reasoning on the $20 month plan and let it cook I needed to mock up some complex interactions in a graph model for an engineer I fed it a list of specs 15m later, hit 90% coverage Claude never got past 10% on Opus”” / X https://x.com/frantzfries/status/1959700004781847017

My wife Anna and I are supporting @LeadingFutureAI because we believe that AI can massively improve quality of life for every person (and every animal!). We believe the goal of AI policy should be to unlock this outcome. That means taking a balanced view, which we think of as”” / X https://x.com/gdb/status/1960022650228793440

Accelerating life sciences research | OpenAI https://openai.com/index/accelerating-life-sciences-research-with-retro-biosciences/

At @OpenAI, we believe that AI can accelerate science and drug discovery. An exciting example is our work with @RetroBiosciences, where a custom model designed improved variants of the Nobel-prize winning Yamanaka proteins. Today we published a closer look at the breakthrough. ⬇️ https://x.com/BorisMPower/status/1958915868693602475

Early this summer, OpenAI and Anthropic agreed to try some of our best existing tests for misalignment on each others’ models. After discussing our results privately, we’re now sharing them with the world. 🧵 https://x.com/sleepinyourhat/status/1960749648110395467

Findings from a Pilot Anthropic – OpenAI Alignment Evaluation Exercise https://alignment.anthropic.com/2025/openai-findings/

We recently ran to have OpenAI and Anthropic each evaluate each others’ models for safety issues. Excited for us to find more ways to help support safety practices across the whole field!”” / X https://x.com/EthanJPerez/status/1960808655642882228

My favorite demo of the new gpt-realtime model from @matthieulc — Shoggoth Mini using Realtime API with image input https://x.com/pbbakkum/status/1961120041799487654

Continuing the journey of optimal LLM-assisted coding experience. In particular, I find that instead of narrowing in on a perfect one thing my usage is increasingly diversifying across a few workflows that I “”stitch up”” the pros/cons of: Personally the bread & butter (~75%?) of”” / X https://x.com/karpathy/status/1959703967694545296

Claim: gpt-5-pro can prove new interesting mathematics. Proof: I took a convex optimization paper with a clean open problem in it and asked gpt-5-pro to work on it. It proved a better bound than what is in the paper, and I checked the proof it’s correct. Details below. https://x.com/SebastienBubeck/status/1958198661139009862

A Teen Was Suicidal. ChatGPT Was the Friend He Confided In. – The New York Times https://www.nytimes.com/2025/08/26/technology/chatgpt-openai-suicide.html

Our custom LLM, gpt-4b micro, has helped achieve an advance in biology. It designed novel variants of the Nobel-winning Yamanaka factors that achieve a 50x increase in reprogramming efficiency in vitro compared to standard OSKM proteins.”” / X https://x.com/gdb/status/1958928877415510134

OpenAI just released HealthBench on Hugging Face. This new dataset is designed for rigorously evaluating large language models’ capabilities in improving human health. A vital step for AI in medicine! https://x.com/HuggingPapers/status/1960749923218895332

Hey AI, give me a clever, moving one paragraph story about a paradox, in any genre you desire. make it good”” These are the first attempts. A bit of the obvious time travel tales from Gemini and Grok. Claude loves to pull on your emotions. GPT-5 Pro goes in a stranger direction. https://x.com/emollick/status/1959817825729781837

Introducing gpt-realtime and Realtime API updates for production voice agents | OpenAI https://openai.com/index/introducing-gpt-realtime/#image-input

🫡 Assistants We’re winding down the Assistants API beta. It will sunset one year from now, August 26, 2026. We’ve put together a guide to help you migrate to the Responses API: https://x.com/OpenAIDevs/status/1960409187122602172

“”Huge Realtime API release today! Details below, but TLDR: – GA (out of beta) – better instruction following, naturalness, audio – MCP support – new voices – SIP (telephony) support – new WebRTC APIs and video support Demos: https://x.com/juberti/status/1961116594211364942

Introducing gpt-realtime and Realtime API updates for production voice agents | OpenAI https://openai.com/index/introducing-gpt-realtime/#additional-capabilities

Introducing gpt-realtime and Realtime API updates for production voice agents | OpenAI https://openai.com/index/introducing-gpt-realtime/#remote-mcp-server-support

The Realtime API is officially out of beta and ready for your production voice agents! We’re also introducing gpt-realtime—our most advanced speech-to-speech model yet—plus new voices and API capabilities: 🔌 Remote MCPs 🖼️ Image input 📞 SIP phone calling ♻️ Reusable prompts https://x.com/OpenAIDevs/status/1961124915719053589

Voice is the OG modality. So excited for image inputs, function calling & MCP support in the Realtime API GA! `gpt-realtime` is a lot more natural and expressive, and every time a SOTA voice model is released, you know what I gotta do… Here is the new voice Marin, on https://x.com/swyx/status/1961124194789499233

DeepSeek-V3.1 Is 2x Cheaper than GPT-5 https://analyticsindiamag.com/ai-news-updates/deepseek-v3-1-is-2x-cheaper-than-gpt-5/

An interesting discussion from a scholar of literature on some of the odd weak points of GPT-5’s figurative writing ability, and what this might tell us about the problems of AI-driven evaluation of AI writing (and other uses of LLMs as a judge) when training models.”” / X https://x.com/emollick/status/1960445234875392090

gpt-5 pro utterly crushes this question in a way no non-reasoning LM ever would There are books on this subject and I’m not totally sure they beat 5 pro’s answer in terms of clarity and density of signal https://x.com/deanwball/status/1959643458718589316

gpt-realtime pricing > $32 / 1M audio input tokens ($0.40 for cached input tokens) > $64 / 1M audio output tokens https://x.com/omarsar0/status/1961117107417928047

if you are a power user, please send us feature requests! (i asked in reply to this message and they were interesting, so would like more)”” / X https://x.com/sama/status/1958922435249754382

New WebRTC APIs: https://x.com/juberti/status/1961118374345241016

We’ve made a bunch of improvements to web search in the Responses API: 🌐 Domain filtering to focus on specific sources 📑 Source reporting 💸 Pricing: $10 / 1K calls (down from $25) https://x.com/OpenAIDevs/status/1960425260576334274

Introducing gpt-realtime and Realtime API updates for production voice agents | OpenAI https://openai.com/index/introducing-gpt-realtime/

A useful thing that GPT-5 can do that wasn’t previously possible before powerful AI is to monitor complex topics by asking it to give you scheduled reports. Example: I have a weekly report on “reproducible, benchmarked evidence of autonomous or recursive self‑improvement in AI” https://x.com/emollick/status/1959424313502961824

people seem to really like the new codex features!”” / X https://x.com/sama/status/1961096744533647501

1/8 🧵 GPT-5’s storytelling problems reveal a deeper AI safety issue. I’ve been testing its creative writing capabilities, and the results are concerning – not just for literature, but for AI development more broadly. 🚨”” / X https://x.com/ChristophHeilig/status/1960358655745724438

GPT-5 says ‘I don’t know’. Love this, thank you. https://x.com/koltregaskes/status/1957474061153436094

It’s rare for competitors to collaborate. Yet that’s exactly what OpenAI and @AnthropicAI just did—by testing each other’s models with our respective internal safety and alignment evaluations. Today, we’re publishing the results. Frontier AI companies will inevitably compete on”” / X https://x.com/woj_zaremba/status/1960757419245818343

Introducing gpt-realtime — our best speech-to-speech model for developers, and updates to the Realtime API https://x.com/OpenAI/status/1961110295486808394

Some notes on the gpt-realtime release it replaces chained STT→LLM→TTS with a single speech-in/speech-out model (lower latency, richer nuance) – huge imo 🔥 On benchmarks (vs GPT4o-realtime): > scores 82.8% vs 65.6% on BigBench (reasoning) > 30.5% vs 20.6% on MultiChallenge”” / X https://x.com/reach_vb/status/1961140618295394579

Breaking: GPT-5 ranked 🥇 on Humanity’s Last Exam and 🥈 on MultiChallenge SEAL Leaderboards. https://x.com/scale_AI/status/1953591873031090505

Musk Tried to Enlist Zuckerberg to Help Finance OpenAI Bid https://www.msn.com/en-us/money/companies/musk-tried-to-enlist-zuckerberg-to-help-finance-bid-for-openai/ar-AA1KZ6y5

OpenAI plans a new build with Oracle that would add 4.5 gigawatts of data-center capacity, an outgrowth of their “Stargate” program. The Wall Street Journal reported OpenAI will pay Oracle $30 billion annually. The plan follows a 1.2-gigawatt site in Abilene, Texas. Selection https://x.com/DeepLearningAI/status/1960900145421177053

Devs, tune in, in Realtime. Livestream at 10am PT 🗣️”” / X https://x.com/OpenAI/status/1961081377174212979

Ezra Klein is impressed with GPT-5, and wrote about his experience in the NYT this morning. https://x.com/AndrewCurran_/status/1959690765933920284

so cool!”” / X https://x.com/sama/status/1958920060116078791

Musk v. OpenAI just got messier https://tech.therundown.ai/p/musk-brings-zuck-into-openai-drama

OpenAI lawyers question Meta’s role in Elon Musk’s $97B takeover bid | TechCrunch https://techcrunch.com/2025/08/21/openai-lawyers-question-metas-role-in-elon-musks-97b-takeover-bid/

All the big labs pretrain on the ~same internet, but each has a secret sauce: OpenAI: reddit xAI: twitter Google: youtube Meta: instagram & facebook In the end, models mirror the cultures of the social media they grew up on.”” / X https://x.com/Yuchenj_UW/status/1961121746670817404

vLLM + open-webui running gpt-oss-120b on a tinybox green v2 Just `vllm serve openai/gpt-oss-120b –tensor-parallel-size 4 –async-scheduling` and you have a local OpenAI API you can trust. https://x.com/__tinygrad__/status/1959862336501715430

first vision language model built off @OpenAI gpt-oss just dropped! 🔥 InternVL3.5 comes with 32 models 🤯 pre-trained, fine-tuned, aligned in various sizes comes with gpt-oss or Qwen3 for LLM part ⤵️ https://x.com/mervenoyann/status/1960298636610326564

xai-vs-apple-and-openai.pdf https://s3.documentcloud.org/documents/26073662/xai-vs-apple-and-openai.pdf