Image created with OpenAI GPT-Image-1. Image prompt: over-the-top 1990s pro-wrestling promo poster, Tesla-coil stage featuring “Science Savage” wielding bubbling beaker nunchucks; electric arcs, grainy print texture, vivid neon titles

AI as the greatest source of empowerment for all | OpenAI
https://openai.com/index/ai-as-the-greatest-source-of-empowerment-for-all/

I will officially start at OpenAI as CEO of Applications on August 18. I am sharing this essay on why I believe AI can be the greatest source of empowerment for all.
https://x.com/fidjissimo/status/1947341053209501716

If we compared AI capabilities against humans with no access to tools, such as the internet, we would probably find that AI already outperformed humans at many or most cognitive tasks we perform at work. But of course this is not a helpful comparison and doesn’t tell us much”” / X https://x.com/random_walker/status/1946180439045018046

Imagine if every pattern shaped by nature – like a protein’s fold or cosmic phenomena – is inherently learnable by AI. @DemisHassabis shares with @lexfridman that if AI can learn these natural patterns, we could open doors to new eras of scientific discovery. Listen now. ↓ https://x.com/GoogleDeepMind/status/1948098855053979930

Thanks @lexfridman for another super fun & wide-ranging conversation. We talked about the future of video games, the nature of reality, advancing science with AI, the path to AGI… and quite a bit more as usual! Always a blast, already looking forward to next time! 😀”” / X https://x.com/demishassabis/status/1948234351205855458

@OriolVinyalsML Impressive result, but let’s be clear, the Gemini model got heavy IMO-specific prep, curated solutions, hints, and strategy guides. That’s not general reasoning. OpenAI’s model hit IMO gold with zero task-specific tuning. One is coached, the other is capable. https://x.com/VraserX/status/1947368827253076001

@pli_cachete For OpenAI at least for this IMO competition: – No tool use, no calculators, internet, formal proof software, algebra packages – same time limits – the same input to the question as for students; no rewriting it to another more suitable format – only one submission”” / X https://x.com/BorisMPower/status/1946859525270859955

🤖 From this week’s issue: Gemini with Deep Think officially achieved gold-medal standard at the International Mathematical Olympiad (IMO) by solving five out of the six IMO problems. https://x.com/dl_weekly/status/1948105084480397503

1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO). https://x.com/alexwei_/status/1946477742855532918

10. My career as a mathematician certainly isn’t threatened by AI; in fact, I hope to leverage AI to accelerate my work. However, I’m unsure whether “”mathematician”” will remain a career path for my son’s generation. (10/10)”” / X https://x.com/ErnestRyu/status/1946700798001574202

4. OpenAI surely knew GDM was working on the IMO, so they beat GDM to the punch with their Saturday morning announcement, generating hype. GDM’s slow-science scholarship cost them the PR battle. (4/10)”” / X https://x.com/ErnestRyu/status/1946699212307259659

5. In my experience using LLMs for math research, Gemini outperforms ChatGPT. We will see if the next-gen models (which seem to be what OpenAI and GDM are using for IMO) perform at research-level math. (5/10)”” / X https://x.com/ErnestRyu/status/1946699302308635130

Advanced version of Gemini Deep Think (announced at #GoogleIO) using parallel inference time computation achieved gold-medal performance at IMO, solving 5/6 problems with rigorous proofs as verified by official IMO judges! Congrats to all involved! https://x.com/koraykv/status/1947335096740049112

Advanced version of Gemini with Deep Think officially achieves gold-medal standard at the International Mathematical Olympiad – Google DeepMind https://deepmind.google/discover/blog/advanced-version-of-gemini-with-deep-think-officially-achieves-gold-medal-standard-at-the-international-mathematical-olympiad/

An advanced version of Gemini with Deep Think has officially achieved gold medal-level performance at the International Mathematical Olympiad. 🥇 It solved 5️⃣ out of 6️⃣ exceptionally difficult problems, involving algebra, combinatorics, geometry and number theory. Here’s how 🧵 https://x.com/GoogleDeepMind/status/1947333836594946337

As confirmed by the new IMO rankings, Grok 4’s eye-popping benchmarks were driving by the following innovations: – train on test – train on test – train on test”” / X https://x.com/nsaphra/status/1946804513114882227

DeepMind has the best research on using AI to solve hard Math: AlphaEvolve AlphaProof AlphaGeometry FunSearch AlphaDev AlphaTensor AlphaCode Despite making IMO Silver 28/42 in ’24, OpenAI announced Gold in ’25 35/42 before them Here’s DeepMind’s 10 best research papers on https://x.com/deedydas/status/1946987560875766212

Drastic progress on maths with Gemini 2.5! As a math undergrad, I am impressed 🤯 🥈 -> 🥇 ✅ Formal -> Informal ✅ Specialized model -> General model ✅ Available soon ✅ Huge thanks to IMO and congrats to all participants! Blog: https://x.com/OriolVinyalsML/status/1947341047547199802

Gary Marcus strikes again: “”No pure LLM is anywhere near getting a silver medal in a math olympiad”” “”Pure deep learning had a good run, but it’s time to move on”” 😂😂😂 https://x.com/scaling01/status/1946530148813025544

Gemini solved the math problems end-to-end in natural language (English).”””” / X https://x.com/denny_zhou/status/1947360696590839976

Gold medal-level performance on the 2025 International Math Olympiad from our latest experimental reasoning LLM. Model operated in natural language (i.e. outputs natural language proofs) under the same rules as humans (e.g. 4.5 hours per session, no tools). Amazing milestone!”” / X https://x.com/gdb/status/1946479692485431465

Had a super fun time training this model. A big yolo run that resulted in a super strong model. Most important thing is to trust your model and give it morale support. 🦾 Was also a big eye opener to see how prep for IMO is done. Before this I knew absolutely zero about this”” / X https://x.com/YiTayML/status/1948464752545726886

hippo at IMO: 0/42 model trained by hippo: 35/42 🥇 😂😂😂”” / X https://x.com/agihippo/status/1947348097144611123

IMO 2025 Solutions https://storage.googleapis.com/deepmind-media/gemini/IMO_2025.pdf

It wasn’t just OpenAI. Google also used a general purpose model to solve the very hard math problems of the International Math Olympiad in plain language. Last year they used specialized tool use Increasing evidence of the ability of LLMs to generalize to novel problem solving”” / X https://x.com/emollick/status/1947356382581137867

It’s hard to overstate the significance of this. It may end up looking like a “moon‑landing moment” for AI.
Just to spell it out as clearly as possible: a next-word prediction machine (because that’s really what it is here, no tools no nothing) just produced genuinely creative proofs for hard, novel math problems at a level reached only by an elite handful of pre‑college prodigies. https://x.com/SebastienBubeck/status/1946577650405056722

MathArena – IMO Blogpost https://matharena.ai/imo/

maybe a better headline would be that oai and gdm ranked 27 at the IMO. some talented kids here! https://x.com/damekdavis/status/1947357679040569520

Not Even Bronze: Evaluating LLMs on 2025 International Math Olympiad 🥉 https://x.com/hardmaru/status/1946942279807308210

Officially validated IMO gold medal, purely via search in token space, achieved in 4.5 hrs (unclear at what compute cost). The solutions read nicely as well https://x.com/fchollet/status/1947337944215523567

On IMO P6 (without going into too much detail about our setup), the model “”knew”” it didn’t have a correct solution. The model knowing when it didn’t know was one of the early signs of life that made us excited about the underlying research direction!”” / X https://x.com/alexwei_/status/1947461238512095718

One piece of info that seems important to me in terms of forecasting usefulness of new AI models for mathematics: did the gold-medal-winning models, which did not solve IMO problem 6, submit incorrect answers for it? https://x.com/littmath/status/1947398065209462981

Other AI models seem to have made big leaps in the International Math Olympiad, not just OpenAI. Not all announcements seem to be out yet.”” / X https://x.com/emollick/status/1947053944192082170

Our IMO gold model is not just an “”experimental reasoning”” model. It is way more general purpose than anyone would have expected. This general deep think model is going to be shipped so stay tuned! 🔥”” / X https://x.com/YiTayML/status/1947350087941951596

P6 was definitely the hardest and most interesting problem. Most people can understand it, but very few can solve it. All models scored 0/7. https://x.com/deedydas/status/1946250774960537927

Right before #imo2025, together with colleagues from Mountain View, NYC, Singapore, etc, we all gathered at @GoogleDeepMind headquarter in London for our final push for IMO. I believe that week was when all magic happened! We put all individual recipes (that we figured out https://x.com/lmthang/status/1948458590492393834

RT @demishassabis: Btw as an aside, we didn’t announce on Friday because we respected the IMO Board’s original request that all AI labs sha…”” / X https://x.com/TheZachMueller/status/1947419062423982583

RT @demishassabis: Official results are in – Gemini achieved gold-medal level in the International Mathematical Olympiad! 🏆 An advanced ver…”” / X https://x.com/AndrewLampinen/status/1947370582393425931

RT @Mihonarium: 🚨 According to a friend, the IMO asked AI companies not to steal the spotlight from kids and to wait a week after the closi…”” / X https://x.com/AndrewLampinen/status/1947072974621982839

RT @ns123abc: Bruh… people already reproduced Google’s IMO results without RL with just prompting openai researchoors think they have the…”” / X https://x.com/_philschmid/status/1948304855837085717

RT @polynoamial: Today, we at @OpenAI achieved a milestone that many considered years away: gold medal-level performance on the 2025 IMO wi…”” / X https://x.com/kchonyc/status/1946526143433015349

The hardest high school math exam in the world, the 6 problem 9 hour IMO 2025, was this week. AI models performed poorly. Gemini 2.5 Pro scored the highest, just 13/42, costing $431.97, in a best of 32 eval. Bronze cutoff was 19. Long way to go for AI to solve hard Math. https://x.com/deedydas/status/1946244012278722616

The two cents: 1. The OpenAI IMO solutions to P1-P5 seem to be correct. 2. P6 is a significantly novel and more difficult problem. P1-P5 are arguably within reach of “standard” IMO problem-solving techniques, but P6 requires creativity. (2/10)”” / X https://x.com/ErnestRyu/status/1946698896375492746

There are always a flood of posts about what AI can or cannot do, so it is worth pausing and paying attention to this one. It is a very hard test, done without tools. It was also viewed as an unlikely goal. Prediction markets had the chance of this happening this year as 20%”” / X https://x.com/emollick/status/1946563737604743386

This wins my respect. https://x.com/Yuchenj_UW/status/1947339774257402217

Tough look for OpenAI They’ve pissed off the international math community by jumping the gun, meanwhile @GoogleDeepMind has an officially-confirmed result that will be available commercially months earlier”” / X https://x.com/mathemagic1an/status/1947352370037305643

Two cents on AI getting International Math Olympiad (IMO) Gold, from a mathematician. Background: Last year, Google DeepMind (GDM) got Silver in IMO 2024. This year, OpenAI solved problems P1-P5 for IMO 2025 (but not P6), and this performance corresponds to Gold. (1/10)”” / X https://x.com/ErnestRyu/status/1946698766305968446

we achieved gold medal level performance on the 2025 IMO competition with a general-purpose reasoning system! to emphasize, this is an LLM doing math and not a specific formal math system; it is part of our main push towards general intelligence. when we first started openai,”” / X https://x.com/sama/status/1946569252296929727

We might be heading into a plot twist in the OpenAI vs. DeepMind IMO saga. Just saw a post from Joseph Myers (involved in the Math Olympiad since 1992): the IMO committee reportedly asked AI labs not to publish results until 7 days after the closing ceremony — out of respect for https://x.com/zjasper666/status/1947013036382068971

Why am I excited about IMO results we just published: – we did very little IMO-specific work, we just keep training general models – all natural language proofs – no evaluation harness We needed a new research breakthrough and @alexwei_ and team delivered”” / X https://x.com/millionint/status/1946551400365994077

RT @Jack_W_Lindsey: We’re launching an “”AI psychiatry”” team as part of interpretability efforts at Anthropic!  We’ll be researching phenome…”” / X https://x.com/EthanJPerez/status/1948612180007612901

Seems like a really cool opportunity! I’m glad to see Anthropic interpretability moving in this kind of direction”” / X https://x.com/NeelNanda5/status/1948194800228069520

Subliminal Learning: Language Models Transmit Behavioral Traits via Hidden Signals in Data https://alignment.anthropic.com/2025/subliminal-learning/

In 2022, @GoogleDeepMind launched Ithaca to help restore, place and date ancient texts. Now, they’re working with collaborators to introduce Aeneas, a new AI model that contextualizes ancient Latin inscriptions. 📜 Learn more ⬇️”” / X https://x.com/Google/status/1948039522194718799

Introducing the first model for contextualizing ancient inscriptions, designed to help historians better interpret, attribute and restore fragmentary texts. – Google DeepMind https://deepmind.google/discover/blog/aeneas-transforms-how-historians-connect-the-past/

Neat example of AI in the humanities. A Google model trained on Latin text fills in lost parts of Latin inscriptions & identifies related texts Historians increased their accuracy by 44% when working with the AI (Though AI alone beats historians, historian + AI was usually best) https://x.com/emollick/status/1948063719042498587

Our new state-of-the-art AI model Aeneas transforms how historians connect the past. 📜 Ancient inscriptions often lack context – it’s like solving a puzzle with 90% of the pieces lost to time. It helps researchers interpret and situate inscriptions in their past context. 🧵 https://x.com/GoogleDeepMind/status/1948037924882133390

The Open Proof Corpus (OPC) bundles 5,062 human‑checked proofs for 1,010 mathematical competition problems, giving researchers a big public yard‑stick for real reasoning rather than guess‑the‑answer tasks . GEMINI‑2.5‑PRO already judges proofs with 88.1% accuracy, and a simple https://x.com/rohanpaul_ai/status/1948012725122052335

Perplexity Comet vs ChatGPT Agent”” / X https://x.com/AravSrinivas/status/1946076236683624616

now AI can write novel proofs at the level of a world-class competitive mathematician but it still can’t reliably book me a weekend trip to boston so strange”” / X https://x.com/jxmnop/status/1946675650686746879

This past week, Harmonic had the opportunity to represent our advanced mathematical reasoning model, Aristotle, at the International Mathematics Olympiad – the most prestigious mathematics competition in the world. To uphold the sanctity of the student competition, the IMO Board https://x.com/HarmonicMath/status/1947023450578763991

Yes, there is an official marking guideline from the IMO organizers which is not available externally. Without the evaluation based on that guideline, no medal claim can be made. With one point deducted, it is a Silver, not Gold.”” / X https://x.com/lmthang/status/1946960256439058844

🧬 Further to my previous post, last month’s huge medical AI innovation, Microsoft’s AI Diagnostic Orchestrator (MAI-DxO) must be mentioned. 📉 Till now, drug research has followed Eroom’s law, where the cost to bring one therapy to market roughly doubles every 9 years and the https://x.com/rohanpaul_ai/status/1946448157652762955

Major progress in AIxBio greatly increases the risk of deliberate or accidental release of harmful bioagents. This demands urgent attention, serious caution & decisive action. Read the statement I’ve signed with many other AI & life science researchers: https://x.com/Yoshua_Bengio/status/1945960609570275508

De novo-designed pMHC binders facilitate T cell–mediated cytotoxicity toward cancer cells | Science https://www.science.org/doi/10.1126/science.adv0422

Aside from everything else interesting about this paper, I appreciate that more scientific papers (aided by LLM help?) are now including little demos and experiments to help non-specialists get the points they are making. (And no, you cannot identify the hidden signals)”” / X https://x.com/emollick/status/1948058454830063782

official results from @atcoder World Tour Finals are in — great results for both humans (#1 and #3 onwards) and AI (#2 in the world!). a milestone for AI for solving hard problems.”” / X https://x.com/gdb/status/1945989983569129632

Elon Musk’s Neuralink says owned by ‘disadvantaged’ persons in filing https://www.cnbc.com/2025/07/17/elon-musks-neuralink-says-owned-by-disadvantaged-persons-in-filing.html

This week we shared an open source AI tool that will help accelerate the discovery of high-performance, low-carbon concrete. See the full story in the thread below. You can also find research artifacts for this project here: 1️⃣ Technical report with details of the model and”” / X https://x.com/AIatMeta/status/1946227081970393116

We’re thrilled to see our advanced ML models and EMG hardware — that transform neural signals controlling muscles at the wrist into commands that seamlessly drive computer interactions — appearing in the latest edition of @Nature. Read the story: https://x.com/AIatMeta/status/1948042281107538352

What a tangled robot we weave https://seas.harvard.edu/news/2025/07/what-tangled-robot-we-weave

Mitochondrial Donation treatment – Press Office – Newcastle University https://www.ncl.ac.uk/press/articles/latest/2025/07/mitochondrialdonationtreatment/

RT @DSPyOSS: 🤯 New research deploys DSPy-optimized system in real-world medical settings. Finds 70% increase in positive patient feedback.…”” / X https://x.com/lateinteraction/status/1946328354740691103

Adrian Cosma on X: “🚨 New paper! We present Dr.Copilot – a multi-agent LLM system deployed in the real world to improve doctor-patient communication in Romanian 🇷🇴. One of the first production deployments of LLMs in Romanian telemedicine. 👇 📄 https://t.co/ghzoQJtpuo https://t.co/hLb9apgUMV” / X

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading