Image created with gemini-2.5-flash-image with claude-sonnet-4-5. Image prompt: Photorealistic wide shot of six Ionic limestone columns with classical entablature carved with EDUCATION in Roman serif, colorful doctoral robes draped over columns, open textbooks at bases, late afternoon golden light on university quad with red brick buildings behind, sharp focus on carved stone text and fabric texture, natural shadows on green grass.

A year ago, I would not have expected the first academic field to seem to reach a consensus that AIs will accelerate research (which is not the same thing as autonomous research) would be math But that appears to be happening based on math professors in my feed and elsewhere.”” / X https://x.com/emollick/status/1984388281061282081

I have been writing for years about the fact that we are not ready for the destruction of costly signalling mechanisms. Writing used to be a way of measuring effort, ability and diligence. We still have no easy substitute. https://x.com/emollick/status/1985854486317822204

I firmly believe we are at a watershed moment in the history of mathematics. In the coming years, using LLMs for math research will become mainstream, and so will Lean formalization, made easier by LLMs. (1/4)”” / X https://x.com/ErnestRyu/status/1984033423586160889

vibe coding should be taught in every school and university around the world, first day on campus, welcome to the future”” / X https://x.com/OfficialLoganK/status/1984645815361814672

We’re announcing a partnership with Iceland’s Ministry of Education and Children to bring Claude to teachers across the nation. It’s one of the world’s first comprehensive national AI education pilots: https://x.com/AnthropicAI/status/1985612560255893693

Google DeepMind release: Towards Robust Mathematical Reasoning Introduces IMO-Bench, a suite of advanced reasoning benchmarks that played a crucial role in GDM’s IMO-gold journey. Vetted by a panel of IMO medalists and mathematicians. IMO-AnswerBench – a large-scale test on https://x.com/iScienceLuvr/status/1985685404276965481

While human expert evaluation remains the gold standard for mathematical proofs, its cost and time intensity limit scalable research. To address this, we built #ProofAutoGrader, an automatic grader for IMO-ProofBench. The autograder leverages Gemini 2.5 Pro, providing it with a https://x.com/lmthang/status/1985772094085595570

Continuing our IMO-gold journey, I’m delighted to share our #EMNLP2025 paper “Towards Robust Mathematical Reasoning”, which tells some of the key stories behind the success of our advanced Gemini #DeepThink at this year IMO. Finding the right north-star metrics was highly https://x.com/lmthang/status/1985760224612057092

Leave a Reply

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading