Science and Medicine: AI News Week Ending 12/19/2025

Science and Medicine: AI News Week Ending 12/19/2025

December 19, 2025

Image created with gemini-2.5-flash-image with claude-sonnet-4-5. Image prompt: Photorealistic 35mm cinema shot of child aged 6-8 from side angle sitting on plush rug in warm-lit bedroom, surrounded by panoramic arc of TV screens displaying molecular structures and neural networks in cool blue-white glow, vintage brass microscope and DNA helix model beside scattered scientific journals and newspapers, small terrarium with bioluminescent plants on shelf, bold text ‘SCIENCE’ at top, shallow depth of field, tender uncanny atmosphere, warm pastels contrasted with screen glow.

A Message from AI Research Leaders: Join Us in Supporting OpenReview https://x.com/openreviewnet/status/2001835887244501221

2025 Physicians AI report https://2025-physicians-ai-report.offcall.com/

Surprisingly rapid & high Ai adoption by doctors: 67% use it daily, 84% says it makes them better doctors, 42% says it makes them want to stay in medicine more (10% said less). A lot of the use cases appear to be administrative and research assistance. https://x.com/emollick/status/2001061282485547116

How good is AI for science? Yesterday, OpenAI released a benchmark, FrontierScience, to measure frontier model performance on scientific tasks. This is the most sophisticated benchmark for science I’ve seen. FrontierScience has 160 questions across various subdomains, https://x.com/jungofthewon/status/2001302379527114798

Chain of Unit-Physics builds physics knowledge directly into the code generation process. Researchers from @UMich propose an inverse approach to scientific code generation: – They encode human expert knowledge as unit-physics tests that the code must pass. – In a multi-agent https://x.com/TheTuringPost/status/2000177305981944308

InternGeometry: An LLM Agent tackles Olympia-level geometry. This novel agent solves 44 of 50 International Math Olympiad problems, beating gold medalists with only 13K training examples. It uses iterative reasoning & Complexity-Boosting RL. https://x.com/HuggingPapers/status/1999572332906438987

.@AIatMeta clarified a concept we strongly support – human and AI co-improvement. When building AI systems that work with human researchers at every step – from ideas to experiments – we can create safer intelligence and tech. Here is how to train AI specifically for research https://x.com/TheTuringPost/status/1999294766664831253

No verifiers? No problem. 🤝 The Together Research team is excited to introduce RARO — a new paradigm that unlocks scalable reasoning. By teaching LLMs to reason through adversarial games, we’re seeing promising results where standard RL fails. Check it out now and let us know”” / X https://x.com/togethercompute/status/2000631170909057390

Overall I’m very excited to see this! I’ve been wanting more transparency into how models are improving at science – we expect models to see the same breakthroughs for science in the next year or so as they have shown in coding to date. Big things are coming.”” / X https://x.com/jungofthewon/status/2001302387949236510

Please join me, Doina Precup @kchonyc @AndrewYNg @Yoshua_Bengio @rshaveddinov @earnmyturns in providing financial support for Open Review. It is one of the most important open platforms for quality AI research. We must ensure that it is well funded and can fulfill its mission.”” / X https://x.com/jpineau1/status/2001843615598092414

For the first time, an AI model (GPT-5) autonomously solved an open math problem submitted to our benchmarking project IMProofBench, with a complete, correct proof, without human hints or intervention. A small but novel contribution to enumerative geometry. Some background:”” / X https://x.com/JohSch314/status/2001300666917208222

NEW: Google releases FunctionGemma, a lightweight (270M), open foundation model built for creating specialized function calling models! 🤯 To test it out, I built a small game: use natural language to solve fun physics simulation puzzles, running 100% locally in your browser! 🕹️ https://x.com/xenovacom/status/2001703932968452365

Evaluating AI’s ability to perform scientific research tasks | OpenAI https://openai.com/index/frontierscience/

Science 🤝 GPT-5. Our new FrontierScience benchmark will be a valuable way to measure the performance of AI models on hard chemistry, biology, physics, and more. Plus, GPT-5 operating in a wet lab environment suggested experiments to increase a molecular cloning protocol’s”” / X https://x.com/kevinweil/status/2000982202067165253

We’re releasing a new eval to measure expert-level scientific reasoning: FrontierScience. This benchmark measures PhD-level scientific reasoning across physics, chemistry, and biology. It contains hard, expert-written questions (both olympiad-style problems and longer”” / X https://x.com/OpenAI/status/2000975293448905038

Energy Department Announces Collaboration Agreements with 24 Organizations to Advance the Genesis Mission | Department of Energy https://www.energy.gov/articles/energy-department-announces-collaboration-agreements-24-organizations-advance-genesis