“Exclusive first look at Orbit. 

It’s a new kind of brain/computer interface. 

AI-powered tool may offer quick, no-contact blood pressure and diabetes screening | American Heart Association 

[2411.04632v1] Improved Multi-Task Brain Tumour Segmentation with Synthetic Data Augmentation

[2411.04872] FrontierMath: A Benchmark for Evaluating Advanced Mathematical Reasoning in AI

“1/10 Today we’re launching FrontierMath, a benchmark for evaluating advanced mathematical reasoning in AI. We collaborated with 60+ leading mathematicians to create hundreds of original, exceptionally challenging math problems, of which current AI systems solve less than 2%. 

FrontierMath: Evaluating Advanced Mathematical Reasoning in AI | Epoch AI | Epoch AI

“7/10 What do experts think? We interviewed Fields Medalists Terence Tao (2006), Timothy Gowers (1998), Richard Borcherds (1998), and IMO coach Evan Chen. They unanimously described our research problems as exceptionally challenging, requiring deep domain expertise. 

[2411.06427v1] UniGAD: Unifying Multi-level Graph Anomaly Detection

“Moravec’s paradox in LLM evals I was reacting to this new benchmark of frontier math where LLMs only solve 2%. It was introduced because LLMs are increasingly crushing existing math benchmarks. The interesting issue is that even though by many accounts (/evals), LLMs are inching” / X

AI protein-prediction tool AlphaFold3 is now more open

[2411.05316v1] Exploring the Alignment Landscape: LLMs and Geometric Deep Models in Protein Representation

How Google helps others with AI flood forecasting

“@_jasonwei Do you have any intuition on why o1 significantly underperforms Gemini 1.5 as well as Sonnet 3.5 on FrontierMath? This was very shocking to me.” / X

“Newly published in this issue of Science Robotics today from Meta FAIR: NeuralFeels with neural fields — Visuotactile perception for in-hand manipulation ➡️ 

“It’s not actually clear to me that the human inductive bias generalizes algebraic structures OOD on its own. Humans mostly do it through tool use, ‘neurosymbolic’ is embodiment in disguise. LLMs still need polishing to reach parity with the human bias but maybe not much?” / X

Robot that watched surgery videos performs with skill of human doctor | Hub

New secret math benchmark stumps AI models and PhDs alike – Ars Technica

“Meet Muse, our latest AI innovation for drug development—a tool designed to optimize patient recruitment built together w. @sanofi and @OpenAI. Muse is an example of how AI systems are becoming capable of performing tasks that once required entire teams or organizations. 

Artificial intelligence is helping improve climate modelshttps://www.economist.com/science-and-technology/2024/11/13/artificial-intelligence-is-helping-improve-climate-models

Leave a Reply

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading