Technical and Dev: AI News Week Ending 03/14/2025

“Meet the recipients of the 2024 ACM A.M. Turing Award, Andrew G. Barto and Richard S. Sutton! They are recognized for developing the conceptual and algorithmic foundations of reinforcement learning. Please join us in congratulating the two recipients! https://x.com/TheOfficialACM/status/1897225672935735579

“Main critiques: 1. Attention lacks a way to do “nothing”, and this may be why attention sinks arise. 2. Temperature scaling should be a function of sequence length because of how denominator of normalization naturally grows. https://x.com/torchcompiled/status/1899901965053944148

“For 25% of the Winter 2025 batch, 95% of lines of code are LLM generated. That’s not a typo. The age of vibe coding is here.” / X https://x.com/garrytan/status/1897303270311489931

“Chain-of-thoughts are the “dark knowledge” of LLMs, like logits in a classifier neural network. By changing prompting methods (e.g. assert a wrong answer and then ask for explanation), you get to learn much more about the model. It’s like getting to know another person. https://x.com/shaneguML/status/1899477905132138577

The US Army Is Using ‘CamoGPT’ to Purge DEI From Training Materials | WIRED https://www.wired.com/story/the-us-army-is-using-camogpt-to-purge-dei-from-training-materials/

“Forgetting Transformer Softmax Attention with a Forget Gate https://x.com/_akhaliq/status/1898946992484602155

“Learning from Failures in Multi-Attempt Reinforcement Learning https://x.com/_akhaliq/status/1898939546718314976

“New Trends for Modern Machine Translation with Large Reasoning Models https://x.com/_akhaliq/status/1900402426886115362

“R1-Searcher Incentivizing the Search Capability in LLMs via Reinforcement Learning https://x.com/_akhaliq/status/1898942888307745100

“TinyR1-32B-Preview Boosting Accuracy with Branch-Merge Distillation https://x.com/_akhaliq/status/1898941150922158225

“It’s starting – just kicked off AI Dev 25, the AI developer conference, in San Francisco! Happy Pi day! https://x.com/AndrewYNg/status/1900594063516254299

“MLX LM has a new home! https://x.com/awnihannun/status/1900311865026372032

“Made a new guide to writing faster MLX and avoiding some common perf cliffs. https://x.com/awnihannun/status/1899861832774668399

“AI compilers bring new tech into the AI performance world – and promise to relieve us from having to write CUDA kernels directly. Here we look at TVM and XLA to see what worked well, what didn’t, and why so much of GenAI is still all written in CUDA directly… 🧐” / X https://x.com/clattner_llvm/status/1899913688158798055

“Leaderboard lore: origin story by @_lewtun , and life in thread! April 2023: @nathanhabib1011 and I were working on an internal evaluation suite at the time – when the lb went public, @Thom_Wolf contacted us to see if we could grow it and industrialize the code” / X https://x.com/clefourrier/status/1900572125238378939

Deriving Muon https://jeremybernste.in/writing/deriving-muon

“AI struggles with messy, conflicting, ever-changing data. Today’s AI ranking methods can’t prioritize clearly, because they lack human guidance. Introducing the world’s first instruction-following, SOTA reranker! Give our reranker instructions to control exactly how it ranks: • https://x.com/douwekiela/status/1899490844572577958

“In this week’s Gradient Updates issue, @EgeErdil2 argues that extrapolating past and current AI capabilities to the future yields excessively conservative estimates of the future impact of AI, and it’s often better to rely on first principles reasoning to make predictions.🧵 https://x.com/EpochAIResearch/status/1898117611105161604

“Coders are over-represented on this site, so discussion on AI access is assumed to be about APIs, but, for most people, AI means using a chatbot, and usually the ChatGPT interface. It is through those interfaces that AI will prove to be useful, or not. At least for now. https://x.com/emollick/status/1897812093576917244

“New post! I’d like to very boldly claim that the origins of using softmax in attention are a bit arbitrary, the probability lens is restrictive framework in how we might improve it, and that there’s even something I’d consider a bug in attention affecting LLMs. https://x.com/torchcompiled/status/1899894436802506976

Fine-tune AI models at scale https://nebius.com/services/studio-fine-tuning

“📄✨ Useful PDF-to-Markdown comparison tool: Compare 8+ parsing methods side-by-side, visualize results instantly, and download as ZIP. Test it yourself https://x.com/fdaudens/status/1898815305448665335

“Had a great time at @Khipu_AI in Santiago, Chile. My talk on Sequential decision making using online variational bayes is here, in case you are interested. (Lots of other cool talks online too) https://x.com/sirbayes/status/1900294930599121068

“Now someone needs to make a follow-up on this work that adds Global Uncertainty Distillation, then they can call it GIDD-GUD lol” / X https://x.com/giffmana/status/1899167614359806012

“I fully agree with @AndrewYNg on this. There’s never been a better time to learn to code. Coding just became a lot more accessible thanks to AI copilots. Anyone with an idea can now code an app to either make money or do something good. I think we’re at the onset of an” / X https://x.com/NandoDF/status/1900548832733069638

“Andrew G. Barto and Richard S. Sutton won the 2024 Turing Award for developing the foundations of reinforcement learning in the 1980s After winning, they both warned against the rapid and unsafe deployment of advanced AI models https://x.com/rowancheung/status/1897554432289616255

“Just uploaded my “Coding Attention Mechanisms” tutorial. A 2h15m session on coding attention mechanisms to understand how the engine of LLMs works: self-attention → parameterized self-attention → causal self-attention → multi-head self-attention https://x.com/rasbt/status/1899493072972415154

“Mixture of Experts Made Intrinsically Interpretable “We introduce MoE-X, a redesigned MoE layer that functions as a wide, sparse, and more interpretable MLP within large language models.” https://x.com/iScienceLuvr/status/1899718810363642242

“YuE: Scaling Open Foundation Models for Long-Form Music Generation “We tackle the task of long-form music generation—particularly the challenging lyrics-to-song problem—by introducing YuE (乐), a family of open foundation models based on the LLaMA2 architecture. Specifically, https://x.com/iScienceLuvr/status/1899717628912157104

Teaching Language Models to Solve Sudoku Through Reinforcement Learning – Hrishbh Dalal https://hrishbh.com/teaching-language-models-to-solve-sudoku-through-reinforcement-learning/

“I am starting to think that «overcapacity» in general is a myth that only a terminally econ-brained culture could entertain. Maybe you can overproduce bicycles. But in most consequential domains – housing, energy, chips, raw materials, and yes cars – More Stuff Better.” / X https://x.com/teortaxesTex/status/1899540640351862954

“All Roads Lead to Likelihood: The Value of Reinforcement Learning in Fine-Tuning Author’s Explanation: https://x.com/TheAITimeline/status/1898892461226996173

“Q-Filters: Leveraging QK Geometry for Efficient KV Cache Compression Author’s Explanation: https://x.com/TheAITimeline/status/1898892513274180085

“Transformers without normalization: Is it possible? @AIatMeta says “Yes!” and proposes Dynamic Tanh (DyT) – a super simple and efficient function that mimics how normalization works. • DyT works just as well as normalization layers (or even better!) • Doesn’t needing extra https://x.com/TheTuringPost/status/1900528108140372411

[on-demand] How to Operationalize AI with Process Orchestration | Camunda https://page.camunda.com/wb-how-to-operationalize-ai-with-process-orchestration

“Introducing our latest blog on WebDev Arena: A Live LLM Leaderboard for Web App Development! How does WebDev Arena work? Submit a prompt → Two LLMs battle it out → You vote on the better web app. Since launching in Dec 2024, we’ve gathered 100,000+ community votes evaluating https://x.com/lmarena_ai/status/1899181467252711593

“Good tip from Replit’s @mattppal at AI Dev 25 on debugging while vibe coding: Large part of it is looking at outputs to figure out what context you have that LLM does not, so that you can give it that context and help it get unstuck. Sometimes pasting in the error messages is https://x.com/AndrewYNg/status/1900617330906067136

“So it looks like the First Scaling Law (the bigger the model the “smarter”) still holds- order of magnitude increases in compute lead to linear improvements in ability GPT-3.5 Turbo scored 30% on GPQA, GPT-4 Turbo got 47%, now GPT-4.5 got 70% And Reasoners add a new Scaling Law https://x.com/emollick/status/1897457930833731979

“Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning “We formalize the problem of optimizing test-time compute as a meta-reinforcement learning (RL) problem, which provides a principled perspective on spending test-time compute. This perspective enables us to view the https://x.com/iScienceLuvr/status/1899392429893042485

“Looking at this, you can see some of answer to the “what _____ saw” memes about why insiders at the AI labs got so excited a year ago. Reasoning models (“test-time compute”) really do seem to represent a breakthrough in AI capability, at least in fields like math and coding.” / X https://x.com/emollick/status/1898174702478082279

“Love to see the talent density and creativity of research work at @cartesia_ai. Now with more GPUs” / X https://x.com/saranormous/status/1899484191256981779

[2503.04715] Predictable Scale: Part I — Optimal Hyperparameter Scaling Law in Large Language Model Pretraining https://arxiv.org/abs/2503.04715

“There is a huge cultural rift in engineering right now between “big team” and “small team” people. It’s one thing for coding to get more efficient. But many people had full-time jobs operating the organizational machinery that allowed large teams to collaborate. If large teams” / X https://x.com/scottastevenson/status/1900357184191390184

HRAvatar – Project page https://eastbeanzhang.github.io/HRAvatar/

“I really wanted a better arxiv explorer for myself, so just vibe-coded https://x.com/amazedsaint/status/1896065978359755176

Paper page – Boosting Jailbreak Attack with Momentum https://huggingface.co/papers/2405.01229

Paper page – D2PO: Discriminator-Guided DPO with Response Evaluation Models https://huggingface.co/papers/2405.01511

Ai2 Case Study – Allen Institute for Artificial Intelligence https://www.cirrascale.com/ai2-case-study

GitHub – wang-zidu/SRM-Hair: The official implementation of SRM-Hair. https://github.com/wang-zidu/SRM-Hair

Paper page – Reasoning About Group Polarization: From Semantic Games to Sequent Systems https://huggingface.co/papers/2405.01322

Paper page – Using Waste Factor to Optimize Energy Efficiency in Multiple-Input Single-Output (MISO) and Multiple-Input Multiple-Output (MIMO) Systems https://huggingface.co/papers/2405.01352

“LLMs in Text-to-SQL are vulnerable to backdoor attacks. Attackers can manipulate models to generate malicious SQL queries when triggered, while maintaining performance on normal inputs. This paper introduces ToxicSQL, a novel backdoor attack framework. ToxicSQL uses stealthy https://x.com/rohanpaul_ai/status/1900151122633121879

VACE https://ali-vilab.github.io/VACE-Page/

Paper page – The continuous extension of the logarithmic double layer potential to the Ahlfors-regular boundary https://huggingface.co/papers/2405.01482

[2503.01785] Visual-RFT: Visual Reinforcement Fine-Tuning https://arxiv.org/abs/2503.01785

“Trained using RLOO Very exciting to see labs exploring algorithms beyond ppo paper: https://x.com/finbarrtimbers/status/1899491349881352628

Motion Anything https://steve-zeyu-zhang.github.io/MotionAnything/

[2405.01304] Misclassification bounds for PAC-Bayesian sparse deep learning https://arxiv.org/abs/2405.01304

LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities https://forjadeforest.github.io/LMM-R1-ProjectPage/

MovieAgent https://weijiawu.github.io/MovieAxgent/

“Folks, we have set up a github repo for QwQ, specifically providing evaluation scripts for you to easily test the benchmark performance of reasoning models, and also reproduce our reported results. We provide step-by-step guidance for you to run the evaluation, and we hope this” / X https://x.com/Alibaba_Qwen/status/1900595120053047452

[2503.02836v1] SeqFusion: Sequential Fusion of Pre-Trained Models for Zero-Shot Time-Series Forecasting https://arxiv.org/abs/2503.02836v1

“Really in-depth paper on AI hallucinations in medicine, with lots of discussion and analysis about addressing them & what is appropriate for medical use. But I found this bit on how much more accurate the latest models have gotten to be interesting (though more study is needed) https://x.com/emollick/status/1899562684405670394

“Announcing Cartesia’s Series A towards our mission of building real-time intelligence I’m cooking up some new models in the back – looking for researchers who want to develop the next generation of architectures 👀” / X https://x.com/_albertgu/status/1899499128389877764

“We challenged Reachy 2 to sort healthy & unhealthy foods—no AI training, just real-time object detection! Built using pollen_vision & our self-development kit. simple version : https://x.com/pollenrobotics/status/1897286630249259310

“New paper: turns out you can train deep nets without normalization layers by replacing them with a parameterized tanh()” / X https://x.com/ylecun/status/1900610590315249833

[2405.01437] Two competing populations with a common environmental resource https://arxiv.org/abs/2405.01437

“LLMs struggle to generate unit tests effectively for complex code. Existing methods either lack usage examples or use large code context, overwhelming the LLM. This paper addresses these limitations. It proposes augmenting LLM prompts with precise context from static program https://x.com/rohanpaul_ai/status/1900183062459146365

“🚀Alpha Test New LangGraph Platform Dataplane Deployment We are looking for ~5-10 small, fast moving startups to alpha test a new deployment option for LangGraph Platform on a Kubernetes cluster living in your own cloud. This deployment option is a hybrid data plane/control” / X https://x.com/hwchase17/status/1899150042172379476