Ethan B. Holland

Over 54,400 manually organized AI links and counting

Tech and Development: Week Ending 04/19/2024

April 19, 2024

product photo. a can of car oil with the brand “Synthetic Data” emblazoned on the front. –ar 16:9 –style raw

“Here are 300 hours of curated courses focused on Machine Learning Engineering. 15 courses. From beginner to advanced. Google published these for free. https://twitter.com/svpino/status/1780657510518788593

“In Quantization Fundamentals with @HuggingFace, you’ll learn how to quantize open source models, to make them more accessible and efficient. You’ll get hands-on and practice by quantizing open source multimodal and language models. Start now! ➡️ https://twitter.com/DeepLearningAI/status/1780612212765200599

“Highly amusing update, ~18 hours later: llm.c is now down to 26.2ms/iteration, exactly matching PyTorch (tf32 forward pass). We discovered a bug where we incorrectly called cuBLAS in fp32 mathmode 🤦‍♂️. And ademeure contributed a more optimized softmax kernel for very long rows… https://twitter.com/karpathy/status/1779272336186978707

“As the natural world’s human data becomes increasingly exhausted through LLM training, we believe that: the data carefully created by AI and the model step-by-step supervised by AI will be the sole path towards more powerful AI. Thus, we built a Fully AI powered Synthetic… https://twitter.com/WizardLM_AI/status/1779899333678387318

“Excited to announce the Compass Beta, a very powerful multi-aspect data search system powered by a new embedding model, Compass. We’re looking for help stress-testing the model’s capabilities and finding where it breaks. Sign up here: https://twitter.com/aidangomez/status/1779882113573044625

[2404.11536v1] FedPFT: Federated Proxy Fine-Tuning of Foundation Models – https://arxiv.org/abs/2404.11536v1

[2404.11225v1] In-Context Learning State Vector with Inner and Momentum Optimization – https://arxiv.org/abs/2404.11225v1

[2404.11335v1] SoccerNet Game State Reconstruction: End-to-End Athlete Tracking and Identification on a Minimap – https://arxiv.org/abs/2404.11335v1

Diffusion Models for Video Generation | Lil’Log – https://lilianweng.github.io/posts/2024-04-12-diffusion-video/

“🔥llm.c update: Our single file of 2,000 ~clean lines of C/CUDA code now trains GPT-2 (124M) on GPU at speeds ~matching PyTorch (fp32, no flash attention) https://twitter.com/karpathy/status/1781387674978533427

[2404.09937] Compression Represents Intelligence Linearly – https://arxiv.org/abs/2404.09937