product photo. a can of car oil with the brand “Synthetic Data” emblazoned on the front. –ar 16:9 –style raw

“Here are 300 hours of curated courses focused on Machine Learning Engineering. 15 courses. From beginner to advanced. Google published these for free.  https://twitter.com/svpino/status/1780657510518788593

“In Quantization Fundamentals with @HuggingFace, you’ll learn how to quantize open source models, to make them more accessible and efficient. You’ll get hands-on and practice by quantizing open source multimodal and language models. Start now! ➡️  https://twitter.com/DeepLearningAI/status/1780612212765200599

“Highly amusing update, ~18 hours later: llm.c is now down to 26.2ms/iteration, exactly matching PyTorch (tf32 forward pass). We discovered a bug where we incorrectly called cuBLAS in fp32 mathmode 🤦‍♂️. And ademeure contributed a more optimized softmax kernel for very long rows…  https://twitter.com/karpathy/status/1779272336186978707 

“As the natural world’s human data becomes increasingly exhausted through LLM training, we believe that: the data carefully created by AI and the model step-by-step supervised by AI will be the sole path towards more powerful AI. Thus, we built a Fully AI powered Synthetic…  https://twitter.com/WizardLM_AI/status/1779899333678387318 

“Excited to announce the Compass Beta, a very powerful multi-aspect data search system powered by a new embedding model, Compass. We’re looking for help stress-testing the model’s capabilities and finding where it breaks. Sign up here:  https://twitter.com/aidangomez/status/1779882113573044625 

[2404.11536v1] FedPFT: Federated Proxy Fine-Tuning of Foundation Models – https://arxiv.org/abs/2404.11536v1 

[2404.11225v1] In-Context Learning State Vector with Inner and Momentum Optimization – https://arxiv.org/abs/2404.11225v1

[2404.11335v1] SoccerNet Game State Reconstruction: End-to-End Athlete Tracking and Identification on a Minimap – https://arxiv.org/abs/2404.11335v1 

Diffusion Models for Video Generation | Lil’Log – https://lilianweng.github.io/posts/2024-04-12-diffusion-video/ 

“🔥llm.c update: Our single file of 2,000 ~clean lines of C/CUDA code now trains GPT-2 (124M) on GPU at speeds ~matching PyTorch (fp32, no flash attention)  https://twitter.com/karpathy/status/1781387674978533427 

[2404.09937] Compression Represents Intelligence Linearly – https://arxiv.org/abs/2404.09937 

Leave a Reply

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading