A fashion photoshoot of a runway look inspired by old rusty toolboxes. A large screen displays the word “Tech” –ar 4:3 –style raw
“Creating a Pipeline for Generating Synthetic Data for Fine-Tuning Custom Embedding Models. 👀 Step 1 Create a Knowledge Base: Start with preparing your domain specific knowledge base, such as PDFs or other documents containing information. Convert the content of these documents
“At current growth rates, AI runs out of easy-to-access high quality data by 2028, depending on how aggressive training is. There are techniques that may extend this (eg synthetic data) and possibilities for using less-easily-accessed data. (Google & Meta are sitting on a lot).”
“Given all this, when will we exhaust the web’s text? Training a compute-optimal dense model on ~100T tokens for 4 epochs would take ~5e28 FLOP (around 3 OOMs above GPT-4). At historical growth rates, we’ll reach this level by 2028. 7/12
“So this paper found you can cut the API token costs of using Chain of Thought prompting by over 20% with no decrease in accuracy for GPT-4 (though a decrease in math accuracy in GPT-3.5) by just adding the words “be concise.” That’s all. LLMs are weird.
[2405.09032] ICAL: Implicit Character-Aided Learning for Enhanced Handwritten Mathematical Expression Recognition
[2405.14831] HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models
[2405.21018v1] Improved Techniques for Optimization-Based Jailbreaking on Large Language Models
[2406.02350v1] LlamaCare: A Large Medical Language Model for Enhancing Healthcare Knowledge Sharing
[AINews] FineWeb: 15T Tokens, 12 years of CommonCrawl (deduped and filtered, you’re welcome) • Buttondown
“Towards Scalable Automated Alignment of LLMs Great overview of methods used for automated alignment of LLMs. The four main directions explored in the paper are the following: – Aligning through inductive bias – Aligning through behavior imitation – Aligning through model
“xLSTM: Extended Long Short-Term Memory “performs favorably when compared to state-of-the-art Transformers and State Space Models, both in performance and scaling.” LSTM is not dead! Looking forward to see the comeback of RNNs🔥
zeux.io – LLM inference speed of light
Mesop: Quickly build web UIs in Python
Used at Google for rapid internal app development
DreamMat: High-quality PBR Material Generation with Geometry- and Light-aware Diffusion Models
“Qdrant is now fully integrated with @neo4j’s APOC procedures, bringing advanced vector search capabilities to your graph database applications! 🚀 📖 Read the documentation:
“⚡️ FastEmbed 0.3.0 is here! Now featuring Image embeddings (ResNet50), multimodal embeddings (CLIP), late interaction embeddings (ColBERT), and an innovative type of sparse embeddings. 🙌 GitHub:
LLM Merging Competition: Building LLMs Efficiently through Merging | NeurIPS 2024 Challenge
“How to organize and generate high quality data is the secret sauce of fine tuning. Daniel is going to provide a masterclass on this topic
“Emmanuel elaborates why he’s increasingly bearish on fine-tuning in this talk: Why Fine-Tuning is Dead I am not as bearish, which is why I think the talk is interesting!
BrightEdge Releases Post Google I/O Data on The Impact of
“Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality Presents Mamba-2, which outperforms Mamba and Transformer++ in both perplexity and wall-clock time
“Are We Done with MMLU? Creates MMLU-Redux, which is a subset of 3,000 manually re-annotated questions across 30 MMLU subjects data:
“Awesome and highly useful: FineWeb-Edu 📚👏 High quality LLM dataset filtering the original 15 trillion FineWeb tokens to 1.3 trillion of the highest (educational) quality, as judged by a Llama 3 70B. +A highly detailed paper. Turns out that LLMs learn a lot better and faster
“Transformers are SSMs Generalized Models and Efficient Algorithms Through Structured State Space Duality While Transformers have been the main architecture behind deep learning’s success in language modeling, state-space models (SSMs) such as Mamba have recently been shown
“The Geometry of Concepts in LLMs Studies the geometry of categorical concepts and how the hierarchical relations between them are encoded in LLMs. Finding from the paper: “Simple categorical concepts are represented as simplices, hierarchically related concepts are orthogonal
“Thought-Augmented Reasoning with LLMs Presents a thought-augmented reasoning approach, Buffer of Thoughts, to enhance the accuracy, efficiency, and robustness of LLM-based reasoning. It leverages a meta-buffer containing high-level thoughts (thought templates) distilled from
“llm.c by Hand✍️ C programming + matrix multiplication by hand This combination is perhaps as low as we can get to explain how the Transformer works. Special thanks to @karpathy for encouraging early feedback and @7etsuo for helping me understand the pragma magic. I hope
“This might be one of the most important 45-mn read you could indulge in today if you want to understand the secret behind high performance large language models like Llama3, GPT-4 or Mixtral Inspired by the @distillpub interactive graphics papers, we settled to write the most”
“This is really a ‘WOW’ paper. 🤯 Claims that MatMul operations can be completely eliminated from LLMs while maintaining strong performance at billion-parameter scales and by utilizing an optimized kernel during inference, their model’s memory consumption can be reduced by more
“This new LoRA technique Orthonormal Low-Rank Adaptation (OLoRA) significantly accelerates the convergence of LLM training while preserving the efficiency benefits of LoRA, such as the number of trainable parameters and GPU memory footprint. 🔥 📌 OLoRA not only converges faster
“Teach LLMs to internalize chain-of-thought (CoT) reasoning, without generating explicit intermediate steps, enabling implicit CoT reasoning during inference. 📌 Stepwise Internalization that successfully teaches LLMs to reason implicitly achieves high accuracy while maintaining
“I am finding Infinity quite awesome ✨ . It’s a high-throughput, low-latency REST API for serving vector embeddings, supporting a wide range of text-embedding models and frameworks. — – Deploy any model from MTEB: deploy the model you know from SentenceTransformers – Fast
“Nice overview in this Paper – “Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey” 📌 Parameter-Efficient Fine-Tuning (PEFT): The core concept revolves around adapting pre-trained large models to specific tasks by modifying only a small subset of
“Prompting correctly is power. Keep navigating the deep, dark depth of the latent space, and then you have a real advantage (jailbreaking, adhering to JSON schemas, grounding and much more)
“been learning a lot about LLMs etc over the past year, organized some of my favorite explainers into a “textbook-shaped” resource guide wish i’d had this at the start, maybe it can useful to others on a similar journey
“It’s also fairly clear to me rn that Memory databases are to 2024 what Vector databases were to 2023 @hwchase17 has a huge hit brewing on his hands with langmem
“Excited to share what I’ve been working on as part of the former Superalignment team! We introduce a SOTA training stack for SAEs. To demonstrate that our methods scale, we train a 16M latent SAE on GPT-4. Because MSE/L0 is not the final goal, we also introduce new SAE metrics.”
“We introduce a new SAE training stack based on a TopK activation function. This eliminates feature shrinking and lets us set L0 directly. We find that our method performs well on the MSE/L0 frontier. Our method has very few dead latents, even at 16M scale.
“Built a simple but seemingly quite difficult benchmark for analyzing malicious solidity contract code. So far only the top closed models are capable of occasionally identifying code that is malicious, gpt-4o and claude-opus every open model I tried fails > 95% of the time”
blog.alexalemi.com KL is All You Need
https://blog.alexalemi.com/kl-is-all-you-need.htm
This week’s executive overview and top links are here:
AI News #36: Week Ending 06/07/2024 with Executive Summary and Top 40 Links
The post you just read is an deep dive extension of my weekly newsletter, This Week In AI, an executive summary of the top things to know in AI. Each week, I create an accessible overview for laypeople to feel confident they are conversant with the week’s AI developments. I include a curated list of must-click links of the week, to offer everyone a hands-on opportunity to explore the most intriguing updates in artificial intelligence across various categories, including robotics, imagery, video, AR/VR, science, ethics, and more. Beyond the overview, I post these topic-based deeper dives (below). If you haven’t read this week’s overview, I recommend starting there.
- Agents/Copilots
- Amazon
- Apple
- Artificial General Intelligence (AGI)
- Augmented and Virtual Reality (AR/VR)
- Autonomous Vehicles
- AI Audio
- Business and Enterprise AI
- Chips and Hardware
- Consumer Products
- Education
- Ethics/Legal Security
- Images/Photos
- International AI News
- Locally Run AI Models
- Mobile
- Meta
- Microsoft
- OpenAI
- Open Source
- Podcasts/YouTube
- Publishing and News
- Retrieval-Augmented Generation (RAG) News
- Robots and Embodiment
- Science and Medicine
- Video
- Vision/Multimodality
- X/Twitter/Grok
- Tech and Development
Credits/Sources

Most of these weekly links come from just a few prolific oversharing sources. Please follow them, as they work hard to find the news each week and they make it a lot easier for me to compile.
- Robert Scoble: https://x.com/Scobleizer
- Ethan Mollick: https://www.linkedin.com/in/emollick/
- Alan Thompson: https://lifearchitect.ai/
- Theoretically Media: https://www.youtube.com/@TheoreticallyMedia
- The Rundown: https://www.therundown.ai/
- Bilawal Sidhu: https://twitter.com/bilawalsidhu/
- TLDR: https://tldr.tech/ai
- Jeremiah Owyang: https://twitter.com/jowyang
- Nick St. Pierre: https://twitter.com/nickfloats
- Dr. Jim Fan: https://twitter.com/DrJimFan
- All About AI: https://www.youtube.com/@AllAboutAI
- Marshall Kirkpatrick: https://aitimetoimpact.com/
- AI News (Smol Talk): https://buttondown.email/ainews/archive/
For previous issues, please visit the archives!

Thanks for reading!





Leave a Reply