sports photography. a llama stands in the UFC octagon. --ar 5:3 --style raw

Open Source AI News: Week Ending 04/26/2024

April 27, 2024

sports photography. a llama stands in the UFC octagon. –ar 5:3 –style raw

Apple

“Apple presents OpenELM – An efficient LM family with open-source training and inference framework – Performs on par with OLMo while requiring 2x fewer pre-training tokens repo: https://twitter.com/arankomatsuzaki/status/1782948858005454997

apple/OpenELM · Hugging Face – https://huggingface.co/apple/OpenELM

“Just read Apple’s “OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework”. Similar to the OLMo, it’s refreshing to see an LLM paper that shares details discussing the architecture, training methods, and training data. Let’s start with the… https://twitter.com/rasbt/status/1783480053847736713

“Apple Has Open-Sourced Their On-Device Language Models And They Aren’t Very Good! Apple has uncharacteristically been open-sourcing its work around language models! Kudos to them for that 🙏 However, their models are really bad. Compare the 3B model’s MMLU, which is 24.8, to… https://twitter.com/bindureddy/status/1783635037365436462

Paper page – OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework – https://huggingface.co/papers/2404.14619

Cohere

“Yesterday, we open sourced the Cohere Toolkit. We think this will be a major accelerant for getting LLMs into production within enterprise. https://twitter.com/aidangomez/status/1783533461401227563

DeepSeek

“🌟 Meet #DeepSeekMoE: The Next Gen of Large Language Models! Performance Highlights: 📈 DeepSeekMoE 2B matches its 2B dense counterpart with 17.5% computation. 🚀 DeepSeekMoE 16B rivals LLaMA2 7B with 40% computation. 🛠 DeepSeekMoE 145B significantly outperforms Gshard,… https://twitter.com/deepseek_ai/status/1745304852211839163

2201.05596v2.pdf – chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/https://arxiv.org/pdf/2201.05596

Meta/Llama

Llama-3 may have just killed proprietary AI models · Kadoa · AI Web Scraper – https://www.kadoa.com/blog/llama3-killed-proprietary-models

“Llama 3 70B is, within the margin of error, tied at 1st place for English queries at the LMSYS leaderboard. https://twitter.com/rohanpaul_ai/status/1783570318230978783

“Meta’s Llama 3 has ascended the AI model rankings. On English prompts, the open model topped GPT-4 on the LMSYS LLM leaderboard. Major milestone for open-source LLMs. https://twitter.com/rowancheung/status/1782251056119885961

“Llama-3 is closing the gap with GPT-4, but multimodal models gotta catch up. Vision capabilities of open models like LlaVA are far, far behind GPT-4V. Video models are even worse. They hallucinate all the time and fail to give detailed descriptions of complex scenes and actions.…” / X – https://twitter.com/DrJimFan/status/1782072699705250204

“llama 3 is an english-first model filter the lmsys leaderboard for english prompts only and 70B instruct beats opus, sonnet, and online gemini pro https://twitter.com/morqon/status/1781289685358088671

“As predicted, Meta starts opening their VR/AR OS to other hardware companies. https://twitter.com/BenBajarin/status/1782443180371349780

A New Era for Mixed Reality | Meta Quest Blog | Meta Store – https://www.meta.com/blog/quest/meta-horizon-os-open-hardware-ecosystem-asus-republic-gamers-lenovo-xbox/

“Best LLM value proposition right now is @GroqInc with @Meta Llama-3-70b-instruct Especially if you value speed!💨 https://twitter.com/benfleming__/status/1781609335170215983

“It’s been exactly one week since we released Meta Llama 3, in that time the models have been downloaded over 1.2M times, we’ve seen 600+ derivative models on @HuggingFace and much more. More on the exciting impact we’re already seeing with Llama 3 ➡️ https://twitter.com/AIatMeta/status/1783602908845748685

“AI Showdown 🤯🚀 @AIatMeta’s LLama 3 70B on @GroqInc blows out of the water Claude Opus and GPT-4 Turbo on combined Speed/Price/Quality dimension at avg 200 tokens per second (real life measurements). Read more here 👇 https://twitter.com/avysotsky/status/1781458937981829426

“”Quantizing Llama 3 8B seems more harmful compared to other models” The 8B model is packed so full of information it’s tensores can no longer be as robustly mathematically/ structurally encoded compared to the older 7Bs. Similar thoughts were explored in the paper – “How Good…” / X – https://twitter.com/rohanpaul_ai/status/1783613703759274363

“Meta’s new AI—Llama3—beats an old version of GPT-4, Claude 2.1, and GPT-3.5! You can run it: – 100% free – with 100% privacy (no data leaves your machine) See the comments for an easy way to install 👇 https://twitter.com/JeremyNguyenPhD/status/1782011010804817978

“Llama 3 8B running 1.89 tokens/s on a Raspberry Pi 5 is pretty CRAZY https://twitter.com/adamcohenhillel/status/1781490719997526210

“llama 3 70b beamed to my phone from my M1 Max ~7.6 tok/s with mlx. your own little gpt-4 at home https://twitter.com/localghost/status/1781847388879220742

“3 ways to run Llama-3 locally (100% free and without internet):” / X – https://twitter.com/Saboo_Shubham_/status/1781748135473131740

Try Meta’s new Llama 3 model via the OctoAI API today | OctoAI – https://octo.ai/blog/try-meta-llama-3-via-the-octoai-api/

“Meta releasing near gpt-4 level models is really driving the price of tokens down because anyone can take the weights and optimize the runtime eg groq, togetherapi, fireworks etc. Definitely not good for OpenAI” / X – https://twitter.com/abacaj/status/1781443464246559180

“Here’s one little thing you might have missed about what makes L2 a big deal. It has a 128k vocabulary. So, first of all, those 15T tokens? Probably 22-25% more text in Llama2/Mistral-speak. It’ll be faster as well, and better at multilingual (esp after finetuning). Good day.” / X – https://twitter.com/teortaxesTex/status/1781001629174575126

Maxime Labonne – Fine-tune Llama 3 with ORPO – https://mlabonne.github.io/blog/posts/2024-04-19_Fine_tune_Llama_3_with_ORPO.html

Llama 3 is not very censored · Ollama Blog – https://ollama.com/blog/llama-3-is-not-very-censored

Sharing Llama-3-8B-Web, an action model designed for browsing the web by following instructions and talking to the user, and WebLlama, a new project for pushing development in Llama-based agents : r/LocalLLaMA – https://www.reddit.com/r/LocalLLaMA/comments/1caw3ad/sharing_llama38bweb_an_action_model_designed_for/

Groq hosted Llama-3-70B is not smart, probably quantized too much : r/LocalLLaMA – https://www.reddit.com/r/LocalLLaMA/comments/1casosh/groq_hosted_llama370b_is_not_smart_probably/

“Moreover, we observe even stronger performance in English category, where Llama 3 ranking jumps to ~1st place with GPT-4-Turbo! It consistently performs strong against top models (see win-rate matrix) by human preference. It’s been optimized for dialogue scenario with large… https://twitter.com/lmsysorg/status/1782483701710061675

The Rundown University – https://university.therundown.ai/c/daily-tutorials/access-the-most-advanced-open-source-llm-llama-3-on-your-phone-3bc18501-071a-4438-aa52-36bd1bcb9eb0

Llama 3 70B is REALLY good with creative writing with just a little bit of prompting effort. I’m very impressed. : r/LocalLLaMA – https://www.reddit.com/r/LocalLLaMA/comments/1cbrt5l/llama_3_70b_is_really_good_with_creative_writing/

“I’ve doubled LLaMA 3’s context window to 16K tokens. Fully open-source. Link in thread: https://twitter.com/mattshumer_/status/1782576964118675565

“Colossal-Inference now supports Llama 3 inference acceleration. They report a ~20% enhancement in training efficiency for Llama 3 8B and 70B and outperforming alternative inference solutions such as vLLM. This is why open-source AI matters. There are all kinds of innovations… https://twitter.com/omarsar0/status/1783895931043111088

“Here’s the 256k (262k) version built on OSS tools so that anyone can reproduce on their own. Trained using PoSE further extending our previous 64k version at the original RoPE theta. Per our previous experiments, I expect this should handle passkey retrieval up to 512k. 🤗Model:…” / X – https://twitter.com/winglian/status/1783842736833016289

“18% faster Llama 3 70B training. 20% faster Llama 3 8B training. We. Are. Moving. Fast! https://twitter.com/svpino/status/1783888989025431933

Llama 3 8B f16 vs Llama 3 70B Q2 : r/LocalLLaMA – https://www.reddit.com/r/LocalLLaMA/comments/1cda0fv/llama_3_8b_f16_vs_llama_3_70b_q2/

LLama-3-8B-Instruct with a 262k context length landed on HuggingFace : r/LocalLLaMA – https://www.reddit.com/r/LocalLLaMA/comments/1cd4yim/llama38binstruct_with_a_262k_context_length/

“Build a UX for your LLM chatbot/agent that not only includes streaming, but also showcases sources as expandable UI elements 📖🔎, similar to @perplexity_ai! Now possible in one-line of code through create-llama. Amazing work by @MarcusSchiesser 💫 Check it out:… https://twitter.com/llama_index/status/1783297521386934351

“I’m up to 96k context for Llama 3 8B. Using PoSE, we did continued pre-training of the base model w 300M tokens to extend the context length to 64k. From there we increased the RoPE theta to further attempt to extend the context length. 🧵 https://twitter.com/winglian/status/1783456379199484367

“Dolphin-2.9-Llama3-70b is released – created by myself, @FernandoNetoAi, @LucasAtkins7, and Cognitive Computations under llama3 license. Much gratitude to my compute sponsor @CrusoeEnergy and personal thanks to @3thanPetersen for quantizing it! And much thanks to the dataset… https://twitter.com/erhartford/status/1783273948022755770

“Llama-3 70b QLoRA finetuning is 1.83x faster & uses 63% less VRAM than HF+FA2 1. Llama-3 70b + Unsloth can fit 48K context lengths on bsz=1 on A100 80GB (6x longer than FA2) with +1.9% overhead 2. Llama-3 8b QLoRA fits in a 8GB card & is 2x faster, uses 68% less VRAM. Can fit… https://twitter.com/danielhanchen/status/1783214287567347719

Snowflake

Snowflake Arctic – LLM for Enterprise AI – https://www.snowflake.com/blog/arctic-open-efficient-foundation-language-models-snowflake/

“.@SnowflakeDB is thrilled to announce #SnowflakeArctic: A state-of-the-art large language model uniquely designed to be the most open, enterprise-grade LLM on the market. This is a big step forward for open source LLMs. And it’s a big moment for Snowflake in our #AI journey as… https://twitter.com/RamaswmySridhar/status/1783123091104936060

“Snowflake casually releases Arctic, an open-source LLM (Apache 2.0 license.) that uses a unique Dense-MoE Hybrid transformer architecture. Arctic performs on par with Llama3 70B in enterprise metrics like coding (HumanEval+ & MBPP+), SQL (Spider) and instruction following… https://twitter.com/omarsar0/status/1783176059694821632
Snowflake Arctic Cookbook Series: Mixture of Experts (MoE) | Snowflake Builders Blog: Data Engineers, App Developers, AI/ML, & Data Science – https://medium.com/snowflake/snowflake-arctic-cookbook-series-exploring-mixture-of-experts-moe-c7d6b8f14d16

Heads up! You’ve scrolled to the end of this category. There may have been just one or two links (above), so go back up and double check to be sure you didn’t quickly scroll down past it.

Be Sure To Read This Week’s Main Post:

This week’s executive overview and top links are here:

AI News #30: Week Ending 04/26/2024 with Executive Summary and Top 39 Links

The post you just read is an deep dive extension of my weekly newsletter, This Week In AI, an executive summary of the top things to know in AI. Each week, I create an accessible overview for laypeople to feel confident they are conversant with the week’s AI developments. I include a curated list of must-click links of the week, to offer everyone a hands-on opportunity to explore the most intriguing updates in artificial intelligence across various categories, including robotics, imagery, video, AR/VR, science, ethics, and more. Beyond the overview, I post these topic-based deeper dives (below). If you haven’t read this week’s overview, I recommend starting there.

Credits/Sources

Most of these weekly links come from just a few prolific oversharing sources. Please follow them, as they work hard to find the news each week and they make it a lot easier for me to compile.