Ethan B. Holland

Over 56,100 manually organized AI links and counting

a vintage filing cabinet in a dusty basement. It glows from within. Something magical is inside its drawers. --ar 5:3 --v 6.0 --style raw

Open Source AI News: Week Ending 04/12/2024

April 12, 2024

a vintage filing cabinet in a dusty basement. It glows from within. Something magical is inside its drawers. –ar 5:3 –v 6.0 –style raw

Cohere

“Exciting news – the latest Arena result are out! @cohere’s Command R+ has climbed to the 6th spot, matching GPT-4-0314 level by 13K+ human votes! It’s undoubtedly the **best** open model on the leaderboard now🔥 Big congrats to @cohere’s incredible work & valuable contribution https://twitter.com/lmsysorg/status/1777630133798772766

“@steph_palazzolo @erinkwoo @AndrewYNg @herbertong Cohere released Rerank 3, a new foundation model designed to enhance enterprise search and RAG systems. Quick facts: – 4k context length -Ability to search over multi-aspect and semi-structured data -Multilingual coverage of 100+ languages -Improved latency https://twitter.com/rowancheung/status/1778602208566603853

“Cohere beating Meta and Mistral to GPT-4 performance with an open weights model was not in my LLM bingo card – huge congrats to the team for making this tech widely accessible 🔥!” / X – https://twitter.com/_lewtun/status/1777679834799345809

“Quite a historic moment for Open-sourced LLM Command R+ from @cohere becomes the first open-weights model outperforming GPT-4. — And to note, Elo rating is probably the one as of now, that is more trustable. It is based on the Elo rating system, commonly used in competitive https://twitter.com/rohanpaul_ai/status/1777771141886623840

“Cmd R+ beats Sonnet at financial RAG I initially assumed these models were equivalent due to pricing. However, command r+ was both faster and 5% more correct than Claude Sonnet on financial RAG evals. Financial RAG pipeline: • openai embeddings • cosine similarity retrieval https://twitter.com/virattt/status/1777676354596618474

“🤖 Excited that Command R+ is the top open-weights model on Chatbot Arena! 🛠️🗺️ This doesn’t even assess RAG, tool use, and multilingual capabilities where ⌘R+ does well. 🛝You can try out ⌘R+ on the playground ( https://twitter.com/seb_ruder/status/1777671882205962471

Mistral

“French AI startup Mistral released Mixtral 8×22B, a powerful new frontier LLM, dropped quietly via a 281GB file on X available for download. The LLM features a 65,000-token context window, 176B parameters and is expected to surpass the previous Mixtral. https://twitter.com/rowancheung/status/1778271328983822607

“New Mixtral 8x22B runs nicely in MLX on an M2 Ultra. 4-bit quantized model in the 🤗 MLX Community: https://twitter.com/awnihannun/status/1778054275152937130

“New open model from @MistralAI! 🧠 Yesterday night, Mistral released Mixtral 8x22B a 176B MoE via magnet link. 🔗🤯 What we know so far: 🧮 176B MoE with ~40B active 📜 context length of 65k tokens. 🪨 Base model can be fine-tuned 👀 ~260GB VRAM in fp16, 73GB in int4 📜 Apache https://twitter.com/_philschmid/status/1778051363554934874

“Apparently the new Mistral model beats Claude Sonnet and is a tad bit worse than GPT-4 In a couple of months, the open source community will fine tune it to beat GPT-4 This is a fully open weights model with an Apache 2 license! I can’t believe how quickly the OSS community” / X – https://twitter.com/bindureddy/status/1778016678154211410

Grok

“Playing around with Grok-1 on a Macbook Pro M3 MAX with incredible results! Time to first token: 0.88s Speed: 2.75tok/s https://twitter.com/cstanley/status/1778651443336982535

Other Open Source News

“MyShell is making decentralized AI reality — We train LLaMA2-level LLMs cheaper than @Meta. Introducing JetMoE, our open-source research with MIT, Priceton, and Lepton AI. @MIT_CSAIL @Princeton @LeptonAI No more mega budgets needed, JetMoE to achieve top LLMs with $0.1M ⏩ https://twitter.com/myshell_ai/status/1777878773716984317

“🏠 Welcome to the Qwen1.5 family, the new dense model member, Qwen1.5-32B! This model has shown competitive performance comparable to the 72B model, especially impressing in language understanding, multilingual support, coding and mathematical abilities. But beyond that, https://twitter.com/huybery/status/1776255803282088056

“We just release CodeGemma, a new version of the Gemma line of models fine-tuned on code generation and completion, that achieves state-of-the-art results. Available in sizes 2B and 7B. https://twitter.com/fchollet/status/1777715491550994732

“Researchers from Princeton NLP developed SWE-agent, an open-source project that turns GPT-4 into an AI software engineering The agent achieved accuracy similar to Devin and resolved 12.29% of issues autonomously. https://twitter.com/adcock_brett/status/1777004183847145973

Qwen1.5-32B: Fitting the Capstone of the Qwen1.5 Language Model Series | Qwen – https://qwenlm.github.io/blog/qwen1.5-32b/

“New open LLM from @Alibaba_Qwen! Qwen1.5 32B is a new multilingual dense LLM with a context of 32k, outperforming Mixtral on the open LLM Leaderboard! 🌍🚀 TL;DR 🧮 32B with 32k context size 💬 Chat model used DPO for preference training 📜 Custom License, commercially useable https://twitter.com/_philschmid/status/1776257496547561805

“Introducing: Zephyr 141B-A35B 🥁 🔥Mixtral-8x22B fine-tune 🤯 Using DORPO: new alignment algorithm (no SFT, open ) 🚀 With 7k instances of (open) data Very strong IFEval, BBH, AGIEval… Enjoy! 🤗 https://twitter.com/osanseviero/status/1778430866718421198

Heads up! You’ve scrolled to the end of this category. There may have been just one or two links (above), so go back up and double check to be sure you didn’t quickly scroll down past it.

Be Sure To Read This Week’s Main Post:

This week’s executive overview and top links are here:

AI News #28: Week Ending 04/12/2024 with Executive Summary and Top 48 Links

The post you just read is an deep dive extension of my weekly newsletter, This Week In AI, an executive summary of the top things to know in AI. Each week, I create an accessible overview for laypeople to feel confident they are conversant with the week’s AI developments. I include a curated list of must-click links of the week, to offer everyone a hands-on opportunity to explore the most intriguing updates in artificial intelligence across various categories, including robotics, imagery, video, AR/VR, science, ethics, and more. Beyond the overview, I post these topic-based deeper dives (below). If you haven’t read this week’s overview, I recommend starting there.

Credits/Sources

Most of these weekly links come from just a few prolific oversharing sources. Please follow them, as they work hard to find the news each week and they make it a lot easier for me to compile.