Databricks

Introducing DBRX: A New State-of-the-Art Open LLM | Databricks – https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm

“Introducing DBRX: A New Standard for Open LLM 🔔  https://twitter.com/vitaliychiley/status/1772958872891752868

“At Databricks, we’ve built an awesome model training and tuning stack. We now used it to release DBRX, the best open source LLM on standard benchmarks to date, exceeding GPT-3.5 while running 2x faster than Llama-70B.  https://twitter.com/matei_zaharia/status/1772972271721763199 

“Meet DBRX, a new sota open llm from @databricks. It’s a 132B MoE with 36B active params trained from scratch on 12T tokens. It sets a new bar on all the standard benchmarks, and – as an MoE – inference is blazingly fast. Simply put, it’s the model your data has been waiting for.  https://twitter.com/jefrankle/status/1772961586497425683 

“Took a look at @databricks’s new open source 132 billion model called DBRX! 1) Merged attention QKV clamped betw (-8, 8) 2) Not RMS Layernorm – now has mean removal unlike Llama 3) 4 active experts / 16. Mixtral 2/8 experts. 4) @OpenAI’s TikToken tokenizer 100K. Llama splits…  https://twitter.com/danielhanchen/status/1772981050530316467?s=46&t=6FDPaNxZcbSsELal6Sv7Ug

“Just $10M and two months to train from scratch a GPT3.5 – Llama2 level model. For context, it probably cost 10-20x more to OAI just a year ago! The more we improve as a field thanks to open-source, the cheaper & more efficient it gets! All companies should now train their own…  https://twitter.com/ClementDelangue/status/1773019321511313467?s=20 

Why the AI Hyperrealists at Databricks Spent $10 Million to Beat Meta’s LLM — The Information – https://www.theinformation.com/articles/why-the-ai-hyperrealists-at-databricks-spent-10-million-to-beat-metas-llm 

Mistral

“Mistral just announced at @SHACK15sf that they will release a new model today: Mistral 7B v0.2 Base Model – 32k instead of 8k context window – Rope Theta = 1e6 – No sliding window  https://twitter.com/marvinvonhagen/status/1771609042542039421 

“Mistral AI is what OpenAI would be if it were actually open. And they just threw the largest OSS LLM hackathon to date. Over 2000 hackers applied to compete for $10k in prizes. Here’s what we saw at the @MistralAI x @cerebral_valley hackathon (🧵):  https://twitter.com/AlexReibman/status/1772164601666314316  

“We designed a more challenging task to test the models’ in-context recall capability. It turns out that such a simple task for any human is still giving LLMs a hard time. Mistral 7B (0.2, 32k ctx) and Mixtral completely failed at only 2500 or 5000 tokens. Github Code coming soon.  https://twitter.com/hu_yifei/status/1772610997166952720?s=20 

Other Open Source News

“Today I am launching pre-orders for Compass – a $99 open-source guide. – 30 hours battery life – learns by transcribing your conversations – revisit important moments in your life – shipping first orders next week – iOS and Android Pre-order link below 👇  https://twitter.com/ItsMartynasK/status/1771890769865187648

Modular: The Next Big Step in Mojo🔥 Open Source – https://www.modular.com/blog/the-next-big-step-in-mojo-open-source

Introducing Jamba – https://www.ai21.com/jamba 

“Introducing Jamba, our groundbreaking SSM-Transformer open model! As the first production-grade model based on Mamba architecture, Jamba achieves an unprecedented 3X throughput and fits 140K context on a single GPU. 🥂Meet Jamba  https://twitter.com/AI21Labs/status/1773350888427438424?s=20  

Qwen1.5-MoE: Matching 7B Model Performance with 1/3 Activated Parameters | Qwen – https://qwenlm.github.io/blog/qwen-moe/ 

Introducing Jamba: AI21’s Groundbreaking SSM-Transformer Model – https://www.ai21.com/blog/announcing-jamba 

Heads up! You’ve scrolled to the end of this category. There may have been just one or two links (above), so go back up and double check to be sure you didn’t quickly scroll down past it.

Be Sure To Read This Week’s Main Post:

This week’s executive overview and top links are here:

AI News #25: Week Ending 03/22/2024 with Executive Summary and Top 55 Links

The post you just read is an deep dive extension of my weekly newsletter, This Week In AI, an executive summary of the top things to know in AI. Each week, I create an accessible overview for laypeople to feel confident they are conversant with the week’s AI developments. I include a curated list of must-click links of the week, to offer everyone a hands-on opportunity to explore the most intriguing updates in artificial intelligence across various categories, including robotics, imagery, video, AR/VR, science, ethics, and more. Beyond the overview, I post these topic-based deeper dives (below). If you haven’t read this week’s overview, I recommend starting there.

Credits/Sources

Most of these weekly links come from just a few prolific oversharing sources. Please follow them, as they work hard to find the news each week and they make it a lot easier for me to compile.

For previous issues, please visit the archives!

Thanks for reading!

One response to “Open Source AI News: Week Ending 03/29/2024”

  1. […] Open Source Models: An open source AI model refers to a class of artificial intelligence models with public source code. They can be inspected, copied, installed, and customized on private computers. In contrast, a closed source model is proprietary and owned by a company that you pay to use (like PowerPoint or Photoshop). One of the most famous open source language models is a French model called Mistral. Its code is completely publicly available, and anyone can download it and customize it. On one hand, open source is a transparent and powerful way to democratize AI, but on the other hand, open source models circumvent the guard rails and copyright protections that private companies implement. Open source models are the wild west of artificial intelligence, but also the potential saving grace (depending on who you ask). It’s a bit like gun control debates but for computing power.This weeks’s latest open source news: https://ethanbholland.com/2024/03/29/open-source-ai-news-week-ending-03-29-2024/ […]

Leave a Reply

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading