A fashion photoshoot of a runway look inspired by blueprints. A large screen displays the word “Open Source” –ar 4:3 –style raw

Cohere

Nvidia and Salesforce double down on AI startup Cohere in $450 million round, source says | Reuters

https://www.reuters.com/technology/nvidia-salesforce-double-down-ai-startup-cohere-450-million-round-source-says-2024-06-04

“We’re excited to launch our startup program to empower early-stage AI innovation solving real world business challenges! Apply here:

We’re excited to launch our startup program to empower early-stage AI innovation solving real world business challenges!

Apply here: https://t.co/kGSKAicWWC https://t.co/pJBm2tWyUa
— cohere (@cohere) June 6, 2024

“We’ve released a library of useful cookbooks to help you get started building powerful AI applications, like agents, with RAG and semantic search, with Cohere’s frontier models to tackle enterprise-grade workloads! Find them here:

We’ve released a library of useful cookbooks to help you get started building powerful AI applications, like agents, with RAG and semantic search, with Cohere’s frontier models to tackle enterprise-grade workloads!

Find them here: https://t.co/OPQTjq8GlN pic.twitter.com/Y37sgGJcuJ
— cohere (@cohere) June 5, 2024

Grok

“xAI’s Grok is rumored to be receiving two new modes, Socrates and D.E.I. This adds to the current selections of Normal and Fun to augment the AI chatbot’s responses

xAI's Grok is rumored to be receiving two new modes, Socrates and D.E.I.

This adds to the current selections of Normal and Fun to augment the AI chatbot’s responseshttps://t.co/I7OppidrJu
— Rowan Cheung (@rowancheung) June 3, 2024

“NEWS: xAI might be working on ‘Socrates’ mode for Grok.

NEWS: xAI might be working on ‘Socrates’ mode for Grok. pic.twitter.com/9hNlGh91hC
— X Daily News (@xDaily) June 2, 2024

Hugging Face

“Thanks Jensen. Now you can use @nvidia NIM directly from the @huggingface hub for Llama3. Very excited to see that Hugging Face is becoming the gateway for AI compute!

Thanks Jensen. Now you can use @nvidia NIM directly from the @huggingface hub for Llama3.

Very excited to see that Hugging Face is becoming the gateway for AI compute! pic.twitter.com/C78FBxtoBM
— clem 🤗 (@ClementDelangue) June 4, 2024

“Optimum-NVIDIA from @nvidia – By changing just a single line of code, you can unlock up to 28x faster inference and 1,200 tokens/second on the NVIDIA platform. 🔥 📌 Optimum-NVIDIA is the first Hugging Face inference library to benefit from the new float8 format supported on

Optimum-NVIDIA from @nvidia – By changing just a single line of code, you can unlock up to 28x faster inference and 1,200 tokens/second on the NVIDIA platform. 🔥

📌 Optimum-NVIDIA is the first Hugging Face inference library to benefit from the new float8 format supported on… pic.twitter.com/CLuGbgkhxC
— Rohan Paul (@rohanpaul_ai) June 4, 2024

“Robotics + open source = <3 Hugging Face and Pollen Robotics show off first project: an open source robot that does chores

Robotics + open source = <3

Hugging Face and Pollen Robotics show off first project: an open source robot that does chores https://t.co/NQJGSPJzZs by @carlfranzen
— Florent Daudens (@fdaudens) June 7, 2024

“MMUPD is now hosted on @huggingface Hub as a leaderboard 🏆 amazing to see LLaVA-1.6 is outperforming proprietary models on many subtasks 🤩 link in the next one!

MMUPD is now hosted on @huggingface Hub as a leaderboard 🏆
amazing to see LLaVA-1.6 is outperforming proprietary models on many subtasks 🤩

link in the next one! https://t.co/zTYIM0OTlz pic.twitter.com/gvBXKcxHS1
— merve (@mervenoyann) June 5, 2024

“Summer blues? Cheer yourself (or a friend) up with the ComplimentBot 💖 I bootstrapped this app with the @Gradio javascript client and @huggingface ZeroGPU. If you’re building with AI, keep Gradio and Hugging Face Hub/spaces close by in your toolbox!

Summer blues? Cheer yourself (or a friend) up with the ComplimentBot 💖

I bootstrapped this app with the @Gradio javascript client and @huggingface ZeroGPU. If you're building with AI, keep Gradio and Hugging Face Hub/spaces close by in your toolbox!https://t.co/EB37omc6Hd pic.twitter.com/2FZK3OcJMi
— Freddy A Boulton (@freddy_alfonso_) June 27, 2024

FineWeb: decanting the web for the finest text data at scale – a Hugging Face Space by HuggingFaceFW

https://huggingface.co/spaces/HuggingFaceFW/blogpost-fineweb-v1

“Hugging Face Embedding Container for @awscloud SageMaker is now generally available! Improve and simplify your embedding creation for Retrieval-Augmented Generation (RAG) applications. 🚀 What’s new 🆕: 🔥 Supports popular architectures: BERT, RoBERTa, XLM-RoBERTa, @nomic_ai

Hugging Face Embedding Container for @awscloud SageMaker is now generally available! Improve and simplify your embedding creation for Retrieval-Augmented Generation (RAG) applications. 🚀

What's new 🆕:
🔥 Supports popular architectures: BERT, RoBERTa, XLM-RoBERTa, @nomic_ai… pic.twitter.com/JW7hffiQCx
— Philipp Schmid (@_philschmid) June 7, 2024

“Qwen2 released! 5 Sizes and multilingual in 29 languages. Check out on @huggingface! Blog:

Qwen2 released! 5 Sizes and multilingual in 29 languages. Check out on @huggingface!

Blog: https://t.co/r26LXnHtdL
72B Chat-Demo: https://t.co/EUjHztV12V
Models: https://t.co/VWSoVtLd2L

Kudos to the @Alibaba_Qwen for adopting Apache 2.0!
— Philipp Schmid (@_philschmid) June 6, 2024

Meta/Llama

“Thanks Jensen. Now you can use @nvidia NIM directly from the @huggingface hub for Llama3. Very excited to see that Hugging Face is becoming the gateway for AI compute!

Thanks Jensen. Now you can use @nvidia NIM directly from the @huggingface hub for Llama3.

Very excited to see that Hugging Face is becoming the gateway for AI compute! pic.twitter.com/C78FBxtoBM
— clem 🤗 (@ClementDelangue) June 4, 2024

“Last Week: Groq exceeded 30,000 Tokens / second input rate on Llama3 8B❗️ This Week: Llama3 70B at 40,792 Tokens/s input rate‼️ – FP16 Multiply, FP32 Accumulate – 7989 tokens in – full Llama context length Next Week: …? 😮

Last Week: Groq exceeded 30,000 Tokens / second input rate on Llama3 8B❗️

This Week: Llama3 70B at 40,792 Tokens/s input rate‼️
– FP16 Multiply, FP32 Accumulate
– 7989 tokens in – full Llama context length

Next Week: …? 😮 pic.twitter.com/rIijD2Is76
— Jonathan Ross (@JonathanRoss321) June 6, 2024

“A GPT-4 level chatbot, available to use completely free, running at over 800 tokens per second on Groq. I’m genuinely mindblown by LlaMA 3. Try it with the link in the next tweet.

A GPT-4 level chatbot, available to use completely free, running at over 800 tokens per second on Groq.

I'm genuinely mindblown by LlaMA 3.

Try it with the link in the next tweet. pic.twitter.com/kSnXqY9HFk
— Rowan Cheung (@rowancheung) April 20, 2024

“Put this mind bending achievement in perspective: @GroqInc runs Llama 70b in lossless precision on ~4 Wikipedia articles in quite literally the blink of an eye. – A 70B model in 16-bit precision with 32-bit accumulation (loss-less). – Processing ~8000 tokens in 0.2 seconds (or”

Put this mind bending achievement in perspective:@GroqInc runs Llama 70b in lossless precision on ~4 Wikipedia articles in quite literally the blink of an eye.

– A 70B model in 16-bit precision with 32-bit accumulation (loss-less).
– Processing ~8000 tokens in 0.2 seconds (or… https://t.co/N1arORIsf5
— Awni Hannun (@awnihannun) June 6, 2024

Mistral

“Run Mixtral with TensorRT-LLM with Parallelism Modes ✨ 📌 Mixture of Experts supports two parallelism modes, these are Expert Parallelism (EP) and Tensor Parallelism (TP). 📌 In TP mode (default) expert weight matrices are sliced evenly between all GPUs, so that all GPUs work

Run Mixtral with TensorRT-LLM with Parallelism Modes ✨

📌 Mixture of Experts supports two parallelism modes, these are Expert Parallelism (EP) and Tensor Parallelism (TP).

📌 In TP mode (default) expert weight matrices are sliced evenly between all GPUs, so that all GPUs work… pic.twitter.com/hAAbWZk95g
— Rohan Paul (@rohanpaul_ai) June 3, 2024

“Mistral fine-tuning API is out ! You can now fine-tune your own Mistral models and deploy them efficiently on La Plateforme :

Mistral fine-tuning API is out !

You can now fine-tune your own Mistral models and deploy them efficiently on La Plateforme : https://t.co/YvDL8jZRQb

In many cases, fine-tuning allows small models to match (and sometimes surpass) the performance of much larger models, but with…
— Guillaume Lample @ ICLR 2024 (@GuillaumeLample) June 5, 2024

“Breaking: First live demo of the @MistralAI fine-tuning API (released a few hours ago) is here: @sophiamyang walks through: – How to prep & validate your data – Hyper params – The fine-tuning API – Integrations (W&B, etc) – A treasure trove of collab notebooks & docs

Breaking: First live demo of the @MistralAI fine-tuning API (released a few hours ago) is here:@sophiamyang walks through:

– How to prep & validate your data
– Hyper params
– The fine-tuning API
– Integrations (W&B, etc)
– A treasure trove of collab notebooks & docs pic.twitter.com/ctzW3YHxdB
— Hamel Husain (@HamelHusain) June 5, 2024

My Tailor is Mistral | Mistral AI | Frontier AI in your hands

https://mistral.ai/news/customization

Phi

“Introducing Phi-3 WebGPU, a private and powerful AI chatbot that runs locally in your browser, powered by 🤗 Transformers.js and onnxruntime-web! 🔒 On-device inference: no data sent to a server ⚡️ WebGPU-accelerated (> 20 t/s) 📥 Model downloaded once and cached Try it out! 👇

Introducing Phi-3 WebGPU, a private and powerful AI chatbot that runs locally in your browser, powered by 🤗 Transformers.js and onnxruntime-web!

🔒 On-device inference: no data sent to a server
⚡️ WebGPU-accelerated (> 20 t/s)
📥 Model downloaded once and cached

Try it out! 👇 pic.twitter.com/Y79fTIghv7
— Xenova (@xenovacom) May 8, 2024

“Phi-3 Medium (14B) and Small (7B) models are on the @lmsysorg leaderboard! 😍 Medium ranks near GPT-3.5-Turbo-0613, but behind Llama 3 8B. Phi-3 Small is close to Llama-2-70B, and Mistral fine-tunes. This proves that we cannot purely optimize for academic benchmarks. We need to

Phi-3 Medium (14B) and Small (7B) models are on the @lmsysorg leaderboard! 😍 Medium ranks near GPT-3.5-Turbo-0613, but behind Llama 3 8B. Phi-3 Small is close to Llama-2-70B, and Mistral fine-tunes.

This proves that we cannot purely optimize for academic benchmarks. We need to… pic.twitter.com/f6EDuqW3cI
— Philipp Schmid (@_philschmid) June 3, 2024

“Truly impressed by this: – real-time speech recognition – running locally in your browser – which means total confidentiality — 0 data sent to anyone – 100 different languages

Truly impressed by this:
– real-time speech recognition
– running locally in your browser
– which means total confidentiality — 0 data sent to anyone
– 100 different languageshttps://t.co/gy20qUe512 https://t.co/2e2JXkGKJL pic.twitter.com/DGS7w4f7Fs
— Florent Daudens (@fdaudens) June 7, 2024

Qwen

“Qwen2 released! 5 Sizes and multilingual in 29 languages. Check out on @huggingface! Blog:

Qwen2 released! 5 Sizes and multilingual in 29 languages. Check out on @huggingface!

Blog: https://t.co/r26LXnHtdL
72B Chat-Demo: https://t.co/EUjHztV12V
Models: https://t.co/VWSoVtLd2L

Kudos to the @Alibaba_Qwen for adopting Apache 2.0!
— Philipp Schmid (@_philschmid) June 6, 2024

“Qwen2 is the most impactful open LLM release since Llama 3! @Alibaba_Qwen just released their new multilingual model family, outperforming @AIatMeta Llama 3 🤯 Qwen2 comes in 5 sizes and is trained in 29 languages, achieving state-of-the-art performance across academic and chat

Qwen2 is the most impactful open LLM release since Llama 3! @Alibaba_Qwen just released their new multilingual model family, outperforming @AIatMeta Llama 3 🤯 Qwen2 comes in 5 sizes and is trained in 29 languages, achieving state-of-the-art performance across academic and chat… pic.twitter.com/A68BTx8ley
— Philipp Schmid (@_philschmid) June 6, 2024

“💗Hello Qwen2! Happy to share the Qwen2 models to you all! 📖 BLOG:

💗Hello Qwen2!

Happy to share the Qwen2 models to you all!

📖 BLOG: https://t.co/0UNwRo1Iea
🤗 HF collection: https://t.co/z6oWkw7Kzb
🤖 https://t.co/Bp56AqQpQJ
💻 GitHub: https://t.co/sEIRe4IDBJ

We have base and Instruct models of 5 sizes, Qwen2-0.5B, Qwen2-1.5B, Qwen2-7B,… pic.twitter.com/y5HAu8HcTH
— Junyang Lin (@JustinLin610) June 6, 2024

Hello Qwen2 | Qwen

https://qwenlm.github.io/blog/qwen2

Generalizing an LLM from 8k to 1M Context using Qwen-Agent | Qwen

https://qwenlm.github.io/blog/qwen-agent-2405

“After months of efforts, we are pleased to announce the evolution from Qwen1.5 to Qwen2. This time, we bring to you: ⭐ Base and Instruct models of 5 sizes, including Qwen2-0.5B, Qwen2-1.5B, Qwen2-7B, Qwen2-57B-A14B, and Qwen2-72B. Having been trained on data in 27 additional

After months of efforts, we are pleased to announce the evolution from Qwen1.5 to Qwen2. This time, we bring to you:

⭐ Base and Instruct models of 5 sizes, including Qwen2-0.5B, Qwen2-1.5B, Qwen2-7B, Qwen2-57B-A14B, and Qwen2-72B. Having been trained on data in 27 additional… pic.twitter.com/SVVxmwwUJ8
— Binyuan Hui (@huybery) June 6, 2024

“🔩Small but Strong: Qwen2-7B-Instruct

🔩Small but Strong: Qwen2-7B-Instruct pic.twitter.com/6Nf0UYdo4r
— Binyuan Hui (@huybery) June 6, 2024

“We just got a new #1 open LLM!

We just got a new #1 open LLM! https://t.co/0fffZ8mO1H pic.twitter.com/PZ6xUlzNFK
— clem 🤗 (@ClementDelangue) June 6, 2024

Other Open Source News

“The Big AI Debate explained by Forbes. What is more beneficial or more dangerous: open source AI or proprietary controlled by 3 or 4 big players? The people who worry most about AI safety also tend to be he ones who overestimate the power of AI.”

The Big AI Debate explained by Forbes.
What is more beneficial or more dangerous: open source AI or proprietary controlled by 3 or 4 big players?
The people who worry most about AI safety also tend to be he ones who overestimate the power of AI. https://t.co/OZRWQVxR8k
— Yann LeCun (@ylecun) June 4, 2024

The future of foundation models is closed-source
if the centralizing forces of data and compute hold, open and closed-source AI cannot both dominate long-term

https://blog.johnluttig.com/p/the-future-of-foundation-models-is

“At COMPUTEX 2024, @nvidia CEO Jensen showing how Pandas code is now 50x faster on @GoogleColab after integration with RAPIDS cuDF over standard pandas. This is with zero code changes, and is available in the default runtime environment. Just add in %load-ext cudf.pandas over

At COMPUTEX 2024, @nvidia CEO Jensen showing how Pandas code is now 50x faster on @GoogleColab after integration with RAPIDS cuDF over standard pandas.

This is with zero code changes, and is available in the default runtime environment. Just add in %load-ext cudf.pandas over… https://t.co/o1ZtLlu9M5 pic.twitter.com/58IBOm5vAt
— Rohan Paul (@rohanpaul_ai) June 2, 2024

“Open-source robotics is the way 🦾”

Open-source robotics is the way 🦾 https://t.co/ANbkaV4ybZ
— hardmaru (@hardmaru) June 7, 2024

“The US is going to lose its leadership in AI if it doesn’t support more open research and open-source AI!”

The US is going to lose its leadership in AI if it doesn't support more open research and open-source AI!
— clem 🤗 (@ClementDelangue) June 7, 2024

“We have released GLM-4-520 and have the open-sourced version GLM-4-9B with superior performance beyond Llama-3-8B.

We have released GLM-4-520 and have the open-sourced version GLM-4-9B with superior performance beyond Llama-3-8B.https://t.co/MRaXbjxZGd pic.twitter.com/cMoNOG5nQ5
— jietang (@jietang) June 8, 2024

“Is Fineweb-edu the best open text dataset ever released? A big step in empowering all companies to train their own GPT5!

Is Fineweb-edu the best open text dataset ever released? A big step in empowering all companies to train their own GPT5! https://t.co/fSEngn3Eou pic.twitter.com/Z8YJRQzB7N
— clem 🤗 (@ClementDelangue) June 3, 2024

” 🤖Introducing 📺𝗢𝗽𝗲𝗻-𝗧𝗲𝗹𝗲𝗩𝗶𝘀𝗶𝗼𝗻: a web-based teleoperation software! 🌐Open source, cross-platform (VisionPro & Quest) with real-time stereo vision feedback. 🕹️Easy-to-use hand, wrist, head pose streaming. Code:

🤖Introducing 📺𝗢𝗽𝗲𝗻-𝗧𝗲𝗹𝗲𝗩𝗶𝘀𝗶𝗼𝗻: a web-based teleoperation software!

🌐Open source, cross-platform (VisionPro & Quest) with real-time stereo vision feedback.

🕹️Easy-to-use hand, wrist, head pose streaming.

Code: https://t.co/3lu5ZTMNfA pic.twitter.com/GVJTb8Fxyl
— Xuxin Cheng (@xuxin_cheng) April 26, 2024

“The “weight” is nearly over! Today, at @ComputexTaipei, our Co-CEO, @chrlaf, officially announced the open release date of Stable Diffusion 3 Medium for June 12th. 🔗Sign up to the waitlist to be the first to know when the model releases:

The “weight” is nearly over! Today, at @ComputexTaipei, our Co-CEO, @chrlaf, officially announced the open release date of Stable Diffusion 3 Medium for June 12th.

🔗Sign up to the waitlist to be the first to know when the model releases: https://t.co/NmplCeKuQB pic.twitter.com/Xe5VxBI1ET
— Stability AI (@StabilityAI) June 3, 2024

Stable Audio Open — Stability AI

https://stability.ai/news/introducing-stable-audio-open

What I learned from looking at 900 most popular open source AI tools

https://huyenchip.com/2024/03/14/ai-oss.html

An entirely open-source AI code assistant inside your editor · Ollama Blog

https://ollama.com/blog/continue-code-assistant

One clear, sad conclusion that we have to draw from our AIW study (https://arxiv.org/abs/2406.02061) is that all current SOTA open-weight LLMs that claim strong performance (eg Command-R+, Mistral, Dbrx-Instruct, Llama 3, Qwen, etc) are in fact seriously flawed in simple basic reasoning.

One clear, sad conclusion that we have to draw from our AIW study (https://t.co/nHlKJFQuRn) is that all current SOTA open-weight LLMs that claim strong performance (eg Command-R+, Mistral, Dbrx-Instruct, Llama 3, Qwen, etc) are in fact seriously flawed in simple basic reasoning.
— Jenia Jitsev 🏳️‍🌈 🇺🇦 (@JJitsev) June 7, 2024

“New open benchmark released with 96% correlation to @lmsysorg Chatbot Arena with < $1 to run. 🤯 MixEval & MixEval-Hard combines existing benchmarks with real-world user queries from the web to close the gap between academic and real-world use! 👀 MixEval-Hard is a challenging

New open benchmark released with 96% correlation to @lmsysorg Chatbot Arena with < $1 to run. 🤯
MixEval & MixEval-Hard combines existing benchmarks with real-world user queries from the web to close the gap between academic and real-world use! 👀

MixEval-Hard is a challenging… pic.twitter.com/M9FqGtbSp4
— Philipp Schmid (@_philschmid) June 7, 2024

“The future of AI glasses is normal looking, light weight and affordable – meet Frame, AI Glasses by @brilliantlabsAR It is shipping to hackers and creators already. Frame is open-source platform with mic, camera, AR display. It leverages your phone (connectivity & audio) and

The future of AI glasses is normal looking, light weight and affordable – meet Frame, AI Glasses by @brilliantlabsAR

It is shipping to hackers and creators already. Frame is open-source platform with mic, camera, AR display. It leverages your phone (connectivity & audio) and… pic.twitter.com/mRcCZBglHX
— Sander Saar (@sandersaar) June 7, 2024

“there we go. please forgo your naive belief that Meta (or any company) will open source powerful AI. we are inevitably headed towards AGI monarchy.”

there we go.
please forgo your naive belief that Meta (or any company) will open source powerful AI.
we are inevitably headed towards AGI monarchy. https://t.co/RFXGhkqESh
— Far El (@far__el) June 5, 2024

This week’s executive overview and top links are here:

AI News #36: Week Ending 06/07/2024 with Executive Summary and Top 40 Links

The post you just read is an deep dive extension of my weekly newsletter, This Week In AI, an executive summary of the top things to know in AI. Each week, I create an accessible overview for laypeople to feel confident they are conversant with the week’s AI developments. I include a curated list of must-click links of the week, to offer everyone a hands-on opportunity to explore the most intriguing updates in artificial intelligence across various categories, including robotics, imagery, video, AR/VR, science, ethics, and more. Beyond the overview, I post these topic-based deeper dives (below). If you haven’t read this week’s overview, I recommend starting there.

Credits/Sources

Most of these weekly links come from just a few prolific oversharing sources. Please follow them, as they work hard to find the news each week and they make it a lot easier for me to compile.