Qwen: AI News Week Ending 06/06/2025

Image created with OpenAI GPT-Image-1. Image prompt: vintage Sly & the Family Stone album-cover style, electric-blue monochrome photo with overlay of daisies and the word LIFE featuring letter-Q circuit ring emitting light; grainy retro print texture, vibrant 60s funk color palette, high-resolution

Qwen2.5-VL is such a great and versatile model that every frontier lab is building on it these days, new agentic models, GUI models and more always base on it @Alibaba_Qwen you’re the best 💗”” / X https://x.com/i/web/status/1929488866748092881

The latest mlx-lm has a new dynamic quantization method (made with @angeloskath). It consistently results in better model quality with no increase in size. Some perplexity results (lower is better) for a few Qwen3 base models: https://x.com/i/web/status/1929633379504493048

Pretty impressive 7B VLM coming out of Xiaomi 🤓 ViT encoder w/ MLP and powered by their 7B Text backbone Compatible w/ Qwen VL arch so works across vLLM, Transformers, SGLang and Llama.cpp Bonus: it can reason and is MIT licensed 🔥 https://x.com/reach_vb/status/1928360066467439012

Deep Seek R1 Qwen3 8B knows it’s overthinking it 😂 https://x.com/awnihannun/status/1928119439737729482

A full set of new and improved Qwen3 4-bit DWQ quants are on Hugging Face MLX Community: https://x.com/i/web/status/1929601108210835931

Why are almost all RL experiments done on qwen models? Kind of interesting right…”” / X https://x.com/abacaj/status/1927948317931000277

The 4-bit DWQ of DSR1 Qwen3 8B is up on HF. Use the command below or use it in @lmstudio: https://x.com/awnihannun/status/1928125690173383098

⚠️⚠️⚠️Qwen team has worked on training pivot tokens ⚠️⚠️⚠️ @_xjdr @doomslide amusingly, they *do* test it on Llama 3.1 as well but find it so ass that no conclusive results can be had without cold start with Qwen data https://x.com/i/web/status/1929755590404055358

Seems like no one saw this either, scraping arxiv manually seems to be the way. Pretty cool paper on rl for creative writing on Qwen3 32B base, and most interestingly it’s one author from the Star Writing Team (haven’t heard of them). They seem to have access to the 32B base tho https://x.com/i/web/status/1929996614883783170