Image created with gemini-3.1-flash-image-preview with claude-sonnet-4-5. Image prompt: Wide shot of solitary young Chinese person at cluttered desk with old monitors in dim internet café, weathered concrete walls, flat fluorescent light, muted desaturated colors, fire horse visible through dusty window in background courtyard, observational realism style, large white Chinese cinema poster text reading QWEN overlaid on upper portion, documentary stillness, Jia Zhangke aesthetic, river-gray and concrete tones

You can now run Qwen3.5 locally! 💜 Qwen3.5-397B-A17B is an open MoE vision reasoning LLM for agentic coding & chat. It performs on par with Gemini 3 Pro, Claude Opus 4.5 & GPT-5.2. Run 4-bit on 256GB Mac / RAM. Guide: https://t.co/wjS1lMnbNp GGUF: https://x.com/UnslothAI/status/2023338222601064463

🎉 Congrats to @Alibaba_Qwen on releasing Qwen3.5 on Chinese New Year’s Eve — day-0 support is ready in vLLM! Qwen3.5 is a multimodal MoE with Gated Delta Networks architecture — 397B total params, only 17B active. What makes it interesting for inference: 🧠 Gated Delta”” https://x.com/vllm_project/status/2023341059343061138

🔥 Alibaba’s Qwen 3.5 just dropped — and Zhihu is dissecting it. Here’s a sharp breakdown from Zhihu contributor toyama nao 👇 🏆 Verdict: “”The spearhead of the open-source elite.”” 📊 Big picture Tongyi Lab’s pattern: new mid-size model leapfrogs old giant. • Last cycle: 80B”” https://x.com/ZhihuFrontier/status/2024176484232155236

Qwen https://qwen.ai/blog?id=qwen3.5

Qwen3.5 is Live! Today we openweight the first model, Qwen3-397B-A17B, which is a native multimodal model supporting both thinking and non-thinking modes. We have strengthened its coding and agentic capabilities to foster productivity for developers and enterprises. Hope you”” https://x.com/JustinLin610/status/2023332446713070039

Alibaba Yunqi: 7 models released in 4 days (Qwen3-Max, Qwen3-Omni, Qwen3-VL) and $52B roadmap | AINews https://news.smol.ai/issues/25-09-23-alibaba-yunqi

Alibaba’s new Qwen3.5-397B-A17B is the #3 open weights model in the Artificial Analysis Intelligence Index – a significant upgrade from Qwen3-235B-A22B-2507, and achieved with fewer active parameters than leading peers Qwen3.5-397B-A17B is the first model released by Alibaba”” https://x.com/ArtificialAnlys/status/2023794497055060262

Qwen https://qwen.ai/blog?id=qwen3.5#spatial-intelligence

Qwen3.5’s thinking is downright excessive.”” https://x.com/QuixiAI/status/2023995215690781143

Oof, SWE-rebench is brutal for recent Chinese releases M2.5 reported 80.2% on SWE-bench verified against 80.8% for Opus 4.6, but it seriously underperforms here Qwen3-Coder-Next looks good with 40% and only 80B A3B parameters”” https://x.com/maximelabonne/status/2022401174549512576

🚀 Qwen Coding Plan is now live on Alibaba Cloud Model Studio! ✨ What you get: • 🔥 Latest Qwen3.5-Plus models • 💡 Fixed monthly subscription: from ~$10/mo (Lite) or ~$50/mo (Pro) • 📦 Up to 90K requests/month for AI-powered coding • 🔌 Works with Claude Code, Qwen Code,”” https://x.com/Alibaba_Qwen/status/2024136381308805564

Ouch, the pricing on Alibaba just hurts. You can get the larger Kimi-K2.5 and GLM-5 for less”” https://x.com/scaling01/status/2023346718377406840

Qwen3.5-397B-A17B SVG results I have seen better. DeepSeek-V3.2 and GLM-5 both beat it.”” https://x.com/scaling01/status/2023364296277721300

🚀 Qwen3.5-397B-A17B is here: The first open-weight model in the Qwen3.5 series. 🖼️Native multimodal. Trained for real-world agents. ✨Powered by hybrid linear attention + sparse MoE and large-scale RL environment scaling. ⚡8.6x-19.0x decoding throughput vs Qwen3-Max 🌍201″” https://x.com/Alibaba_Qwen/status/2023331062433153103

Happy Chinese New Year!! What a week for open-source LLMs: > Qwen-3.5 > GLM-5 > MiniMax-M2.5 Are we just waiting on DeepSeek-V4 now? Also I’m hoping a US lab steps up with a true frontier open-source model.”” https://x.com/Yuchenj_UW/status/2023453819938763092

Qwen 3.5 goes bankrupt on Vending-Bench 2″” https://x.com/andonlabs/status/2023450768406364238

So a new Repo full of MLX-LM-LoRA examples to train your own LLM for Apple Silicon, fast and efficient on ultra long context lengths: Fine-tune Qwen3 4B Instruct on 32K context: https://t.co/yGZlR59fHD Train @IBMResearch Granite 350M model on RL-GRPO Reasoning:”” https://x.com/ActuallyIsaak/status/2022414004623479014

🚀 Qwen3.5-397B-A17B-FP8 weights are now open! It took some time to adapt the inference frameworks, but here we are: ✅ SGLang support is merged 🔄 vLLM PR submitted → https://t.co/rJkuitOBWs Check the model card for example code. vLLM support landing in the next couple of days!”” https://x.com/Alibaba_Qwen/status/2024161147537232110

🚩Cerebras’s MiniMax-M2 GGUF 2-bit model: https://t.co/udlviJQZqQ Qwen3-Coder-Next INT4 model:”” https://x.com/HaihaoShen/status/2022293472796180676

A clarification of Qwen3.5 Plus and 397B: 1. for opensource, we follow the tradition to make parameters apparent so we use the name with the number of total parameters and active params. 2. Qwen3-Plus is a hosted API version of 397B. As the model natively supports 256K tokens,”” https://x.com/JustinLin610/status/2023340126479569140

It’s Qwen 3.5 day today! 🥳 State of the art 800 GB model. Runs _locally_ with MLX using Q4, taking 225 GB of RAM.”” https://x.com/pcuenq/status/2023369902011121869

Let’s do the KV cache math for Qwen3.5: – KV heads: 2 – Head dimension: 256 – gated attention layers: 15 – bytes per element (BF16): 2 2 x 256 x 15 x 2 = 15 360 This is the same for K and V. So, we multiply by 2: 30 720 bytes Roughly 31 kb per token of context. Meaning at max”” https://x.com/bnjmn_marie/status/2023424404504342608

ollama run qwen3.5:cloud Qwen3.5-397B-A17B is the first open-weight model in the series. It’s available on Ollama’s cloud right now! Give it a try. Let’s go! 🚀🚀🚀”” https://x.com/ollama/status/2023334181804069099

Qwen 3.5 Plus is now available on AI Gateway. Thanks @vercel_dev team. 🤝 Use model: ‘alibaba/qwen3.5-plus’ Try it now!”” https://x.com/Alibaba_Qwen/status/2024029499541909920

Qwen3.5 runs quite well in mlx-lm. Awesome that we have a frontier-level hybrid model. The context gets longer but the inference speed and memory use barely change. Here’s the Q4 generating a space invaders game on an M3 Ultra. Generated 4,120 tokens at 37.6 tok/s.”” https://x.com/awnihannun/status/2023462412092059679

So speaking of benchmarks, what can be said of the new open Qwen? First, it completely destroys Qwen3-VL-235B ofc, but more surprisingly it outscores Qwen3-Max-thinking. All the while it’s the same model as “”Plus””. Plus just has 1M context and some more bells and whistles.”” https://x.com/teortaxesTex/status/2023331885402009779

The new chonky Qwen 3.5 looks pretty solid, beating their own Qwen3-Max model everywhere and is much better at vision benchmarks than Qwen3-235B-A22B-VL Now what I sadly haven’t seen is anything on reasoning efficiency.”” https://x.com/scaling01/status/2023343368399704506

Kimi K2‑0905 and Qwen3‑Max preview: two 1T open weights models launched | AINews https://news.smol.ai/issues/25-09-05-1t-models

Leave a Reply

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading