Image created with gemini-3.1-flash-image-preview with claude-sonnet-4-5. Image prompt: Wide static shot of a half-empty Chinese logistics warehouse with concrete floors and metal shelving, a fire horse standing naturally near an open loading bay door with overcast daylight streaming in, a solitary warehouse worker in orange vest walking past with tablet, desaturated colors with river-gray and rust tones, large white Chinese-style text overlay reading ALIBABA, observational realism, documentary stillness, Jia Zhangke aesthetic.
You can now run Qwen3.5 locally! 💜 Qwen3.5-397B-A17B is an open MoE vision reasoning LLM for agentic coding & chat. It performs on par with Gemini 3 Pro, Claude Opus 4.5 & GPT-5.2. Run 4-bit on 256GB Mac / RAM. Guide: https://t.co/wjS1lMnbNp GGUF: https://x.com/UnslothAI/status/2023338222601064463
🎉 Congrats to @Alibaba_Qwen on releasing Qwen3.5 on Chinese New Year’s Eve — day-0 support is ready in vLLM! Qwen3.5 is a multimodal MoE with Gated Delta Networks architecture — 397B total params, only 17B active. What makes it interesting for inference: 🧠 Gated Delta”” https://x.com/vllm_project/status/2023341059343061138
🔥 Alibaba’s Qwen 3.5 just dropped — and Zhihu is dissecting it. Here’s a sharp breakdown from Zhihu contributor toyama nao 👇 🏆 Verdict: “”The spearhead of the open-source elite.”” 📊 Big picture Tongyi Lab’s pattern: new mid-size model leapfrogs old giant. • Last cycle: 80B”” https://x.com/ZhihuFrontier/status/2024176484232155236
Qwen https://qwen.ai/blog?id=qwen3.5
Qwen3.5 is Live! Today we openweight the first model, Qwen3-397B-A17B, which is a native multimodal model supporting both thinking and non-thinking modes. We have strengthened its coding and agentic capabilities to foster productivity for developers and enterprises. Hope you”” https://x.com/JustinLin610/status/2023332446713070039
Alibaba Yunqi: 7 models released in 4 days (Qwen3-Max, Qwen3-Omni, Qwen3-VL) and $52B roadmap | AINews https://news.smol.ai/issues/25-09-23-alibaba-yunqi
Alibaba’s new Qwen3.5-397B-A17B is the #3 open weights model in the Artificial Analysis Intelligence Index – a significant upgrade from Qwen3-235B-A22B-2507, and achieved with fewer active parameters than leading peers Qwen3.5-397B-A17B is the first model released by Alibaba”” https://x.com/ArtificialAnlys/status/2023794497055060262
Qwen https://qwen.ai/blog?id=qwen3.5#spatial-intelligence
Qwen3.5’s thinking is downright excessive.”” https://x.com/QuixiAI/status/2023995215690781143
Oof, SWE-rebench is brutal for recent Chinese releases M2.5 reported 80.2% on SWE-bench verified against 80.8% for Opus 4.6, but it seriously underperforms here Qwen3-Coder-Next looks good with 40% and only 80B A3B parameters”” https://x.com/maximelabonne/status/2022401174549512576
🚀 Qwen Coding Plan is now live on Alibaba Cloud Model Studio! ✨ What you get: • 🔥 Latest Qwen3.5-Plus models • 💡 Fixed monthly subscription: from ~$10/mo (Lite) or ~$50/mo (Pro) • 📦 Up to 90K requests/month for AI-powered coding • 🔌 Works with Claude Code, Qwen Code,”” https://x.com/Alibaba_Qwen/status/2024136381308805564
Ouch, the pricing on Alibaba just hurts. You can get the larger Kimi-K2.5 and GLM-5 for less”” https://x.com/scaling01/status/2023346718377406840
Qwen3.5-397B-A17B SVG results I have seen better. DeepSeek-V3.2 and GLM-5 both beat it.”” https://x.com/scaling01/status/2023364296277721300
🚀 Qwen3.5-397B-A17B is here: The first open-weight model in the Qwen3.5 series. 🖼️Native multimodal. Trained for real-world agents. ✨Powered by hybrid linear attention + sparse MoE and large-scale RL environment scaling. ⚡8.6x-19.0x decoding throughput vs Qwen3-Max 🌍201″” https://x.com/Alibaba_Qwen/status/2023331062433153103
Happy Chinese New Year!! What a week for open-source LLMs: > Qwen-3.5 > GLM-5 > MiniMax-M2.5 Are we just waiting on DeepSeek-V4 now? Also I’m hoping a US lab steps up with a true frontier open-source model.”” https://x.com/Yuchenj_UW/status/2023453819938763092
Qwen 3.5 goes bankrupt on Vending-Bench 2″” https://x.com/andonlabs/status/2023450768406364238
So a new Repo full of MLX-LM-LoRA examples to train your own LLM for Apple Silicon, fast and efficient on ultra long context lengths: Fine-tune Qwen3 4B Instruct on 32K context: https://t.co/yGZlR59fHD Train @IBMResearch Granite 350M model on RL-GRPO Reasoning:”” https://x.com/ActuallyIsaak/status/2022414004623479014
🚀 Qwen3.5-397B-A17B-FP8 weights are now open! It took some time to adapt the inference frameworks, but here we are: ✅ SGLang support is merged 🔄 vLLM PR submitted → https://t.co/rJkuitOBWs Check the model card for example code. vLLM support landing in the next couple of days!”” https://x.com/Alibaba_Qwen/status/2024161147537232110
🚩Cerebras’s MiniMax-M2 GGUF 2-bit model: https://t.co/udlviJQZqQ Qwen3-Coder-Next INT4 model:”” https://x.com/HaihaoShen/status/2022293472796180676
A clarification of Qwen3.5 Plus and 397B: 1. for opensource, we follow the tradition to make parameters apparent so we use the name with the number of total parameters and active params. 2. Qwen3-Plus is a hosted API version of 397B. As the model natively supports 256K tokens,”” https://x.com/JustinLin610/status/2023340126479569140
It’s Qwen 3.5 day today! 🥳 State of the art 800 GB model. Runs _locally_ with MLX using Q4, taking 225 GB of RAM.”” https://x.com/pcuenq/status/2023369902011121869
Let’s do the KV cache math for Qwen3.5: – KV heads: 2 – Head dimension: 256 – gated attention layers: 15 – bytes per element (BF16): 2 2 x 256 x 15 x 2 = 15 360 This is the same for K and V. So, we multiply by 2: 30 720 bytes Roughly 31 kb per token of context. Meaning at max”” https://x.com/bnjmn_marie/status/2023424404504342608
ollama run qwen3.5:cloud Qwen3.5-397B-A17B is the first open-weight model in the series. It’s available on Ollama’s cloud right now! Give it a try. Let’s go! 🚀🚀🚀”” https://x.com/ollama/status/2023334181804069099
Qwen 3.5 Plus is now available on AI Gateway. Thanks @vercel_dev team. 🤝 Use model: ‘alibaba/qwen3.5-plus’ Try it now!”” https://x.com/Alibaba_Qwen/status/2024029499541909920
Qwen3.5 runs quite well in mlx-lm. Awesome that we have a frontier-level hybrid model. The context gets longer but the inference speed and memory use barely change. Here’s the Q4 generating a space invaders game on an M3 Ultra. Generated 4,120 tokens at 37.6 tok/s.”” https://x.com/awnihannun/status/2023462412092059679
So speaking of benchmarks, what can be said of the new open Qwen? First, it completely destroys Qwen3-VL-235B ofc, but more surprisingly it outscores Qwen3-Max-thinking. All the while it’s the same model as “”Plus””. Plus just has 1M context and some more bells and whistles.”” https://x.com/teortaxesTex/status/2023331885402009779
The new chonky Qwen 3.5 looks pretty solid, beating their own Qwen3-Max model everywhere and is much better at vision benchmarks than Qwen3-235B-A22B-VL Now what I sadly haven’t seen is anything on reasoning efficiency.”” https://x.com/scaling01/status/2023343368399704506
Kimi K2‑0905 and Qwen3‑Max preview: two 1T open weights models launched | AINews https://news.smol.ai/issues/25-09-05-1t-models




Leave a Reply