Ethan B. Holland

Over 53,700 manually organized AI links and counting

Qwen: AI News Week Ending 10/24/2025

October 24, 2025

Image created with gemini-2.5-flash-image with claude-sonnet-4-5. Image prompt: A Chinese Xiangqi chess set on dark wood, red and black carved jade pieces mid-game, the General piece glowing subtly with golden circuit patterns etched into translucent jade, soft paper lantern light, traditional East Asian study setting, shallow depth of field focusing on the illuminated General piece.

Airbnb CEO Brian Chesky: “We’re relying a lot on Alibaba’s Qwen model. It’s very good. It’s also fast and cheap… We use OpenAI’s latest models, but we typically don’t use them that much in production because there are faster and cheaper models.” The valley is built on Qwen?”” / X https://x.com/natolambert/status/1980657338726887662

It makes perfect sense to let agents understand, imitate, and learn how humans use computers from videos! We present VideoAgentTrek, which builds strong computer-use agents through video pretraining and agentic tuning. This approach has already proven effective in the training of”” / X https://x.com/huybery/status/1981728838024560669

This AI trading benchmark is interesting. Each model got $10,000 to invest. ~3 days in: ranking atm: – DeepSeek V3.1: +$2,658 – Grok 4: +$2,236 – Claude 4.5 Sonnet: +$1,911 – Qwen 3 Max: −$211 – GPT-5: −$3,139 – Gemini 2.5 Pro: −$3,719 DeepSeek beats all the other models https://x.com/Yuchenj_UW/status/1980318499185823760

Qwen Deep Research just got a major upgrade. ⚡️ It now creates not only the report, but also a live webpage 🌐 and a podcast 🎙️ – Powered by Qwen3-Coder, Qwen-Image, and Qwen3-TTS. Your insights, now visual and audible. ✨ 👉 https://x.com/Alibaba_Qwen/status/1980609551486624237

Introducing Qwen3-VL-2B and Qwen3-VL-32B! From edge to cloud, these dense powerhouses deliver ultimate performance per GPU memory, packing the full capabilities of Qwen3-VL into compact and scalable forms. 🔥 Qwen3-VL-32B outperforms GPT-5 mini & Claude 4 Sonnet across STEM, https://x.com/Alibaba_Qwen/status/1980665932625383868

🚨 WebDev Arena: Top 15 Disrupted! 4 new models have been added to the WebDev leaderboard: 🔸 #4 Claude Sonnet 4.5 Thinking 32k by @AnthropicAI 🔸 #4 GLM 4.6 (the new #1 open model) by @Zai_org 🔸 #11 Qwen3 235B A22B Instruct (and #7 open model) by @Alibaba_Qwen 🔸 #14 Claude https://x.com/arena/status/1980367208300835328

I expect GLM-4.6-Air to make an improvement similar to Qwen-3 to Q3-2507 update, or maybe even the latest Qwen round. Will be the default model between 30B and 200B.”” / X https://x.com/teortaxesTex/status/1981702360981557624

Choose the “”:exacto”” version of open-source models in Cline automatically route to the best inference provider for models like GLM-4.6, Qwen3-Coder, and Kimi-K2. Provider quality varies wildly, meaning the same model can yield completely different results at different endpoints. https://x.com/cline/status/1981370535176286355

Over the last 24 hours, I have finetuned three Qwen3-VL models (2B, 4B, and 8B) on the CATmuS dataset on @huggingface . The first version of the models are now available on the Small Models for GLAM organization with @vanstriendaniel ! (Link below). These are designed to work https://x.com/wjb_mattingly/status/1981736776076026044

Qwen3-VL-2B-Instruct app is out on Hugging Face https://x.com/_akhaliq/status/1980690335220351063

we just updated the model comparison on our blog for you 🫡 added Chandra, OlmOCR-2, Qwen3-VL and their averaged OlmOCR score! https://x.com/mervenoyann/status/1981396054634615280

Qwen just released Qwen3-VL on Hugging Face The most powerful vision-language model in the Qwen series, with comprehensive upgrades across text understanding, visual reasoning, and long context video analysis. From GUI operations to 1M context. https://x.com/HuggingPapers/status/1980809413045940553