Ethan B. Holland

Over 56,100 manually organized AI links and counting

International: AI News Week Ending 10/10/2025

October 10, 2025

Image created with gemini-2.5-flash-image with claude-sonnet-4-5. Image prompt: Create a 16:9 cinematic split-screen poster. LEFT SIDE (40% width): – A large world map pinned to the wall above a desk, with colored strings connecting different cities, and sticky notes in multiple languages (e.g., English, Chinese, Spanish) referencing AI collaborations. – The background is a turquoise / teal abstract field made of stylized blue rods or data fibers behind the map, suggesting global networks. – Use soft, natural lighting. No glowing map, no neon. RIGHT SIDE (60% width): – A green-toned abstract aerial forest canopy texture, representing the shared planet. – Two clean rounded rectangles stacked vertically near the center-right. – The TOP rectangle contains the text: “International”. – The BOTTOM rectangle contains the text: “2025/10/10”. – Clean sans-serif font, dark green or charcoal. OVERALL STYLE: – Global, cooperative, and grounded. – No flags or corporate logos. – Preserve the turquoise/forest split-screen.

5 things: Nvidia’s Huang on the state of the AI race with China https://www.cnbc.com/2025/10/08/nvidia-huang-ai-race-china-us-trump.html

Z.ai Chat – Free AI powered by GLM-4.6 & GLM-4.5 https://chat.z.ai/

My infant year as an AI researcher — Moving from physics to AI https://alfredyao.github.io/posts/2025-10-06.html

Self-Forcing++ for minute-scale video generation ByteDance’s new method generates high-quality videos up to 4 min 15 sec! It scales diffusion models without long-video teachers or retraining, preserving fidelity and consistency. https://x.com/HuggingPapers/status/1974688371340648857

🤔 Ever thought about how models like DeepSeek-V3.2 with Dynamic Sparse Attention might fundamentally reshape inference systems? Zhihu contributor & Huawei researcher 左鹏飞 and team tackled this with SparseServe — a system tailored for sparse-attention LLMs. 💡 Their key https://x.com/ZhihuFrontier/status/1976544233700929614

The results: 4x faster vs. baseline 500 TPS on DeepSeek-V3.1 Outperforms specialized hardware The more you use it, the better it performs.”” / X https://x.com/togethercompute/status/1976655649120612525

🚨 New Open Models in the Text Arena! The most competitive Arena has new contenders on the board from 🐳 @deepseek_ai. DeepSeek-V3.2-Exp’s thinking variant has broken into the Top 10. Highlights: 🔹 #2 open model: DeepSeek-V3.2-Exp-Thinking, tied #9 overall 🔹 #6 open https://x.com/arena/status/1976000265271873851

🚨 New Top Open Model Update! A relative newcomer to the Arena, @zai_org’s GLM-4.6 takes the clear, undisputed #1 spot for Top Open Model. 🏆 It also ranks #4 overall, which is not an easy feat! The next top open model, DeepSeek R1 0528, has been the standing champion for https://x.com/arena/status/1975220164703752351

Interesting new results on the latest agentic benchmark (GAIA2): deepseek v3.1 terminus is a very strong model if you want OSS for your agents! Interestingly, more reasoning can be cost effective! (eg: if your agent is smarter it’s better at selecting the correct tools faster)”” / X https://x.com/clefourrier/status/1975469097174634854

Europe doesn’t have a Silicon Valley. So they’re building a bridge. 🌉 The Bridge (@jointhebridge) is an 8-week founder residency in the Bay Area. For Europeans who want to start massive companies now. ✅ One house in San Francisco. ✅ A cohort to find your cofounder and your https://x.com/IlirAliu_/status/1973657694478241807

We’re excited to announce that we have joined forces with @JinaAI_, a leader in frontier models for multimodal and multilingual search. This acquisition deepens Elastic’s capabilities in retrieval, embeddings, and context engineering to power agentic AI: https://x.com/elastic/status/1976278980018765886

Moondream 3 understands UIs, not just pixels. Identify buttons, prices, and labels with a prompt. Perfect for agentic workflows. Open, tiny, blazingly fast. https://x.com/moondreamai/status/1976624914070401142

👀 Vision Leaderboard Shakeup New model, Hunyuan Vision 1.5 Thinking by @tencenthunyuan, has entered to tie for #3 in the Vision Arena. Evaluating AI models with vision adds new complexities when compared to text. To perform well a model must extract information from images, https://x.com/arena/status/1975257734053503260

🥇 Just one week after the release, HunyuanImage 3.0 has taken the #1 spot in LMArena, ranked as both the top overall and top open-source Text-to-Image model. We are so grateful for the support from the community! 🙌”” / X https://x.com/TencentHunyuan/status/1974522542858911935

🚨 Text-to-Image Leaderboard Shakeup! Hunyuan Image 3.0 by @TencentHunyuan just stormed into the #1 spot in the Arena 🏆 – ranked as both the top overall and top open-source Text-to-Image model. 🖼️ This image generation model has leapfrogged over Seedream 4, and the famous https://x.com/arena/status/1974502371721162982

Strategic collaboration with Japan’s Digital Agency to bring OpenAI-powered tools to Japanese government employees: https://x.com/gdb/status/1973619271239700631

Recent open weights releases are reducing the gap to proprietary frontier models on agentic workflows On the Terminal-Bench Hard evaluation for agentic coding and terminal use, open-weights models such as DeepSeek V3.2 Exp, Kimi K2 0905, and GLM-4.6 have made large strides, with https://x.com/ArtificialAnlys/status/1975468544973545810

With the US falling behind on open source models, one startup has a bold idea for democratizing AI: let anyone run reinforcement learning. https://x.com/WIRED/status/1975993813995774448

There’s still a way to go for omni models to match human-level responsiveness and reasoning—but we won’t stop improving. Hope you love Qwen3-Omni – natively end-to-end multilingual omni model.”” / X https://x.com/Alibaba_Qwen/status/1976267690785505440

🚀 Qwen3-VL-30B-A3B-Instruct & Thinking are here! Smaller size, same powerhouse performance 💪—packed with all the capabilities of Qwen3-VL! 🔧 With just 3B active params, it’s rivaling GPT-5-Mini & Claude4-Sonnet — and often beating them across STEM, VQA, OCR, Video, Agent https://x.com/Alibaba_Qwen/status/1974289216113947039

LFM2-8B-A1B just dropped on @huggingface! 8.3B params with only 1.5B active/token 🚀 > Quality ≈ 3–4B dense, yet faster than Qwen3-1.7B > MoE designed to run on phones/laptops (llama.cpp / vLLM) > Pre-trained on 12T tokens → strong math/code/IF https://x.com/maximelabonne/status/1975561460798628199

Qwen Image Edit 2509 is the new leading open weights image editing model, ranking #3 overall in the Artificial Analysis Image Editing Arena and introducing multi-image editing capabilities! The latest release from Alibaba Qwen trails only Gemini 2.5 Flash (Nano-Banana) and https://x.com/ArtificialAnlys/status/1975993986314813889

Researchers introduced GAIN-RL, a method that fine-tunes language models by training on the most useful examples first. It ranks data using a simple internal signal from the model. On Qwen 2.5 and Llama 3.2, this method matched baseline accuracy in 70 to 80 epochs instead of https://x.com/DeepLearningAI/status/1974640684528243151

Introducing Qwen3-VL Cookbooks! 🧑‍🍳 A curated collection of notebooks showcasing the power of Qwen3-VL—via both local deployment and API—across diverse multimodal use cases: ✅ Thinking with Images ✅ Computer-Use Agent ✅ Multimodal Coding ✅ Omni Recognition ✅ Advanced https://x.com/Alibaba_Qwen/status/1976479304814145877

🚀 Day 0 Support — Qwen3-VL-30B-A3B-Instruct on NexaSDK We’re excited to announce Day 0 support for Qwen3-VL-30B-A3B-Instruct, a breakthrough in multimodal intelligence, now running natively on NexaSDK. We’ve added full support for the MLX Engine on @Apple Silicon GPUs, https://x.com/nexa_ai/status/1974562612164886659

4/5 The same efficiency gains apply on mobile. Running at 16K context lengths on an iPhone 16 Pro, Jamba outputs nearly 16 tokens/second, outpacing token outputs from Llama 3.2 3B, Qwen 3 1.7B, and Phi-4 Mini. Jamba is the only one that can handle up to 64K.”” / X https://x.com/AI21Labs/status/1975917063278567919

Alibaba has released Qwen3 Omni and Qwen3 Omni Realtime – two natively end-to-end “”omni””-modal models that process text, images, audio, and video in a single unified architecture. Artificial Analysis benchmarking shows competitive Speech to Speech performance, as well as https://x.com/ArtificialAnlys/status/1975904190061834602

Most popular local models in Cline are qwen3-coder & GLM-4.5-Air (guide on how to use them is linked below)”” / X https://x.com/cline/status/1976101061753700400

Qwen3-VL secured 2nd place in the vision leaderboard and became the first open-source model to rank first in both the pure text and visual leaderboards.”” / X https://x.com/Alibaba_Qwen/status/1975360868092420345

More generally: if all of your experiments are “”RL on math with Qwen””, I’m not interested in any outlandish claims you want to make. Qwen’s base models have been (appropriately) aggressively mid-trained for math for a long time. Stop drawing conclusions purely from this.”” / X https://x.com/lateinteraction/status/1976761442842849598

Qwen3-30B-A3B-Instruct-2507-4bit generation on MLX: 473 tokens per sec on M3 Ultra! 🚀 https://x.com/ivanfioravanti/status/1976153645658898453

Thank you @ArtificialAnlys ! 🙏 Qwen Image Edit 2509 ranks #3 overall and leads all open-weight models — enabling multi-image editing with precise control. Try it now: https://x.com/Alibaba_Qwen/status/1976119224339955803

Intelligence performance: The Qwen3 Omni 30B reasoning variant achieves an Artificial Analysis Intelligence Index score of 40, surpassing similarly-sized models like Qwen3 30B, but still trailing Alibaba’s flagship LLM, Qwen3 235B 2507, which scored 57. The Qwen3 Omni 30B https://x.com/ArtificialAnlys/status/1975904195426537596

Z ai’s updated GLM 4.6 (Reasoning) is one of the most intelligent open weights models, with near DeepSeek V3.1 (Reasoning) and Qwen3 235B 2507 (Reasoning) level intelligence 🧠 Key intelligence benchmarking takeaways: ➤ Reasoning Model Performance: GLM 4.6 (Reasoning) scores 56 https://x.com/ArtificialAnlys/status/1975425594679496979

HF demo: https://x.com/Alibaba_Qwen/status/1974290412602040532

🚀 Moondream 3 Preview is now live on fal! 🧠 9B params (2B active): faster + smarter 🖼️ Real-world vision: drones, robotics, med-imaging, retail ⚙️ 64-expert MoE + 32K context for structured reasoning 🔍 Native pointing, improved OCR, fine-tuning ready https://x.com/fal/status/1976682702167228919