Ethan B. Holland

Over 56,100 manually organized AI links and counting

International: AI News Week Ending 09/12/2025

September 12, 2025

Image created with Flux Pro v1.1 Ultra. Image prompt: International, airport‑style resort lounge, passport peeking from travel bag with subtle circuit‑globe emboss, ambient morning light, photorealistic, editorial, minimal, landscape, vacation, no text overlays

ByteDance Seed presents AgentGym-RL • First unified RL framework for multi-turn agent training (no SFT) • Modular, extensible design across web, search, games, embodied & science tasks • Agents rival/surpass commercial models on 27 task https://x.com/arankomatsuzaki/status/1965979980971782414

📊 @Kimi_Moonshot’s K2-0905 on @GroqInc scored 7th overall at 94% on Roo Code evals, the 1st open-source model to break the 90+ barrier. It’s also the fastest and cheapest in the top 10, while holding its own on accuracy. View the full leaderboard: https://x.com/roo_code/status/1965098976677658630

It feels the coding agent frontier is now open-weights: GLM 4.5 costs only $3/month and is on par with Sonnet Kimi K2.1 Turbo is 3x speed, 7x cheaper vs Opus 4.1, but as good Kimi K2.1 feels clean. The best model for me. GPT-5 is only good for complicated specs — too slow.”” / X https://x.com/Tim_Dettmers/status/1965021602267217972

Kimi K2 0905 upgrade: Substantial improvement in agentic capabilities, modest change in overall intelligence Key takeaways: ➤ Intelligence increased +2 pts in our Artificial Analysis Intelligence Index ➤ Agentic capabilities substantially improved as shown by our two new https://x.com/ArtificialAnlys/status/1965010554499788841

🚨 Leaderboard Disrupted! Two new models have entered the Top 10 Text leaderboard: 🔸#6 Qwen3-max-preview (Proprietary) by @Alibaba_Qwen 🔸#8 Kimi-K2-0905-preview (Modified MIT) by @Kimi_Moonshot tied with 7 others. Note that this puts Kimi-K2-0905-preview in a tight race for https://x.com/arena/status/1965115050273976703

Bytedance’s answer to Nano Banana is really good at making high density infographics. “Draw the following system of binary linear equations and the corresponding solution steps on the blackboard: 5x + 2y = 26; 2x -y = 5.” “Create an infographic showing the causes of https://x.com/bilawalsidhu/status/1965838191019307476

Seedream 4.0 is the new leading image model across both the Artificial Analysis Text to Image and Image Editing Arena, surpassing Google’s Gemini 2.5 Flash (Nano-Banana), across both! Seedream 4.0 is the latest release from Bytedance Seed, and is a substantial improvement on https://x.com/ArtificialAnlys/status/1966167814512980210

MBZUAI and G42 Launch K2 Think: A Leading Open-Source System for Advanced AI Reasoning https://www.prnewswire.com/news-releases/mbzuai-and-g42-launch-k2-think-a-leading-open-source-system-for-advanced-ai-reasoning-302551074.html

⚡️ Efficient weight updates for RL at trillion-parameter scale 💡 Best practice from Kimi @Kimi_Moonshot vLLM is proud to collaborate with checkpoint-engine: • Broadcast weight sync for 1T params in ~20s across 1000s of GPUs • Dynamic P2P updates for elastic clusters •”” / X https://x.com/vllm_project/status/1965824120920342916

Introducing checkpoint-engine: our open-source, lightweight middleware for efficient, in-place weight updates in LLM inference engines, especially effective for RL. ✅ Update a 1T model on thousands of GPUs in ~20s ✅ Supports both broadcast (sync) & P2P (dynamic) updates ✅ https://x.com/Kimi_Moonshot/status/1965785427530629243

Updated & turned my Big LLM Architecture Comparison article into a narrated video lecture. The 11 LLM architectures covered in this video: 1. DeepSeek V3/R1 2. OLMo 2 3. Gemma 3 4. Mistral Small 3.1 5. Llama 4 6. Qwen3 7. SmolLM3 8. Kimi 2 9. GPT-OSS 10. Grok 2.5 11. GLM-4.5 https://x.com/rasbt/status/1965798055141429523

@Alibaba_Qwen (Gated) Attention is all you need. Excited to offer both Qwen3-Next models on dedicated deployments backed by 4xH100 GPUs. https://x.com/basetenco/status/1966224960223158768

Uber and Momenta to test autonomous vehicles in Germany in 2026 | TechCrunch https://techcrunch.com/2025/09/07/uber-and-momenta-to-test-autonomous-vehicles-in-germany-in-2026/

Alibaba’s New AI Tool Just Changed Ecommerce Forever https://x.com/ariesnotebook/status/1964014875807559904

📢 New Model Drop: Seedream 4.0 is live on Yupp! This image model from ByteDance offers text-to-image generation as well as image editing. We dove in with some prompts: https://x.com/yupp_ai/status/1965827081826422990

🚨 ByteDance just released Seedream 4.0 — how does its AI image generation perform? Zhihu contributors share their feedbacks. Let’s have a quick view👇 🎨 Trisimo 崔思莫: ➤ Seed 4.0 vs Nano Banana: Different tech paths. • Nano Banana: multimodal, stronger understanding & https://x.com/ZhihuFrontier/status/1965681077231727069

🚨 New Model Alert! ByteDance’s latest Seedream 4 is ready in the Arena! 🖼️Seedream 4 merges the capabilities of Seedream 3 (Text-to-Image) with SeedEdit 3 (Image Edit). Come and test out your hardest Text-to-Image and Image Edit prompts! https://x.com/arena/status/1965929099370889432

DeepSeek V3.1 dynamic @UnslothAI quants on Aider Polyglot benchmarks are here! 1. 3-bit thinking gets 75.6% vs 76.1% un-quantized 2. Leaving attn_k_b in 8-bit gets +2% accuracy vs 4-bit 3. Dynamic quants beat other similar imatrix quants 4. AMA r/LocalLlama today 10AM PST! https://x.com/danielhanchen/status/1965800675105017980

Albania appoints world’s first AI-made minister – POLITICO https://www.politico.eu/article/albania-apppoints-worlds-first-virtual-minister-edi-rama-diella/

Mixtral of experts | Mistral AI https://mistral.ai/news/mixtral-of-experts

ASML, Mistral AI enter strategic partnership https://www.asml.com/en/news/press-releases/2025/asml-mistral-ai-enter-strategic-partnership

Mistral raises 1.7B€, partners with ASML | Hacker News https://news.ycombinator.com/item?id=45178041

4B OCR with Apache-2.0 license outperforming Mistral OCR 🔥 Tencent released Points-Reader, it’s a new model firstly trained on Qwen2.5VL annotations and then self-trained on real data in many benchmarks, it performs better than Qwen2.5VL and MistralOCR! https://x.com/mervenoyann/status/1966176133894098944

@reach_vb @Alibaba_Qwen ❤️ We ship as fast as we can. We optimized the models’ speed, serve in bf16, should be fast!”” / X https://x.com/Yuchenj_UW/status/1966201249721888800

🚀 Introducing Qwen3-Next-80B-A3B — the FUTURE of efficient LLMs is here! 🔹 80B params, but only 3B activated per token → 10x cheaper training, 10x faster inference than Qwen3-32B.(esp. @ 32K+ context!) 🔹Hybrid Architecture: Gated DeltaNet + Gated Attention → best of speed & https://x.com/Alibaba_Qwen/status/1966197643904000262

From research paper to live website in minutes 🚀 Upload your paper, let Qwen Chat turn it into a webpage, and deploy instantly. Try it now: https://x.com/Alibaba_Qwen/status/1964870508421480524

I vibe coded a visual PDF search app with ColQwen2. This is how it works: – Store PDF files as images in a @weaviate_io vector database – Embed images and text with a multimodal late-interaction model (ColQwen2) – Generate token-wise (and summed) similarity maps to highlight https://x.com/helloiamleonie/status/1964997028875743637

Learn more about Qwen3-max-preview here: https://x.com/arena/status/1965124408097517853

Qwen3-Next (thinking & non-thinking) are now live in BF16 at Hyperbolic! Qwen3-Next is a huge efficiency leap: – 80B MoE with just 3B active params – 10x cheaper to train vs Qwen3-32B – 10x inference throughput for >32K tokens Proud to be a launch partner with @Alibaba_Qwen – https://x.com/Yuchenj_UW/status/1966199037973200955

Qwen3-Next, or to say, a preview of our next generation (3.5?) is out! This time we try to be bold, but actually we have been doing experiments on hybrid models and linear attention for about a year. We believe that our solution shoud be at least a stable and solid solution to”” / X https://x.com/JustinLin610/status/1966199996728156167

Qwen3-Next: Towards Ultimate Training & Inference Efficiency
https://qwen.ai/blog?id=4074cca80393150c248e508aa62983f9cb7d27cd&from=research.latest-advancements-list

Welcome Qwen3-Next! You can run it efficiently on vLLM with accelerated kernels and native memory management for hybrid models. https://x.com/vllm_project/status/1966224816777928960

🎙️ Meet Qwen3-ASR — the all-in-one speech recognition model! ✅ High-accuracy EN/CN + 9 more languages: ar, de, en, es, fr, it, ja, ko, pt, ru, zh ✅ Auto language detection ✅ Songs? Raps? Voice with BGM? No problem. <8% WER ✅ Works in noise, low quality, far-field ✅ Custom https://x.com/Alibaba_Qwen/status/1965068737297707261

inpainting is not dead! @instantx_ai brought it back to life! 🪔🕯 Qwen Image Inpainting ControlNet allows for the most precise, targeted & high quality edits that ever happened to inpainting Official Model & Demo on @huggingface 🤗 https://x.com/multimodalart/status/1966190381340692748

Alibaba leads $100m round in Chinese robotics startup X Square
https://www.techinasia.com/news/alibaba-leads-100m-round-in-chinese-robotics-startup-x-square

Chinese humanoid startup X Square Robot has raised $100M in a funding round led by Alibaba Cloud, with HongShan (formerly Sequoia Capital China) and others, bringing total funding to $280M since the company launched Dec 2023. The firm recently launched Wall-OSS, an open-source https://x.com/TheHumanoidHub/status/1965475160804462768