International: AI News Week Ending 08/22/2025

International: AI News Week Ending 08/22/2025

August 22, 2025

Image created with Flux Pro v1.1 Ultra. Image prompt: O’Hare concourse window and global flight board; the word “International” displayed on a rolling departures banner in wayfinding sans; newsroom cart carries multilingual headlines; efficient, ultramarine palette, airy

American companies are losing market share to chinese open-source companies! Anthropic’s coding market share on OpenRouter went from 46% in July down to 32% in a month the reason for it? Qwen3-Coder https://x.com/scaling01/status/1956858471682617553

New DeepSeek V3.1 beats Opus and R1 for a dollar https://x.com/scaling01/status/1957892601098432619

Deepseek V3.1 is already 4th trending on HF with a silent release without model card 😅😅😅 The power of 80,000 followers on @huggingface (first org with 100k when?)! https://x.com/ClementDelangue/status/1957897020741402751

bangs successfully removed with 8-step Qwen Image Edit [Fast] too 💨 using Qwen Image Lightning LoRA, now on Spaces👇 https://x.com/linoy_tsaban/status/1957762030393544847

🚀 Qwen Chat Desktop for Windows is here! 💻 All the power of Qwen Chat — now with MCP support for smarter, faster agents. ⚡ Run up MCP Servers, supercharge your productivity, and stay in control. 📥 Download now → https://x.com/Alibaba_Qwen/status/1956399490698735950

There’s been a lot of Discourse about Qwen’s rejection of hybrid paradigm. “”Did DeepSeek fall for the hybrid meme?”” But hybrids make *so much sense* if you’re building a fast, economical SWE agent, which is exactly what 3.1 is for. It’s all been for Aider, Claude Code, MCPs. https://x.com/teortaxesTex/status/1958437173948023127

Introducing DeepSeek-V3.1: our first step toward the agent era! 🚀 🧠 Hybrid inference: Think & Non-Think — one model, two modes ⚡️ Faster thinking: DeepSeek-V3.1-Think reaches answers in less time vs. DeepSeek-R1-0528 🛠️ Stronger agent skills: Post-training boosts tool use and”” / X https://x.com/deepseek_ai/status/1958417062008918312

Excited to release: Jupyter Agent 2 The agent can load data, execute code, plot results inside Jupyter faster than you can scroll! 🤖 Powered by Qwen3-Coder ⚡️ Running on Cerebras ⚙️ Executed in E2B ↕️ Upload your files All videos are in *real time*! https://x.com/lvwerra/status/1957832240416580024

I tried @Alibaba_Qwen Qwen3-Coder today inside @cline . Very impressed. It helped me solve a tricky deployment: putting a Dockerized vibe-coded project onto https://x.com/chunhualiao/status/1956957519315956074

Ovis is one of the best, most creative and overlooked VLM series A yet another Alibaba division. https://x.com/teortaxesTex/status/1956306172576690610

>V3.1-Base I guess this confirms they’ve moved on to hybrid models, Anthropic-style (and contra Qwen). I am not amused with how it works. But I was also disappointed with V2.5 (original), their merge of chat and code; ultimately, it worked. Another reason to expect V4, not R2. https://x.com/teortaxesTex/status/1957818879205351851

Well, this happened sooner than I expected… Tencent has dropped their version of Genie 3. https://x.com/bilawalsidhu/status/1955968609940873624

Wow! Chinese lab Tencent Hunyuan has released an open source alternative to Genie 3 🔥 You can generate realistic videos that you can control in real time. – Long-term consistency – No need for expensive rendering – Trained on 1M+ gameplay recordings Already available ↓ https://x.com/itsPaulAi/status/1957182570309013714

• DeepSeek V3.1 Reasoner improves on DeepSeek R1 on the Extended NYT Connections Benchmark: 48.6% → 57.7%. • DeepSeek V3.1 Non-Think improves on DeepSeek V3-0324: 16.8% → 21.6%. • Mistral Medium 3.1 improves on Mistral Medium 3: 11.5% → 15.2%. • GPT-5 (low https://x.com/LechMazur/status/1958970478712037548

🚨 Top 10 Leaderboard Disrupted! A new model provider has landed in the Arena Top 10: 💠Mistral-Medium-2508 ranks at #8! 💠it also ranks top 3 in the Coding & Longer Query categories The Text Arena is neck and neck—just a few points can shift the rankings and change who’s on https://x.com/lmarena_ai/status/1958954094867226954

ByteDance has beaten DeepSeek …in how half-assed the release is https://x.com/teortaxesTex/status/1958173309410939299

ByteDance just released the Seed-OSS 36B LLM on Hugging Face. It’s an open-source model with powerful long-context, reasoning, and agentic capabilities. https://x.com/HuggingPapers/status/1958207114876228111

ByteDance-Seed/Seed-OSS-36B-Instruct · Hugging Face https://huggingface.co/ByteDance-Seed/Seed-OSS-36B-Instruct

New ByteDance Seed reasoning RL paper, relating RL to self-supervised learning. The paper is pretty dense with all the dual-task derivation so this is basically my notes. https://x.com/nrehiew_/status/1958882481488146644

@deepseek_ai Now Available and default model in anycoder: https://x.com/_akhaliq/status/1958488877024362966

@scaling01 Just to clarify, it’s “”trained using the UE8M0 FP8.”” DeepSeek stated this is designed for the upcoming generation of chips”” / X https://x.com/Anonyous_FPS/status/1958437047359995914

@teortaxesTex Maybe I missed something, but I could only find the Base model, and no model card. Where did they upload the Thinking/Reasoning model? https://x.com/rasbt/status/1957982932594778596

📢 New Model(s) Drop: DeepSeek v3.1 Thinking & Chat are now on Yupp! The latest edition from @deepseek_ai offers hybrid thinking built in, for quicker answers and stronger, tool-savvy agents. We checked them out with some prompts on Yupp: https://x.com/yupp_ai/status/1958935061677711451

🥇DeepSeek v3.1 INT4 model: https://x.com/HaihaoShen/status/1958507863749325197

9:15 AM in China, I predict we’ll see the second item soon (logically, V3.1–Instruct) and hopefully a model card/tweets. My biggest wish is to also see «With the release of DeepSeek-V3.1, the V3 series comes to an end… the DeepSeek V4 series will be released in the future» https://x.com/teortaxesTex/status/1957975224768430179

BIG LAUNCH: @deepseek_ai’s V3.1 is now live on W&B Inference! One model, two modes: toggle between high-speed ‘Non-Think’ & deep ‘Think’. Priced at just $0.55/$1.65 per 1M tokens, it’s a game-changer for building intelligent agents. Want $50 in free credits? Details below. https://x.com/weave_wb/status/1958681269484880026

DeepSeek had been using UE8M0 FP8 for a long time, you can see it in DeepGEMM. But maybe? https://x.com/teortaxesTex/status/1958437815710089697

DeepSeek is doubling down on their open source commitments with an MIT license for -Base. This is not only their first permissively licensed base model, it is the first large* permissively licensed base model in the industry. * unless you count dots.llm1 from RedNote @ 140B. https://x.com/georgejrjrjr/status/1957867653764379073

Deepseek just released a new model! https://x.com/ClementDelangue/status/1957823652298166340

DeepSeek launches V3.1, unifying V3 and R1 into a hybrid reasoning model with an incremental increase in intelligence Incremental intelligence increase: Initial benchmarking results for DeepSeek V3.1 show Artificial Analysis Intelligence Index of 60 in reasoning mode, up from https://x.com/ArtificialAnlys/status/1958432118562041983

DeepSeek V3.1 beats Claude 4 Opus on Aider Polyglot This makes it the best non-TTC coding model and all of that for ~$1 https://x.com/scaling01/status/1957890953026392212

DeepSeek V3.1 dropped and the Cline community is testing it out. Early sentiment: “”Makes 10,000 assumptions even when told to clarify”” for planning tasks. What’s your experience been? (early data — 13.3% diff edit failure rate)”” / X https://x.com/cline/status/1959032407828602886

DeepSeek v3.1 is live on our Model APIs! https://x.com/basetenco/status/1958716181256577347

DeepSeek V3.1 Now Available on Chutes, with hybrid inference (one-model, two-modes) $0.1999 USD / M Input $0.8001 USD / M Output Available now: https://x.com/chutes_ai/status/1958507978476106196

DeepSeek-V3.1 Release | DeepSeek API Docs https://api-docs.deepseek.com/news/news250821

DeepSeek-V3.1-4bit running with MLX on M3 Ultra 512GB at 21 toks/sec! 🔥 Using only 380GB! 👀 <think> or </think> that is the question. https://x.com/ivanfioravanti/status/1958778366229655971

Linear scaling achieved with multiple DeepSeek v3.1 instances. 4x macs = 4x throughput. 2x M3 Ultra Mac Studios = 1x DeepSeek @ 14 tok/sec 4x M3 Ultra Mac Studios = 2x DeepSeek @ 28 tok/sec DeepSeek V3.1 is a 671B parameter model – so at its native 8-bit quantization, it https://x.com/MattBeton/status/1958946396062851484

Looking into the V3 vs V3.1 a bit – modelling and config for the latest deepseek models is exactly the same? What’s the difference then? purely data? if purely data then why release base model too? and not just release a refresh for instruct?”” / X https://x.com/reach_vb/status/1957824849633485249

looks like @deepseek_ai is still on track to ship DeepSeek V4! https://x.com/swyx/status/1957902542136045608

Now on MLX 🚀 > pip install mlx-lm”” / X https://x.com/Prince_Canuma/status/1958791001301987628

Reminder that there’s 15 hours difference between SF and Hangzhou/Beijing. DeepSeek release cycle is as follows: do tests, push the model to prod at ≈ 7 PM local time, go home/out for drinks/whatever, next day maybe leisurely add a model card. They sleep through the release. https://x.com/teortaxesTex/status/1957954702781686094

some highlights from the release: > optional thinking mode achieves same/ competitive results as R1-0528 > MMLU, GPQA): 80.1 on GPQA (pretty strong) > LiveCodeBench: scores 74.8 > R1 > AIME 2024: scores 93.1 > R1 > support for tool use (non-thinking mode only) > new search”” / X https://x.com/reach_vb/status/1958430639595864378

@deepseek_ai 3.1 reasons to get hyped about DeepSeek v3.1 1: Hybrid reasoning 2: Agentic tool use 3: Improved coding 3.1: Best-in-class latency on Baseten https://x.com/basetenco/status/1958515897972232526

@nrehiew_ That’s not why it’s because reasoning uses up context length too fast to get to the end of an agentic coding loop”” / X https://x.com/Teknium1/status/1958898159326765075

DeepSeek trained its agentic coder as a non reasoner. There is a reason Anthropic evaluated Opus 4.1 without thinking on SweBench, Claude Code has thinking off by default and Qwen released Qwen Coder for Qwen code as a non reasoner. We do not need reasoning for Agentic Coding. https://x.com/nrehiew_/status/1958838487895117956

DeepSeek-V3.1 officially released! Key highlights of the update: – hybrid thinking model – more efficient reasoning – improved reasoning for search – better tool calling and agentic capabilities – improvements on many benchmarks: SWE-Bench: 44.6% -> 66%, Aider Polyglot https://x.com/scaling01/status/1958438863279681824

DeepSeek-V3.1 on par with o3, Opus 4 and Gemini 2.5 Pro Preview on coding It achieves a 76.3% score on Aider Polyglot with Thinking https://x.com/scaling01/status/1958438007104549243

just a minor version bump. booooring https://x.com/willccbb/status/1958420877537849801

🚀 Exciting news: DeepSeek-V3.1 from @deepseek_ai now runs on vLLM! 🧠 Seamlessly toggle Think / Non-Think mode per request ⚡ Powered by vLLM’s efficient serving — scale to multi-GPU with ease 🛠️ Perfect for agents, tools, and fast reasoning workloads 👉 Guide & examples: https://x.com/vllm_project/status/1958580047658491947

DeepSeek-V3.1 is fully ready on Hugging Face Inference Providers! https://x.com/ben_burtenshaw/status/1958449429511352549

China’s DeepSeek Releases V3.1, Boosting AI Model’s Capabilities – Bloomberg https://www.bloomberg.com/news/articles/2025-08-19/china-s-deepseek-release-v3-1-boosting-ai-model-s-capabilities

GSA Launches USAi to Advance White House “America’s AI Action Plan” | GSA https://www.gsa.gov/about-us/newsroom/news-releases/gsa-launches-usai-to-advance-white-house-americas-ai-action-plan-08142025

WE ARE SO BACK!!! https://x.com/reach_vb/status/1957821171249934486

🎨✨ From simple sketches to stunning 3D interiors — powered by Qwen-Image-Edit! All designs are community contributions, showcasing how AI transforms architectural visions into realistic, stylish, and precise creations. Try it now: https://x.com/Alibaba_Qwen/status/1958744976772198825

📸 Just showed Qwen Chat Vision Understanding how to “”see”” and understand a meal — and it didn’t just identify the food, it analyzed what, where, weight and even how many calories! From a simple photo, we extracted detailed insights: ✅ Object detection ✅ Weight estimation ✅ https://x.com/Alibaba_Qwen/status/1956618027769971070

🖼️ 🚨 Image Edit Leaderboard Update: Qwen-Image-Edit is now the #1 open model for Image Edit in the Arena (Apache 2.0). The model by @alibaba_qwen debuts at #6 overall on the Image Edit leaderboard tied with Gemini 2.0 Flash Preview. https://x.com/lmarena_ai/status/1958206842657743270

🖼️ Image Edit Model Update Qwen-Image-Edit, developed by @Alibaba_Qwen, is now available in the Arena. This model brings image editing capabilities, and we encourage you to test it with your most complex prompts. https://x.com/lmarena_ai/status/1957878222986821711

🚀 Excited to introduce Qwen-Image-Edit! Built on 20B Qwen-Image, it brings precise bilingual text editing (Chinese & English) while preserving style, and supports both semantic and appearance-level editing. ✨ Key Features ✅ Accurate text editing with bilingual support ✅ https://x.com/Alibaba_Qwen/status/1957500569029079083

🚀 Small but mighty update to Vision Understanding in Qwen Chat — now with native 128K context and stronger performance across vision, video, and 3D tasks! 🔥 Key Upgrades: ✅ Significant boost in math & reasoning ✅ More accurate object recognition ✅ OCR support for 30+ https://x.com/Alibaba_Qwen/status/1956289523421470855

NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale “”Autoregressive models—generating content step-by-step like reading a sentence—excel in language but struggle with images. Traditionally, they either depend on costly diffusion models or https://x.com/iScienceLuvr/status/1956321483183329436

Qwen Image Edit works too well with lightx2v LoRA to run with just 8 and 4 steps, wtf? in my experience, 8 steps keeps the quality of the edits at the same level as the original model, at a 12x speedup 💨 (ofc i built a demo for it) https://x.com/multimodalart/status/1958217824629092568

Qwen-Image Edit in ComfyUI”” / X https://x.com/Alibaba_Qwen/status/1957991583649001555

Qwen-Image-Edit is out in anycoder for image editing in your vibe coded apps Built on 20B Qwen-Image, it brings precise bilingual text editing (Chinese & English) while preserving style, and supports both semantic and appearance-level editing. https://x.com/_akhaliq/status/1957519569016238268

Qwen-Image-Edit is the new open weights leader in Image Editing, with quality comparable to GPT-4o and FLUX.1 Kontext [max] Qwen-Image-Edit is the image editing variant of the recent Qwen-Image release from Alibaba, also released under the Apache 2.0 license with weights https://x.com/ArtificialAnlys/status/1958712568731902241

Qwen-Image-Edit: Image Editing with Higher Quality and Efficiency | Qwen https://qwenlm.github.io/blog/qwen-image-edit/

Relighting images with Qwen Edit impressive directional control and color temperature manipulation w/o additional finetuning crazy how we needed a dedicated model for this not long ago https://x.com/linoy_tsaban/status/1958176756185325931

Thank you! Qwen-Image-Edit is now available in anycoder!”” / X https://x.com/Alibaba_Qwen/status/1957709912202682588

👀🚨 Vision Leaderboard update! Two new models have entered the Vision Top 20 this week: 🔸Qwen-vl-max-2025 by @alibaba_qwen lands at #10 (tied with gemini-1.5-pro & gpt-5-nano-high) 🔸Step 3 by @StepFun_ai ranks at #19 (tied with step-lo-turbo) Congrats to both 🎉 this is https://x.com/lmarena_ai/status/1958957107946168470

Wow — Qwen-Image-Edit just debuted at #2 in the Image Editing Arena 🏆 ELO 1098, with performance on par with GPT-4o — and all at open weights under Apache 2.0. Thanks to @ArtificialAnlys Try it now: https://x.com/Alibaba_Qwen/status/1958725835818770748

@YouJiacheng Just added! K2 scored *lowest* on sycophancy. 👀 https://x.com/sam_paech/status/1956612862379721057

Mistral Medium 3.1 is 2nd on LMArena without style control. Very proud of the @MistralAI team ! https://x.com/GuillaumeLample/status/1959015551172583602

Mistral Medium 3.1 just landed on @lmarena_ai leaderboard—punching way above its weight! 🏆 #1 in English (no Style Control) 🏆 2nd overall (no Style Control) 🏆 Top 3 in Coding & Long Queries 🏆 8th overall Small model. Big impact. Try it now on Le Chat and the API! https://x.com/MistralAI/status/1959015454359585230

Nvidia dropping model that rivals qwen 3 8b, with data, with base model, not that bad of a license (could be better to be clear) a big win, love to see it. Hopefully is well integrated into open tools and “”easy to finetune”” etc, which is hard to measure”” / X https://x.com/natolambert/status/1957517030929887284

💥 We just launched ChatGPT Go in India, a special subscription tier just for Indian users. For Rs 399, you get 10x higher message limits, 10x more image generations, 10x more file uploads, and 2x more memory compared to the free tier. Give it a try—you can even pay with UPI! 🇮🇳”” / X https://x.com/kevinweil/status/1957646363212087650

ChatGPT Go — a new low-cost subscription plan initially launching in India at ₹399/month (~$4.55 USD). 🇮🇳”” / X https://x.com/gdb/status/1957650320923979996

ChatGPT Go launches in India! Looking forward to making ChatGPT more affordable in India first, and then learning from feedback to expand to other countries.”” / X https://x.com/sama/status/1957849495733166587

we are opening our first office in india later this year! and i’m looking forward to visiting next month. ai adoption in india has been amazing to watch–chatgpt users grew 4x in the past year–and we are excited to invest much more in india!”” / X https://x.com/sama/status/1958922390731464805

We just launched ChatGPT Go in India, a new subscription tier that gives users in India more access to our most popular features: 10x higher message limits, 10x more image generations, 10x more file uploads, and 2x longer memory compared with our free tier. All for Rs. 399. 🇮🇳”” / X https://x.com/nickaturley/status/1957613818902892985

We just launched ChatGPT Go, a new low-cost subscription plan in India at ₹399/month. 🇮🇳 With this plan, users get everything in Free, and 10x more messages with GPT-5 auto, 10x more image generations, 10x more file uploads and 2x longer memory for more personalized responses.”” / X https://x.com/snsf/status/1957640122171896099

GPT-5 behind chinese models like Kimi-K2 and Qwen3-235B on coding https://x.com/scaling01/status/1956404452442681829

GPT-5-mini high shows no improvement over o4-mini and behind top chinese models like Kimi-K2, GLM-4.5, Qwen3-235B and DeepSeek-R1 https://x.com/scaling01/status/1956405559978029061

AI everywhere — love seeing Qwen3 powering cars & robots on-device with Qualcomm NPU! 🚀 Thanks to NEXA AI 🙌”” / X https://x.com/Alibaba_Qwen/status/1958800193970954657

Qwen 3 instruct is now on Baseten Model APIs. Our model performance team has worked quite a bit of magic to reach ~95tps for Qwen 3 Instruct. This gives you blazing fast responses for a state of the art reasoning model. https://x.com/basetenco/status/1956475210582090030

The @Alibaba_Qwen team patched two improvement fixes after we released. We thought of doing a patch release for that. So, please update to the latest: 0.35.1. Notes: https://x.com/RisingSayak/status/1958057896731897940

Knobs that matter α tunes performance vs efficiency; accuracy rises fast until ~0.6 while cost stays low until ~0.4 then climbs. Implementation uses k‑means with k=60, Qwen3‑embedding‑8B (4096‑d) and top‑p=4 nearest clusters at inference. https://x.com/omarsar0/status/1958897532890943884

Quick hacks for tool calling and thinking flag support for DeepSeek V3.1 in SGLang: https://t.co/EoUWKu4MEE Then run with: –tool-call-parser deepseekv31 –reasoning-parser qwen3 And in request body: “”chat_template_kwargs””: {“”thinking””: true} This is up on @chutes_ai now, but”” / X https://x.com/jon_durbin/status/1958488353478758599

🐞 We hit a bug in the inference code for Qwen-Image-Edit on Diffusers, which caused some odd cases. ✅ Fixed now and thanks to Diffusers for the quick merge — give it another try! 🔗 Try it now: https://x.com/Alibaba_Qwen/status/1957840853277290703

AI Toolkit now supports fine tuning Qwen Image Edit and supports caching the text embeddings with the control images. I already trained a 3 bit ARA for it, which will allow you to train a LoRA at 1024 on a 5090 when caching the text embeddings. More in 🧵 https://x.com/ostrisai/status/1958932936620900666

It’s out friends! Really great to see the state of things in image edits, video fidelity being pushed further and further, thanks to the community! This release also features new fine-tuning scripts for Qwen-Image and Flux Kontext (with support for image inputs). So, get busy https://x.com/RisingSayak/status/1957668389935096115

nano-banana, qwen-image-edit, what else? Try @StepFun_ai NextStep-1-Large-Edit – 14B AR model – Apache 2 license – Demo available on @huggingface – Pretrain model also made available Link below https://x.com/Xianbao_QIAN/status/1957749693485838448

qwen image edit is back at #1 trending model at @huggingface 👑 https://x.com/multimodalart/status/1958229738398634171

Qwen-Image pruning experiment. Going from 60 to 30 blocks, 20B params to 10B params. Removed block idx 2, 3, 4, 5, 7, 8, 10, 11, 12, 13, 14, 15, 16, 21, 23, 24, 40, 41, 42, 43, 44, 45, 49, 50, 51, 52, 53, 54, 55, 56 https://x.com/ostrisai/status/1957748358451503166

China reportedly discouraged purchase of NVIDIA AI chips due to ‘insulting’ Lutnick statements https://www.engadget.com/ai/china-reportedly-discouraged-purchase-of-nvidia-ai-chips-due-to-insulting-lutnick-statements-123055120.html