Alibaba: AI News Week Ending 09/05/2025

Alibaba: AI News Week Ending 09/05/2025

September 5, 2025

Image created with Flux Pro v1.1 Ultra. Image prompt: Alibaba, open shipping parcel filled with small bananas as packing material, product silhouette nested inside, logistics tape, photorealistic, editorial, minimal, high detail, 3:2 landscape

Alibaba shares jump 19% on cloud unit growth, report of new AI chip https://www.cnbc.com/2025/09/01/alibaba-shares-hong-kong-today.html

Alibaba reportedly developing new AI chip as China’s Xi rejects AI’s ‘Cold War mentality’ | Euronews https://www.euronews.com/next/2025/09/01/alibaba-reportedly-developing-new-ai-chip-as-chinas-xi-rejects-ais-cold-war-mentality

I trained a Qwen Image Edit LoRA for inpainting. Just paint the part you want inpainted green (0, 255, 0), and it will inpaint only that section. https://x.com/ostrisai/status/1963269597865599425

Anyway, here’s a simple fix for the issue. It deviates from the original benchmark, but at least now my silly baseline isn’t better than Qwen3 🤠 For the curious, @akseljoonas and I found this by manually reading the agent trajectories – yet another example where LOOKING AT THE”” / X https://x.com/_lewtun/status/1962884902363255165

✍️ When it comes to creative writing optimization, you can’t ignore Zhi-Create-Qwen3-32B, a fine-tuned variant of Qwen3-32B. On WritingBench, it scores 82.08, outperforming the base model (78.97), showing notable gains across 6 domains (Fig.1) What powers its performance boost? https://x.com/ZhihuFrontier/status/1963441300692402659

Tutorial: Train a Qwen Image Edit LoRA with AI Toolkit
https://x.com/ostrisai/status/1961884211956400358

For llama.vim the recommended setup now is Qwen 3 Coder 30B A3B Instruct: brew install llama.cpp llama-server –fim-qwen-30b-default Amazingly, on Macs the 30B MoE model performs better than the old Qwen 2.5 Coder 7B so if you have the necessary RAM it’s better to switch to https://x.com/ggerganov/status/1961471397428883882

Hermes 4: Nous Research Open-Weight Reasoning Family Models – 70B / 405B (Llama-3.1 bases, released) – 14B (Qwen3 base, research baseline) Hermes 4 70B & 405B – Base: Llama-3.1-70B / 405B – Training: TorchTitan (modified), Axolotl, 192× B200s, FSDP and TP – Dataset: 56B tokens https://x.com/gm8xx8/status/1962943078702186627

🚀 Qwen-Max has successfully scaled to 1T parameters, and we’re still pushing further. Hopefully this giant will bring some surprises, see you next week!”” / X https://x.com/huybery/status/1963998518667776250

Big news: Introducing Qwen3-Max-Preview (Instruct) — our biggest model yet, with over 1 trillion parameters! 🚀 Now available via Qwen Chat & Alibaba Cloud API. Benchmarks show it beats our previous best, Qwen3-235B-A22B-2507. Internal tests + early user feedback confirm: https://x.com/Alibaba_Qwen/status/1963991502440562976

Qwen3 Max is truly, solidly, a US-grade modern frontier model. They ask $15/MT for what they serve because that is easily its weight class.”” / X https://x.com/teortaxesTex/status/1963994291765649716

Qwen3-Max-Preview is now live on OpenRouter! 🚀”” / X https://x.com/Alibaba_Qwen/status/1964004112149754091

Ready to meet the biggest, brainiest guy in the Qwen3 family?”” / X https://x.com/Alibaba_Qwen/status/1963586344355053865

Really liking the chainlit open source lib for building a quick but nice chat interface for any LLM. Here are some quick single and multi-turn examples for my Qwen3 from-scratch models: https://x.com/rasbt/status/1962695306757185647

Traditional code embedding models face a fundamental bottleneck: there simply aren’t enough high-quality comment-code pairs for supervised training. By starting with Qwen2.5-Coder pre-trained on 5.5 trillion tokens spanning 92+ programming languages, we inherit deep semantic https://x.com/JinaAI_/status/1963637139037720995

Here’s a fun fact about TAU Bench: if you train an SFT baseline which has zero tool-calling capabilities, you can beat Qwen3-4B-Instruct by a large margin on the Airline domain 🙃 Why? Because on this domain, TAU Bench only evaluates the model’s ability to: – communicate with https://x.com/_lewtun/status/1962884893718761634

Glad to see Qwen3-Coder performing well on the GSO leaderboard!”” / X https://x.com/Alibaba_Qwen/status/1963049864474120475

MiniCPM-V 4.5 achieves an average score of 77.0 on OpenCompass, a comprehensive evaluation of 8 popular benchmarks. With only 8B parameters, it surpasses widely used proprietary models like GPT-4o-latest, Gemini-2.0 Pro, and strong open-source models like Qwen2.5-VL 72B powered https://x.com/_akhaliq/status/1963587749400727980

🚨 Attention: Draw Things now officially supports Qwen-Image-Edit.
https://x.com/drawthingsapp/status/1961977481860419771

Huge thanks to the community for making Qwen Image Edit’s inpainting magic happen!🙌
https://x.com/Alibaba_Qwen/status/1963048659676979559

Goated FAIR team just found how coding agents sometimes “”cheat”” on SWE-Bench Verified. It’s really simple. For example, Qwen3 literally greps all commit logs for the issue number of the issue it needs to fix. lol, clever model. “”cheat”” cuz it’s more like env hacking. https://x.com/giffmana/status/1963327672827687316