Image created with Flux Pro v1.1 Ultra. Image prompt: International, airport‑style resort lounge, passport peeking from travel bag with subtle circuit‑globe emboss, ambient morning light, photorealistic, editorial, minimal, landscape, vacation, no text overlays
ByteDance Seed presents AgentGym-RL • First unified RL framework for multi-turn agent training (no SFT) • Modular, extensible design across web, search, games, embodied & science tasks • Agents rival/surpass commercial models on 27 task https://x.com/arankomatsuzaki/status/1965979980971782414
📊 @Kimi_Moonshot’s K2-0905 on @GroqInc scored 7th overall at 94% on Roo Code evals, the 1st open-source model to break the 90+ barrier. It’s also the fastest and cheapest in the top 10, while holding its own on accuracy. View the full leaderboard: https://x.com/roo_code/status/1965098976677658630
It feels the coding agent frontier is now open-weights: GLM 4.5 costs only $3/month and is on par with Sonnet Kimi K2.1 Turbo is 3x speed, 7x cheaper vs Opus 4.1, but as good Kimi K2.1 feels clean. The best model for me. GPT-5 is only good for complicated specs — too slow.”” / X https://x.com/Tim_Dettmers/status/1965021602267217972
Kimi K2 0905 upgrade: Substantial improvement in agentic capabilities, modest change in overall intelligence Key takeaways: ➤ Intelligence increased +2 pts in our Artificial Analysis Intelligence Index ➤ Agentic capabilities substantially improved as shown by our two new https://x.com/ArtificialAnlys/status/1965010554499788841
🚨 Leaderboard Disrupted! Two new models have entered the Top 10 Text leaderboard: 🔸#6 Qwen3-max-preview (Proprietary) by @Alibaba_Qwen 🔸#8 Kimi-K2-0905-preview (Modified MIT) by @Kimi_Moonshot tied with 7 others. Note that this puts Kimi-K2-0905-preview in a tight race for https://x.com/arena/status/1965115050273976703
Bytedance’s answer to Nano Banana is really good at making high density infographics. “Draw the following system of binary linear equations and the corresponding solution steps on the blackboard: 5x + 2y = 26; 2x -y = 5.” “Create an infographic showing the causes of https://x.com/bilawalsidhu/status/1965838191019307476
Seedream 4.0 is the new leading image model across both the Artificial Analysis Text to Image and Image Editing Arena, surpassing Google’s Gemini 2.5 Flash (Nano-Banana), across both! Seedream 4.0 is the latest release from Bytedance Seed, and is a substantial improvement on https://x.com/ArtificialAnlys/status/1966167814512980210
MBZUAI and G42 Launch K2 Think: A Leading Open-Source System for Advanced AI Reasoning https://www.prnewswire.com/news-releases/mbzuai-and-g42-launch-k2-think-a-leading-open-source-system-for-advanced-ai-reasoning-302551074.html
⚡️ Efficient weight updates for RL at trillion-parameter scale 💡 Best practice from Kimi @Kimi_Moonshot vLLM is proud to collaborate with checkpoint-engine: • Broadcast weight sync for 1T params in ~20s across 1000s of GPUs • Dynamic P2P updates for elastic clusters •”” / X https://x.com/vllm_project/status/1965824120920342916
Introducing checkpoint-engine: our open-source, lightweight middleware for efficient, in-place weight updates in LLM inference engines, especially effective for RL. ✅ Update a 1T model on thousands of GPUs in ~20s ✅ Supports both broadcast (sync) & P2P (dynamic) updates ✅ https://x.com/Kimi_Moonshot/status/1965785427530629243
Updated & turned my Big LLM Architecture Comparison article into a narrated video lecture. The 11 LLM architectures covered in this video: 1. DeepSeek V3/R1 2. OLMo 2 3. Gemma 3 4. Mistral Small 3.1 5. Llama 4 6. Qwen3 7. SmolLM3 8. Kimi 2 9. GPT-OSS 10. Grok 2.5 11. GLM-4.5 https://x.com/rasbt/status/1965798055141429523
@Alibaba_Qwen (Gated) Attention is all you need. Excited to offer both Qwen3-Next models on dedicated deployments backed by 4xH100 GPUs. https://x.com/basetenco/status/1966224960223158768
Uber and Momenta to test autonomous vehicles in Germany in 2026 | TechCrunch https://techcrunch.com/2025/09/07/uber-and-momenta-to-test-autonomous-vehicles-in-germany-in-2026/
Alibaba’s New AI Tool Just Changed Ecommerce Forever https://x.com/ariesnotebook/status/1964014875807559904
📢 New Model Drop: Seedream 4.0 is live on Yupp! This image model from ByteDance offers text-to-image generation as well as image editing. We dove in with some prompts: https://x.com/yupp_ai/status/1965827081826422990
🚨 ByteDance just released Seedream 4.0 — how does its AI image generation perform? Zhihu contributors share their feedbacks. Let’s have a quick view👇 🎨 Trisimo 崔思莫: ➤ Seed 4.0 vs Nano Banana: Different tech paths. • Nano Banana: multimodal, stronger understanding & https://x.com/ZhihuFrontier/status/1965681077231727069
🚨 New Model Alert! ByteDance’s latest Seedream 4 is ready in the Arena! 🖼️Seedream 4 merges the capabilities of Seedream 3 (Text-to-Image) with SeedEdit 3 (Image Edit). Come and test out your hardest Text-to-Image and Image Edit prompts! https://x.com/arena/status/1965929099370889432
DeepSeek V3.1 dynamic @UnslothAI quants on Aider Polyglot benchmarks are here! 1. 3-bit thinking gets 75.6% vs 76.1% un-quantized 2. Leaving attn_k_b in 8-bit gets +2% accuracy vs 4-bit 3. Dynamic quants beat other similar imatrix quants 4. AMA r/LocalLlama today 10AM PST! https://x.com/danielhanchen/status/1965800675105017980
Albania appoints world’s first AI-made minister – POLITICO https://www.politico.eu/article/albania-apppoints-worlds-first-virtual-minister-edi-rama-diella/
Mixtral of experts | Mistral AI https://mistral.ai/news/mixtral-of-experts
ASML, Mistral AI enter strategic partnership https://www.asml.com/en/news/press-releases/2025/asml-mistral-ai-enter-strategic-partnership
Mistral raises 1.7B€, partners with ASML | Hacker News https://news.ycombinator.com/item?id=45178041
4B OCR with Apache-2.0 license outperforming Mistral OCR 🔥 Tencent released Points-Reader, it’s a new model firstly trained on Qwen2.5VL annotations and then self-trained on real data in many benchmarks, it performs better than Qwen2.5VL and MistralOCR! https://x.com/mervenoyann/status/1966176133894098944
@reach_vb @Alibaba_Qwen ❤️ We ship as fast as we can. We optimized the models’ speed, serve in bf16, should be fast!”” / X https://x.com/Yuchenj_UW/status/1966201249721888800
🚀 Introducing Qwen3-Next-80B-A3B — the FUTURE of efficient LLMs is here! 🔹 80B params, but only 3B activated per token → 10x cheaper training, 10x faster inference than Qwen3-32B.(esp. @ 32K+ context!) 🔹Hybrid Architecture: Gated DeltaNet + Gated Attention → best of speed & https://x.com/Alibaba_Qwen/status/1966197643904000262
From research paper to live website in minutes 🚀 Upload your paper, let Qwen Chat turn it into a webpage, and deploy instantly. Try it now: https://x.com/Alibaba_Qwen/status/1964870508421480524
I vibe coded a visual PDF search app with ColQwen2. This is how it works: – Store PDF files as images in a @weaviate_io vector database – Embed images and text with a multimodal late-interaction model (ColQwen2) – Generate token-wise (and summed) similarity maps to highlight https://x.com/helloiamleonie/status/1964997028875743637
Learn more about Qwen3-max-preview here: https://x.com/arena/status/1965124408097517853
Qwen3-Next (thinking & non-thinking) are now live in BF16 at Hyperbolic! Qwen3-Next is a huge efficiency leap: – 80B MoE with just 3B active params – 10x cheaper to train vs Qwen3-32B – 10x inference throughput for >32K tokens Proud to be a launch partner with @Alibaba_Qwen – https://x.com/Yuchenj_UW/status/1966199037973200955
Qwen3-Next, or to say, a preview of our next generation (3.5?) is out! This time we try to be bold, but actually we have been doing experiments on hybrid models and linear attention for about a year. We believe that our solution shoud be at least a stable and solid solution to”” / X https://x.com/JustinLin610/status/1966199996728156167
Qwen3-Next: Towards Ultimate Training & Inference Efficiency
https://qwen.ai/blog?id=4074cca80393150c248e508aa62983f9cb7d27cd&from=research.latest-advancements-list
Welcome Qwen3-Next! You can run it efficiently on vLLM with accelerated kernels and native memory management for hybrid models. https://x.com/vllm_project/status/1966224816777928960
🎙️ Meet Qwen3-ASR — the all-in-one speech recognition model! ✅ High-accuracy EN/CN + 9 more languages: ar, de, en, es, fr, it, ja, ko, pt, ru, zh ✅ Auto language detection ✅ Songs? Raps? Voice with BGM? No problem. <8% WER ✅ Works in noise, low quality, far-field ✅ Custom https://x.com/Alibaba_Qwen/status/1965068737297707261
inpainting is not dead! @instantx_ai brought it back to life! 🪔🕯 Qwen Image Inpainting ControlNet allows for the most precise, targeted & high quality edits that ever happened to inpainting Official Model & Demo on @huggingface 🤗 https://x.com/multimodalart/status/1966190381340692748
Alibaba leads $100m round in Chinese robotics startup X Square
https://www.techinasia.com/news/alibaba-leads-100m-round-in-chinese-robotics-startup-x-square
Chinese humanoid startup X Square Robot has raised $100M in a funding round led by Alibaba Cloud, with HongShan (formerly Sequoia Capital China) and others, bringing total funding to $280M since the company launched Dec 2023. The firm recently launched Wall-OSS, an open-source https://x.com/TheHumanoidHub/status/1965475160804462768




