International: AI News Week Ending 03/20/2026

Image created with gemini-3.1-flash-image-preview with claude-sonnet-4-5. Image prompt: Using the provided reference image, preserve the deep midnight navy car hood, shallow depth-of-field sky background, chrome pedestal base, dramatic upward camera angle, and automotive advertisement lighting exactly as shown. Replace only the Mercedes star with a single chrome globe hood ornament featuring etched latitude/longitude meridian lines, mounted on the same pedestal at realistic ornament scale, polished and photorealistic. Add bold white sans-serif display text reading INTERNATIONAL across the upper portion of the image as a headline.

BREAKING 🚨: MiniMax released MiniMax M2.7, a new self-evolving model, achieving a score of 56.22% on SWE-Bench Pro. M2.7 was used for building complex agent harnesses during its own development. Users can now access MiniMax M2.7 via APIs and MiniMax Agent.
https://x.com/testingcatalog/status/2034250919345377604#m

During the iteration process, we also realized that the model’s ability to recursively evolve its harness is equally critical. Our internal harness autonomously collects feedback, builds evaluation sets for internal tasks, and based on this continuously iterates on its own
https://x.com/MiniMax_AI/status/2034315323109953605#m

Introducing MiniMax-M2.7, our first model which deeply participated in its own evolution, with an 88% win-rate vs M2.5 – Production-Ready SWE: With SOTA performance in SWE-Pro (56.22%) and Terminal Bench 2 (57.0%), M2.7 reduced intervention-to-recovery time for online incidents
https://x.com/MiniMax_AI/status/2034315320337522881#m

MiniMax Global Announces Full Year 2025 Financial Results – MiniMax News | MiniMax https://www.minimax.io/news/minimax-global-announces-full-year-2025-financial-results

Minimax M2.7 released! And its a big one Highlights: Self-evolving – first model that helped build itself, running 100+ autonomous optimization loops during its own RL training (30% internal improvement). Strong coder – 56.2% on SWE-Pro (near Opus 4.6), 55.6% on VIBE-Pro,
https://x.com/kimmonismus/status/2034269026353082422#m

MiniMax M2.7: Early Echoes of Self-Evolution – MiniMax News | MiniMax https://www.minimax.io/news/minimax-m27-en

ByteDance reportedly pauses global launch of its Seedance 2.0 video generator | TechCrunch https://techcrunch.com/2026/03/15/bytedance-reportedly-pauses-global-launch-of-its-seedance-2-0-video-generator/

ByteDance Suspends Launch of Video AI Model After Copyright Disputes With Hollywood — The Information https://www.theinformation.com/articles/bytedance-suspends-launch-video-ai-model-copyright-disputes-hollywood

Here is SeedProteo, our latest diffusion-based model for de novo all-atom protein design from ByteDance Seed! Our server is now live — feel free to give it a try! https://x.com/SeedFold/status/2033515503839514771

Introducing Forge | Mistral AI https://mistral.ai/news/forge

Introducing Mistral Small 4 | Mistral AI https://mistral.ai/news/mistral-small-4

Leanstral: Open-Source foundation for trustworthy vibe-coding | Mistral AI https://mistral.ai/news/leanstral

@_avichawla Impressive work from Kimi
https://x.com/elonmusk/status/2033528245464047805

🔥 @Kimi_Moonshot’s new Attention Residual paper is sparking discussions. Zhihu contributor OpenLLMAI shares a deep dive: “”From Kimi’s Attention Residual to ‘Vertical Attention’ — an idea I’ve been thinking about for half a year.”” Some interesting thoughts on attention mechanisms
https://x.com/ZhihuFrontier/status/2033751367198949865

Avi Chawla on X: “Big release from Kimi! They just released a new way to handle residual connections in Transformers. In a standard Transformer, every sub-layer (attention or MLP) computes an output and adds it back to the input via a residual connection. If you consider this across 40+ layers, https://t.co/5i5AN9tzIm” / X
https://x.com/_avichawla/status/2033472650836914495

https://chatgpt.com/share/69cda240-9324-832a-89b6-a43d4a22f437

https://claude.ai/share/7239e73e-9e9d-469a-bbdb-e5c7da75a4e9

Introducing 𝑨𝒕𝒕𝒆𝒏𝒕𝒊𝒐𝒏 𝑹𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔: Rethinking depth-wise aggregation. Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with
https://x.com/Kimi_Moonshot/status/2033378587878072424

visual summary of attention residuals by kimi, beautiful paper
https://x.com/eliebakouch/status/2033488233854620007

i am actually still not over how Qwen as we knew it, one of the S tier Tigers, is over.
https://x.com/swyx/status/2033030744352993296

Announcing Copilot leadership update – The Official Microsoft Blog https://blogs.microsoft.com/blog/2026/03/17/announcing-copilot-leadership-update/

“a large jump in agentic” – we agree 🙌 M2.7 is a big step forward in agentic workflows, from tool use to real-world, multi-step execution. Now live on @OpenRouter 🚀
https://x.com/MiniMax_AI/status/2034356786413867182#m

🔍Follow Zhihu contributor toyama nao, a top large model reviewer, to evaluate @MiniMax_AI MiniMax-M2.7’s capabilities in detail!✨ 📌 Basic Info： MiniMax iterates monthly in the Agent-driven model track. As a minor version upgrade, M2.7 carries its new understanding of the
https://x.com/ZhihuFrontier/status/2034543142234628318

DEFAULT and FREE M2.7 on @zocomputer
https://x.com/MiniMax_AI/status/2034348503347171625#m

Early testers are saying that M2.7 has big improvements in emotional intelligence and character consistency 👀
https://x.com/MiniMax_AI/status/2034528945962696948

Great to see M2.7 live on @vercel_dev 🙌 We’re seeing a real shift from simple tool use → multi-step agentic workflows running in production. M2.7 is built for exactly that.
https://x.com/MiniMax_AI/status/2034357583797178841#m

Live Stream Alert with @OpenClaw Thursday 9PM ET We will share an in-depth look at MiniMax M2.7, including early developments in self-evolution and efficient solutions designed to support 100,000 OpenClaw running clusters. 🎁 MiniMax vouchers will also be distributed during
https://x.com/MiniMax_AI/status/2034520321466978488

M2.7 is already up😎 Try it on @kilocode.
https://x.com/MiniMax_AI/status/2034339731660759097#m

M2.7 now live on @yupp_ai 🌸 Feels like a good time to build something new.
https://x.com/MiniMax_AI/status/2034328337527783857#m

M2.7 now on @opencode ⚙️ give it a plan → it runs with it add the loop (check → fix → retry) and things start to feel very agentic
https://x.com/MiniMax_AI/status/2034361282527461473#m

Minimax 2.7 incoming!
https://x.com/kimmonismus/status/2033531736647463151

Minimax 2.7 is available in Hermes Agent through the Minimax Provider, try it today!
https://x.com/Teknium/status/2034658808870621274

MiniMax doubles in Hong Kong debut, marking yet another Chinese AI listing https://www.cnbc.com/2026/01/09/minimax-hong-kong-ipo-ai-tigers-zhipu.html

MiniMax has released MiniMax-M2.7, delivering GLM-5-level intelligence for less than one third of the cost MiniMax-M2.7 from @MiniMax_AI scores 50 on the Artificial Analysis Intelligence Index, an 8-point improvement over MiniMax-M2.5, which was released one month ago. This is
https://x.com/ArtificialAnlys/status/2034313314420019462#m

MiniMax launches M2.7 model on MiniMax Agent and APIs https://www.testingcatalog.com/minimax-launches-m2-7-model-on-minimax-agent-and-apis/

MiniMax M2.7 now live on @Trae_ai Excited to see what you ship. 🙌
https://x.com/MiniMax_AI/status/2034327432124350924#m

MiniMax M2.7: Early Echoes of Self-Evolution
https://x.com/MiniMax_AI/status/2034335605145182659

MiniMax M2.7🆚MiniMax M2.5 – Website about recently released video games The release of M2.7 should be close. MiniMax M2.5 was released two days after it appeared on the Arena
https://x.com/AiBattle_/status/2033503838284447758

MiniMax-M2.7 is now available on Ollama’s cloud. made for coding and agentic tasks 🖥️ Try it inside Claude Code: ollama launch claude –model minimax-m2.7:cloud 🦞 Use it with OpenClaw: ollama launch openclaw –model minimax-m2.7:cloud If you already have OpenClaw
https://x.com/ollama/status/2034351916097106424#m

ByteDance also implemented attention over depth. They literally combined it with sequence attention.
https://x.com/rosinality/status/2033810580604158323

Europe moves slow. Unless…you design it differently. Killing bureaucracy: 14 days from deadline to signed contract! That’s how @SPRIND runs the €125M Frontier AI Challenge. • no 100-page proposals • fast selection focused on builders, not writers This is the mindset behind
https://x.com/IlirAliu_/status/2033830196642717910

Introducing My Computer: When Manus Meets Your Desktop https://manus.im/blog/manus-my-computer-desktop

Leanstral is part of the Mistral Small 4 family
https://x.com/scaling01/status/2033625927268126969

Analysis of training dynamics demonstrates how AttnRes naturally mitigates hidden-state magnitude growth and yields a more uniform gradient distribution across depth.
https://x.com/Kimi_Moonshot/status/2033378596438556853

I wrote something on Moonshot’s latest research release – Attention Residuals. Intuition, notes and how you can understand standard residuals vs mHC vs attention residuals.
https://x.com/tokenbender/status/2033437211371454915

Moonshot AI targets $1b raise, eyes $18b valuation https://www.techinasia.com/news/moonshot-ai-targets-1b-raise-eyes-18b-valuation

This is so damn cool! Transformers do attention across tokens, now imagine doing attention across layers too. This delivers a 1.25x compute efficiency, <4% training overhead on the 48B Kimi model, +7.5 on GPQA-Diamond. Kimi is quietly becoming the new DeepSeek for the coolest
https://x.com/Yuchenj_UW/status/2033404695880896804

Oh wow, Mamba-3 is here! For me, the most interesting use case of Mamba and Mamba-likes are the recent transformer attention hybrid architectures (Qwen3.5, Kimi Linear, etc.) Would be interesting to swap Gated DeltaNet with Mamba-3 (which now also has RoPE) in next gen hybrids.
https://x.com/rasbt/status/2034088726997893168#m

📎We’ve uploaded it to arXiv, enjoy! https://x.com/Kimi_Moonshot/status/2033796781327454686

🔥 An insider take on @Kimi_Moonshot ‘s Attention Residual — From Kimi AI infra team member & Zhihu contributor Reku A rare look at how attention ideas collide with real-world training systems 👇 🧠 Attention Residual isn’t just modeling — it’s an infra challenge I mainly worked
https://x.com/ZhihuFrontier/status/2034269774281400798#m

As a member of the Kimi team, I wrote the linked blog to share how our team tackles truly innovative work together–not just as individuals, but as a coordinated group. 💎I fully agree: “you can always trust the Kimi solidness.” For us, solidness means making ideas actually work
https://x.com/YyWangCS17122/status/2034273847164473820#m

For more details, check out our paper here:
https://x.com/Kimi_Moonshot/status/2033378599450079581

Thread by @Kimi_Moonshot on Thread Reader App – Thread Reader App https://threadreaderapp.com/thread/2033378587878072424.html

Xiaomi has released MiMo-V2-Pro, which scores 49 on the Artificial Analysis Intelligence Index, placing it between Kimi K2.5 and GLM-5 @Xiaomi’s MiMo-V2-Pro is a new reasoning model and a significant upgrade over their prior open weights release, MiMo-V2-Flash (309B total / 15B
https://x.com/ArtificialAnlys/status/2034239267052896516#m

The frontier has increasingly shifted to hybrid models – from Qwen to Kimi-Linear and now with NVIDIA’s Nemotron-3 Super – that rely on a strong linear sequence model. Today we release Mamba-3, the most powerful linear model to date.
https://x.com/tri_dao/status/2033948569502413245

I reverse engineered Qwen 3.5’s FP8 format, and provide a script to recreate it.
https://x.com/QuixiAI/status/2033419073401287156

Pretty proud of this one! 😎 Qwen 3.5 Max Preview just hit #3 in Math, Top 10 in Arena Expert, and Top 15 overall! We’re already back in the lab optimizing the preview experience. Even sharper performance coming soon–stay tuned! 🚀
https://x.com/Alibaba_Qwen/status/2034658901321560549

Qwen 3.5 Max Preview has landed in top 10 for Arena Expert and top 15 for Text Arena. It shows particular strength in Math. Highlights: – #3 Math – #10 Expert – #15 Text Arena – Top 20 for Writing, Literature & Language, Life, Physical, & Social Science, Entertainment, Sports,
https://x.com/arena/status/2034653740465336407

With the preview of Qwen 3.5 Max Preview by @Alibaba_Qwen, we’re looking back at past Qwen Max variants to see how far it has progressed. Where Qwen 3.5 Max sees the largest gains vs. Qwen 3 Max: – Text Overall (+45pts) – Creative Writing (+57pts) – Math (+49pts) –
https://x.com/arena/status/2034658045113065603

ScreenSpot-Pro, the GUI computer use benchmark is now on @huggingface 🏆 just added Qwen3.5 it takes 5th place, with specialist Holo2 family takes top ranks whoever builds next GUI model based on Qwen3.5 can top the leaderboard? 🔥
https://x.com/mervenoyann/status/2034265145158119642#m

Good news: I got Qwen3.5-397B-FP8 running on my 8x mi210 server. Bad news: at 6 tokens per second.
https://x.com/QuixiAI/status/2033342155414982952