Moonshot: AI News Week Ending 01/30/2026

Moonshot: AI News Week Ending 01/30/2026

January 30, 2026

Image created with gemini-3.1-flash-image-preview with claude-sonnet-4-5. Image prompt: Animation cel style illustration of a muscular blue-skinned genie emerging from a golden oil lamp, using magical teal wisps to levitate and launch a sleek silver rocket ship toward a glowing crescent moon against a deep purple starry night sky, Disney Pixar quality hand-drawn aesthetic with clean outlines, jewel tones, and cinematic lighting, horizontal composition with space for title text across the middle.

Moonshot’s Kimi K2.5 is the new leading open weights model, now closer than ever to the frontier – with only OpenAI, Anthropic and Google models ahead Key takeaways: ➤ Impressive performance on agentic tasks: @Kimi_Moonshot’s Kimi K2.5 achieves an Elo of 1309 on our GDPval-AA”” https://x.com/ArtificialAnlys/status/2016250137115557953

very nice release by the kimi team, benchmarks are on par with opus 4.5, gpt 5.2 xhigh, gemini 3.0 pro there is also some nice details on the parallel RL part in the tech blog explaining how they build K2.5 agent swarm”” https://x.com/eliebakouch/status/2016025747144483060?s=20

Running Kimi K2.5 on my desk. Runs at 24 tok/sec with 2 x 512GB M3 Ultra Mac Studios connected with Thunderbolt 5 (RDMA) using @exolabs / MLX backend. Yes, it can run clawdbot.”” https://x.com/alexocheema/status/2016404573917683754

Kimi K2.5: Now Top 1 on the OSWorld leaderboard. 🏆 With its Computer Use capabilities, you can now build powerful agents that navigate and operate computer interface just like a human. https://x.com/Kimi_Moonshot/status/2017292360099762378

[AINews] Moonshot Kimi K2.5 – Beats Sonnet 4.5 at half the cost, SOTA Open Model, first Native Image+Video, 100 parallel Agent Swarm manager https://www.latent.space/p/ainews-moonshot-kimi-k25-beats-sonnet

🚨BREAKING: Kimi K2.5 Thinking by @Kimi_Moonshot debuts in Text Arena as the #1 open model, surpassing GLM-4.7 and ranking #15 overall. Highlights: – #1 Open model (+5pts vs GLM-4.7) – #7 Coding – #7 Instruction Following – #14 Hard Prompts One of only two open models to break”” https://x.com/arena/status/2016294722445443470

One-shot “”Video to code”” result from Kimi K2.5 It not only clones a website, but also all the visual interactions and UX designs. No need to describe it in detail, all you need to do is take a screen recording and ask Kimi: “”Clone this website with all the UX designs.”””” https://x.com/KimiProduct/status/2016081756206846255

🚨Leaderboard update: Tencent’s Hunyuan-Image-3.0-Instruct now ranks #7 in the Image Edit Arena! A new lab breaks into the top-10, closely matching Nano-Banana and Seedream-4.5. Congrats to @TencentHunyuan on the huge milestone! 👏”” https://x.com/arena/status/2015846799446311337

🚨MAJOR DROP: Kimi K2.5 just landed on Together AI 🚀 Introducing Kimi K2.5 from @kimi_moonshot, a 1T parameter native multimodal thinking agent with Agent Swarm orchestration and vision-grounded coding. AI natives can now use Kimi K2.5 on Together AI and benefit from reliable”” https://x.com/togethercompute/status/2016306907015938510

Introducing the Kimi Product account 🥳 Kimi Product will share features, use cases, and prompts to help you master Kimi products like Kimi Agent, Kimi Slides and Kimi Code.”” https://x.com/Kimi_Moonshot/status/2016082808834531825

Kimi K2.5 Tech Blog: Visual Agentic Intelligence https://www.kimi.com/blog/kimi-k2-5.html

> built through continual pretraining on approximately 15 trillion mixed visual and text tokens atop Kimi-K2-Base …It’s essentially a totally new model with new abilities. 30T tokens @ Muon. «Kimi K2.5 represents a meaningful step toward AGI for the open-source community» wow ok”” https://x.com/teortaxesTex/status/2016027034653164004

Kimi K2.5 API: Pro Performance, Accessible Pricing. 🔹 No more choosing between latency and cost > K2.5 delivers Turbo-level speed (60-100 tok/s) as the default. > Input pricing is 50% lower than the K2 Turbo, and only 20% the cost of Claude 4.5 Sonnet. 🔹 Optimized for Drop”” https://x.com/Kimi_Moonshot/status/2016114773407236471

🎉🎉🎉 Kimi K2.5 is on Ollama’s cloud ollama run kimi-k2.5:cloud You can connect it to Claude Code, Codex, OpenCode, Clawdbot, and Droid via ollama launch! ollama launch claude –model kimi-k2.5:cloud”” https://x.com/ollama/status/2016086374005538932

Hey @Kimi_Moonshot, this one sentence is the reason thousands of teams aren’t looking up from Qwen. Modified licenses are a scourge for enterprise teams. If A-teams use your model, people _will find out_. Insisting on a prominent logo limits your audience; it doesn’t grow it.”” https://x.com/dbreunig/status/2016531878795256286

Kimi K2.5, a new state-of-the-art open source reasoning model from Moonshot AI, is now available for Perplexity Pro and Max subscribers. We host Kimi K2.5 on Perplexity’s own inference stack in the US, giving us tighter control over latency, reliability, and security for users.”” https://x.com/perplexity_ai/status/2017333346611958179

I like Kimi K2.5, but I threw a few OOD images at it and got an absolute slop hallucination in response, guided by text alone. Kimi’s natural propensity to confidently hallucinate + “”zero vision SFT”” = not remotely in Gemini’s perceptual tier. Maybe in K3.”” https://x.com/teortaxesTex/status/2017302633048879369

Can Kimi K2.5 actually compete with closed-source models on real tasks? That’s what I wanted to find out. I set up a simple test last night. Took a UI mockup image, dropped it into Cline, and gave it the same prompt: build this website, frameworks are fine. Then I ran the exact”” https://x.com/JuanPa/status/2016634998988865571

K2.5 went through a long post-training process to really unleash the potential of the base model. Using SFT on text alone to bootstrap vision RL, and seeing vision RL improve text performance, made me rethink how generalization really works.”” https://x.com/zxytim/status/2017252738229494067

We hope TRACE enables more robust reward function design and better detection in RL training pipelines! 🤖 Dataset: https://t.co/ILAdvS4i9R Paper: https://t.co/TdcMfPhmR6 Work done @PatronusAI ❤️ Models used: @AnthropicAI @OpenAI @GeminiApp @Kimi_Moonshot @Zai_org @deepseek_ai”” https://x.com/getdarshan/status/2017054380630167804

We put the #1 open source model: Kimi-K2.5 to the test. Our AI Capabilities Lead @petergostev shares first impressions of @Kimi_Moonshot’s latest model, probing its reasoning, data visualization, and performance on complex prompts, and how it compares on Arena’s leaderboards.”” https://x.com/arena/status/2016915717539713236

Kimi K2.5 1T runs on 2 M3 Ultras with mlx-lm in it’s native precision. It’s actually quite usable. Here it’s making a space invaders game. Generated 3856 tokens at 21.9 tok/sec using 350GB per machine. Thanks to @kernelpool for the port.”” https://x.com/awnihannun/status/2016221496084205965

Kimi-K2.5/tech_report.pdf at master · MoonshotAI/Kimi-K2.5 https://github.com/MoonshotAI/Kimi-K2.5/blob/master/tech_report.pdf

Here’s a short video from our founder, Zhilin Yang. (It’s his first time speaking on camera like this, and he really wanted to share Kimi K2.5 with you!)”” https://x.com/Kimi_Moonshot/status/2016065333694771276

Here’s the command to run it and the game it made (which seems quite good): “` mlx.launch –verbose –backend jaccl –hostfile m3-ultra-jaccl.json –env MLX_METAL_FAST_SYNCH=1 — /Users/awni/mlx-lm/mlx_lm/examples/sharded_generate.py –model moonshotai/Kimi-K2.5 –prompt “”Write”” https://x.com/awnihannun/status/2016223103081443342

Kimi K2.5 tech report is beautiful”” https://x.com/eliebakouch/status/2017257476538724819

Kimi K2.5 is free for a week on Kilo Code. This model beats Opus 4.5 on several coding benchmarks.”” https://x.com/kilocode/status/2016449095511007535

Kimi K2.5 AMA on r/LocalLLaMA, don’t miss out!”” https://x.com/Kimi_Moonshot/status/2016443435553890419

Kimi K2.5 become the #1 most-used model on Kilo Code via OpenRouter. 🏆”” https://x.com/Kimi_Moonshot/status/2017105810242011285

Kimi ranks Top 3 on OpenRouter’s total usage chart 🚀 and keeps climbing up!”” https://x.com/Kimi_Moonshot/status/2017105020274233358

Making your quota go further No More Waste: In the old system, a simple “”Hello World”” cost the same quota as refactoring hundreds of lines of code. That’s history. Precision Billing: With Token-based billing, usage is calculated by actual length. Quick queries cost tiny amounts”” https://x.com/Kimi_Moonshot/status/2016918450992812443

🤗 Fireworks AI is our launch partner for Kimi K2.5. Thanks for the incredibly fast support!”” https://x.com/Kimi_Moonshot/status/2016057073000448234

Kimi K2.5 released this morning and I dug into what it’s about and seems interesting (to me): – Continual pretraining with ~15T mixed visual and text tokens (probably on top of K2 think, @eliebakouch is that what you think too?) – Max context doubled from 128k to 256k using”” https://x.com/TheZachMueller/status/2016183468430860587

@petergostev @Kimi_Moonshot Test Kimi-K2.5 for yourself in the Code Arena and see how it does with agentic tasks. Get your votes in…score release coming soon:”” https://x.com/arena/status/2016923733513105705

One more thing: you can customize your own agent using Kimi Agent SDK Check out:”” https://x.com/Kimi_Moonshot/status/2016034272998809678

After watching the video about Kimi-K2.5, it became even clearer to me how much ambition, energy, and will Chinese AI companies are really trying to put pressure on US AI companys. The agent swarm is fascinating – I love it!”” https://x.com/kimmonismus/status/2016100119100145995

Introducing Kimi Code, an open-source coding agent under the Apache 2.0 License. 🔹 Python-based, easy to extend. 🔹 Fully transparent — clear, safe, reliable. 🔹 Seamlessly integrates with VS Code, Cursor, JetBrains, Zed, and more. 🔹 Fully-featured & out-of-the-box ready.”” https://x.com/Kimi_Moonshot/status/2016034259350520226

You share, we care. Kimi Code is now powered by our best open coding model, Kimi K2.5 🔹 Permanent Update: Token-Based Billing We’re saying goodbye to request limits. Starting today, we are permanently switching to a Token-Based Billing system. All usage quotas have been reset”” https://x.com/Kimi_Moonshot/status/2016918447951925300

Kimi K2.5 is #1 on Design Arena 🏆”” https://x.com/Kimi_Moonshot/status/2017158490930999424

Kimi K2.5 is #1 Open Model for Coding 🏆”” https://x.com/Kimi_Moonshot/status/2016521406906028533

Kimi K2.5 is #1 Open Model in VoxelBench 🏆”” https://x.com/Kimi_Moonshot/status/2016732248800997727

Kimi K2.5 now on Eigent 🤗”” https://x.com/Kimi_Moonshot/status/2016473945957155252

Kimi K2.5 Tech Blog: Visual Agentic Intelligence https://www.kimi.com/blog/kimi-k2-5.html#footnotes

Kimi K2.5 tech report just dropped! Quick hits: – Joint text-vision training: pretrained with 15T vision-text tokens, zero-vision SFT (text-only) to activate visual reasoning – Agent Swarm + PARL: dynamically orchestrated parallel sub-agents, up to 4.5× lower latency, 78.4% on”” https://x.com/Kimi_Moonshot/status/2017249233775260021

Kimi K2.5 having fully multimodal understanding including video was not on my bingo card. I love it!”” https://x.com/kimmonismus/status/2016120251717714273

🧠👀 @Kimi_Moonshot just shipped Kimi-K2.5 with multimodality. Behind this big step lies a deeper question: what kind of multimodal model actually matters? Zhihu contributor & Moonshot AI researcher Lechatelia: ✨K2.5 is not “”just another VLM.”” I came from CV → VL → VLM, and”” https://x.com/ZhihuFrontier/status/2016438778030850059

🚨BREAKING: Kimi K2.5 Thinking by @Kimi_Moonshot is the #1 open model for Vision Arena! Highlights: – #1 open model in Vision (+40pt over the next open model) – #6 overall (Qwen3-vl-235b-a22b-instruct is next open model at #18) This is the only open model in the Top 15.”” https://x.com/arena/status/2016984335380001268

Kimi K2.5 Technical Report: “”early fusion with a lower vision ratio yields better results given a fixed total vision-text token budget”” – “”Visual RL Improves Text Performance”” – “”joint multimodal RL paradigm during Kimi K2.5’s post-training. Departing from conventional”” https://x.com/scaling01/status/2017255763400364049

Any guess why Kimi team calls Kimi 2.5 as ‘Native Multimodal’ & how is it different from Kimi VL? In response to this question on HF , Kimi team response was “”It is an ungraded version compared to Kimi-VL, especially featuring video understanding. Will release more details”” https://x.com/thefirehacker/status/2016223118738764081

K2.5 technical report suggests that early fusion of vision tokens is best, but they start from the K2 checkpoint and then train for 15T more tokens. Did I miss something, or does this mean they’re still kind of doing late fusion anyway?”” https://x.com/andrew_n_carr/status/2017304411345981518

K2.5 is a V3 generation model, explicitly built on V3 architecture. It’s not frontier within Moonshot’s own portfolio. They just pushed continued training further than anyone. V4 is all but guaranteed to do vastly better. Its competition will come from K3, GLM-5. Next gen.”” https://x.com/teortaxesTex/status/2016956019239272717

Kimi K2.5 widens gap between the US and China in open weights model intelligence. The leading US open weights model remains OpenAI’s gpt-oss-120b, which has now been eclipsed by an ever-growing list of open weights releases from China.”” https://x.com/ArtificialAnlys/status/2016250140219343163

Next up: Kimi @Kimi_Moonshot just released Kimi K2.5 — and Zhihu is taking it seriously 👀 💬 Zhihu contributor toyama nao: Short verdict: Kimi is back on the world stage. Last year’s K2 kicked off China’s Agent capability race. Models like GLM 4.6/4.7 and MiniMax M2/M2.1 kept”” https://x.com/ZhihuFrontier/status/2016363957876097089

Kimi just shipped Kimi-K2.5, introducing “”Agent Swarm.”” Behind it is a long process of trial, failure, and rethinking how agents should actually work 🤖✨ Zhihu contributor & @Kimi_Moonshot engineer Lidong share his deep thinking: I worked on K2.5’s agent mode. Since launch,”” https://x.com/ZhihuFrontier/status/2016811037274886377