HuggingFace: AI News Week Ending 07/18/2025

HuggingFace: AI News Week Ending 07/18/2025

July 18, 2025

🚨 BREAKING: @Kimi_Moonshot’s Kimi-K2 is now the #1 open model in the Arena! With over 3K community votes, it ranks #5 overall, overtaking DeepSeek as the top open model. Huge congrats to the Moonshot team on this impressive milestone! The leaderboard now features 7 different https://x.com/lmarena_ai/status/1945866381880373490

5 Things You Need to Know About Moonshot AI and Kimi K2, the New #1 model on the Hub https://huggingface.co/blog/fdaudens/moonshot-ai-kimi-k2-explained

Every ML Engineer’s dream loss curve: “Kimi K2 was pre-trained on 15.5T tokens using MuonClip with zero training spike, demonstrating MuonClip as a robust solution for stable, large-scale LLM training.” https://x.com/hardmaru/status/1943976259236901315

For those unfamiliar with Kimi K2: – Surpasses models like GPT-4.1 and Claude 4 Opus on coding benchmarks – Scores new highs on math and STEM tests among non-reasoning systems – Doesn’t even have multimodal or reasoning capabilities yet kimi [dot] com https://x.com/rowancheung/status/1944647747027558636

I think I will spend the rest of the day letting Kimi generate these reports. They are so nice to look at compared to what OpenAI, Anthropic and others give you https://x.com/scaling01/status/1944850575470027243

It’s so beautiful to see the @Kimi_Moonshot team participating in every single community discussions or pull requests on @huggingface (the little blue bubbles on the right). In my opinion, every serious AI organization should dedicate meaningful time and ressources to this https://x.com/ClementDelangue/status/1946208120385999328

It’s undeniable with Kimi-K2 China has reached the frontier and will surpass the US next year”” / X https://x.com/scaling01/status/1944045857340359044

Kimi has a distinct writing style that is free of most of the patterns we now associate with AI generated text. Both Kimi and DeepSeek’s prose is apparently even more impressive in Chinese. Both of these models have a unique ‘voice’, quite different from Western AI. https://x.com/AndrewCurran_/status/1944434569899290839

Kimi is 200 people, very few of them with “frontier experience”, a platform (but you can buy such data) and a modest GPU budget. In theory there are many dozens of business entities that could make K2 in the West. It’s telling how none did. Not sure what it’s telling tho.”” / X https://x.com/teortaxesTex/status/1944856509734961596

Kimi is a really weird model, and it needs a lot more testing to figure out For example, I gave it an altered version of Great Gatsby and it found the two alterations (as does Claude) but then made up a ton of hallucinated nonsense that sounded plausible but was just plain wrong https://x.com/emollick/status/1944974487369158864

Kimi K2 is an incredible model.”” / X https://x.com/skirano/status/1944123290525831317

Kimi K2 is now available on https://x.com/togethercompute/status/1944952034840732138

Kimi K2 is number one trending on HF, congrats! https://x.com/huggingface/status/1944155602583691492

Kimi K2 is so good at tool calling and agentic loops, can call multiple tools in parallel and reliably, and knows “”when to stop””, which is another important property. It’s the first model I feel comfortable using in production since Claude 3.5 Sonnet. https://x.com/skirano/status/1944475540951621890

Kimi K2 just hit #1 on @huggingface trending models in <24 hours! This MoE powerhouse packs 1T params with 32B active – crushing coding challenges and autonomous agent tasks. https://x.com/fdaudens/status/1943996876778614948

Kimi K2 now on https://x.com/togethercompute/status/1945143838911128019

Kimi K2, the latest from @Kimi_Moonshot is now live in the Arena! https://x.com/lmarena_ai/status/1944827675597791456

Kimi K2: Open Agentic Intelligence https://moonshotai.github.io/Kimi-K2/

Kimi team is more american than most American labs lol”” / X https://x.com/Teknium1/status/1944430651278537098

Kimi team just trained a state of the art open source model 32B active parameter/1T total with 0 training instabilities, thanks to MuonClip, this is amazing https://x.com/eliebakouch/status/1943687750563004801

Kimi-k2 seems to be a very good (and giant & odd) open weights model that may be the new leader in open LLMs. It is not beating the frontier closed models on my weird tests, but it doesn’t have a reasoner yet. More testing needed but Chinese open weights models are impressive. https://x.com/emollick/status/1943901440453259374

past week had huuuge releases, here’s our picks 🔥 > moonshot released Kimi K2, sota LLM with 1T total 32B active parameters 🤯 > @huggingface released SmolLM3-3B, best LM for it’s size, offers thinking mode 💭 as well as the dataset, smoltalk2 > Alibaba released WebSailor-3B, https://x.com/mervenoyann/status/1944757807191888080

Pretty wild that @Kimi_Moonshot dropped a 1T parameter (32B active) MoE trained on 15.5 Trillion tokens – MIT licensed 🔥 Beats all other open weights models across coding, agentic and reasoning benchmarks Ofcourse live on Hugging Face! 🤗 https://x.com/reach_vb/status/1943703030026641801

RT @ArtificialAnlys: While Moonshot AI’s Kimi k2 is the leading open weights non-reasoning model in the Artificial Analysis Intelligence In…”” / X https://x.com/zacharynado/status/1944945039647629548

RT @DeepInfra: Moonshot AI’s Kimi 2 is now live on DeepInfra, as always at the best price of $0.55/$2.20, full tool call and context suppor…”” / X https://x.com/jeremyphoward/status/1944939322735780260

RT @htihle: Results from kimi-k2 on WeirdML! It does very well for a non-reasoning model. Like a scaled up deepseek-v3, beating out gpt-4.1…”” / X https://x.com/bigeagle_xd/status/1944325829657554962

RT @huggingface: Kimi K2 is number one trending on HF, congrats! https://x.com/_akhaliq/status/1944159007456784512

RT @ivanfioravanti: Kimi-Dev-72B-4bit-DWQ is on mlx-community! It took 9 hours to create 😅 Quick performance test on M3 Ultra: Prompt: 56…”” / X https://x.com/awnihannun/status/1944108947411284374

RT @Kimi_Moonshot: 🚀 Hello, Kimi K2! Open-Source Agentic Model! 🔹 1T total / 32B active MoE model 🔹 SOTA on SWE Bench Verified, Tau2 & Ace…”” / X https://x.com/stanfordnlp/status/1944114320226263165

RT @koltregaskes: Kimi-K2 tops EQ-Bench, the benchmark that measures emotional intelligence. https://x.com/jeremyphoward/status/1944326479246147899

RT @lmarena_ai: 🚨 BREAKING: @Kimi_Moonshot’s Kimi-K2 is now the #1 open model in the Arena! With over 3K community votes, it ranks #5 over…”” / X https://x.com/Kimi_Moonshot/status/1945897926796185841

RT @lmarena_ai: Kimi K2, the latest from @Kimi_Moonshot is now live in the Arena! https://x.com/Kimi_Moonshot/status/1945462820147249523

RT @masondrxy: New K2 model from @Kimi_Moonshot is officially supported by @LangChainAI on @GroqInc! See 👇 https://x.com/Hacubu/status/1945144499228811676

RT @OpenRouterAI: Kimi K2 is now passing 200 tokens per second on OpenRouter Props to @GroqInc !”” / X https://x.com/JonathanRoss321/status/1945779694256722025

RT @reach_vb: LOVE ITT! You can run Kimi K2 (1T token MoE) on a single M4 Max 128GB VRAM (w/ offloading) or a single M3 Ultra (512GB) 🔥 Th…”” / X https://x.com/reach_vb/status/1944997786329460978

RT @sam_paech: Kimi-K2 just took top spot on both EQ-Bench3 and Creative Writing! Another win for open models. Incredible job @Kimi_Moonsh…”” / X https://x.com/Teknium1/status/1944285648825069759

RT @sdrzn: Seriously blown away by Moonshot’s new Kimi K2 model in @cline. It beats Claude Opus 4 on coding benchmarks and is up to 90% che…”” / X https://x.com/ClementDelangue/status/1946316382313869778

RT @weights_biases: NEW: Kimi K2 is now live on W&B Inference by @CoreWeave! It’s the first truly open challenger, ready for production wi…”” / X https://x.com/l2k/status/1945225318928634149

Seen many people mention how kimi K2 for example has no CoT or thinking which isn’t true, more of an issue with terminology Main difference with reasoning models (in terms of actual functionality) is the thinking is hidden during general non-verifiable rl, so the model can”” / X https://x.com/Grad62304977/status/1944050338551484702

Some thoughts on the decisions behind Kimi K2’s architecture – from our infra staff”” / X https://x.com/Kimi_Moonshot/status/1944589115510734931

Thank you to @Kimi_Moonshot for quickly addressing my queries on the correct system prompt for Kimi K2! We’ll be re-uploading all BF16 + dynamic @unslothai GGUFs with fixed tool calling & the new sys prompt! Sys prompt = “”You are Kimi, an AI assistant created by Moonshot AI.”””” / X https://x.com/danielhanchen/status/1946163064665260486

That’s from Kimi K2 blog post. In case someone says «wow and it’s not RL-trained». It very much is, don’t get misled by the absence of long CoT. Looks like DeepResearch but It’s probably similar to what’s been happening since Sonnet 3.5, giving it uncanny «pre-reasoner» powers. https://x.com/teortaxesTex/status/1944416704253018372

The success of Kimi K2 is no accident. The unfortunate reality in AI is that user experiences haven’t yet fully caught up to raw model capabilities. Experiences have plateaued. There are only so many coding assistants, research tools, or agents you can realistically offer, and https://x.com/skirano/status/1945505132323766430

TheZvi’s answer “why isn’t there American Kimi” basically: incentives. I *partially* buy it. But given the Concern about the dominance of Chinese open models, expressed by numerous patriotic think tanks, I think we could expect *someone* rising to the task. https://x.com/teortaxesTex/status/1945624983985639487

This is what 200 tokens/second looks like with Kimi K2 on @GroqInc For reference, Claude Sonnet-4 is usually delivered at ~60 TPS https://x.com/cline/status/1945354314844922172

True, the first ever application of Muon was to break the 3-second barrier in the CIFAR-10 speedrun. For perspective on scale that was a 3e14 flop training; @Kimi_Moonshot’s K2 is 3e24 flops, 10 orders of magnitude larger. https://x.com/kellerjordan0/status/1945701578645938194

We’ve just fixed 2 bugs in Kimi-K2-Instruct huggingface repo. Please update the following files to apply the fix: – tokenizer_config.json: update chat-template so that it works for multi-turn tool calls. – tokenization_kimi.py: update encode method to enable encoding special”” / X https://x.com/Kimi_Moonshot/status/1945050874067476962

We’ve submitted Kimi K2 to @lmarena_ai. Waiting to be added to the match pool: https://x.com/Kimi_Moonshot/status/1944754256059453823

You might not have heard of Moonshot AI, but within 24 hours, their Kimi K2 model shot to the top of the Hugging Face trending models. So… who are they, and why does this matter? 🧵Here are a few standout facts:”” / X https://x.com/fdaudens/status/1945128932040208867

Kimi K2 at 185 t/s (or even higher, nearly 220 in my short tests) is probably the best use of Groq to date, and can make K2 immediately more compelling than Sonnet 4. Impressive that they’ve managed to fit this 1T monster on their chips. https://x.com/teortaxesTex/status/1944950183051321542

Quick start project for Claude Code on Kimi:”” / X https://x.com/jeremyphoward/status/1944326308210921652

Very interesting – you can use Kimi with the Anthropic API. This means, perhaps most importantly, that you can now use Kimi with Claude Code! 🤯 https://x.com/jeremyphoward/status/1944322841866125597

RT @allhands_ai: Kimi-K2 is definitely the first strong open-weight competitor to Claude Sonnet. 65.4% on SWE-Bench Verified in OpenHands,…”” / X https://x.com/TheZachMueller/status/1945545349352829439

The DeepSeek moment was supercharged by pent-up consumer demand for a good free AI for those who wouldn’t pay (especially for students for homework) A reason Kimi K2 has not had the immediate public impact of DeepSeek may be, for most consumers/students, DeepSeek is good enough”” / X https://x.com/emollick/status/1944764085741957153

RT @yawnxyz: Kimi K2 is **INCREDIBLE** at using tools. I built a chrome extension to chat with Google Maps, but I never posted it. All th…”” / X https://x.com/bigeagle_xd/status/1945087963408351728

I’ve been a bit quiet on X recently. The past year has been a transformational experience. Grok-4 and Kimi K2 are awesome, but the world of robotics is a wondrous wild west. It feels like NLP in 2018 when GPT-1 was published, along with BERT and a thousand other flowers that https://x.com/DrJimFan/status/1944443447953498285

I doubt that Sama’s delay of open model is about Kimi. But I don’t find the logic here compelling either. «Only nerds noticed Kimi». Well, Sama is loathed. The point of his model is, above all things, PR. If it’s not open SOTA, reports will notice *that*. I think he wants SOTA. https://x.com/teortaxesTex/status/1944263611398180954

Rumors that OpenAI delayed their open-source model because of Kimi are fun, but from what I hear: – the model is much smaller than Kimi K2 (<< 1T parameters) – super powerful – but due to some (frankly absurd) reason I can’t say, they realized a big issue just before release, so”” / X https://x.com/Yuchenj_UW/status/1944235634811379844

Super excited to see Kimi K2 land on Perplexity. If you’re fine-tuning, quick reminder: using the Muon optimizer during both fine-tuning and RL phases gives the best results (details are in our Moonlight paper).”” / X https://x.com/Kimi_Moonshot/status/1944224975428497549

Grok 4 suggests that scaling still works (with the diminishing returns predicted by the scaling law), and that tool use can unlock performance gains. Kimi suggests there continues to be big opportunities from improvements in methods (Muon, etc.). Lots of paths for AI right now.”” / X https://x.com/emollick/status/1944306918631018856

⚠️ AI isn’t just a tool—it’s also supercharging medical misinformation. From fake studies to hallucinated citations, the risks are real when governments and charlatans lean on unchecked AI. https://x.com/fdaudens/status/1945537203754369211

Big news in healthcare AI! I’m thrilled to announce the launch of OpenMed on
HuggingFace, releasing 380+ state-of-the-art medical NER models for free under Apache 2.0. https://x.com/ClementDelangue/status/1945622980475691364

Maybe we should buy Cline 😅😅😅”” / X https://x.com/ClementDelangue/status/1946288857814610057

Open ASR Leaderboard – a Hugging Face Space by hf-audio https://huggingface.co/spaces/hf-audio/open_asr_leaderboard

RT @calebfahlgren: NEW 🔥!! There’s can now view JSON for List cells on @huggingface datasets. Now there’s no excuse for looking at your d…”” / X https://x.com/_lewtun/status/1944384554795462891

merve on X: “GLM-4.1V-9B-Thinking is the BEST thinking vision LM out there 😍 it’s now served in @huggingface Inference Providers through @novita_labs 🤝 https://t.co/XUwXBLmoz7” / X
https://x.com/mervenoyann/status/1945432520339734647

Rohan Paul on X: “📢 A new 32B model, EXAONE 4.0 just dropped on @huggingface from LG AI Research. 🤏 Outcompetes Qwen 235B on coding and exceeds DeepSeek R1 V3 671B on instruction tasks. – toggleable reasoning, 131K context, and a non-commercial license. – It solves more edge cases than Qwen https://t.co/km7RZ9CC5U” / X

keypoint detection & matching, now completely open-source! 🔥 @stevenbucaille recently shipped DISK with LightGlue in @huggingface transformers, allowing for commercial use 😍 https://x.com/mervenoyann/status/1945398984144548320

Thomas Wolf on X: “Thrilled to finally share what we’ve been working on for months at @huggingface 🤝@pollenrobotics Our first robot: Reachy Mini A dream come true: cute and low priced, hackable yet easy to use, powered by open-source and the infinite community. Tiny price, small size, huge https://t.co/yl71EtwTKs” / X
https://x.com/Thom_Wolf/status/1942887160983466096

13 new types of LoRA ▪️ T-LoRA ▪️ SingLoRA ▪️ LiON-LoRA ▪️ LoRA-Mixer ▪️ QR-LoRA ▪️ FreeLoRA ▪️ LoRA-Augmented Generation (LAG) ▪️ ARD-LoRA (Adaptive Rank Dynamic LoRA) ▪️ WaRA ▪️ BayesLoRA ▪️ Dual LoRA Learning (DLoRAL) ▪️ Safe Pruning LoRA (SPLoRA) ▪️ PLoP (Precise LoRA https://x.com/TheTuringPost/status/1944374993309069818