Open Source: AI News Week Ending 10/31/2025

Open Source: AI News Week Ending 10/31/2025

October 31, 2025

Image created with gemini-2.5-flash-image with claude-sonnet-4-5. Image prompt: Cinematic 80s suburban driveway at autumn dusk, large hand-drawn blueprints and technical schematics spread across concrete, neighbors in flannel and denim crouching with flashlights and toolboxes collaboratively assembling Halloween decorations, fallen leaves scattered over papers, warm porch lights glowing, fog rolling in, nostalgic film grain aesthetic

🎉 Congrats to @Kimi_Moonshot! vLLM Day-0 model expands! Now supporting Kimi Linear — hybrid linear attention with Kimi Delta Attention(KDA): – RULER 128k context: 84.3 perf + 3.98× speedup – Up to 6× faster decoding & 6.3× faster TPOT (1M tokens) – 75% KV cache reduction 💡 https://x.com/vllm_project/status/1983941708233765149

🔥 Inside Kimi Linear: First-Hand Insights @Kimi_Moonshot just dropped something impressive again. @yzhang_cs from Kimi AI Infra, shared an insider’s look at the making of Kimi Linear — an architecture designed around hybrid linear attention and optimized for efficiency × https://x.com/ZhihuFrontier/status/1984321210055082207

Kimi just released another “”next-gen”” model that reduces memory usage by up to 75%, while achieving up to 6.3× higher decoding throughput and outperforming MLA and GDN baselines https://x.com/scaling01/status/1983926811051384965

Nvidia just released ChronoEdit-14B on Hugging Face enables physics-aware image editing and action-conditioned world simulation through temporal reasoning. It distills priors from a 14B-parameter pretrained video generative model and separates inference into (i) a video https://x.com/_akhaliq/status/1983953896415604836

👀 Meet @NVIDIAAI Nemotron Nano 2 VL, now hosted on Nebius AI Studio – 10× higher throughput – Document + video intelligence – Open weights, open data – Ready for production Build multimodal assistants today → https://x.com/nebiusaistudio/status/1983243873317974318

NVIDIA Launches Open Models and Data to Accelerate AI Innovation | NVIDIA Blog https://blogs.nvidia.com/blog/open-models-data-ai/

We just launched new open models and datasets to make AI research and development more accessible 🤝 You now have open foundations to build specialized intelligent agents faster, safer, and at scale — from Nemotron to Cosmos, Isaac GR00T to Clara. Over 650 open models and 250 https://x.com/NVIDIAAIDev/status/1983227688333574318

We used our new capabilities index, the ECI, to measure the gap between open- and closed-weight models. The result? This gap is smaller than previously estimated. On average, it takes 3.5 months for an open-weight model to catch up with closed-source SOTA. https://x.com/EpochAIResearch/status/1983987212183335097

🚀 LangChain is now an AWS Generative AI Competency Partner We’re joining a small group of companies recognized by AWS for their technical expertise and customer impact in GenAI. LangSmith (our platform for agent engineering) is now available through the AWS marketplace. https://x.com/LangChainAI/status/1984303566723625044

A continuing issue in studying LLMs for medicine is the fact that everyone is testing different things with different standards. This (interesting) paper is about agentic systems powered by DeepSeek-V3.2. Other papers look at single LLMs. Tons of different benchmarks. Confusion.”” / X https://x.com/emollick/status/1982630126065201636

🔥 New LangChain Academy Course: LangChain Essentials (Python & TypeScript) 🔥 Learn the basics of LangChain – our open source framework that makes it easy to start building agents with any model provider. Last week, we released LangChain 1.0. We’ve completely rewritten https://x.com/LangChainAI/status/1982851795287507398

✨ At vLLM, we strive for correctness, reliability, and open collaboration — every detail matters. Together with @Kimi_Moonshot , we verified Kimi K2’s tool-calling accuracy on vLLM using the latest K2-Vendor-Verifier benchmark. Our debugging uncovered 3 key compatibility”” / X https://x.com/vllm_project/status/1983115488982122929

Kimi For Coding: Exclusive Add-on to Your VIP Plan! We’ve added Kimi For Coding as a powerful add-on built right on top of your current subscription perks. Extra value, no extra cost. More details 👉 https://x.com/Kimi_Moonshot/status/1984207737673359441

Kimi K2vv updated! We’ve added case-by-case statistics for ToolCall-Trigger Similarity and ToolCall-Schema Accuracy. Feedback is welcome! https://x.com/Kimi_Moonshot/status/1983082003731042637

Kimi K2vv updated! We’ve added case-by-case statistics for ToolCall-Trigger Similarity and ToolCall-Schema Accuracy. The infra team also listed some suggestions for vendors; looks like enforcer is important. https://x.com/crystalsssup/status/1983126339399102756

Kimi Linear Tech Report is dropped! 🚀 https://x.com/Kimi_Moonshot/status/1983937694360322136

My favorite part: > “Scaling Ladder” is a Kimi tradition for scaling models. We start from something small (say, 1B active parameters) and gradually aim to beat the baseline on benchmarks, while also monitoring the corresponding “internals.” Only after clearing each gate at each”” / X https://x.com/eliebakouch/status/1984291165860958614

Thankfully, theres a really nice glossary in the KIMI Delta Attention paper that covers most of the notable variants https://x.com/nrehiew_/status/1983891931823505518

There are a lot of works behind Kimi Linear. We’ve rethought efficient and expressive linear attention from infra. We even first discovered the attn matrix, and then the recurrent. No wait to check out the kda kernel in the FLA repo. We have much more work to do, to open.”” / X https://x.com/uniartisan/status/1983941443283775780

You are also welcome to share your suggestions and feedback for our Kimi CLI on GitHub. > https://x.com/Kimi_Moonshot/status/1984207741037252751

Introducing Kimi CLI Technical Preview & Kimi For Coding! Kimi CLI powers your terminal: – Shell-like UI + shell command execution – Seamless Zsh integration – MCP support -Agent Client Protocol (now compatible with @zeddotdev) More features incoming! https://x.com/Kimi_Moonshot/status/1984207733177090274

Claude, GPT-5, Gemini, and Kimi: “”write me a horror story done entirely in the dedications to six books (you can give me the title and author of each book as well)”” ChatGPT and Claude did well in different way. Kimi did the usual (sounds good but meaning falls apart). https://x.com/emollick/status/1982279778859151783

Many people are confused by Minimax’s recent return to full attention – especially since it was the first large-scale pivot toward hybrid linear attention – and by Kimi’s later adoption of hybrid linear variants (as well as earlier attempts by Qwen3-Next, or Qwen3.5). I actually”” / X https://x.com/SonglinYang4/status/1984021551914926514

🤖deepagents: the open source, multi-model agent harness We’re releasing 0.2 of deep agents, with a big addition: a “”backend”” abstraction This lets you swap the filesystem you use from a local filesystem to a remote VM to a database to anything blog: https://x.com/LangChainAI/status/1983219130057527662

standard content blocks is a huge new addition in LangChain v1! solves a bunch of issues with switching between model providers”” / X https://x.com/hwchase17/status/1982652804654391432

15/ MCP Toolbox for Databases supports multiple SQL systems. @Sumanth_077 offers open-source tools for secure agent interactions. https://x.com/AtomSilverman/status/1983653239821365711

(this is not valid) DeepSeek is the new king now. It’s gaining 125% in just 9 days, making more than GPT-5 and Gemini 2.5 Pro lost combined. DeepSeek is just a side project of a hedge fund, confirmed. https://x.com/Yuchenj_UW/status/1982658436182712750

The incredible work the @huggingface does in the open truly charts a path towards a post-scarcity future that includes everyone in the fruits of AI.”” / X https://x.com/JayAlammar/status/1984273218568696014

We’ve just published the Smol Training Playbook: a distillation of hard earned knowledge to share exactly what it takes to train SOTA LLMs ⚡️ Featuring our protagonist SmolLM3, we cover: 🧭 Strategy on whether to train your own LLM and burn all your VC money 🪨 Pretraining, https://x.com/_lewtun/status/1983929588909797414

yesterday, Hugging Face dropped a 214-page MASTERCLASS on how to train LLMs > it’s called The Smol Training Playbook > and if want to learn how to train LLMs, > this GIFT is for you > this training bible walks you through the ENTIRE pipeline > covers every concept that matters https://x.com/TheAhmadOsman/status/1984157512795357614

Weights are out! 🔥 FIBO is a new open weights high quality 8B image model by @bria_ai_ trained on json prompts! It can generate and modify images with a precisely crafted json prompt, allowing for every detail of the image to be decided Try it here: https://x.com/multimodalart/status/1983575476917403763

ollama run qwen3-vl Ollama’s engine now supports all the Qwen 3 VL models locally. 2B to 235B parameter sizes. The smaller models work exceptionally well for their size. The latest version of Ollama v0.12.7 is needed! Give it a try! 👇👇👇 https://x.com/ollama/status/1983683646864126155

Just saw the MiniMax-M2 benchmarks, and the performance is too good to ignore :). So, I just amended my “”The Big LLM Architecture Comparison”” with entry number 13! 1️⃣ Full attention modules: As shown in the overview figure below, I grouped MiniMax-M2 with the other https://x.com/rasbt/status/1983212569885122670

MiniMax-M2 just dropped – 230B MoE with 10B active; built for coding, agents, & tool use; MIT license🔥 > #1 open-source model on Artificial Analysis benchmarks, #5 overall > Excels at multi-file edits, test-repair loops, and BrowseComp tasks > Fast, cheap, deployable – runs”” / X https://x.com/reach_vb/status/1982705125157126590

The new open source MiniMax model is available for free in the OpenRouter and Roo Code Cloud providers for a limited time!”” / X https://x.com/roocode/status/1983162578567324069

@MiniMax__AI Smart models 🤝 fast inference The tool calling and coding capabilities of M2 look promising! https://x.com/basetenco/status/1982796366108672393

@Teknium1 @intrstllrninja @sgl_project yes minimax-m2 is supported by vllm, see https://x.com/vllm_project/status/1983936128878059541

couple more additions to AIE CODE: excited to welcome back @beyang, CTO of @ampcode, and 3x AIE top speaker and @SkylerMiao7 of @Hailuo_AI Head of Eng, whose Minimax M2 is currently #5 on Artificial Analysis and is the SOTA open model by the measure https://x.com/swyx/status/1983939826069205340

Hugging Face: https://x.com/MiniMax__AI/status/1983524834048184753

Live in Cline: @MiniMax__AI M2 Temporarily free (i.e. money = $0) in Cline! https://x.com/cline/status/1982948478105088047

MiniMax M2 design choice seems to be: qk-norm, partial rope, sliding window attention, no shared expert, GQA”” / X https://x.com/eliebakouch/status/1982660966648324504

MiniMax M2 is available for free for a limited time in anycoder on Hugging Face using @OpenRouterAI https://x.com/_akhaliq/status/1982591245043240975

MiniMax M2 vLLM PR: https://x.com/eliebakouch/status/1982656807102451723

MiniMax Text01 / M1 Model Transformers Deployment Guide – MiniMax API Docs https://platform.minimax.io/docs/guides/text-transformers-deployment

MiniMax-M2 has been met with incredible enthusiasm from developers around the world, we’re deeply encouraged and grateful for it. Earlier today, a surge in global traffic briefly lowered our success rate to 90%. We heard your feedback right away. Service is now fully restored,”” / X https://x.com/MiniMax__AI/status/1983522475217735915

MiniMax’s M2 achieves a new all-time-high Intelligence Index score for an open weights model and offers impressive efficiency with only 10B active parameters (200B total) Key takeaways: ➤ Efficiency to serve at scale: MiniMax-M2 has 200B total parameters and is very sparse with https://x.com/ArtificialAnlys/status/1982714153375854998

Modelscope: https://x.com/MiniMax__AI/status/1982683091115323419

Text Generation Guide – MiniMax API Docs https://platform.minimax.io/docs/guides/text-generation

Wow this is very surprising, in minimax01 they did ablation and show that naive linear attention was better (even much better for long context) https://x.com/eliebakouch/status/1982647963467030704

You can now vibe code AI react apps in anycoder with MiniMax M2 https://x.com/_akhaliq/status/1982937250095882580

open-source OCR models are super cheap to run and privacy first 🤝 BUT there’s a ton of new models out there: DeepSeek-OCR, Nanonets, PaddleOCR, how do you pick them? 🤯 don’t worry though, @huggingface got you covered! 🫡🧶 https://x.com/mervenoyann/status/1980685830411931885

Brain-IT: Image Reconstruction from fMRI via Brain-Interaction Transformer A new paper from Weizmann Institute of Science, getting reconstructions that are not complete nonsense from ONLY 15 min of data. We previously demonstrated SOTA for 1 hr of data with MindEye2. This https://x.com/iScienceLuvr/status/1984195725253804449

🚀Excited to team up with @NVIDIAAIDev to bring Nemotron Nano 2 VL to vLLM – a multimodal model powered by a hybrid Transformer–Mamba language backbone, built for video understanding and document intelligence✨ Full post here👇 https://x.com/vllm_project/status/1984334926972592193

We are so excited to be a launch partner for @nvidia Nemotron Nano 2 VL today and offer day-zero support for this highly accurate and efficient vision language model, alongside other models in the Nemotron family. To learn more, read our blog here https://x.com/basetenco/status/1983243273171845596

Got to say hi to @adcock_brett at the NVIDIA GTC pregame. Jensen’s keynote starts in one hour. https://x.com/TheHumanoidHub/status/1983187543349965082

people are sleeping on this release NVIDIA blessed us with a new family of Nemotron RAG models 🔥 it comes with text retrievers, multimodal retrievers as well as layout detectors with commercially permissive license 👏 https://x.com/mervenoyann/status/1984302303570960666

Nemotron Nano VL 12B V2 by @nvidia is now on Replicate A 12B vision-language beast for document intelligence & video understanding. Handles up to 4 images or 1 video, extract data from invoices, compare pics, summarize clips, all in 10 languages! https://x.com/replicate/status/1983242266836890026

Today we’re excited to add gpt-oss and DeepSeek model families to Tinker – one of our top community requests. With Tinker, you can train a 671B parameter model on your laptop in just a few lines of code. No GPU rentals. No CUDA. No cluster setup. Just train.”” / X https://x.com/dchaplot/status/1983055956352348614

@OpenAI Wohoooo! Congratulations on the release 🔥 Love the weights on the hub 🤗 https://x.com/reach_vb/status/1983508207793238150

PewDiePie in 2025: – built a 10×4090 rig – runs Llama 70B, gpt-oss-120B & Qwen 245B locally via vLLM – built a custom web UI (chat, RAG, search, TTS) – ran protein-folding simulations for charity – created an AI “council”, a swarm of 64 models – now fine-tuning his own model https://x.com/Yuchenj_UW/status/1984309989134254493

💡Some fun facts about Minimax M2: 1. Minimax uses GPT-OSS-like structure, i.e., Full Attention interleaved with Sliding Window Attention (SWA). 2. It uses QK Norm, and every single attention head has its own unique, learnable RMSNorm. 3. The full attention and SWA parts https://x.com/yifan_zhang_/status/1982667098963734602

gpt-oss-safeguard works with the same inference solutions as gpt-oss and will be available with some of our inference partners, including Ollama, LM Studio, Cerebra, and Groq.”” / X https://x.com/OpenAIDevs/status/1983508957508317690

“OpenAI is now able to release open weight models that meet requisite capability criteria.” let’s gooooo”” / X https://x.com/reach_vb/status/1983167809975922845

GPT OSS 120B | Model library https://www.baseten.co/library/gpt-oss-120b/

gpt-oss-safeguard lets developers use their own custom policies to classify content. The model interprets those policies to classify messages, responses, and conversations. These models are fine-tuned versions of our gpt-oss open models, available under Apache 2.0 license. Now”” / X https://x.com/OpenAI/status/1983507394316710039

Now in research preview: gpt-oss-safeguard Two open-weight reasoning models built for safety classification. https://x.com/OpenAI/status/1983507392374641071

ollama run gpt-oss-safeguard Ollama is partnering with @OpenAI and robust open online safety tools (ROOST) to bring the latest gpt-oss-safeguard reasoning models to users for safety classification tasks. 20B: ollama run gpt-oss-safeguard:20b 120B: ollama run”” / X https://x.com/ollama/status/1983509776530039014

ROOST is also launching the ROOST Model Community, bringing together T&S practitioners and researchers to share best practices for implementing open source AI models into safety workflows. https://x.com/OpenAIDevs/status/1983508959505084849

We just added 4 new models to Tinker from the gpt-oss and DeepSeek-V3.1 families. Sign up for the waitlist: https://x.com/thinkymachines/status/1983041573517701327

A great deep-dive on On-Policy Distillation — an efficient way to post-train smaller LLMs with dense, on-policy feedback. Excited to see Qwen featured in the experiments, showcasing strong math-reasoning gains and continual-learning recovery. Excellent work by @thinkymachines 👏”” / X https://x.com/Alibaba_Qwen/status/1983053298447069275

Always visualize. We’ve caught so many interesting patterns / phenomenon In particular in either multimodal or MoE modeling by inspecting / debugging visually. @kilian_maciej ran this exercise of the Qwen3 MoE series and some very interesting patterns emerged”” / X https://x.com/AkshatS07/status/1982629716495663521

IBM Granite team released Granite 4 Nano models 1B variant outperforms Qwen3-1.7B with fewer params on a mix of tasks from math to coding 👏 https://x.com/mervenoyann/status/1983192115577503974

Qwen 3 Max Thinking has released Should also be up on VB shortly! https://x.com/legit_api/status/1984284268412191216

Qwen3-VL models are now live in LM Studio! 🎉🚀 A powerful collection of vision-language models. Happy Halloween! 🎃👻 https://x.com/lmstudio/status/1984330903880155154

We dive deep into every part of the stack that we didn’t build. Here’s a deep-dive into (potentially undisclosed?) MoE depth wise up-cycling that Qwen3 does. I have a lot of Qwen folks that follow me, would anyone like to clarify :)”” / X https://x.com/ArmenAgha/status/1982613142321746130

🔥 It’s here: OpenFold3 is now live. THE open-source foundation model for predicting 3D structures of proteins, nucleic acids & small molecules. This is where the future of drug discovery and biomolecular AI lives. Built by @open_fold. Hosted on @huggingface. 👇 more https://x.com/cgeorgiaw/status/1983241877479379187

Very nice blog post from Thinky (@_kevinlu et al) about on-policy distillation for LLMs — we published this idea back in 2023 and it is *publicly* known to be successfully applied to Gemma 2 & 3, and Qwen3-Thinking (and probably many closed frontier models)! The idea behind”” / X https://x.com/agarwl_/status/1982880080482140372