Ethan B. Holland

Over 54,900 manually organized AI links and counting

Open Source: AI News Week Ending 05/15/2026

May 15, 2026

Kimi K2.6 is now open-weight #1 on Finance Agent Benchmark V2.
https://x.com/Kimi_Moonshot/status/2054803169994272819

Meet Kimi Web Bridge – Kimi’s browser extension. Agent can now interact with websites like a human: search, scroll, click, type and complete tasks. Supports Kimi Code CLI, Claude Code, Cursor, Codex, Hermes, and more. Available now on
https://t.co/sUqDpi0HQr and the Chrome Web
https://x.com/Kimi_Moonshot/status/2054918374837322140

Give our early preview of Computer Use (with ANY model) a try today! Built into the latest Hermes Agent and powered by @trycua – opens the door to any model, not just the frontier models in special modes – to control your actual computer. Best part, it doesnt take over your PC
https://x.com/Teknium/status/2053961675985113404

OpenSquilla launches open-source AI agent to cut token costs
https://www.testingcatalog.com/opensquilla-launches-open-source-ai-agent-to-cut-token-costs/

DeepSeek V4 Flash is ~90% cheaper than GPT 5.4 Mini and ~70% cheaper than Gemini 3.1 Flash Lite For devs pushing ~500M tok/month, this is the difference between: GPT 5.4 Mini: ~$394/mo Gemini 3.1 Flash Lite: ~$131/mo DeepSeek V4 Flash: ~$71/mo … roughly ~$3,900/dev/yr back
https://x.com/masondrxy/status/2053855842076942555

We Tested DeepSeek V4 Pro and Flash Against Claude Opus 4.7 and Kimi K2.6
https://blog.kilo.ai/p/we-tested-deepseek-v4-pro-and-flash

Are scaling laws finally working for time series foundation models? Today, @datadoghq is releasing Toto 2.0 weights in Apache 2.0 on @huggingface. It’s a family of open-weights TSFMs from 4M to 2.5B parameters, where every size beats the last from a single hyperparameter config.
https://x.com/ClementDelangue/status/2054991352295731619

Unlocking asynchronicity in continuous batching
https://huggingface.co/blog/continuous_async

Building Blocks for Foundation Model Training and Inference on AWS
https://huggingface.co/blog/amazon/foundation-model-building-blocks

🆕 Hugging Face 🤝 Hermes Agent 🔥 > we added Hermes Agent to local apps: run it locally with any compatible GGUF/MLX model > shipped native traces support for Hermes Agent: visualize your Hermes traces directly on the Hub Very soon most agents will run locally and we want to
https://x.com/mervenoyann/status/2053857347429151163

[2605.10730] Qwen-Image-2.0 Technical Report
https://arxiv.org/abs/2605.10730

Exciting: local ML is (finally) going mainstream 🔥 – new GGUF uploads on HF nearly doubled in 2 months – smaller models (like gemma 4 and qwen 27b) starting to be really good and run on a lot of hardwaers – people forking and vibing over llama.cpp (MTP / DS4 / turboquant…)
https://x.com/victormustar/status/2053780086596288781

Codex powered Hermes Agent? Whaaat??
https://x.com/Teknium/status/2054958835547443553

JUST IN 🔥 Hermes Agent can now route OpenAI turns through the Codex CLI app-server! Your ChatGPT subscription becomes the engine No API key No metered tokens Codex’s sandbox, plugins, and shell tools all run inside a Hermes session. They wrapped OpenAI’s agent runtime as
https://x.com/HermesAgentTips/status/2054963533800992962

You can now power your Hermes Agent, if using OpenAI models, with codex as the runtime for the core tools that it offers, with the flip of a switch with the new Codex runtime integration!
https://x.com/NousResearch/status/2054958564951912714

I have a new job! Excited to announce that I will be working with Hugging Face to make local models work great in OpenClaw and other open agent harnesses! I will be building in public and documenting everything along the way, stay tuned!
https://x.com/onusoz/status/2053812410730037256

It would be a mistake for any country to try to slow down open source. A country that leads open source is a country that can lead AI in general”” @ClementDelangue, co-founder & CEO @HuggingFace Watch the full interview on YouTube:
https://x.com/TheTuringPost/status/2053626741302993030

We leveraged two amazing open source projects when building SmithDB. One is @ApacheDataFusio: an extensible Rust based query engine. We built custom execution plans specifically tuned for our workloads and storage backend, and DataFusion made it straightforward to plumb
https://x.com/ankush_gola11/status/2054681251513254260

GB 200s change how one does the prefill and decode disaggregation when serving large MoEs like Qwen. We’ve published details of our stack quantifying the throughput benefits compared to serving on Hoppers.
https://x.com/AravSrinivas/status/2054206802133504234

kyutai: open-science AI lab
https://kyutai.org/

Cool idea from Nous Research. What if you could speed up long-context pretraining with a subquadratic wrapper that you remove before deployment? That is the idea behind Lighthouse Attention. The method wraps ordinary SDPA with a hierarchical, gradient-free selection layer that
https://x.com/omarsar0/status/2054224130103554359