Open Source: AI News Week Ending 04/17/2026

Open Source: AI News Week Ending 04/17/2026

April 17, 2026

Image created with gemini-3.1-flash-image-preview with claude-opus-4.7. Image prompt: Using the provided reference image, preserve every detail exactly — the marigold-orange backdrop, the seated woman’s closed-eyes smile and purple windbreaker, the tattooed singer’s red beanie and layered red vest, the lighting and intimate two-shot framing — but replace only the black handheld microphone with a transparent acrylic modular cube-microphone built from interlocking snap-together open-hardware blocks with visible screws and exposed circuitry inside, held to his mouth in the exact same grip and position, photographed with the same realism and lighting integration. After generating the image, overlay the text “Open Source” in the upper-left corner of the frame in large, bold, all-caps ITC Avant Garde Gothic Pro Medium (or a near-identical geometric sans-serif if unavailable), pure white (#FFFFFF), with no date, subtitle, drop shadow, or outline. The text should be substantial in scale — taking up a meaningful portion of the upper-left area — with comfortable margin from the top and left edges, set against the negative space of the orange backdrop so it does not overlap or obscure the singer, the seated woman, or the replaced object.

Sub-32B open weights models now offer GPT-5 level intelligence with Qwen3.5 27B (Reasoning) matching GPT-5 (medium) at 42 and Gemma 4 31B (Reasoning) matching GPT-5 (low) at 39 on the Artificial Analysis Intelligence Index @Alibaba_Qwen’s Qwen3.5 and @GoogleDeepMind’s Gemma 4
https://x.com/ArtificialAnlys/status/2043929874537296026

⚡ Meet Qwen3.6-35B-A3B：Now Open-Source！🚀🚀 A sparse MoE model, 35B total params, 3B active. Apache 2.0 license. 🔥 Agentic coding on par with models 10x its active size 📷 Strong multimodal perception and reasoning ability 🧠 Multimodal thinking + non-thinking modes
https://x.com/Alibaba_Qwen/status/2044768734234243427

LM Performance：Qwen3.6-35B-A3B outperforms the dense 27B-param Qwen3.5-27B on several key coding benchmarks and dramatically surpasses its direct predecessor Qwen3.5-35B-A3B, especially on agentic coding and reasoning tasks.
https://x.com/Alibaba_Qwen/status/2044768738294268199

VLM Performance：Qwen3.6 is natively multimodal, and Qwen3.6-35B-A3B showcases perception and multimodal reasoning capabilities that far exceed what its size would suggest, with only around 3 billion activated parameters. Across most vision-language benchmarks, its performance
https://x.com/Alibaba_Qwen/status/2044768742761189762

Alibaba released Qwen3.6-35B-A3B today. Big jump compared to Qwen 3.5-35B model. It’s a sparse MoE, 35B total params, only 3B active. Natively multimodal, thinking and non-thinking modes. Hardfacts: SWE-bench Verified: 73.4, near dense Qwen3.5-27B (75.0), way ahead of
https://x.com/kimmonismus/status/2044780695361290347

Love seeing this open-sourced. Had a great chat with @nicoalbanese10 some weeks ago where he hinted to something like this. Great reference architecture for cloud coding agents. Open Agents gives you the full stack: UI, auth, workflows, sandbox. #DeepAgent from @LangChain takes
https://x.com/bromann/status/2043886229650067729

Just shipped **artifact-preview** for Hermes 🔥 Like Claude Artifacts, build dashboards, games, UIs, get a full interactive preview that instantly opens in a live browser. Real clickable code, smooth refreshes on prompt edits. cc @Teknium
https://x.com/ChuckSRQ/status/2044504539978465658

Qwen 3.6 is here, and open-source! Run it locally with improved agentic coding capabilities. Try it with Claude Code: ollama launch claude –model qwen3.6 Try it with OpenClaw: ollama launch openclaw –model qwen3.6 Run it: ollama run qwen3.6
https://x.com/ollama/status/2044779844672852465

Shocking result on my pelican benchmark this morning, I got a better pelican from a 21GB local Qwen3.6-35B-A3B running on my laptop than I did from the new Opus 4.7! Qwen on the left, Opus on the right
https://x.com/simonw/status/2044830134885306701

New insane model from Jackrong on @huggingface 🤯 Qwen3.5-9B-GLM5.1-Distill-v1 🧠 Distilled on GLM-5.1 reasoning ⚙️ Deeper thinking than base model 🧪 Benchmarks coming soon ✅ Fits on 8GB VRAM ✍️ New model after Qwopus/Gemopus After distilling Claude Opus 4.6, he’s now back
https://x.com/leftcurvedev_/status/2044700338817564814

The PR you would have opened yourself
https://huggingface.co/blog/transformers-to-mlx

Ollama
https://ollama.com/

r/localLlama + r/localLLM + r/sillytavernAI preferred models list – apr 2026

Model	Size/Class	Format	Hosted Provider	Best Local Path	Notes
Huihui Gemma 4 E2B Abliterated v2	E2B	GGUF	No	Ollama / llama.cpp	Gemma 4 MoE with ~2B active params. Multimodal (image+text in, text out). Abliterated for reduced refusal. Lightweight enough to run fast, but MoE active-param sizing means quality punches above its weight class.
Huihui Gemma 4 E4B Abliterated	E4B	GGUF	No	Ollama / llama.cpp	Same Gemma 4 MoE family as E2B but with ~4B active params. Multimodal. Better quality ceiling than E2B at the cost of more compute per token.
SultrySilicon V2	7B	GGUF	No	Ollama / llama.cpp	Roleplay-focused 7B model. Smallest in the set. Good for quick creative/RP sanity checks, not for reasoning or instruction-following benchmarks.
Huihui-GLM-4.6V-Flash-Abliterated	9B	GGUF	No	Ollama / llama.cpp	Based on Z.ai GLM-4.6V-Flash. Vision-language model (image+text). Abliterated. Bilingual Chinese/English. Fast inference variant of the GLM-4.6V family.
Gemma-2-Ataraxy-9B	9B	GGUF	No	Ollama / llama.cpp	Merge of Gemma-2-9B-SimPO and Gemma-2-Gutenberg-9B. Creative writing and roleplay oriented. Scored well on EQ-Bench. Good balance of instruction-following and literary quality at 9B.
MythoMax-L2-13B	13B	GGUF	No	Ollama / llama.cpp	By Gryphe. Llama 2 merge of MythoLogic-L2 and Huginn using experimental per-tensor gradient merging. One of the most downloaded RP/creative models ever (~59k GGUF downloads). Strong at both roleplay and storywriting. Alpaca format. The OG.
Dan’s PersonalityEngine V1.3.0	24B	GGUF	No	Ollama / llama.cpp	Fine-tuned from Mistral Small 3.1 24B Base. Trained on a massive mix: roleplay, storywriting, tool use, math, reasoning, code, medical, legal, and survival topics. Multilingual (EN, AR, DE, FR, ES, HI, PT, JA, KO). A genuine generalist with personality.
SuperGemma4 26B Abliterated Multimodal	26B multimodal	GGUF	No	custom multimodal stack	Based on Gemma 4 26B-A4B. Multimodal (image-text-to-text). Abliterated with low refusal. Optimized for Apple Silicon (MLX). Supports Korean + English. Tool use and coding tags.
Gemma 3 27B Abliterated	27B	GGUF	No	Ollama / llama.cpp	Abliterated version of Google’s Gemma 3 27B instruct. Multimodal (image-text-to-text). Reduced refusal behavior while preserving instruction-following quality.
Huihui Gemma 4 31B Abliterated	31B	GGUF	No	Ollama / llama.cpp	Abliterated Gemma 4 31B instruct. Multimodal (any-to-any pipeline tag). Dense 31B, not MoE. Strongest Gemma 4 dense abliterated option.
Gemma 4 31B Abliterated	31B	GGUF + safetensors	No	Ollama / llama.cpp	Same base as above (Gemma 4 31B-it) but different abliteration method using mlabonne’s harmful_behaviors + harmless_alpaca datasets. Both formats in one repo.
Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-Abliterated	35B A3B	GGUF	No	Ollama / llama.cpp	Qwen 3.5 MoE (35B total, ~3B active). Distilled from Claude 4.6 Opus reasoning. Chain-of-thought and reasoning-focused. Abliterated. Multimodal. Punches well above its active param count on reasoning tasks.
Midnight Rose 70B v2.0.3	70B	GGUF	No	Ollama / llama.cpp	By sophosympatheia. Complex multi-stage SLERP/DARE-TIES merge of WizardLM, Tulu-2-DPO, Dolphin, and earlier Midnight Rose versions. Uncensored. Designed for roleplay and storytelling. Scored surprisingly high on EQ-Bench even at low quants. ~6k context sweet spot.
Midnight Miqu 70B v1.5	70B	GGUF	No	Ollama / llama.cpp	Llama-family merge of Midnight-Miqu v1.0 and Tess-70B. Creative writing and roleplay focused. 32k context. Known for strong prose quality and character consistency at 70B scale.
Midnight Rose 103B v2.0.3	103B	GGUF	No	heavy self-host	Same lineage as the 70B but scaled up. Importance-matrix GGUF by mradermacher. Firmly in the “need real hardware” category.
DeepSeek V3	671B A37B	safetensors	Yes: DeepInfra, Novita	Hosted preferred	Massive MoE. 671B total, 37B active. Strong on code, math, and instruction-following. Pre-trained on ~15T tokens. Use via OpenRouter, not locally.
DeepSeek V3.2	685B A37B	safetensors	No confirmed provider yet	Hosted preferred	Successor to V3. Same general architecture class. Not a local play.
Behemoth-123B-v1	123B	GGUF	No	heavy self-host	Mistral-family 123B. Creative/RP community model. Massive parameter count makes it impractical for casual local use but prized for output quality in the r/LocalLLM community.
Monstral-123B	123B	GGUF	No	heavy self-host	Mistral-family 123B. Text generation and chat focused. Same weight class as Behemoth, different training mix and community lineage.
BlackSheep-Large	~27B	GGUF	No	Ollama / llama.cpp	By TroyDoesAI. Canonical repo is gated. Q8_0 is ~29.5 GB, placing it in the 27B-class. Community RP/creative model.

view raw

gistfile1.md

hosted with ❤ by GitHub

So much in this release but the one many have been waiting for above the rest, the GUI dashboard! Manage and monitor your Hermes Agent with a GUI Local Web Dashboard with `hermes dashboard` command to start it!
https://x.com/Teknium/status/2043771509123232230

Is there somewhere a collection of the best agent/coding harnesses for each models, especially open-source and local ones? In my opinion, the biggest reason why people are struggling with open/local models these days is that the agent/coding harnesses in most open agent are not
https://x.com/ClementDelangue/status/2044139560355901911

《收藏！10 大 Hermes Agent 实用教程，新手少走 10 小时弯路》随着 Hermes Agent 的爆火，我们正目睹一场从”被动工具”向”主动生命体”的范式转移。Hermes 的迷人之处不在于它能即刻交付多少活，而在于它具备”自我生长”的复利效应：你喂给它的每一行代码、每一次对话、每一个 Profile
https://x.com/biteye_sister/status/2043630704798679545

Added official support to Hermes Agent for: QQBot – hugely popular messaging platform in China AWS Bedrock Model Provider Run `hermes update` in your terminal to access early!
https://x.com/Teknium/status/2044557360962871711

Capable agents are the result of co-evolution between models and harnesses. We’ve been working with @NousResearch to ensure that M2.7 x Hermes Agent provides a top-tier experience for users. Hermes’s self-improving loop brings out the best in M2.7 through real usage. We are
https://x.com/MiniMax_AI/status/2044745282785886469

Finally had the chance to get up and running with @NousResearch Hermes Agent and my impression is great. The thing that has stood out so far: it’s fast, at least twice as fast as OpenClaw (I set up a new instance to test it against) Generally the UX also just feels a lot better
https://x.com/dabit3/status/2043808914312212568

For anyone running @NousResearch Hermes Agent locally and wishing it just stayed online: there’s now a one-click deployment template on Tencent Lighthouse. Cloud-hosted, sandboxed from your local env, online around the clock, reach it through WhatsApp, Telegram, WeCom, QQ, or
https://x.com/TencentAI_News/status/2044007400282436006

hermes agent @NousResearch is fucking insane i know literally NOTHING about coding. ZERO. and i just built a fully functioning web app in minutes
http://localhost:3000/ check it out @Teknium
https://x.com/friesmakesfries/status/2044751296641802481

hermes is so much better than openclaw hype is crazy
https://x.com/theCTO/status/2044559179151773933

Hermes 实在是太好用了我在 win 系统上也装了一个流程简单的一批，建议自己手动安装，别用 Claude code 1. 安装 WSL2: wsl –install 2.重启电脑,启动 Ubuntu: wsl 3.执行官方安装命令： curl -fsSL
https://t.co/voDBXKw7Py | bash
https://x.com/aiqiang888/status/2043920187959992609

hermes-lcm v0.3.0 is out — biggest release yet!🚀 What’s new: – Smart search with sort modes (recency / relevance / hybrid) + full CJK & emoji support – Adaptive compaction that scales with backlog pressure and auto-retries on model limits – SQLite hardening: FTS auto-repair,
https://x.com/SteveSchoettler/status/2044536537434755493

I put 2 separate instances of Hermes agents into a chat, holy sh!t this is fun >1 agent is builder, 1 is strategist >each on separate models >gave them some shared context >enabled bot2bot andadded each bot to the other’s TG allowlist >put 3 of us in a gc >started with a simple
https://x.com/KSimback/status/2044736703370309706

Introducing Mirra Workspaces Workspaces give your local agents access to a shared multi-tenant environment. Our customers are already using Cloud Workspaces to automatically share context between their team member’s agents. Workspaces work best with @NousResearch Hermes, which
https://x.com/mirra/status/2044762744998519282

Introducing the Nous Portal Tool Gateway, one login to access over 400 LLMs and power all core tools in Hermes Agent. Check it out below!
https://x.com/Teknium/status/2044879261564375326

M2.7 w/ hermes cli is replacing ~75% of my claude code / opus usage now, but we need clarity for using it as a coding agent @ work. We’re truly blessed to have the weights of this one, looking forward to seeing the license change. Definitely a model worth checking out.
https://x.com/Sentdex/status/2044108342147060067

Pliny used Hermes Agent to do the abliteration! Very Cool!
https://x.com/Teknium/status/2044482769536045194

The Hermes Agent dashboard is here! Run ‘hermes dashboard’
https://x.com/NousResearch/status/2043791876835156362

The update V0.9.0 changes everything for Hermes Agent! You have now: – Web UI – Model switching – iMessage & WeChat integration – Backup & Restore, no more debugging for hours – Android via Tmux, yes, your Android can host Hermes Great work @NousResearch and the +20
https://x.com/AntoineRSX/status/2043884430901850271

This Hermes update is going to be the thing that gives @NousResearch their openclaw moment. Hermes just dropped a UI dashboard And I truly believe that this is what is going to give Hermes their openclaw moment. The team has spent months dialling everything in so that the
https://x.com/Shaun__Furman/status/2043820083114545416

This is the Hermes Agent article you need! New or experienced, most users end up with messy sessions or use them sub optimally. One of the biggest upgrades is learning how to manage sessions properly: >resume by title >rename threads >branch conversations >export history
https://x.com/NeoAIForecast/status/2044521045013762389

This skill is now built in to Hermes! Use /architecture-diagram <prompt> after updating hermes, and you’re good to go! Thanks to the author of the skill making it MIT we were able to port it over directly into Hermes Agent as a built in skill!
https://x.com/Teknium/status/2044190761609244986

today’s @NousResearch Hermes Agent prompt: i want you to pick one skill every 8 hours to evolve and do it. do whatever you need to do to and use whatever you need to get it done. Nora’s response: Let me build proper tracking. The tracker is already picking up data (the
https://x.com/chooseliberty/status/2044425487141781660

Tool Gateway is now live in Nous Portal. No separate accounts, no API key juggling. All you need is one subscription, and everything works. A paid Nous Portal subscription now includes access to 300+ models and a growing set of third-party tools. Launching with: → Web
https://x.com/NousResearch/status/2044878344592699744

tried hermes yesterday light years ahead of openclaw UX is just so much better, it’s wild feels like it’s made by someone that actually cares about architecture and user experience still not sure why anyone should use this user something fully hosted like Poke, but if you
https://x.com/robinebers/status/2043835216670929005

Using Hermes after OpenClaw is like having an ice-cold glass of water in hell. @NousResearch 🫡
https://x.com/vrloom/status/2044506378103099816

Was able to get a slick native swift desktop app v1.0 up and running for Hermes agent today (credit to redsparklabs) Can I get a few people to alpha test it with me? Works great for me so far! 🚀 DM me! @Teknium @NousResearch Check out this beauty!
https://x.com/nesquena/status/2044516572983923021

云服务器能不能跑 Hermes 浏览器自动化？昨天我录了个小视频，用 Hermes 的 /browser connect 直接连上我本地的 Chrome，然后让它自己去点赞推文。本来就是试试看效果的，结果浏览量还挺高的，可能这个视频让很多人对ai能力更具像化了。其实还是要感谢Hermes开发者@Teknium 的转推，哈哈。
https://x.com/0xme66/status/2044755328391319757

现在Hermes可以输入 /browser connect 命令来操作浏览器了，我试了一下点赞我X上的帖子，感觉相当好。默认提供了一些执行策略，大家可以都玩玩看～
https://x.com/0xme66/status/2044410470770331913

10 open projects for AI security (lightweight alternatives while Mythos is closed) ▪️ NVIDIA NeMo Guardrails ▪️ Promptfoo ▪️ LLM Guard ▪️ NVIDIA garak LLM vulnerability scanner ▪️ DeepTeam ▪️ Llama Prompt Guard 2-86M ▪️ ShieldGemma 2 ▪️ OpenGuardrails ▪️ Cupcake ▪️ CyberSecEval
https://x.com/TheTuringPost/status/2043332388785426498

The inevitable need for an open model consortium
https://www.interconnects.ai/p/the-inevitable-need-for-an-open-model

The open-source AI community just got a new home for their data workflows. 🤗 @huggingface is now available in Adaptive Data. Pull datasets directly into a platform that evolves with the problems you’re solving.
https://x.com/adaption_ai/status/2044717435127742693

new open-source Bonsai models are out 🔥 > ternary weights in 8B (1.75 GB), 4B (0.86 GB), and 1.7B (0.37 GB) > comes in MLX, ONNX weights and WebGPU browser demo 😍 > a2.0 licensed 👏
https://x.com/mervenoyann/status/2044841709075411047

🎉 Congrats @Alibaba_Qwen on the first open-weight Qwen3.6! Stronger agentic coding and a new thinking preservation option to retain reasoning context across turns. Same architecture as Qwen3.5, so serving teams can upgrade in place. Day-0 support in vLLM v0.19+. Thinking, tool
https://x.com/vllm_project/status/2044787721538060784

Introducing Nucleus-Image: the first sparse Mixture-of-Experts diffusion model 17B parameters. Only 2B active. 10x more parameter-efficient than leading diffusion models. Toe-to-toe with GPT Image 1, Imagen 4, and Qwen-Image: from pure pre-training alone. No DPO. No RL. No
https://x.com/withnucleusai/status/2044412335473713284

Qwen/Qwen3-Coder-Next · Hugging Face
https://huggingface.co/Qwen/Qwen3-Coder-Next

We built FrogsGame as a new task for evaluating AI’s posttraining skills! It’s a tool-using RL environment built around a blind-start interaction loop. Frontier agents get a container with the Qwen3-8B tokenizer, board-generating scaffolding, and @tinkerapi for remote training
https://x.com/karinanguyen/status/2044885375085339023

2-bit Qwen3.6-35B-A3B did a complete repo bug hunt with evidence, repro, fixes, tests and a PR writeup. 🔥 Run it locally in Unsloth Studio with just 13GB RAM. 2-bit Qwen3.6 GGUF made 30+ tool calls, searched 20 sites and executed Python code. GitHub:
https://x.com/UnslothAI/status/2044858346948464743

Qwen3.6-35B-A3B can now be run locally!💜 The model is the strongest mid-sized LLM on nearly all benchmarks. Run on 23GB RAM via Unsloth Dynamic GGUFs. GGUFs to run:
https://t.co/VlyW8UwDjw Guide:
https://x.com/UnslothAI/status/2044786492451778988

Medical AI models now run on iPhone. No cloud. No API. OpenMed 1.0.0 just shipped. MLX backend for Apple Silicon. Swift package for macOS and iOS. 200+ PII detection models across 8 languages. pip install openmed Open source. Apache 2.0.
https://x.com/MaziyarPanahi/status/2044037968659103806

Introducing Kernels on the Hugging Face Hub ✨ What if shipping a GPU kernel was as easy as pushing a model? – Pre-compiled for your exact GPU, PyTorch & OS – Multiple kernel versions coexist in one process – torch.compile compatible – 1.7x-2.5x speedups over PyTorch baselines
https://x.com/ClementDelangue/status/2044053580504584349

We’re open-sourcing webAI-ColVec1. #1 on ViDoRe V3. Two of the top three spots. For a long time, multimodal RAG looked like a scaling problem. It’s not. Frontier-level retrieval. No OCR. No preprocessing. Built for real documents.
https://x.com/thewebAI/status/2044435998508240926