Open Source: AI News Week Ending 08/29/2025

Open Source: AI News Week Ending 08/29/2025

August 29, 2025

Image created with Flux Pro v1.1 Ultra. Image prompt: Giant “100” as pure white negative‑space cutout dominating the frame; minimalist poster style; branch‑and‑merge diagram weaving through the zeros; forest‑white backdrop; high contrast, crisp edges, soft studio light, no other text, no logos

Connectors and persistent conversations in the Responses API:”” / X https://x.com/gdb/status/1958691151139283454

ByteDance just opensourced a desktop automation AI Agent. This agent can use any desktop app, open files, and browse websites using vision models running locally. 100% Free, Opensource, and Local. https://x.com/unwind_ai_/status/1956538069311500514

Grok 2 from @xai has just been released on @huggingface: https://x.com/ClementDelangue/status/1959356467959439464

Grok-2 has been “”open sourced”” but has one of the worst licenses of any recent major open weights release. Given that it’s already quite outdated by the time they’ve got around to releasing it, combined with the license, this will see little use. It’s dead on arrival. https://x.com/xlr8harder/status/1959490601264533539

Pretty cool that they open sourced the actual full-sized production model. Here’s the Grok 2.5 architecture overview next to a roughly similarly sized Qwen3 model. The MoE residual is quite interesting. Kind of like a shared expert. I don’t think I’ve seen this setup before. https://x.com/rasbt/status/1959643038268920231

The @xAI Grok 2.5 model, which was our best model last year, is now open source. Grok 3 will be made open source in about 6 months. https://x.com/elonmusk/status/1959379349322313920

xAI just released Grok 2 on Hugging Face. This massive 500GB model, a core part of xAI’s 2024 work, is now openly available to push the boundaries of AI research. https://x.com/HuggingPapers/status/1959345658361475564

xai-org/grok-2 · Hugging Face https://huggingface.co/xai-org/grok-2

Grok now has a model card – which is a big step forward! But it is light on details, with unexplained results. Some examples: if the MASK measurement is the same as in the source paper, .43 would be a fairly high level of deception, also the sycophancy score is hard to interpret https://x.com/emollick/status/1959116132096336066

OpenAI just released HealthBench on Hugging Face. This new dataset is designed for rigorously evaluating large language models’ capabilities in improving human health. A vital step for AI in medicine! https://x.com/HuggingPapers/status/1960749923218895332

Some of the quotes from ChatGPT in this piece are quite eye-opening. OpenAI’s safeguards clearly failed here. https://x.com/lefthanddraft/status/1960340188145787005

Qwen-Code weekly release (v0.0.8) : ✨ Deep VS Code Integration: Get context-aware suggestions & inline diffs directly in your editor! Initialize with /ide and supercharge your workflow. 🔌 Enhanced MCP Support: Add, remove, list MCP servers via CLI (qwen mcp add|remove|list), https://x.com/Alibaba_Qwen/status/1959170659583476026

microsoft is dropping (still uploading) VibeVoice-1.5B model on @huggingface! i love the multi-speaker conversational audio feature for podcasts! https://x.com/MaziyarPanahi/status/1959994276198351145

microsoft/VibeVoice-1.5B · Hugging Face https://huggingface.co/microsoft/VibeVoice-1.5B

VibeVoice A Frontier Open-Source Text-to-Speech Model https://x.com/_akhaliq/status/1960106923191140373

VibeVoice is a framework from @MSFTResearch for generating expressive, long-form, multi-speaker audio conversations. Create podcasts from text. MIT licensed🔥 Synthesize speech up to 90 minutes long with 4 distinct speakers 🤯 https://x.com/Gradio/status/1960023019239133503

weekly tokens processed went from ~111B to 3.21T in 1 year *on OpenRouter https://x.com/scaling01/status/1960113882607067569

Didn’t see anyone talk about this. The new ByteDance OSS model has special tokens during its CoT where it would automatically check how many tokens in its thinking budget it has used and how many remain. https://x.com/nrehiew_/status/1959437761188163872

TikTok parent company ByteDance releases new open source Seed-OSS-36B model with 512K token context https://venturebeat.com/ai/tiktok-parent-company-bytedance-releases-new-open-source-seed-oss-36b-model-with-512k-token-context

Command A Translate: Secure translation for global enterprises https://cohere.com/blog/command-a-translate

Introducing Command A Reasoning, our most advanced model for enterprise reasoning tasks. https://x.com/cohere/status/1958542682890047511

Try the model on our platform, deploy it privately, or for research use on @huggingface. Find more details in our blog: https://x.com/cohere/status/1961081787674763525

🤖 DeepSeek-V3.1 just landed on Together AI 671B hybrid model with: ⚡ Fast mode for routine tasks 🧠 Thinking mode for complex problems Our infrastructure is built for massive MoE models like this. 99.9% uptime means your reasoning workflows actually work in production. https://x.com/togethercompute/status/1960835568574578736

Introducing DeepSeek-V3.1: our first step toward the agent era! 🚀 🧠 Hybrid inference: Think & Non-Think — one model, two modes ⚡️ Faster thinking: DeepSeek-V3.1-Think reaches answers in less time vs. DeepSeek-R1-0528 🛠️ Stronger agent skills: Post-training boosts tool use and”” / X https://x.com/deepseek_ai/status/1958417062008918312

DeepSeek-V3.1 Is 2x Cheaper than GPT-5 https://analyticsindiamag.com/ai-news-updates/deepseek-v3-1-is-2x-cheaper-than-gpt-5/

DeepSeek V3.1 dropped last week and is showing a 9.9% diff edit failure rate in real-world usage. That’s higher than Qwen 3 Coder (6.1%) but still solid for an open-source model for an early release. What do you think? How’s it been in Cline so far? https://x.com/cline/status/1960565950442578418

open source nano banana? bytedance just dropped USO, an open source editing model that… just works https://x.com/multimodalart/status/1961147988258295893

🔔 Two months ago, we released #IneqMath, which revealed the Soundness Gap: LLMs can guess answers to Olympiad-level inequalities problems, but still struggle to make rigorous proof steps. Since then, it’s been downloaded 4K+ times on HuggingFace! ➡️ https://x.com/lupantech/status/1960384184842879444

A big milestone for Hermes. We did a lot of work to make a frontier level openmodel that does not dictate what expression you can elicit from the model. Super strong at math, coding, STEM, and creativity. Model Weights: https://x.com/Teknium1/status/1960420619620901135

Hermes 4 – Nous Research https://hermes4.nousresearch.com/

Hermes 4 technical breakdown: ▫️ Open Source LLM ▫️ Fine-tune of Llama 3.1 ▫️ 405B & 70B params ▫️ Hybrid reasoning ▫️ Trained on 3.5 million reasoning samples ▫️ Trained using 192 NVIDIA B200 GPUs ▫️ Uncensored ▫️ Steerable, aligned to the user ▫️ Creativity enhanced (like”” / X https://x.com/vectro/status/1960734604601569560

Nous Research presents Hermes 4, our latest line of hybrid reasoning models. https://x.com/NousResearch/status/1960416954457710982

Fourth model launch of the day 🔥 – introducing Hermes 4, from @NousResearch Hermes 4 is trained for steerability and lower refusal rates, topping RefusalBench and beating Grok 4 https://x.com/OpenRouterAI/status/1960436262923592065

Ollama v0.11.7 is available with DeepSeek v3.1 support. You can run it locally with all its features like hybrid thinking. This works across Ollama’s new app, CLI, API, and SDKs. Ollama’s Turbo mode that’s in preview has also been updated to support the model! https://x.com/ollama/status/1960463433515852144

microsoft (Microsoft) https://huggingface.co/microsoft

🤖 From this week’s issue: The NVIDIA team released the NVIDIA Nemotron Nano 2 family of accurate and efficient hybrid Mamba-Transformer reasoning models. https://x.com/dl_weekly/status/1960321337248944130

Efficient Language Model with PostNAS NVIDIA’s recent research on LLMs has been fantastic. Jet-Nemotron is the latest in efficient language models, which significantly improves generation throughput. Here are my notes: https://x.com/omarsar0/status/1960724749790929009

NVIDIA has released Nemotron Nano 9B V2, a small 9B reasoning model that scores 43 on the Artificial Analysis Intelligence Index, the highest yet for <10B models Nemotron 9B V2 is the first Nemotron model pre-trained by @NVIDIA. Previous Nemotron models have been developed by https://x.com/ArtificialAnlys/status/1960504310309249045

NVIDIA release announcement with all the technical details: https://x.com/ArtificialAnlys/status/1960504316550373657

We just released Nemotron-CC-Math 🚀 Equations on web aren’t just LaTeX-they’re in MathML,<pre> tags,inline,even images.Code shows up just as many ways. Most parsers drop it. Nemotron-CC-Math(133B tokens) reprocesses CommonCrawl math pages to capture math equations +code reliably”” / X https://x.com/KarimiRabeeh/status/1960682448867426706

Results Jet-Nemotron-2B outperforms or matches small full-attention models on MMLU, MMLU-Pro, BBH, math, commonsense, retrieval, coding, and long-context tasks. All this while delivering up to 47x decoding throughput at 64K and as high as 53.6x decoding and 6.14x prefilling https://x.com/omarsar0/status/1960724855709688053

🔄 Revision Rollbacks just landed in LangGraph Platform! ✨Redeploy any previous revision, making it easy to revert changes if you notice unwanted behavior in your deployment🚀 https://x.com/LangChainAI/status/1960082101065388138

🔥 Revision Queueing in LangGraph Platform! Any new revisions will be be queued up and proceed after completion of the first Developer workflows on LangGraph Platform just got another boost 🏃 https://x.com/LangChainAI/status/1960118072984911948

we are opening our first office in india later this year! and i’m looking forward to visiting next month. ai adoption in india has been amazing to watch–chatgpt users grew 4x in the past year–and we are excited to invest much more in india!”” / X https://x.com/sama/status/1958922390731464805

vLLM + open-webui running gpt-oss-120b on a tinybox green v2 Just `vllm serve openai/gpt-oss-120b –tensor-parallel-size 4 –async-scheduling` and you have a local OpenAI API you can trust. https://x.com/__tinygrad__/status/1959862336501715430

…so update. Article soon. I just got a time to first token Qwen3 480B in 25.66s. And I might be able to go faster with faster drives.”” / X https://x.com/TheZachMueller/status/1959643512695054638

Since everyone is talking about RL Environments and GRPO now but no one knows how it works we thought it would be cool to make an explainer video + code you can run: This is an example of using GRPO to train Qwen 2.5 to play 2048 (code in thread) 🧵: https://x.com/jayendra_ram/status/1960157842620498107

🧸Turn your favorite memes into real-world collectibles — one click with Qwen-Image-Edit. 🎁Try it now: https://x.com/Alibaba_Qwen/status/1959507306774999389

🪂 Qwen-Image-Edit is now live on chutes. Absolutely epic model. Playground needs a couple updates to work but you can use it in the API right now, takes minimally “”prompt”” and “”image_b64″” fields as inputs, other args optional. https://x.com/jon_durbin/status/1959230037036519724

🚀 Excited to introduce Qwen-Image-Edit! Built on 20B Qwen-Image, it brings precise bilingual text editing (Chinese & English) while preserving style, and supports both semantic and appearance-level editing. ✨ Key Features ✅ Accurate text editing with bilingual support ✅ https://x.com/Alibaba_Qwen/status/1957500569029079083

Qwen Edit is so good at outpainting https://x.com/linoy_tsaban/status/1959989758475780523

first vision language model built off @OpenAI gpt-oss just dropped! 🔥 InternVL3.5 comes with 32 models 🤯 pre-trained, fine-tuned, aligned in various sizes comes with gpt-oss or Qwen3 for LLM part ⤵️ https://x.com/mervenoyann/status/1960298636610326564

In collaboration with @NASA, we’re open-sourcing Surya—the first foundation model for solar physics: https://x.com/IBMResearch/status/1958275705952731475

Also first time i hear about this South Korean company, didn’t get the attention it deserve imo 👀 paper https://x.com/eliebakouch/status/1959598956540755984

Wow, pretty cool that they also open sourced a FSDP2 compatible Muon and PolyNorm working with @huggingface kernels! https://x.com/eliebakouch/status/1959652478422536611

🚀 LLM Compressor v0.7.0 is here! This release brings powerful new features for quantizing large language models, including transform support (QuIP, SpinQuant), mixed precision compression, improved MoE handling with Llama4 support, and more. Full blog: https://x.com/vllm_project/status/1960432740672921934

Three ways to code for free just landed in Cline v3.26.6. Cloud speed with @grok Code Fast 1, local privacy via @LMStudio, or generous daily limits via the Qwen Code provider –> pick your path. 🧵 https://x.com/cline/status/1961201105729401060