Image created with gemini-2.5-flash-image with claude-sonnet-4-5. Image prompt: A 1961 Ferrari 250 GT California Spyder in Rosso Corsa red with hood open revealing a pristine chrome V12 engine in a bright modern workshop, warm cinematic lighting, technical blueprints and precision tools on nearby workbench, transparent accessibility, high-gloss automotive photography with soft depth of field, no people.

🤯 400 Token/S on a MacBook? Yes, you read that right! Shaohong Chen just fine-tuned the Qwen3-0.6B LLM in under 2 minutes using Apple’s MLX framework. This is how you turn your MacBook into a serious LLM development rig. A step-by-step guide and performance metrics inside! 🧵 https://x.com/ModelScope2022/status/1977706364563865805

ByteDance just released Sa2VA on Hugging Face. This MLLM marries SAM2 with LLaVA for dense grounded understanding of images & videos, offering SOTA performance in segmentation, grounding, and QA. https://x.com/HuggingPapers/status/1978745567258829153

Meta just dropped MobileLLM-Pro on Hugging Face a 1B foundational language model in the MobileLLM series, designed to deliver high-quality, efficient on-device inference across a wide range of general language modeling tasks two variants of the model: A pre-trained base model https://x.com/_akhaliq/status/1978916251456925757

Just type “add a girlfriend” to any video on the new Grok Imagine https://x.com/elonmusk/status/1977982448861381081

We just dropped our Open Agent Builder example app 👀 A 100% open source n8n style workflow builder powered by Firecrawl, @LangChainAI, @convex_dev, @ClerkDev and more. Check it out!”” / X https://x.com/firecrawl_dev/status/1978878728827478289

Cua ❤️ open-weight models Highly requested by the Discord community, we tested Moondream3 and Salesforce GTA-1 for UI grounding in computer-use agents 1/3 https://x.com/trycua/status/1976001242901119401

Agents don’t just chat — they act by fetching data, sending messages, calling APIs, and updating records, which makes securing them a whole new challenge. Our latest blog post breaks down how to implement authentication and authorization for agents. 🔒In this blog post, learn https://x.com/LangChainAI/status/1978121116867567644

Open-sourcing retrieve-dspy! 💻🚀 While developing Search Mode for Weaviate’s Query Agent, we dove into the literature. It was amazing, and overwhelming, to see how many different takes on Compound Retrieval Systems there are! 📚 From perspectives on Reranking, such as to https://x.com/CShorten30/status/1978567334424932523

Salesforce AI Research introduces MCP-Universe: the first benchmark to truly test LLM agents in real-world scenarios with live Model Context Protocol servers. https://x.com/HuggingPapers/status/1959347736429674567

Claude-Haiku-4.5 is out in anycoder on Hugging Face https://x.com/_akhaliq/status/1978578382309753205

We have a new comprehensive Model Context Protocol (MCP) documentation section, to help you connect your AI applications to external tools and data sources through a standardized interface. 🔌 Learn how MCP works – connecting LLMs to databases, tools, and services through a https://x.com/llama_index/status/1957840992360710557

Just shipped Privacy AI 1.3.2 This update adds full ‘MLX model support’ — you can now run ‘text and vision models locally’ using Apple’s MLX engine. Models can be downloaded directly from ‘Hugging Face’, and the new download manager supports ‘resume-on-failure’, ‘background https://x.com/best_privacy_ai/status/1977736637086920765

Qwen3-VL 30B-A3B at 4-bit precision, running on Apple silicon at 80 tok/s with MLX! @awnihannun @Prince_Canuma @ostensiblyneil @lmstudio https://x.com/vincentaamato/status/1977776546736713741

Beautiful work from @sundarpichai @demishassabis and team with open weights on HF: https://x.com/ClementDelangue/status/1978613819451551780

Introducing: HuggingChat Omni 💫 Select the best model for every prompt automatically 🚀 – Automatic model selection for your queries – 115 models available across 15 providers Available now all Hugging Face users. 100% open source. https://x.com/victormustar/status/1978817795312808065

The model + resources are now on HuggingFace and GitHub so researchers can keep building and experimenting. More details here:  https://x.com/sundarpichai/status/1978507235610226931

What if you could orchestrate your entire microservices setup using LlamaIndex Workflows? That’s the idea we explored, and the result is a working demo that brings together @docker, @apachekafka and Workflows to manage a small scale e-commerce system. In this walkthrough, https://x.com/llama_index/status/1978137596900593667

Here are some long-awaited performance numbers for DGX Spark using llama.cpp https://x.com/ggerganov/status/1978106631884828843

Tiny Recursion Model (TRM) results on ARC-AGI – ARC-AGI-1: 40%, $1.76/task – ARC-AGI-2: 6.2%, $2.10/task Thank you to @jm_alexia for contributing TRM, a well written, open source, and thorough research to the community based on the HRM from @makingAGI https://x.com/arcprize/status/1978872651180577060

icymi there’s a two new Nanonets OCR models, Nanonets-OCR2-3B and Nanonets-OCR2-1.5B-exp 🙌🏻 this model can handle forms (checkboxes), recognize watermarks, describes images, charts in docs and more! it even handles flowcharts 🤯 it’s multilingual and Apache-2.0 licensed 🙏🏻 https://x.com/mervenoyann/status/1978837720353927415

BOOM: We’ve just re-launched HuggingChat v2 💬 – 115 open source models in a single interface is stronger than ChatGPT 🔥 Introducing: HuggingChat Omni 💫 > Select the best model for every prompt automatically 🚀 > Automatic model selection for your queries > 115 models https://x.com/reach_vb/status/1978854312647307426

I asked Sam Altman if he’s worried about watermark removers on Sora. Everyone’s seen the unfiltered Jake Paul videos with no watermark (generated by Sora) going viral on socials… So what’s the pull for creators to enable face cameos with that risk? Sam’s take: open source https://x.com/rowancheung/status/1977769948882559135

npm install -g cline Not just Cline in the terminal, but the primitive you can build on. > scriptable > open-source > open-model Return to the primitives. (Preview) https://x.com/cline/status/1978874789193486749

Introducing the most advanced open-source chat template built on top of AI SDK. → Agents (Orchestration, handoffs & context guardrails) → Artifacts (Canvas, Charts) → Rate limits (Messages, tool permissions) Link ⬇️🧵 https://x.com/pontusab/status/1976304838200983572

Am I wrong in sensing a paradigm shift in AI? Feels like we’re moving from a world obsessed with generalist LLM APIs to one where more and more companies are training, optimizing, and running their own models built on open source (especially smaller, specialized ones) Some https://x.com/ClementDelangue/status/1978113358772449379

State of DeepSeek models: DeepSeek launched its new V3.1 Terminus and V3.2 Exp hybrid reasoning models in quick succession in September, making meaningful steps in both intelligence and cost efficiency Both models can be used in reasoning and non-reasoning modes, and are https://x.com/ArtificialAnlys/status/1977809542621851654

Qwen3-VL 235B is now live on Ollama’s cloud — free to try!”” / X https://x.com/Alibaba_Qwen/status/1978288558587674672

Qwen3-VL 235B is available on Ollama’s cloud! It’s free to try. ollama run qwen3-vl:235b-cloud The smaller models, and the ability to run fully on-device will be coming very soon! See examples and how to use the model on Ollama! 👇👇👇 https://x.com/ollama/status/1978225292784062817

> to celebrate this, we built some notebooks to fine-tune Qwen3-VL-4B with SFT/GRPO in a free Colab notebook 🥹 https://x.com/mervenoyann/status/1978153606462550220

Qwen3-VL is very good for JSON structured output and is insanely fast 💨 Thanks @Alibaba_Qwen team!”” / X https://x.com/andrejusb/status/1978076341158244835

The new dense Qwen3-VL models from @Alibaba_Qwen have day-0 support on MLX-VLM! Get started today: > pip install -U mlx-vlm Model weights: https://x.com/Prince_Canuma/status/1978164715848134699

Kaggle link: https://x.com/Alibaba_Qwen/status/1978290751436943415

We’re open-sourcing several core components from the Qwen3Guard Technical Report, now available for research and community use: 🔹 Qwen3-4B-SafeRL: A safety-aligned model fine-tuned via reinforcement learning using feedback from Qwen3Guard-Gen-4B. → Achieves significant safety”” / X https://x.com/Alibaba_Qwen/status/1978732145297576081

The next generation of Qwen-VL models is here! > Qwen3-VL 4B (dense, ~3GB) > Qwen3-VL 8B (dense, ~6GB) > Qwen3-VL 30B (MoE, ~18GB) These models come with comprehensive upgrades to visual perception, spatial reasoning, and image understanding. Supported with 🍎MLX on Mac. https://x.com/lmstudio/status/1978205419802616188

🚀Qwen3-VL-235B-A22B-Instruct is now #1 on OpenRouter for image processing — 48% market share! 🎉Huge thanks to our amazing community. https://x.com/Alibaba_Qwen/status/1977566109198151692

These are the Qwen3-VL models I have most been looking forward to – in 4B and 8B sizes, I expect these will work well even on just CPUs”” / X https://x.com/simonw/status/1978151711987372227

Qwen3-VL has already become one of the most popular multimodal models supported by vLLM – Try it out!”” / X https://x.com/rogerw0108/status/1978158856611024913

Excited to announce the launch of Qwen3-VL-Flash on Alibaba Cloud Model Studio! 🚀 A powerful new vision-language model that combines reasoning and non-reasoning modes, outperforming open-source Qwen3-VL-30B-A3B and Qwen2.5-72B with faster responses, stronger capabilities, and https://x.com/Alibaba_Qwen/status/1978841775411503304

day 3 on nanochat and it’s getting integrated! – we now have weights for the small and large models with 20 and 32 layers. Karpathy shared his own checkpoints on the hub. – I’ve got demos for both weights on the hub as space. – we have a PR on transformers to integrate NanoChat, https://x.com/ben_burtenshaw/status/1978832914952401081

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading