Image created with OpenAI GPT-Image-1. Image prompt: Cheesy late-night infomercial freeze-frame, oversaturated NTSC palette, VHS static—phone-bank set with confetti cannons featuring giant coupon “ALIBABA SUPER-SAVER AI™” flashing in kanji and English; gold rim-light, CRT bloom, high-resolution

#CVPR2025 Picks #3 Alibaba just released VideoRefer-VideoLLaMA3 (2B & 7B video LLMs with A2.0 license!) These models can understand videos and segment objects, answer questions about them throughout the video at the same time 🤯 see it in action ⤵️ https://x.com/mervenoyann/status/1935739721772081336

Controllable and Expressive One-Shot Video Head Swapping https://humanaigc.github.io/SwapAnyHead/

Running mlx-lm locally with MCP using @huggingface’s tiny-agents is actually pretty easy and works quite well. A demo running Qwen3 4B with an MCP client for the local file system: https://x.com/awnihannun/status/1931755333011349831

RT @menloresearch: Meet Jan-nano, a 4B model that outscores DeepSeek-v3-671B using MCP. It’s built on Qwen3-4B with DAPO fine-tuning, it h…”” / X https://x.com/mervenoyann/status/1934909412117741601

stop using VLMs blindly ✋🏻 compare different VLM outputs on a huge variety of inputs (from reasoning to OCR!) 🔥 > has support for multiple VLMs: Gemma 3, Qwen2.5VL, Llama4 > recommend us new models or inputs, we’ll add 🫡 https://x.com/mervenoyann/status/1935708014645784713

🚀 Excited to launch Qwen3 models in MLX format today! Now available in 4 quantization levels: 4bit, 6bit, 8bit, and BF16 — Optimized for MLX framework. 👉 Try it now! Huggingface: https://x.com/Alibaba_Qwen/status/1934517774635991412

Upgraded from Llama 3 to Qwen3 as my go-to model for research experiments, so I implemented qwen3 from scratch: https://x.com/rasbt/status/1936041873099063333

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading