Image created with OpenAI GPT-Image-1. Image prompt: mid‑1990s web‑browser screenshot, CRT glow, 256‑color dithering — Visitor guest‑book entries rendered in Times New Roman — Chinese dragon scroll banner titled “Qwen Model” — crisp pixel edges, screen‑door scan‑lines, phosphor glow

Huawei’s AI lab denies that one of its Pangu models copied Alibaba’s Qwen | Reuters https://www.reuters.com/business/media-telecom/huaweis-ai-lab-denies-that-one-its-pangu-models-copied-alibabas-qwen-2025-07-07/

ByteDance released Tar 1.5B and 7B: image-text in image-text out models 👏 They have an image tokenizer unified with text, and they de-tokenize using either of two models (LLM and diffusion) The model is actually a full LLM (Qwen2), the tokenizer converts image tokens 🤯 https://x.com/mervenoyann/status/1942539723089621055

Skywork-R1V3: a multimodal reasoning model. Reportedly SOTA performance in the open source, up there with frontier models on STEM vision/reasoning evals. Still strong on text. Mixed preference optimization (PPO&GRPO++). Derived from Qwen2.5 through maaany steps. Great paper. https://x.com/teortaxesTex/status/1942641002902090171

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading