Image created with gemini-3.1-flash-image-preview with claude-sonnet-4-5. Image prompt: Animation cel style: blue-skinned genie emerging from ornate golden lamp, gesturing toward majestic red and gold Chinese dragon coiled beside him, magical cyan wisps flowing between them, red paper lanterns and stylized clouds in background, Disney-quality hand-drawn aesthetic with clean lines and cinematic lighting, warm jewel tones, open composition with space for title text across top third.

🚀 Introducing Qwen3-Max-Thinking, our most capable reasoning model yet. Trained with massive scale and advanced RL, it delivers strong performance across reasoning, knowledge, tool use, and agent capabilities. ✨ Key innovations: ✅ Adaptive tool-use: intelligently leverages”” https://x.com/Alibaba_Qwen/status/2015805330652111144

New DeepSeek-OCR-2 model! 1. Utilizes Qwen2 500M as a vision encoder instead of VIT 300M 2. Adds causal mask with a non causal mask 3. Accuracy boost by 3.73% to 91.09% from 87.36% 4. Edit Distance 0.100 vs 0.129 for OCR v1 And we added DS-OCR-2 fine-tuning support in Unsloth!”” https://x.com/danielhanchen/status/2016043326760485313

Qwen https://qwen.ai/blog?id=qwen3-max-thinking

Qwen3-Max-Thinking debuts with focus on hard math, code https://www.testingcatalog.com/qwen3-max-thinking-debuts-with-focus-on-hard-math-code/

Qwen3-ForcedAligner-0.6B”” https://x.com/Alibaba_Qwen/status/2016859224077455413

📢 New Model Drop: Qwen3 Max Thinking is now live on Yupp! It’s @Alibaba_Qwen’s latest flagship reasoning model. We can’t wait to see what you learn, build and imagine – and how the model fares on our user-preference leaderboards.”” https://x.com/yupp_ai/status/2015812409823522952

🎉 Congrats @Alibaba_Qwen on the Qwen3-ASR release — vLLM has day-0 support. 52 languages, 2000x throughput on the 0.6B model, singing voice recognition, and SOTA accuracy on the 1.7B. Serve it now in vLLM! 🚀 Learn more: https://x.com/vllm_project/status/2016865238323515412

Qwen3-ASR and Qwen3-ForcedAligner are now open source — production-ready speech models designed for messy, real-world audio, with competitive performance and strong robustness. ● 52 languages & dialects with auto language ID (30 languages + 22 dialects/accents) ● Robust in”” https://x.com/Alibaba_Qwen/status/2016858705917075645

Qwen3-ASR is out🚀 https://t.co/pVnuuNPMEL ✨ 0.6B & 1.7B – Apache2.0 ✨ 30 languages + 22 Chinese dialects, plus English accents across regions ✨ Single model for language ID + ASR (no extra pipeline stitching) ✨ Qwen3-ForcedAligner-0.6B, a strong forced aligner”” https://x.com/AdinaYakup/status/2016865634559152162

Qwen3-ASR is the first open-source LLM-based ASR in the industry with native streaming support. Demo: https://t.co/y2X1slCMcs vLLM Example:”” https://x.com/Alibaba_Qwen/status/2016900512478875991

Big thanks to vLLM for providing Day 0 support for Qwen3-ASR.”” https://x.com/Alibaba_Qwen/status/2016905051395260838

What the heck: Qwen3-Max-Thinking outperforms all SOTA Models (Gemini 3.0 Pro, GPT-5.2, …) in HLE with search tools and even achieves almost 60% Overall really impressive evals! OpenAI and Anthropic have to hurry in their r&d”” https://x.com/kimmonismus/status/2015820838243561742

🚨 Qwen3 Max Thinking is in the Text Arena! @Alibaba_Qwen’s Qwen3 Max Preview debuted last fall in the top 10 – so let’s see what this variant can do. Bring your toughest prompts and we’ll see how it stacks up against other frontier AI models in the most competitive arena. 💪”” https://x.com/arena/status/2015803787680808996

LLaMA Factory – an open-source unified toolkit for training, fine-tuning, and deploying 100+ LLMs and multimodal models. It wraps training into a clear CLI + Web UI, supporting everything from SFT to RL, all without glue code. What it gives you: – Fine-tuning for LLaMA, Qwen,”” https://x.com/TheTuringPost/status/2014827186629595429

🌟🚀Sparse Attention Models Can Get Sparser We’ve updated The Sparse Frontier–the largest empirical analysis of training-free sparse attention to date–from Qwen 2.5 to 3 model families, now including Llama 3.1 and Gemma 3. Key findings: 📊 Larger sparse models outperform”” https://x.com/p_nawrot/status/2017161371566178304

Leave a Reply

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading