Image created with Ideogram 3.0. Image prompt: Lower-East-Side street-corner photograph reminiscent of a late-80s album cover: weathered red-brick tenement with exterior fire-escapes, canvas awning shading racks of vintage clothes; above the awning, a hand-painted board reads ‘Qwen SPORTSWEAR’; a hanging blade sign in cursive script reads ‘Qwen Boutique’; an origami crane named ‘Qwen’ hangs from a fire-escape ladder; warm golden-hour light, subtle 35mm film grain, muted yet punchy color palette, gritty NYC vibe.

Alibaba’s Qwen team made Deep Research for Qwen Chat available for all users It’s pretty much like ChatGPT’s Deep Research, providing users the ability to prepare detailed reports on different subjects in a matter of minutes. https://x.com/adcock_brett/status/1924133804630753660

🚀 Qwen Web Dev just got even better! ✨ One prompt. One website. One click to deploy. 💡 Let your creativity shine — and share it with the world. 🔥 What will you build today? https://x.com/Alibaba_Qwen/status/1924299942614688111

Together AI and Agentica launched DeepCoder-14B-Preview, a code generation model that competes with top reasoning models like OpenAI’s o1 and DeepSeek-R1, but at a fraction of the size. Built on a 14 billion parameter Qwen model, DeepCoder uses a highly optimized reinforcement https://x.com/DeepLearningAI/status/1924570759793369303

Do LLMs Really Understand Cell Biology? Interesting paper evaluating LLMs potential in understanding cell biology. Finding: It finds that specialist models don’t work so great. Generalist models, such as Qwen and DeepSeek, exhibit preliminary understanding capabilities within https://x.com/omarsar0/status/1922662317986099522

Multimodal model support is here in 0.7! Ollama now supports multimodal models via its new engine. Cool vision models to try👇 – Llama 4 Scout & Maverick – Gemma 3 – Qwen 2.5 VL – Mistral Small 3.1 and more 😍 Blog post 🧵👇 https://x.com/ollama/status/1923139667563528347

Qwen introduces: Parallel Scaling Law for Language Models “”We introduce the third and more inference-efficient scaling paradigm: increasing the model’s parallel computation during both training and inference time.”” “”We draw inspiration from classifier-free guidance (CFG)”” “”In https://x.com/iScienceLuvr/status/1923262107845525660

Qwen3 is abliterated! ✂️✂️✂️ What started as a weekend hack turned into three, but I’m happy with the result. Qwen3 was challenging with much stronger alignment and a new thinking mode that interfered with refusals. Here’s what I did to abliterate it https://x.com/maximelabonne/status/1924412611430404492

Lumina-Next on Qwen base, from Salesforce. Slightly surpasses Janus-Pro. I hope we start seeing actually multimodally pretrained unified models soon. https://x.com/teortaxesTex/status/1922961229233946869

You can now run Qwen3-32B on @HuggingFace with Cerebras Inference — and it’s ⚡️! Typing the question took longer than getting the answer 😅 https://x.com/fdaudens/status/1923107187284394368

Qwen3 Technical Report Author’s Explanation: https://x.com/TheAITimeline/status/1924232110383960163

It was the week of video generation at @huggingface, on top of many new LLMs, VLMs and more! Let’s have a wrap 🌯 LLMs 💬 > Alibaba Qwen released WorldPM-72B, new World Preference Model trained with 15M preference samples (OS) > II-Medical-8B, new LLM for medical reasoning that https://x.com/mervenoyann/status/1924430139242283172

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading