Image created with gemini-3.1-flash-image-preview with claude-sonnet-4-5. Image prompt: Vintage 1990s screen-printed t-shirt graphic on worn mustard-yellow cotton fabric, bold deep red ink illustration of an open passport with cartoon passport stamps showing simplified world landmarks, chunky text reading INTERNATIONAL arched across top, simple cartoon airplane dotted trail, retro novelty beach shop style with slightly imperfect printed texture and fabric aging.
How AI Is Turbocharging the War in Iran – WSJ https://www.wsj.com/tech/ai/how-ai-is-turbocharging-the-war-in-iran-aca59002
The biggest barrier for AI applications in Africa isn’t model complexity — it’s the scarcity of data for the 2000+ spoken languages there. We just released WAXAL. This open-access dataset delivers 2,400+ hours of high-quality speech data for 27 Sub-Saharan African languages,
https://x.com/GoogleResearch/status/2032482132619387348
Europe doesn’t need another AI strategy paper… It needs 10 teams with €125M and zero excuses. I was in Munich last week for the first Next Frontier AI event by SPRIND – Bundesagentur für Sprunginnovationen. The energy was real. Not panel-talk energy. Builder energy! What???
https://x.com/IlirAliu_/status/2031296862267875669
Tried many AI models with OpenClaw, I found Kimi AI to be the most token efficient, good at coding, also the easiest to set up.
https://x.com/cz_binance/status/2031313379235606989
Learn how to run Qwen3.5 locally using Claude Code. Our guide shows you how to run Qwen3.5 on your server for local agentic coding. We then build a Qwen 3.5 agent that autonomously fine-tunes models using Unsloth. Works on 24GB RAM or less. Guide: https://x.com/UnslothAI/status/2031008078850924840
DeepSeek MoE training efficiency has been completely commoditized (though it’s startling how much of that is reimplementation of DeepSeek’s own Open Source Week releases 1 year ago, updated for Blackwell). Very cool
https://x.com/teortaxesTex/status/2031263831595335702
🧵 1/4 Still waiting for DeepSeek-V4? We (@Zai_org) made DSA 1.8× faster with minimal code change — and it’s ready to deliver real inference gains on GLM-5. IndexCache removes 50% of indexer computations in DeepSeek Sparse Attention with virtually zero quality loss. On GLM-5
https://x.com/realYushiBai/status/2032299919999189107
Europe is still far behind the US in AI. But this week in Munich something interesting started… 200+ researchers. Founders. Investors. All in one room in Munich asking the same question: ❓How does Europe build its own frontier AI labs? Wednesday was the kickoff of the Next
https://x.com/IlirAliu_/status/2029994158463590880
rant time: the use of anonymous “sources” in English-language reporting on Chinese tech is honestly outrageous, most evidently with all the exclusive “scoops” we’ve seen in the past year claiming to know when DeepSeek’s next big model is dropping. All these reports from the most
https://x.com/vince_chow1/status/2031002233060634953
GPT-5.4 is great at coding, knowledge work, computer use, etc, and it’s nice to see how much people are enjoying it. But it’s also my favorite model to talk to! We have missed the mark on model personality for awhile, so it feels extra good to be moving in the right direction.
https://x.com/sama/status/2030319489993298349
Our next kernel competition is now open for submissions! A $1.1M cash prize competition sponsored by AMD on optimizing DeepSeek-R1-0528, GPT-OSS-120B on MI355X Registration:
https://x.com/GPU_MODE/status/2029974019018244223
Announcing NVIDIA Nemotron 3 Super! 💚120B-12A Hybrid SSM Latent MoE, designed for Blackwell 💚36 on AAIndex v4 💚up to 2.2X faster than GPT-OSS-120B in FP4 💚Open data, open recipe, open weights Models, Tech report, etc. here: https://t.co/CAYpP1iK3i And yes, Ultra is coming!
https://x.com/ctnzr/status/2031762077325406428
Another week, another noteworthy open-weight LLM release. Nvidia’s Nemotron 3 Super 120B-A12B looks pretty good. Benchmarks are on par with Qwen3.5 122B and GPT-OSS 120B, but the throughput is great! Below is a short, visual architecture rundown.
https://x.com/rasbt/status/2032084724743553129
We’re excited to be day-0 launch partners for NVIDIA Nemotron 3 Super! You can try it now on Baseten, or read @rapprach’s blog to learn more about the new model: https://x.com/baseten/status/2031775755253026965
We open sourced WAXAL! – Multilingual speech dataset for African languages – 17 languages for TTS – 19 languages for ASR Over 100 million speakers across 40 Sub-Saharan African countries
https://x.com/osanseviero/status/2032452729059045881
🤖 New models: Qwen3.5, COLQwen3, ColModernVBERT, Ring 2.5, Ovis 2.6, Nemotron embed/rerank VL 🎙️ ASR: FunASR, FireRedASR2, Qwen3-ASR realtime streaming 📦 PyTorch 2.10 upgrade (breaking change for env deps) 🔗 Transformers v5 compatibility Speculative decoding: Nemotron-H MTP,
https://x.com/vllm_project/status/2030178782259171382
🚀 Three attention paradigms are emerging in modern LLMs: Hybrid (Linear + Full), GQA, and DSA. Two recent models illustrate these design choices well: Qwen3.5 and MiniMax M2.5. Here’s a quick breakdown of their architectures from Zhihu contributor kaiyuan👇 🧠 Qwen3.5 — Hybrid
https://x.com/ZhihuFrontier/status/2031686944040915152
🚀 vLLM v0.17.0 is here! 699 commits from 272 contributors (48 new!) This is a big one. Highlights: ⚡ FlashAttention 4 integration 🧠 Qwen3.5 model family with GDN (Gated Delta Networks) 🏗️ Model Runner V2 maturation: Pipeline Parallel, Decode Context Parallel, Eagle3 + CUDA
https://x.com/vllm_project/status/2030178775212671148
RWKV-7 G1e is here (13B/7B/3B/1B). Although Qwen 3.5 is strong, we are improving every month too 🙂 G1f in April. (G1d models all released too).
https://x.com/BlinkDL_AI/status/2031226189654966418





Leave a Reply