Ethan B. Holland

Over 52,500 manually organized AI links and counting

DeepSeek: AI News Week Ending 03/28/2025

March 28, 2025

“The new Deep Seek V3 0324 in 4-bit runs at > 20 toks/sec on a 512GB M3 Ultra with mlx-lm! https://x.com/awnihannun/status/1904177084609827054

AK on X: “DeepSeek just quietly dropped DeepSeek-V3-0324 on Huggingface https://t.co/6YFtRE5u3C” / X
https://x.com/_akhaliq/status/1904154585242935516

“DeepSeek just quietly dropped DeepSeek-V3-0324 on Huggingface https://x.com/_akhaliq/status/1904154585242935516

DeepSeek narrows China-US AI gap to three months, 01.AI founder Lee Kai-fu says | Reuters https://www.reuters.com/technology/artificial-intelligence/deepseek-narrows-china-us-ai-gap-three-months-01ai-founder-lee-kai-fu-says-2025-03-25/

“Introducing Together Chat! Use DeepSeek R1 (hosted in North America) & other top open source models to do web search, coding, image generation, & image analysis. Available today for free! https://x.com/togethercompute/status/1904204860217500123

“DeepSeek V3-0324 is now available on Hugging Face through @SambaNovaAI 250+ t/s — fastest in the world Smashes benchmarks like MMLU-Pro (81.2) & AIME (59.4), outperforming Gemini 2.0 Pro & Claude 3.7 Sonnet https://x.com/_akhaliq/status/1905350698797334860

“Very detailed qualitative evaluation of 0324. > it surpasses DeepSeek-R1! It even surpasses Claude-3.7! … It ranks third in the KCORES large model arena with a score of 328.3, second only to claude-3.7-sonnet-thinking and claude-3.5 That’s what you get by scaling post-training https://x.com/teortaxesTex/status/1904292164672115077

“The wild whale DeepSeek 🐳 just dropped a new model, MIT license. We at @hyperbolic_labs now serve DeepSeek-V3-0324, the first inference provider serving this model on @huggingface. My vibe check: It definitely has some <think> model smell. Watch it ace “how many r’s in https://x.com/Yuchenj_UW/status/1904223627509465116

“Comparing DeepSeek V3-0324 APIs: We are now tracking 10 APIs for DeepSeek’s new model, including DeepSeek’s first-party API and offerings from Fireworks, DeepInfra, Hyperbolic, Nebius, CentML, https://x.com/ArtificialAnlys/status/1905279539250676065

“You can already run the latest open source DeepSeek V3-0324 via MLX 🔥 https://x.com/reach_vb/status/1904204090868900140

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞) (@teortaxesTex) / X https://x.com/teortaxesTex/

“For your single-node inference pleasure – Cognitive Computations presents AWQ quants of DeepSeek-V3-0324. Props to @casper_hansen_ and v2ray for their assistance! https://x.com/cognitivecompai/status/1904653165519085775

“A new base open-source model from @deepseek_ai is out and you can already test it on the hub thanks to our amazing inference partners @FireworksAI_HQ & @hyperbolic_labs! https://x.com/ClementDelangue/status/1904237660237115542

“Deepseek API change log updated for 0324, h/t @OedoSoldier MMLU-Pro: 75.9 → 81.2 (+5.3) GPQA: 59.1 → 68.4 (+9.3) AIME: 39.6 → 59.4 (+19.8) LiveCodeBench: 39.2 → 49.2 (+10.0) and that’s not mentioning a plethora of QoL improvements. These are some phenomenal scores. https://x.com/teortaxesTex/status/1904364173426893235

“DeepSeek takes the lead: DeepSeek V3-0324 is now the highest scoring non-reasoning model This is the first time an open weights model is the leading non-reasoning model, a milestone for open source. DeepSeek V3-0324 has jumped forward 7 points in Artificial Analysis https://x.com/ArtificialAnlys/status/1904467255083348244

“DeepSeek new model is now available directly on Hugging Face through @hyperbolic_labs hyperbolic labs now serves DeepSeek-V3-0324, the first inference provider serving this model on hugging face https://x.com/_akhaliq/status/1904231386430799938

deepseek-ai/DeepSeek-V3-0324 · Hugging Face https://huggingface.co/deepseek-ai/DeepSeek-V3-0324

“Compared to leading reasoning models, including DeepSeek’s own R1, DeepSeek V3-0324 remains behind – but for many uses, the increased latency associated with letting reasoning models ‘think’ before answering makes them unusable. https://x.com/ArtificialAnlys/status/1904467262364692970

deepseek-ai/DeepSeek-V3-0324 at main https://huggingface.co/deepseek-ai/DeepSeek-V3-0324/tree/main

“2.7bit dynamic quants for DeepSeek V3 are here! 1. Use temperature 0.0-0.3 2. Use min_p=0.01 3. Non dynamic quants seem to always create “seizured” typed results – see docs for more details 4. 1.78bit (150GB) still uploading! 5. Flappy Bird + Heptagon works! More details: 1. https://x.com/danielhanchen/status/1904707162074669072

DeepSeek-V3-0324 Release | DeepSeek API Docs https://api-docs.deepseek.com/news/news250325

“A wild Deepseek has appeared https://x.com/Teknium1/status/1904147049219494148

“The «commoditize your complement» theory of DeepSeek makes no economic sense. China reaps no strategic advantage from this – only goodwill and acceleration of domestic research (see all PRC labs adopting DSMoE now). Enjoy it while it lasts. While Xi isn’t senile and Leo-pilled. https://x.com/teortaxesTex/status/1904008640542937273

“the dialectics of decentralization: all of the world’s data for one model one model available on every computer DeepSeek will build AGI and we must help them. https://x.com/teortaxesTex/status/1904851047270559935

Testing DeepSeek R1 locally for RAG with Ollama and Kibana – Elasticsearch Labs https://www.elastic.co/search-labs/blog/deepseek-rag-ollama-playground