Ethan B. Holland

Over 56,100 manually organized AI links and counting

International: AI News Week Ending 03/28/2025

March 28, 2025

“The new Deep Seek V3 0324 in 4-bit runs at > 20 toks/sec on a 512GB M3 Ultra with mlx-lm! https://x.com/awnihannun/status/1904177084609827054

AK on X: “DeepSeek just quietly dropped DeepSeek-V3-0324 on Huggingface https://t.co/6YFtRE5u3C” / X
https://x.com/_akhaliq/status/1904154585242935516

Chinese Startup Behind Manus AI Agent Seeks $500 Million Valuation — The Information https://www.theinformation.com/articles/chinese-startup-behind-manus-ai-agent-seeks-500-million-valuation

Qwen2.5 Omni: See, Hear, Talk, Write, Do It All! | Qwen https://qwenlm.github.io/blog/qwen2.5-omni/

“DeepSeek just quietly dropped DeepSeek-V3-0324 on Huggingface https://x.com/_akhaliq/status/1904154585242935516

China, US need to cooperate in AI, says US-China relations committee head | Reuters https://www.reuters.com/technology/artificial-intelligence/china-us-need-have-cooperation-ai-says-us-china-relations-committee-head-2025-03-24/

DeepSeek narrows China-US AI gap to three months, 01.AI founder Lee Kai-fu says | Reuters https://www.reuters.com/technology/artificial-intelligence/deepseek-narrows-china-us-ai-gap-three-months-01ai-founder-lee-kai-fu-says-2025-03-25/

Rebuilding TikTok in America https://www.perplexity.ai/hub/blog/rebuilding-tiktok-in-america

Alibaba-affiliate Ant uses Chinese, U.S. chips to cut AI costs https://www.cnbc.com/2025/03/24/alibaba-affiliate-ant-uses-china-us-chips-to-cut-ai-costs.html

U.S. blacklists over 50 Chinese companies in bid to curb Beijing’s AI, chip capabilities https://www.cnbc.com/2025/03/26/us-blacklists-50-chinese-companies-in-bid-to-curb-beijings-ai-chip-capabilities.html

“Alibaba just released Qwen2.5-VL-32B-Instruct on Hugging Face further optimize this VLM with reinforcement learning and have found significant improvements in human preference and also mathematical reasoning https://x.com/_akhaliq/status/1904242971043607002

“Qwen3 coming soon 👀 https://x.com/fdaudens/status/1903482331312103653

“72B too big for VLM? 7B not strong enough! Then you should use our 32B model, Qwen2.5-VL-32B-Instruct! Blog: https://x.com/Alibaba_Qwen/status/1904227859616641534

Sakana AI on X: “Sakana AI super-powers AI reasoning using Japan’s own Sudoku Puzzles! Read more here → https://t.co/Sxqnpi0TuV At @NVIDIAGTC, Llion Jones @YesThisIsLion announced the release of our new reasoning benchmark based on the modern variant Sudoku to challenge the AI community. We https://t.co/8LnE9EpjYg” / X
https://x.com/SakanaAILabs/status/1902913196358611278

Reasoning Efficiency Redefined! Meet Tencent’s ‘Hunyuan-T1’—The First Mamba-Powered Ultra-Large Model
https://llm.hunyuan.tencent.com/#/blog/hy-t1?lang=en

“✨ Excited to share QVQ-Max, our visual reasoning model that’s still evolving We’ve been experimenting with this approach for a while – try it out on Qwen Chat! (https://t.co/FmQ0B9tiE7) 🚀 Just upload any image or video, ask away, and hit the “Thinking” button to see how it https://x.com/Alibaba_Qwen/status/1905342260100956210

“Introducing Together Chat! Use DeepSeek R1 (hosted in North America) & other top open source models to do web search, coding, image generation, & image analysis. Available today for free! https://x.com/togethercompute/status/1904204860217500123

“DeepSeek V3-0324 is now available on Hugging Face through @SambaNovaAI 250+ t/s — fastest in the world Smashes benchmarks like MMLU-Pro (81.2) & AIME (59.4), outperforming Gemini 2.0 Pro & Claude 3.7 Sonnet https://x.com/_akhaliq/status/1905350698797334860

“Very detailed qualitative evaluation of 0324. > it surpasses DeepSeek-R1! It even surpasses Claude-3.7! … It ranks third in the KCORES large model arena with a score of 328.3, second only to claude-3.7-sonnet-thinking and claude-3.5 That’s what you get by scaling post-training https://x.com/teortaxesTex/status/1904292164672115077

“The wild whale DeepSeek 🐳 just dropped a new model, MIT license. We at @hyperbolic_labs now serve DeepSeek-V3-0324, the first inference provider serving this model on @huggingface. My vibe check: It definitely has some <think> model smell. Watch it ace “how many r’s in https://x.com/Yuchenj_UW/status/1904223627509465116

Qwen2.5-VL-32B: Smarter and Lighter | Qwen https://qwenlm.github.io/blog/qwen2.5-vl-32b/

QVQ-Max: Think with Evidence | Qwen https://qwenlm.github.io/blog/qvq-max-preview/

“Comparing DeepSeek V3-0324 APIs: We are now tracking 10 APIs for DeepSeek’s new model, including DeepSeek’s first-party API and offerings from Fireworks, DeepInfra, Hyperbolic, Nebius, CentML, https://x.com/ArtificialAnlys/status/1905279539250676065

“You can already run the latest open source DeepSeek V3-0324 via MLX 🔥 https://x.com/reach_vb/status/1904204090868900140

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞) (@teortaxesTex) / X https://x.com/teortaxesTex/

“For your single-node inference pleasure – Cognitive Computations presents AWQ quants of DeepSeek-V3-0324. Props to @casper_hansen_ and v2ray for their assistance! https://x.com/cognitivecompai/status/1904653165519085775

“A new base open-source model from @deepseek_ai is out and you can already test it on the hub thanks to our amazing inference partners @FireworksAI_HQ & @hyperbolic_labs! https://x.com/ClementDelangue/status/1904237660237115542

Qwen/Qwen2.5-Omni-7B · Hugging Face https://huggingface.co/Qwen/Qwen2.5-Omni-7B

“Deepseek API change log updated for 0324, h/t @OedoSoldier MMLU-Pro: 75.9 → 81.2 (+5.3) GPQA: 59.1 → 68.4 (+9.3) AIME: 39.6 → 59.4 (+19.8) LiveCodeBench: 39.2 → 49.2 (+10.0) and that’s not mentioning a plethora of QoL improvements. These are some phenomenal scores. https://x.com/teortaxesTex/status/1904364173426893235

“DeepSeek takes the lead: DeepSeek V3-0324 is now the highest scoring non-reasoning model This is the first time an open weights model is the leading non-reasoning model, a milestone for open source. DeepSeek V3-0324 has jumped forward 7 points in Artificial Analysis https://x.com/ArtificialAnlys/status/1904467255083348244

“This is cool. Qwen is the solid leader on open source multimodality. https://x.com/teortaxesTex/status/1904950082480279943

“DeepSeek new model is now available directly on Hugging Face through @hyperbolic_labs hyperbolic labs now serves DeepSeek-V3-0324, the first inference provider serving this model on hugging face https://x.com/_akhaliq/status/1904231386430799938

deepseek-ai/DeepSeek-V3-0324 · Hugging Face https://huggingface.co/deepseek-ai/DeepSeek-V3-0324

“Compared to leading reasoning models, including DeepSeek’s own R1, DeepSeek V3-0324 remains behind – but for many uses, the increased latency associated with letting reasoning models ‘think’ before answering makes them unusable. https://x.com/ArtificialAnlys/status/1904467262364692970

deepseek-ai/DeepSeek-V3-0324 at main https://huggingface.co/deepseek-ai/DeepSeek-V3-0324/tree/main

“2.7bit dynamic quants for DeepSeek V3 are here! 1. Use temperature 0.0-0.3 2. Use min_p=0.01 3. Non dynamic quants seem to always create “seizured” typed results – see docs for more details 4. 1.78bit (150GB) still uploading! 5. Flappy Bird + Heptagon works! More details: 1. https://x.com/danielhanchen/status/1904707162074669072

DeepSeek-V3-0324 Release | DeepSeek API Docs https://api-docs.deepseek.com/news/news250325

“A wild Deepseek has appeared https://x.com/Teknium1/status/1904147049219494148

“The «commoditize your complement» theory of DeepSeek makes no economic sense. China reaps no strategic advantage from this – only goodwill and acceleration of domestic research (see all PRC labs adopting DSMoE now). Enjoy it while it lasts. While Xi isn’t senile and Leo-pilled. https://x.com/teortaxesTex/status/1904008640542937273

Leaked data exposes a Chinese AI censorship machine | TechCrunch https://techcrunch.com/2025/03/26/leaked-data-exposes-a-chinese-ai-censorship-machine/

“I beg people to start reasoning about China starting with the premise “China is a unique country and cannot be understood with rankings for normal countries”. Ideally we’d do that for every country but it’s more vital here. You need to focus on fundamentals, not muh GDP.” / X https://x.com/teortaxesTex/status/1904711779030008108

BMW and Alibaba Deepen Strategic Partnership in China https://www.alizila.com/bmw-and-alibaba-deepen-strategic-partnership-in-china-harnessing-qwens-ai-power-to-redefine-intelligent-in-car-experiences/

“I say that the seeming Chinese inability to match eg. ASML does not tell us anything about their inherent deficiency in creativity, ootb thinking, outlier geniuses. Our high-end tech is legitimately that hard. Westerners *too* can’t make a second ASML. https://x.com/teortaxesTex/status/1904723137553379748

EU to invest $1.4 billion in artificial intelligence, cybersecurity and digital skills | Reuters https://www.reuters.com/technology/artificial-intelligence/eu-invest-14-billion-artificial-intelligence-cybersecurity-digital-skills-2025-03-28/

“Qwen 2.5 7B Omni looks pretty rad – end to end multimodal LLM 🔥 Key features: > Novel TMRoPE (Time-aligned Multimodal RoPE) for synchronizing video & audio timestamps > Supports live interactions with low-latency streaming > Outperforms both streaming & non-streaming speech https://x.com/reach_vb/status/1904946172021936351

“Voice Chat + Video Chat! Just in Qwen Chat ( https://x.com/Alibaba_Qwen/status/1904944923159445914

“the dialectics of decentralization: all of the world’s data for one model one model available on every computer DeepSeek will build AGI and we must help them. https://x.com/teortaxesTex/status/1904851047270559935

“Sakana AI super-powers AI reasoning using Japan’s own Sudoku Puzzles! Read more here → https://x.com/SakanaAILabs/status/1902913196358611278

“So proud of the team. I think this US-Japan Defense challenge is just the first step for @SakanaAILabs to help accelerate defense innovation in Japan.” / X https://x.com/hardmaru/status/1904320457396162563

“Sakana AI Wins Award at US-Japan Competition for Defense Innovation https://x.com/SakanaAILabs/status/1904156111621754905

“ByteDance just announced InfiniteYou available on Hugging Face Flexible Photo Recrafting While Preserving Your Identity https://x.com/_akhaliq/status/1902937194198700280

ByteDance/InfiniteYou · Hugging Face https://huggingface.co/ByteDance/InfiniteYou

OpenAI and Meta Seek AI Alliance With India’s Reliance — The Information https://www.theinformation.com/articles/openai-meta-seek-ai-alliance-indias-reliance

“OpenAI possibly cutting ChatGPT’s India price by 75%–85%. OpenAI’s top brass has been in multiple talks with India’s Reliance, kicking around a possible partnership. Reliance wants to push OpenAI APIs to Indian businesses, but Microsoft might need to co-sign thanks to its https://x.com/rohanpaul_ai/status/1903776469400400265

“Qwen just DROPPED a 32B VLM – beats Qwen 2.5 72B and GPT 4o Mini – Apache 2.0 licensed 🔥 Vision Tasks (vs. Qwen2-VL-72B): > MMMU: 70.0 (vs. 64.5) > MathVista: 74.7 (vs. 70.5) > OCRBenchV2: 57.2/59.1 (vs. 47.8/46.1) > Android Control: 69.6/93.3 (vs. 66.4/84.4) Text Tasks (vs. https://x.com/reach_vb/status/1904234593576014312

Testing DeepSeek R1 locally for RAG with Ollama and Kibana – Elasticsearch Labs https://www.elastic.co/search-labs/blog/deepseek-rag-ollama-playground