International: AI News Week Ending 06/06/2025

International: AI News Week Ending 06/06/2025

June 6, 2025

Image created with OpenAI GPT-Image-1. Image prompt: vintage Sly & the Family Stone album-cover style, psychedelic collage of band members, swirling neon tie-dye backdrop featuring world map grid with flowing data lines; grainy retro print texture, vibrant 60s funk color palette, high-resolution

Introducing Mistral Code | Mistral AI https://mistral.ai/news/mistral-code

Mistral releases a vibe coding client, Mistral Code | TechCrunch https://techcrunch.com/2025/06/04/mistral-releases-a-vibe-coding-client-mistral-code/

DeepSeek’s R1 leaps over xAI, Meta and Anthropic to be tied as the world’s #2 AI Lab and the undisputed open-weights leader DeepSeek R1 0528 has jumped from 60 to 68 in the Artificial Analysis Intelligence Index, our index of 7 leading evaluations that we run independently https://x.com/ArtificialAnlys/status/1928071179115581671

@karpathy Daily driver these days is Gemini 2.5 Pro and sometimes Claude Sonnet 4 For simple brainstorming/ creative writing DeepSeek v3″” / X https://x.com/i/web/status/1929613466475659662

Examples of international AI redlines: 1. Intelligence explosion redline. AIs might be able to improve AIs all by themselves in the next few years. The US and China should not want anybody to attempt an intelligence explosion where thousands of AIs are autonomously and rapidly”” / X https://x.com/i/web/status/1929709721290002497

AI-powered coding for the enterprise | Mistral AI https://mistral.ai/products/mistral-code

AI Agents can now talk to any website directly. Microsoft’s NLWeb converts website data into APIs that AI agents can query as an MCP server. Works with OpenAI, DeepSeek, Gemini, Claude and other LLMs. 100% Opensource. https://x.com/Saboo_Shubham_/status/1927379307371864428

Qwen2.5-VL is such a great and versatile model that every frontier lab is building on it these days, new agentic models, GUI models and more always base on it @Alibaba_Qwen you’re the best 💗”” / X https://x.com/i/web/status/1929488866748092881

🚀 DeepSeek-R1-0528 is here! 🔹 Improved benchmark performance 🔹 Enhanced front-end capabilities 🔹 Reduced hallucinations 🔹 Supports JSON output & function calling ✅ Try it now: https://x.com/deepseek_ai/status/1928061589107900779

DeepSeek has released DeepSeek-R1-0528, an updated version of DeepSeek-R1. How does the new model stack up in benchmarks? We ran our own evaluations on a suite of math, science, and coding benchmarks. Full results in thread! https://x.com/EpochAIResearch/status/1928489524616630483

New DeepSeek just dropped. Proud to serve the fastest DeepSeek R1 0528 inference on OpenRouter (#1 on TTFT and TPS) with our Model APIs. https://x.com/basetenco/status/1928195639822700898

The DeepSeek-R1-0528 model card just dropped. Up 17.5 points on the AIME 2025 test. https://x.com/fdaudens/status/1928055679182352461

Today’s open weights frontier is led by DeepSeek (both reasoning and non-reasoning models) https://x.com/ArtificialAnlys/status/1928477951365939328

We made dynamic 1bit quants for DeepSeek-R1-0528 – 74% smaller 713GB to 185GB. Use the magic incantation -ot “”.ffn_.*_exps.=CPU”” to offload MoE layers to RAM, allowing non MoEs to fit < 24GB VRAM on 16K context! The rest sits in RAM & disk. Quants here: https://x.com/danielhanchen/status/1928278088951157116

On GPQA Diamond, a set of PhD-level multiple-choice science questions, DeepSeek-R1-0528 scores 76% (±2%), outperforming the previous R1’s 72% (±3%). This is generally competitive with other frontier models, but below Gemini 2.5 Pro’s 84% (±3%). https://x.com/EpochAIResearch/status/1928489527204589680

DeepSeek R1 05-28 LiveBench results: – 8th in the Overall ahead of o4-mini, Gemini 2.5 Flash Preview and Qwen3-235B-A22B (biggest competitors) – 1st on Data Analysis !!! – 3rd on Reasoning !! – 4th on Mathematics ! – 11th on Language – 20th on Instruction Following – 23rd on https://x.com/scaling01/status/1928173385399308639

Releasing our Q2 2025 State of AI – China Report 🇨🇳: Chinese AI labs have achieved close to parity with US labs, led by DeepSeek’s leap to world #2 in intelligence and backed by a deep ecosystem of 10+ players Key findings from our analysis: 🇨🇳 The Chinese AI Ecosystem has depth https://x.com/ArtificialAnlys/status/1928477941715079175

The latest mlx-lm has a new dynamic quantization method (made with @angeloskath). It consistently results in better model quality with no increase in size. Some perplexity results (lower is better) for a few Qwen3 base models: https://x.com/i/web/status/1929633379504493048

Decentralized compute is winning. We don’t have one datacenter, we have dozens. We don’t have one SRE team, we have nearly 100. Latest example: DeepSeek-R1-0528. 100% uptime, day zero support, 4x more tokens on openrouter than all other providers combined (and go check the https://x.com/i/web/status/1929639699171495936

Chinese tech companies prepare for AI future without Nvidia, FT reports https://finance.yahoo.com/news/chinese-tech-companies-prepare-ai-012546092.html

Brookfield plans $10 billion AI data centre in Sweden | Reuters https://www.reuters.com/technology/brookfield-asset-management-plans-10-bln-data-centre-ai-sweden-2025-06-04/

I sat down with Bloomberg to share why Bell Canada chose Groq. Canada didn’t wait. They built sovereign AI infrastructure with speed, scale, and control – powered by Groq. Full Interview 🔗 https://x.com/JonathanRoss321/status/1928241967122506083

Pretty impressive 7B VLM coming out of Xiaomi 🤓 ViT encoder w/ MLP and powered by their 7B Text backbone Compatible w/ Qwen VL arch so works across vLLM, Transformers, SGLang and Llama.cpp Bonus: it can reason and is MIT licensed 🔥 https://x.com/reach_vb/status/1928360066467439012

Nvidia B200s serving DeepSeek R1 at ~250 tks/s 5x faster than H100″” / X https://x.com/i/web/status/1929670236057264354

Why DeepSeek is cheap at scale but expensive to run locally | sean goedecke https://www.seangoedecke.com/inference-batching-and-deepseek/

Optimised MLX quant for DeepSeek R1 0528 🔥 https://x.com/reach_vb/status/1928002892633383338

Deep Seek R1 Qwen3 8B knows it’s overthinking it 😂 https://x.com/awnihannun/status/1928119439737729482

It turns out that most AI models (including DeepSeek r1), if told they should “”follow your conscience to make the right decision,”” will snitch on you to the Feds if they think you are suppressing knowledge of a drug trial that actually kills people. Alignment in practice?”” / X https://x.com/emollick/status/1928979986813243899

OpenAI finds more Chinese groups using ChatGPT for malicious purposes | Reuters https://www.reuters.com/world/china/openai-finds-more-chinese-groups-using-chatgpt-malicious-purposes-2025-06-05/

Given that the US, China & Europe are all players in frontier open weights models, I am not sure what it means for a nation to “win” in AI. Unless you are positing a take-off scenario where one (closed weights) AI dominates everything else, won’t open models diffuse worldwide?”” / X https://x.com/emollick/status/1928203057092870635

A full set of new and improved Qwen3 4-bit DWQ quants are on Hugging Face MLX Community: https://x.com/i/web/status/1929601108210835931

America wins by recruiting builders and innovators: Einstein, Grove, Brin, Huang. Immigrants start companies, create millions of jobs, and strengthen our global leadership—benefiting generations of native-born Americans. Immigration done right is America First. 🇺🇸”” / X https://x.com/saranormous/status/1928479931660411033

Shisa V2 405B: Japan’s Highest Performing LLM https://simonwillison.net/2025/Jun/3/shisa-v2/

Ollama can now think! 🤔🤔🤔 For thinking models, and especially useful for very thoughtful models like DeepSeek-R1-0528, Ollama can separate the thoughts and the response. Thinking can also be disabled! This is useful for getting a direct response. This works across https://x.com/ollama/status/1928543644090249565

Why are almost all RL experiments done on qwen models? Kind of interesting right…”” / X https://x.com/abacaj/status/1927948317931000277

The 4-bit DWQ of DSR1 Qwen3 8B is up on HF. Use the command below or use it in @lmstudio: https://x.com/awnihannun/status/1928125690173383098

⚠️⚠️⚠️Qwen team has worked on training pivot tokens ⚠️⚠️⚠️ @_xjdr @doomslide amusingly, they *do* test it on Llama 3.1 as well but find it so ass that no conclusive results can be had without cold start with Qwen data https://x.com/i/web/status/1929755590404055358

Seems like no one saw this either, scraping arxiv manually seems to be the way. Pretty cool paper on rl for creative writing on Qwen3 32B base, and most interestingly it’s one author from the Star Writing Team (haven’t heard of them). They seem to have access to the 32B base tho https://x.com/i/web/status/1929996614883783170