Image created with gemini-2.5-flash-image with claude-sonnet-4-5. Image prompt: Cinematic wide shot of multiple futuristic orbital command stations positioned around Earth in deep space, holographic data beams connecting stations to territories below, cool blue and muted green lighting with dramatic rim light, Ender’s Game inspired aesthetic, sleek military sci-fi architecture, high contrast with Earth partially shadowed, conveying geopolitical AI strategy and international coordination at massive scale

ByteDance’s Volcano Engine debuts coding agent at $1.3 promo price https://www.techinasia.com/news/bytedances-volcano-engine-debuts-coding-agent-at-1-3-promo-price

ByteDance unveils China’s most affordable AI coding agent at just US$1.30 a month | South China Morning Post https://www.scmp.com/tech/big-tech/article/3332365/bytedance-unveils-chinas-most-affordable-ai-coding-agent-just-us130-month

GPT-5, Claude, Kimi, and Gemini: “”I can travel back in time to any time before 1500 and change only one thing, what is the single thing you would change, nothing obvious.”” https://x.com/emollick/status/1987355374928769395

Perceptron’s platform is here — built for Physical AI Developers can now use Isaac-0.1 or Qwen3VL 235B via: Perceptron API — fast, reliable multimodal intelligence Python SDK — simple, grounded prompting for vision + language Build apps that see and understand the world. https://x.com/perceptroninc/status/1988713482460750290

Introducing Meta Omnilingual Automatic Speech Recognition (ASR), a suite of models providing ASR capabilities for over 1,600 languages, including 500 low-coverage languages never before served by any ASR system. While most ASR systems focus on a limited set of languages that are https://x.com/AIatMeta/status/1987946571439444361

Omnilingual ASR: Advancing Automatic Speech Recognition for 1,600+ Languages https://ai.meta.com/blog/omnilingual-asr-advancing-automatic-speech-recognition/

Nvidia’s Jensen Huang: ‘China is going to win the AI race,’ FT reports https://finance.yahoo.com/news/nvidias-jensen-huang-says-china-211900769.html

Baidu just dropped an open-source multimodal AI that it claims beats GPT-5 and Gemini | VentureBeat https://venturebeat.com/ai/baidu-just-dropped-an-open-source-multimodal-ai-that-it-claims-beats-gpt-5

“Group chats in ChatGPT are now piloting in Japan, New Zealand, South Korea, and Taiwan. A new way to collaborate with friends, family, or coworkers and ChatGPT in the same conversation.” https://x.com/OpenAI/status/1989138776585851038

Most models: think → tool call → think → tool call K2 Thinking: keeps tool calls inside the reasoning trace so multi-step workflows don’t drift. We’ll show how Moonshot post-trained for agentic tool calling and demo complex workflows running in one model call.”” / X https://x.com/togethercompute/status/1988009780149878904

It turns out that Kimi K2 Thinking is also a beast at deep research. It can run 200-300 tool requests for impressive multi-agent capabilities. Would you like to see a code example of it?”” / X https://x.com/omarsar0/status/1987912692099682399

Kimi K2 Thinking is impressive. So I built a multi-agent deep researcher, Kimi Deep Researcher. It generates long research reports on any topic, powered by subagents (web searcher, analyzer, and synthesizer). It can do 100s of tool calls per session. Repo soon! https://x.com/omarsar0/status/1988974710592516454

These are pretty impressive benchmarks from a Chinese open weights model. Especially big is the agentic capability, which has generally lagged in the open weights models. Be interesting to see independent confirmation soon, I found K2 a solid, but kind of weird, model to use.”” / X https://x.com/emollick/status/1986452925418270871

🚀 Hello, Kimi K2 Thinking! The Open-Source Thinking Agent Model is here. 🔹 SOTA on HLE (44.9%) and BrowseComp (60.2%) 🔹 Executes up to 200 – 300 sequential tool calls without human interference 🔹 Excels in reasoning, agentic search, and coding 🔹 256K context window Built https://x.com/Kimi_Moonshot/status/1986449512538513505

🚀We’re going live with @Kimi_Moonshot on Nov 19 for a technical deep dive on Kimi K2 Thinking Learn about the 1T parameter MoE that allows your AI agent to make 300 tool calls in one run. Register: https://x.com/togethercompute/status/1988009777247510564

from Kimi AMA: – K3 will likely use KDA or some other hybrid attention mechanism – Kimi-K2 will get vision https://x.com/scaling01/status/1987916859400659011

I wonder if part of what makes Kimi K2 Thinking impressive is that it produces a lot more thinking tokens for even minor & non-technical queries than any model I have used. This is the thinking trace for “”write me a really good sentence about cheese”” it is 1,595 tokens long! https://x.com/emollick/status/1987286609713107261

Try Kimi-K2-Thinking now on Together AI https://x.com/togethercompute/status/1988011880443470217

I’m sorry Kimi bros The problem is and was 100% the OpenRouter API and it’s starting to piss me off that long reasoning always breaks Just use Kimi API for now and not OpenRouter if you have requests that take a lot of reasoning tokens. Simpler requests work fine with”” / X https://x.com/scaling01/status/1987938809628291168

since testing Kimi-K2 Thinking I have become very wary of providers on OpenRouter might switch to original provider APIs only they need to do quality testing for every model and provider”” / X https://x.com/scaling01/status/1988399213563236810

Kimi K2 Thinking passes the Lem Test the first time, very few models have done so Just like Kimi K2, however, this remains a very weird & interesting model in a way that is hard to benchmark. Its writing is often very good but sometimes doesn’t hold up under close investigation https://x.com/emollick/status/1986552301922738651

Thanks everyone for testing Kimi K2 Thinking and sharing benchmark results! We’ve noticed that benchmark outcomes can vary across providers. Some third-party endpoints show substantial accuracy drops (e.g., 20+ pp), which has negatively affected scores on reasoning-heavy tasks”” / X https://x.com/Kimi_Moonshot/status/1987892275092025635

Kimi AMA on K2 Thinking: 1. $4.6M training cost is not an official number 2. Trained on H800s (nerfed H100s) 3. KDA (Kimi Delta Attention) hybrids with NoPE MLA perform better than full MLA with RoPE 4. Muon scales well to 1T parameters. “there are tens of optimizers and”” / X https://x.com/Yuchenj_UW/status/1987940704929395187

Test out Kimi K2 Thinking vs. all the frontier models for yourself at: https://x.com/arena/status/1987947224173781185

Testing Kimi K-2 has reminded me of how insane it is that firms picking AIs are treating them as fungible based on benchmarks Kimi & Grok & Claude & every other model have strengths, quirks & weaknesses that can make a big difference in aggregate Develop your own benchmarks!”” / X https://x.com/emollick/status/1986604851770360213

In our new Expert and Occupational leaderboards: The previous, non-thinking Kimi K2 is ranked #7 for Hard Prompts, particularly excelling in the ‘Legal & Government’ category under the ‘Occupational’ leaderboard, while falling behind in ‘Instruction Following’. Kimi K2 Thinking https://x.com/arena/status/1987947222299013630

k2 vision is happening. this is not a drill. https://x.com/code_star/status/1987917177417289794

Whenever people ask me, “Is Muon optimizer just hype?” I need to show them this. Muon isn’t just verified and used in Kimi; other frontier labs like OpenAI are using it and its variants. It’s also in PyTorch stable now! https://x.com/Yuchenj_UW/status/1987955443420065816

Latest LisanBench results for Kimi-K2 Thinking Kimi-K2 Thinking is the best open-source model and 7th best model overall, right between GPT-5 and GPT-5-Mini Raw Scores: Glicko-2 ratings – better indicator of relative strength Kimi-K2 Thinking managed to set new high-scores https://x.com/scaling01/status/1987952884927934966

🚨 Leaderboard Update! Kimi K2 Thinking by @Kimi_Moonshot has landed on the Text leaderboard as the #2 open source model (MIT modified), tied for #7 overall. These are real-world results. With only a six-point difference with @Zai_org ‘s GLM 4.6, the competition is tight. Kimi https://x.com/arena/status/1987947219224526902

Our very own @RLanceMartin outlined a new playbook for AI engineering on the High Signal Podcast. In this conversation, he touches on: 🔶 Why top products from Claude Code to Manus are constantly re-architecting to keep up with tomorrow’s models 🔶 How to use context engineering https://x.com/LangChainAI/status/1989152093127782765

UK firms plan 3% pay rises in coming year, see AI hit to jobs, survey shows | Reuters https://www.reuters.com/business/world-at-work/uk-firms-plan-3-pay-rises-coming-year-see-ai-hit-jobs-survey-shows-2025-11-10/

Only a few countries have enough power to build many >1 GW data centers like Stargate E.g. 30 GW is ~5% of the US’ power, ~2.5% of China’s, but ~90% of the UK’s Other countries can build some frontier data centers and grow their power capacity — but they need more time/money”” / X https://x.com/EpochAIResearch/status/1987944152542441763

China’s DeepSeek makes rare public comment, calls for AI ‘whistle-blower’ on job losses | South China Morning Post https://www.scmp.com/tech/big-tech/article/3332086/chinas-deepseek-makes-rare-public-comment-calls-ai-whistle-blower-job-losses

A new addition to the ERNIE open-source model family is here! Meet ERNIE-4.5-VL-28B-A3B-Thinking, our lightweight multimodal reasoning model. > 3B active parameters with enhanced semantic alignment between visual and language modalities > Outperforming Gemini-2.5-Pro and https://x.com/Baidu_Inc/status/1988182106359411178

The Government of Kazakhstan, @OpenAI, and Freedom Holdin Corp. signed a strategic agreement to bring ChatGPT Edu to 165,000 educators across the country, advancing digital literacy with enhanced privacy and data protection. https://x.com/KazakhEmbassy/status/1987252296267248002

🚀 Qwen DeepResearch 2511 is LIVE! 🚀 We’ve just dropped a major upgrade, making your research deeper, faster, and smarter! 🔗: https://x.com/Alibaba_Qwen/status/1989026687611461705

🚀 Qwen Code v0.2.1 is here! We shipped 8 versions(v0.1.0->v0.2.1) in just 17 days with major improvements: What’s New: 🌐 Free Web Search: Support for multiple providers. Qwen OAuth users get 2000 free searches per day! 🎯 Smarter Code Editing: New fuzzy matching pipeline https://x.com/Alibaba_Qwen/status/1989368317011009901

QwenEdit-2509 Photo2Anime: LoRA transforms photos into anime; delivers better results than prompting for “”anime”” without it. https://x.com/wildmindai/status/1988309389259010112

Qwen Image Edit Light Restoration app Easily remove shadows and relight in seconds. Here’s how + 4 wild examples:👇 https://x.com/minchoi/status/1988008926797787208

America and China build robots. Europe builds committees. Tesla just showed Optimus in pilot production. Humanoids being assembled like cars. Meanwhile, in Europe, we’re still arguing over regulation, ethics boards, and frameworks that nobody in the real world reads. This https://x.com/IlirAliu_/status/1986869259226456142

I would like policy discussion be much clearer about what “winning the international AI race” means. Policymakers do not seem to believe in a takeoff scenario based on other decisions, and without an apotheosis as a “finish line,” it isn’t that clear to ke what we are racing to.”” / X https://x.com/emollick/status/1986297613642084407

We releasing a large update to 📄FinePDFs! – 350B+ highly education tokens in 69 languages, with incredible perf 🚀 – 69 edu classifiers, powered by ModernBert and mmBERT – 300k+ EDU annotations for each of 69 languages from Qwen3-235B https://x.com/HKydlicek/status/1988328336469459449

Leave a Reply

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading