Image created with gemini-3.1-flash-image-preview with claude-sonnet-4-5. Image prompt: Photorealistic winter bay at dusk with large three-dimensional Chinese calligraphy characters naturally formed within translucent ice blocks scattered across frozen surface, characters catching golden sunset light creating amber and blue gradients through ice, 4K nature documentary quality, horizontal landscape composition with bold sans-serif ‘Zhipu AI’ title text prominently displayed, no CGI effects, physically grounded ice formations with realistic textures and atmospheric depth
GLM-5: From Vibe Coding to Agentic Engineering https://simonwillison.net/2026/Feb/11/glm-5/
GLM-5: From Vibe Coding to Agentic Engineering https://z.ai/blog/glm-5
Introducing GLM-5: From Vibe Coding to Agentic Engineering GLM-5 is built for complex systems engineering and long-horizon agentic tasks. Compared to GLM-4.5, it scales from 355B params (32B active) to 744B (40B active), with pre-training data growing from 23T to 28.5T tokens.”” https://x.com/Zai_org/status/2021638634739527773
GLM-5 was pre-trained on 28.5T tokens and uses DeepSeek Sparse Attention”” https://x.com/scaling01/status/2021627498451370331
A new model is now available on https://x.com/Zai_org/status/2021564343029203032
GLM Coding Plan has seen strong growth in users and usage. To sustain service quality, we’ve been investing heavily in compute and model optimization. To reflect these rising costs, we’re adjusting GLM Coding Plan pricing effective February 11, 2026: – First-purchase discounts”” https://x.com/Zai_org/status/2021656635668901985
User traffic has increased tenfold in a very short time. We’re currently scaling to handle the load.”” https://x.com/Zai_org/status/2021585714551443676
Weights are also available on ModelScope:”” https://x.com/Zai_org/status/2021703681104568337
❤️ GLM-5 is on Ollama’s cloud! It’s free to start, and with higher limits available on the paid plans. ollama run glm-5:cloud It’s fast. You can connect it to Claude Code, Codex, OpenCode, OpenClaw via ollama launch! Claude: ollama launch claude –model glm-5:cloud”” https://x.com/ollama/status/2021667631405674845
🎉 The mysterious Pony Alpha is finally revealed, congrats to @Zai_org on releasing GLM-5! SGLang is ready to support on day-0. 🛠️ 744B params (40B active) model built for complex systems engineering & long-horizon agentic tasks 📚 28.5T tokens pretraining for a stronger”” https://x.com/lmsysorg/status/2021639499374375014
🔥Congrats to @Zai_org on launching GLM-5 — 744B parameters (40B active), trained on 28.5T tokens, integrating DeepSeek Sparse Attention to keep deployment cost manageable while preserving long-context capacity. vLLM has day-0 support for GLM-5-FP8 with: 📖 DeepSeek Sparse”” https://x.com/vllm_project/status/2021656482698387852
🚀 Zhipu AI GLM-5: A Real Step Into the Top Tier? Zhihu contributor toyama nao offers a concise verdict: “”A hard road upward — the stairway to godhood.”” 🔮From recovery to contention Over the past six months (4.5 → 5.0), Zhipu has climbed back into China’s first tier and now”” https://x.com/ZhihuFrontier/status/2022161058321047681
GLM-5 by @Zai_org is now the #1 open model in Code Arena, tied with Kimi-K2.5-Thinking! Overall #6 on par with Gemini-3-pro, 100+pts below Claude-Opus-4.6 in agentic webdev tasks. Congrats to the @Zai_org GLM team on the new milestone! 👏”” https://x.com/arena/status/2021996281141629219
GLM-5 from @Zai_org just climbed to #1 among open models in Text Arena! ▫️#1 open model on par with claude-sonnet-4.5 & gpt-5.1-high ▫️#11 overall; scoring 1452, +11pts over GLM-4.7 Test it out in the Code Arena and keep voting, we’ll see how GLM-5 performs for agentic coding”” https://x.com/arena/status/2021725350481526904
GLM-5 is coming to Coding Plan Pro users within one week, and we’re working to bring it to everyone after that. To be upfront: compute is very tight. Even before the GLM-5 launch, we were pushing every chip to its limit just to serve inference. We appreciate your understanding”” https://x.com/Zai_org/status/2021656633320018365
GLM-5 is now on AI Gateway. Better long-range planning, multiple thinking modes, and improved multi-step agent tasks versus previous https://t.co/Yqx8kVZ3i8 models. Use 𝚖𝚘𝚍𝚎𝚕: ‘𝚣𝚊𝚒/𝚐𝚕𝚖-𝟻’ to get started.”” https://x.com/vercel_dev/status/2021655129347539117
GLM-5 is the new leading open weights model! GLM-5 leads the Artificial Analysis Intelligence Index amongst open weights models and makes large gains over GLM-4.7 in GDPval-AA, our agentic benchmark focused on economically valuable work tasks GLM-5 is @Zai_org’s first new”” https://x.com/ArtificialAnlys/status/2021678229418066004
GLM-5 is ZAI’s new flagship. 744B params (40B active), trained on 28.5T tokens, and built for complex systems engineering and long-horizon agentic tasks. Two things worth paying attention to: 1. They integrated DeepSeek Sparse Attention to cut deployment costs while keeping”” https://x.com/cline/status/2021999167875555694
GLM-5 just launched — now available in Qoder. On Qoder Bench — our benchmark for real-world software engineering tasks — GLM-5 outperforms Sonnet 4.5 and approaches Opus 4.5. At a fraction of the cost. High demand expected — brief waits possible during peak hours. Scaling in”” https://x.com/qoder_ai_ide/status/2021639227814092802
GLM-5, the latest frontier open model from @Zai_org, is available now on Modal. We partnered with https://t.co/nhqgwNEWkB to release an endpoint that will be free for a limited time.”” https://x.com/modal/status/2021645783733616800
Pony Alpha Stealth model reveal: GLM-5 from @Zai_org GLM-5 is a new 744B foundation model for coding and agentic usecases. It achieves SOTA scores on top agent benchmarks, and has been used successfully in many agent flows during its Stealth period. Live now on OpenRouter!”” https://x.com/OpenRouter/status/2021639702789730631
Average Throughput of GLM-5 on Openrouter is 14 tps”” https://x.com/scaling01/status/2021981416452764058
Build more. Spend less. GLM-5 is now on YouWare. Landing pages, portfolios, prototypes. All handled fast, with a 200K context window. Save your premium credits for the big builds.”” https://x.com/YouWareAI/status/2021982784948936874
Congrats @Zai_org on GLM-5! Love the permissive MIT license (vs K2.5’s modified MIT). Haven’t chatted with it yet so no vibes, but from the numbers I’m not compelled to switch from @Kimi_Moonshot K2.5: • Similar evals, but GLM-5’s are at bf16 while K2.5’s are at int4 – GLM-5″” https://x.com/QuixiAI/status/2021651135615184988
Day-0 with @Zai_org: GLM-5 is live on DeepInfra 🔥 Built for long-horizon agents that plan, orchestrate, and self-correct. Serving ~100 TPS at launch and as usual the best price on the market!”” https://x.com/DeepInfra/status/2021666854088110318
GLM 5 is 2x the total parameter of GLM 4.5 + deepseek sparse attention for efficient long context this is going to be a crazy model”” https://x.com/eliebakouch/status/2020824645868630065
GLM MoE DSA”” is landing in transformers 👀”” https://x.com/xeophon/status/2020815776890909052
GLM-4.7-Flash-GGUF is now the most downloaded model on @UnslothAI.”” https://x.com/Zai_org/status/2021207517557051627
GLM-5 already available on OpenRouter (with even lower prices)”” https://x.com/scaling01/status/2021637257103651040
GLM-5 has a 200k context length and maximum output of 128k”” https://x.com/scaling01/status/2021628691357298928
GLM-5 is massive. 745B params. LETS FUCKING GOOOOO This should be fun!”” https://x.com/scaling01/status/2020840989947298156
GLM-5 Pricing $1 and $3.2 Output There is also a GLM-5 Code variant that is more expensive👀 almost 8 times cheaper than Opus”” https://x.com/scaling01/status/2021628971939418522
GLM-5 runs with mlx-lm on a single 512GB M3 Ultra in Q4. It’s quite good in my initial testing and pretty fast as well. It generated a highly functional space invaders game using 7.1k tokens at 15.4 tok/s and 419GB memory. Thanks to @ActuallyIsaak and @kernelpool for the port.”” https://x.com/awnihannun/status/2022007608811696158
https://t.co/ctlyPtiB3j GLM-5 architecture is out: ~740B parameters ~50B active 78 layers, MLA attention lifted from DeepSeek V3, plus DeepSeek V3.2’s sparse attention indexer for 200k context. Basically DeepSeek V3 scale with DSA bolted on.”” https://x.com/QuixiAI/status/2021111352895393960
GLM-5 is out on @huggingface 🔥 > A40B/744B, trained on more tokens (28.5T) > outperforms/on par with closed sota > allows commercial use (MIT licensed) 💗 use with vLLM/SGLang locally or through HF Inference Providers thanks to @novita_labs and @Zai_org 📦”” https://x.com/mervenoyann/status/2021642658188538348
DeepSeek V4-lite, Minimax 2.5, GLM-5 what a bloodbath will Qwen accelerate the release of 3.5?”” https://x.com/teortaxesTex/status/2021586965594857487
Z.ai said they are GPU starved, openly. : r/LocalLLaMA https://www.reddit.com/r/LocalLLaMA/comments/1r26zsg/zai_said_they_are_gpu_starved_openly/





Leave a Reply