Image created with Flux Pro v1.1 Ultra. Image prompt: Chips, sunlit café table near the pool, metallic microchip keychain clipped to a beach bag zipper, subtle reflections, photorealistic, editorial, minimal, landscape, vacation, no text overlays
Exclusive | Oracle, OpenAI Sign $300 Billion Cloud Deal – WSJ https://www.wsj.com/business/openai-oracle-sign-300-billion-computing-deal-among-biggest-in-history-ff27c8fe?mod=hp_lead_pos1
Oracle Stock Skyrockets as Software Giant Scores Massive AI Deals – WSJ https://www.wsj.com/business/earnings/oracle-stock-orcl-ai-deals-047216cd
⚡️ Efficient weight updates for RL at trillion-parameter scale 💡 Best practice from Kimi @Kimi_Moonshot vLLM is proud to collaborate with checkpoint-engine: • Broadcast weight sync for 1T params in ~20s across 1000s of GPUs • Dynamic P2P updates for elastic clusters •”” / X https://x.com/vllm_project/status/1965824120920342916
Introducing checkpoint-engine: our open-source, lightweight middleware for efficient, in-place weight updates in LLM inference engines, especially effective for RL. ✅ Update a 1T model on thousands of GPUs in ~20s ✅ Supports both broadcast (sync) & P2P (dynamic) updates ✅ https://x.com/Kimi_Moonshot/status/1965785427530629243
@Alibaba_Qwen (Gated) Attention is all you need. Excited to offer both Qwen3-Next models on dedicated deployments backed by 4xH100 GPUs. https://x.com/basetenco/status/1966224960223158768
In 2025, getting the right cloud GPU at the right cost depends on capacity, contracts, and timing. Our new report maps providers, pricing models, hardware shifts, and strategies to help ML teams optimize availability and costs. https://x.com/dstackai/status/1965807328508399984
Underdiscussed topic: GPUs aren’t the bottleneck. For fast distributed GenAI post-training, network & storage matter as much. By tuning the network and storage on the cloud, @makneee found 10x speedup on @nebiusai cloud, even with the same GPUs and code. https://x.com/skypilot_org/status/1966208445339807816
Nebius stock soars on AI infrastructure deal with Microsoft https://www.cnbc.com/2025/09/08/nebius-stock-soars-on-ai-infrastructure-deal-with-microsoft-.html
Writing fast GPU kernels is important, though not nearly as important as writing correct ones. That’s why @marksaroufim and folks from Meta have released BackendBench. Now BackendBench also lives on @PrimeIntellect environment hub. 1/3 https://x.com/m_sirovatka/status/1965891832942047350
Heads up: if you’re looking to try CUDA-13 (most useful for Blackwell gpus), PyTorch nightly already has cu130 builds: https://x.com/StasBekman/status/1965826539540590791
This is a handy report on the state of cloud GPUs in 2025: costs, performance, playbooks by @dstackai https://x.com/StasBekman/status/1965817531043811339
🚀 Excited to announce QuTLASS v0.1.0 🎉 QuTLASS is a high-performance library for low-precision deep learning kernels, following NVIDIA CUTLASS. The new release brings 4-bit NVFP4 microscaling and fast transforms to NVIDIA Blackwell GPUs (including the B200!) [1/N] https://x.com/DAlistarh/status/1965157635617087885
Don’t forget, you can try it out at NVIDIA API Catalog https://x.com/Alibaba_Qwen/status/1966206151391064143
remembering the time i tried to buy GPUs from oracle the first sales guy I talked to didn’t know if the $10/hr on their website was per H100 or per 8xH100 he scheduled a follow up call, in which they brought in 2x more sales people than our company’s total headcount”” / X https://x.com/vikhyatk/status/1965943667237204069
Writing fast GPU kernels is important, though not nearly as important as writing correct ones. That’s why the folks from Meta have released BackendBench.https://x.com/johannes_hage/status/1965945249274151107




