About This Week’s Covers
This week’s newsletter cover was inspired by the new Taylor Swift album “The Life of A Showgirl”

I gave the original album art to GPT and asked for a prompt to give an image tool that removed the text and swapped in a humanoid robot in a similar pose in a landscape image. I tried the prompt in all of the major tools and Flux won easily. I added the text in Photoshop.
I used my nine-week-old GPT rubric + Flux Pro Ultra to automatically incorporate all of the categories into a Vegas showgirl theme. I gave it a one-sentence description of the theme, and GPT automatically generated 46 cover image prompts and sent them through the API (Flux this week, but I can change it) with no supervision. All ideas and compositions came from GPT autonomously.
I love the aesthetic of the imagery this week. I’d give the covers a B-. My favorite six are below:

This Week By The Numbers
Total Organized Headlines: 552
- AGI: 8 stories
- Accounting and Finance: 4 stories
- Agents and Copilots: 200 stories
- Alibaba: 14 stories
- Amazon: 2 stories
- Anthropic: 31 stories
- Apple: 1 story
- Audio: 8 stories
- Augmented Reality (AR/VR): 52 stories
- Autonomous Vehicles: 4 stories
- Benchmarks: 147 stories
- Business and Enterprise: 40 stories
- ByteDance: 3 stories
- Chips and Hardware: 18 stories
- DeepSeek: 2 stories
- Education: 19 stories
- Ethics/Legal/Security: 131 stories
- Google: 66 stories
- HuggingFace: 17 stories
- Images: 26 stories
- International: 43 stories
- Llama: 4 stories
- Locally Run: 62 stories
- Meta: 15 stories
- Microsoft: 10 stories
- Mobile: 4 stories
- Multimodal: 11 stories
- NVIDIA: 7 stories
- Open Source: 107 stories
- OpenAI: 208 stories
- Perplexity: 7 stories
- Podcasts/YouTube: 7 stories
- Publishing: 35 stories
- Qwen: 14 stories
- RAG: 7 stories
- Robotics Embodiment: 30 stories
- Science and Medicine: 14 stories
- Technical and Dev: 190 stories
- Video: 65 stories
- X: 28 stories
This Week’s Executive Summaries
Here’s everything you need to know about AI news for the week ending August 8, 2025.
ChatGPT is on track to reach 700 million weekly active users by the end of next week. That’s up from 500 million users at the end of March and four times the weekly users at this time last year.
The seven largest technology companies spent over $100 billion on AI infrastructure in the past three months; that’s more than all U.S. consumer spending combined in the last six months.
Meta is selling $2 billion in assets to help pay for its new data center infrastructure and growth.
OpenAI is reportedly working on a secondary sale that values the company at $500 billion.
OpenAI released GPT-5 to quite a bit of fanfare, but for most users the only visible difference was GPT chooses which model to use, without you having to tell it.
That said, feedback is pouring in with tons of videos and examples, if you’re interested, below.
Google released a world simulation model that creates entire explorable 3D worlds with just a prompt. This is going to be a big boost for robot training and potentially AR/VR, gaming, and videos. It’s almost “see it to believe it” level tech. Here are some examples:
Genie 3: A new frontier for world models – Google DeepMind
https://deepmind.google/discover/blog/genie-3-a-new-frontier-for-world-models/
Now, it’s worth spending some time talking about open source software and what it means to the AI race between the US and China. There are a ton of links below in the executive summary, but here are the main points:
Likely under advisement by the tech companies, the White House last week made open source AI development a national priority in order to keep up with the Chinese models. For a decade, the United States has led the open source development race, however in 2025 China has begun to dominate the strongest models. As of last week four out of the top five open source models were Chinese.
A petition is circulating called the American Truly Open Models project or ATOM. After Meta’s llama model lost its lead, the petition is pushing for consensus that frontier companies continue releasing leading open source options.
This week OpenAI released two strong, open source models that have reopened the debate of whether or not the US is back in the lead. Regardless of the U.S. status, the gap between the best closed frontier models and the leading open source models is now only 10%!
The OpenAI open source models are remarkable because they can be run on a MacBook Pro locally. This is a really big deal for privacy fanatics. Technical users are able to modify the code to fine tune their models as well.
One notable element of the two locally run open source models from OpenAI is that they have access to web browsing tools and can read websites and follow links between pages. Another harbinger of the death of the organic page view.
While the OpenAI open models are getting a lot of praise, some critics have noticed that the models seem to be tailored to beat benchmarks. While they excel at some tasks, they’re almost worthless in other common sense cases. We’ll know more in the coming weeks.
Hugging Face released a 200 page playbook on how companies can train their own private models using open source software. It’s a fantastic resource.
Former Google DeepMind researchers at the Reflection AI are in discussions to raise over $1 billion to build a new open source model.
In addition to open-source news last week, there have been a lot of ethical headlines that are worth walking through.
Anthropic cut off OpenAI’s access to Claude, claiming that OpenAI was violating their API terms.
Over 100 Nobel laureates, professors, whistleblowers and organizations have signed an open letter calling on OpenAI to provide more information about their corporate restructuring plans. OpenAI has been all over the map when it comes to their nonprofit versus for-profit status. It’s worth reading the open letter which is also in the summaries below.
OpenAI became an official vendor for US government agencies and will now provide ChatGPT access to all federal employees for just one dollar per year per agency. This means that every federal employee now has essentially a PhD at their fingertips, assuming they use it wisely and ethically. We are entering a new era of potential misuse and security risks, in addition to potential positive outcomes.
The Prime Minister of Sweden has been transparent that he often uses GPT for a second opinion when he has questions or ideas of what he should do. I’m happy that he is transparent about it because I do the same thing and I think everyone should use GPT for second opinions… however there’s also the risk of security breaches if the advice is too close to policy decisions.
North Carolina’s Department of the State Treasurer tested GPT for three months and dramatically reduced the time it takes for operational tasks. For example, a 90 minute audit was reduced to 30 minutes. Some tasks that used to take 20 minutes now only take 20 seconds. The study was conducted by North Carolina Central University and showed that 85% of participants had a positive experience and saved between 30 and 60 minutes every day.
As AI use becomes more common across consumers, as well as the government, experts are beginning to caution that the next election cycle is at risk for tremendous misinformation.
President Trump’s media company has integrated Perplexity’s AI search tools, allowing users to search the web with AI.
OpenAI put out a statement that they are dedicated to quality of user experience rather than engagement, attention, and addiction. I am unsure how I feel about OpenAI’s veracity or ability to be candid. Right now in my mind, they are somewhere between Google and Meta. That said I appreciate that they have made at least a gesture to try not to get people addicted to their product.
In speaking to my friends, I’ve discovered that I rarely chat with AI. Instead, I have a task I need to achieve and I execute it surgically. I’m still digesting the fact that people sit down and have a conversation with AI. That’s not anywhere near how I use it. I’m thankful for that to be honest.
Speaking of veracity and conflict of interests, I’ve been wary about the fact that Internet backbone provider Cloudflare is positioning themselves as an AI firewall and protector of privacy and training. Back in the day, Akamai sold their consumer tracking tools (Akamai ADS) to MediaMath, because they did not want to get into the business of arbitraging their position as a backbone and cache for additional profit.
Cloudflare’s marketing began by saying that they could block AI agents that were ignoring robots.txt. Then it evolved to attempting to enable commission structures for publisher payments. And now Cloudflare almost seems to be actively trying to block the efficiency of AI beyond just crawlers and training.
This has manifested in a battle between Cloudflare and Perplexity. Cloudflare argues that Perplexity is circumventing transparent web usage as an AI agent, but if I don’t feel like browsing the Internet and I want an agent to do it for me, I should have that right without Cloudflare trying to take a position on my behalf. This will become especially important as mundane tasks that would have usually required a lot of clicking around can now simply be delegated to an agent.
I think the agents will win handily in the long game, just as the internet ate up or transformed essentially every legacy business before it.
Conversely, yet very much related, Google is claiming that referrals from search engines to websites have remained stable year over year… even though everything I’ve read shows that traffic is down as much as 50%.
I can’t imagine how Google’s AI search overviews are driving traffic to websites. I trust Google about as far as I can throw them. And that’s from 20 years of working with them closely and being almost constantly disappointed.
Frontier models continue to position themselves as stewards for education.
Anthropic signed the Pledge To America’s Youth, which commits to advancing AI education.
Much like OpenAI launched study mode recently, Gemini launched guided learning to help students learn and study better and build deep understanding of subjects, including the why and how rather than just getting an answer.
Google also created an AI for education accelerator that provides free AI training and career certificates to college students in the United States. This includes a $1 billion commitment to AI literacy.
Students 18 and over can get Gemini Pro FREE for one year.
This week saw a lot of developments with agents and copilots:
Google launched Gemini DeepThink which is a consumer version of the thinking engine used to win the International Math Olympiad recently.
ByteDance released an open source agent for software engineering called Trae Agent.
ByteDance also released a math proof system that is breaking records on benchmarks. It performed 4x better than the rest of the leading models.
Google launched its coding copilot Jules. It’s too early to see if this is the Claude Code killer. I’m still using GPT for most things and Claude for debugging, because I’m not heavily into coding. Most of my serious coder friends use Claude in the command line.
Singapore start-up Manus announced Wide Research which empowers multiple agents across multiple models (as a wrapper). It’s supposedly incredibly powerful for dauntingly scaled tasks like researching all Fortune 100 companies at the same time.
Perplexity partnered with OpenTable to power agents to book reservations. This sort of small breakthrough will add up and change our world, once we can simply talk to our phones and get mundane challenges accomplished easily.
Andreessen Horowitz is backing voice agent platform, EliseAI, which makes AI voice agents for property management, and healthcare, at a $2B valuation!
ElevenLabs announced their redesigned interface for AI voice conversation agents. I’m very excited about this one.
Shopify continues to roll out conversational commerce agents.
Conversational agents will be common in a year or less, I would guess. It will be fun to see what Amazon and Apple do alongside these new competitors/APIs.
Anthropic released Opus 4.1 (up from 4) with improvements to coding and reasoning.
AI leaders are routinely (if not intentionally) aligned lately with their message of democratizing knowledge:
“someday soon something smarter than the smartest person you know will be running on a device in your pocket, helping you with whatever you want. this is a very remarkable thing.” -Sam Altman
“Something I think about a lot: who knows how many brilliant ideas never saw the light of day because “I don’t know how to do that.”” Pretty crazy to think that with AI everyone now has a reasonable VC advisor, coder, or professor on hand to teach you about anything you want” -Mustafa Suleyman
Last week I found myself sheepishly saying “I think Saas is dead”. I felt self-conscious.
Then to my surprise Sam Altman tweeted “entering the fast fashion era of SaaS very soon”.
Two weeks ago, Runway announced Aleph, a powerful video tool. More great examples have poured in, and a few are below (the rest are in the full executive summary section).
Also a few weeks ago, Google released their video tool Veo 3. Examples keep coming in and they are worth watching.
This week’s humanities reading is an excerpt from the end of Don Quixote by Miguel de Cervantes:
Here lies the noble fearless knight, Whose valor rose to such a height; When Death at last did strike him down, His was the victory and renown. He reck’d the world of little prize, And was a bugbear in men’s eyes; But had the fortune in his age To live a fool and die a sage.
Full Executive Summaries with Links, Generated by Claude 4
ChatGPT approaches 700 million weekly users in rapid growth
ChatGPT is set to reach 700 million weekly active users this week, marking a significant increase from 500 million users at the end of March and representing a fourfold growth compared to last year. The AI chatbot’s expanding user base reflects its growing adoption by individuals and teams who use it for learning, creative tasks, and problem-solving. This milestone demonstrates the rapid mainstream acceptance of conversational AI tools, as millions integrate ChatGPT into their daily workflows for various personal and professional applications.
This week, ChatGPT is on track to reach 700M weekly active users — up from 500M at the end of March and 4× since last year. Every day, people and teams are learning, creating, and solving harder problems. Big week ahead. Grateful to the team for making ChatGPT more useful and”” / X https://x.com/nickaturley/status/1952385556664520875
Tech giants’ AI spending now drives more economic growth than consumers
The seven largest technology companies spent over $100 billion on AI infrastructure like data centers in just the past three months, contributing more to U.S. economic growth in the last six months than all consumer spending combined. This massive investment in AI capabilities by companies like Microsoft, Google, and Amazon represents an unprecedented shift in what drives the American economy, as businesses race to build the computing power needed for artificial intelligence systems.
Christopher Mims 🤌 on X: “The AI infrastructure build-out is so gigantic that in the past 6 months, it contributed more to the growth of the U.S. economy than /all of consumer spending/ The ‘magnificent 7’ spent more than $100 billion on data centers and the like in the past three months *alone* 1/🧵 https://t.co/sHMK1zI0sP” / X https://x.com/mims/status/1951256592642441239
Meta plans to sell $2 billion in data center assets to share AI costs
Meta Platforms is selling $2 billion worth of data center land and construction assets to bring in partners who can help pay for the expensive infrastructure needed to run artificial intelligence systems. The company reclassified these assets as “held-for-sale” in June and expects to transfer them to a third party within the next year for joint data center development. This move represents a significant change for tech companies, which traditionally funded their own growth but now face enormous costs for AI infrastructure. Meta’s CEO Mark Zuckerberg has described plans to build AI data center “superclusters” that would each cover a significant portion of Manhattan’s footprint, requiring hundreds of billions of dollars in investment. The company raised its annual spending forecast to between $66 billion and $72 billion, though executives said AI-improved advertising is helping offset these rising infrastructure costs.
Meta to share AI infrastructure costs via $2 billion asset sale | Reuters https://www.reuters.com/business/meta-share-ai-infrastructure-costs-via-2-billion-asset-sale-2025-08-01/
OpenAI seeks secondary sale that would triple its valuation to $500 billion
OpenAI is reportedly negotiating a secondary sale that would value the company at $500 billion, a massive jump from its current $300 billion valuation and the $157 billion it achieved in last year’s funding round. The sale, which would allow current and former employees to sell their shares, comes as the company’s annual revenue has surged from $10 billion in June to $13 billion and is expected to reach $20 billion by year-end. The timing coincides with hints that OpenAI will announce GPT-5 tomorrow, with early testers reporting the model shows strong improvements in science, coding and math capabilities, though the leap won’t be as dramatic as from GPT-3 to GPT-4. The potential valuation reflects broader investor enthusiasm for AI companies, with competitors Anthropic, Mistral AI and Cohere also reportedly seeking major funding rounds at significantly higher valuations.
OpenAI reportedly in talks for secondary sale at $500B valuation – SiliconANGLE https://siliconangle.com/2025/08/06/openai-reportedly-talks-secondary-sale-500b-valuation/
US prioritizes open-source AI to compete with China’s growing influence
The White House has made open-source AI development a national priority after Chinese models like DeepSeek-R1 gained widespread adoption among American developers and researchers. China’s recent open-source AI releases have become the most popular models globally, with thousands of US companies now building on Chinese foundations rather than American ones. This represents a major shift from 2016-2020 when the US led in open-source AI development, but American tech giants have since locked their best models behind proprietary APIs. The administration recognizes that falling behind in open-source AI could mean losing the broader AI race, as open models drive faster innovation, allow for security auditing, and prevent vendor lock-in. Industry leaders are calling for renewed US commitment to open AI development through companies like Meta and research institutions to ensure American leadership in this critical technology.
Why open-source AI became an American national priority | VentureBeat https://venturebeat.com/ai/why-open-source-ai-became-an-american-national-priority/
US launches initiative to compete with China in open AI models
The United States has fallen behind China in developing and adopting open-source AI models, despite initially leading with Meta’s Llama system. The American Truly Open Models (ATOM) Project aims to address this gap by building support for US-based open AI development. Open models allow anyone to access, modify, and build upon AI technology, making them crucial for innovation and preventing any single country or company from controlling AI advancement. The project highlights growing concerns that America’s early advantage in accessible AI technology is slipping away to international competitors.
America needs to take open models more seriously. This summer the early lead in open model adoption of the US via Llama has been overtaken by Chinese models. With The American Truly Open Models (ATOM) Project we’re looking to build support and express the urgency of this issue. https://x.com/natolambert/status/1952370970762871102
I signed this because, despite worrying about misuse of open models more than most, I would like that to be the bottleneck rather than “”is it beneficial to big companies commercially/reputationally etc.”” There are many benefits to the US investing here. https://x.com/Miles_Brundage/status/1952400404668657966
very excited by the ATOM project”” / X https://x.com/finbarrtimbers/status/1952401883391520794
Meta’s Llama 4 struggles reshape global AI development landscape
The underperformance of Meta’s Llama 4 language model has triggered significant changes in the artificial intelligence industry. The model’s shortcomings have pushed open-source AI development leadership toward China, as Western companies struggle to maintain competitive alternatives. Many organizations that previously relied on running Llama models locally have been forced to switch to proprietary, closed-source systems due to the lack of viable upgrades. This shift has also intensified competition for AI talent in the United States, as companies scramble to recruit experts who can help them develop competitive models independently rather than relying on Meta’s open-source offerings.
The relative failure of Llama 4 turned out to be very consequential to the AI landscape. It led to the shifting the locus of open weights development to China, a move towards closed models as companies running local Llama couldn’t continue to upgrade, & big talent wars in the US.”” / X https://x.com/emollick/status/1951433537485500476
OpenAI’s new model narrows gap between open and closed AI systems
OpenAI’s latest release has sparked debate about whether open-source AI models are catching up to proprietary ones. While some argue that the US now has competitive open-weight models, others point out that Chinese open-source models still outperform Western alternatives in certain areas. The performance gap appears to be shrinking, with reports suggesting that advanced closed models like GPT-5 are only about 10% better at coding tasks than open-weight models that can run on consumer hardware. This development raises questions about whether OpenAI will continue releasing open models and what this means for the timeline of achieving artificial general intelligence, with some observers suggesting that if major companies like Anthropic don’t produce significantly better models soon, AGI may be further away than previously thought.
Did yesterday’s release shift the needle in the open vs. closed debate? Today in @ReedAlbergotti’s newsletter https://x.com/fdaudens/status/1953147586312872057
OpenAI / America is still ahead in the race”” -> no There is no western open-source model that beats or ties the best chinese open-source models.”” / X https://x.com/scaling01/status/1952900225120780705
The US now likely has the leading open weights models (or close to it)… … but the real question is whether this is a one-off situation from OpenAI, in which case the lead will evaporate quickly as others catch up. But also unclear what their incentives are to keep updating.”” / X https://x.com/emollick/status/1952836130958917894
It seems the closed-source vs open-weights landscape has been leveled. GPT-5 is just 10% better at coding than an open-weight model you can run on a consumer desktop and soon laptop. If Anthropic cannot come up with a good model, then we will probably not see AGI for a while.”” / X https://x.com/Tim_Dettmers/status/1953521836299350494
Reflection AI seeks billion-dollar funding to build open-source models
Reflection AI, a one-year-old startup founded by former DeepMind researchers, is reportedly in discussions to raise over $1 billion in funding. The company aims to develop open-source artificial intelligence models that would compete directly with established players like DeepSeek, Meta, and Mistral. This significant fundraising effort highlights the growing competition in the open-source AI space, where companies are racing to create powerful language models that developers can freely access and modify. The involvement of DeepMind veterans suggests the startup has strong technical expertise, while the massive funding target indicates investor confidence in the open-source AI market’s potential.
Late night scoop w/ @KevKubernetes @nmasc_: Reflection AI, the 1-yr-old startup founded by DeepMind researchers, is in talks to raise $1B+ as it looks to develop open-source models to compete with DeepSeek, Meta and Mistral: https://x.com/steph_palazzolo/status/1952555858761588892
OpenAI releases powerful open-source language models for local use
OpenAI has released gpt-oss, a collection of open-source language models that match the performance of their o4-mini model while running entirely on personal devices. The release includes two versions: a 120-billion parameter model that runs on high-end laptops and a smaller 20-billion parameter version that works on smartphones. These models represent a significant shift in AI accessibility, allowing users to run advanced language processing locally without relying on cloud services or internet connections. The models quickly gained traction in the developer community, reaching the top spot on Hugging Face, a popular AI model repository, within just two hours of release. This development marks an important step toward democratizing AI technology by giving users full control over powerful language models on their own hardware.
🚨 It’s official: OpenAI’s gpt-oss-120b & gpt-oss-20b just landed on Hugging Face! Brand new open-weight LLMs ready for anyone to try, fine-tune, and run anywhere. Here’s what makes this drop a big deal: https://x.com/fdaudens/status/1952781183575593234
And just like that, @OpenAI gpt-oss is now the number one trending model on @huggingface, out of almost 2M open models 🚀 People sometimes forget that they’ve already transformed the field: GPT-2, released back in 2019 is HF’s most downloaded text-generation model ever, and https://x.com/ClementDelangue/status/1952827283808375168
BREAKING: OpenAI just released two open-weight models: gpt-oss-120b and gpt-oss-20b. The 120B model is on par with o4-mini on reasoning benchmarks and can run on a single 80GB GPU. The 20B model achieves similar results to o3-mini and can run on edge devices with 16GB of https://x.com/rowancheung/status/1952777754904072566
Frontier models, capable of agentic reasoning, can now run on your Macbook Pro 🧑💻 @OpenAI’s release of GPT-OSS 20B and 120B are the biggest releases in open-source this year. Build agentic workflows with @llama_index that run 100% locally! Huge props to @LoganMarkewich and https://x.com/jerryjliu0/status/1952883595787239563
gpt-oss for entirely local tool use:”” / X https://x.com/gdb/status/1952802157956350221
gpt-oss https://gpt-oss.com/
gpt-oss is a big deal; it is a state-of-the-art open-weights reasoning model, with strong real-world performance comparable to o4-mini, that you can run locally on your own computer (or phone with the smaller size). We believe this is the best and most usable open model in the”” / X https://x.com/sama/status/1952778518225723434
gpt-oss is out! we made an open model that performs at the level of o4-mini and runs on a high-end laptop (WTF!!) (and a smaller one that runs on a phone). super proud of the team; big triumph of technology.”” / X https://x.com/sama/status/1952777539052814448
gpt-oss-120b & gpt-oss-20b Model Card | OpenAI https://openai.com/index/gpt-oss-model-card/
Introducing gpt-oss | OpenAI https://openai.com/index/introducing-gpt-oss/
Just released gpt-oss: state-of-the-art open-weight language models that deliver strong real-world performance. Runs locally on a laptop! https://x.com/gdb/status/1952780717638942910
Open models by OpenAI | OpenAI https://openai.com/open-models/
Well, it took just 2 hours for OSS-GPT to hit #1 on @huggingface. Don’t remember seeing anything rise that fast! https://x.com/fdaudens/status/1952814865795698954
A hypothesis: gpt-oss is trained entirely on synthetic data, from pre-training to post-training. The approach enhances safety and helps smaller models achieve better performance.”” / X https://x.com/huybery/status/1952905224890532316
attention is 0.84% of gpt oss, intelligence is stored in those 99.16% mlp layer, attn is key to unlock it https://x.com/shxf0072/status/1953143243992166849
curious about the training data of OpenAI’s new gpt-oss models? i was too. so i generated 10M examples from gpt-oss-20b, ran some analysis, and the results were… pretty bizarre time for a deep dive 🧵 https://x.com/jxmnop/status/1953899426075816164
Everyone is sleeping on AMD for local models – gpt-oss 20B running on an AMD GPU @ 52 tok/sec in a <$1000 laptop https://x.com/dzhng/status/1953132623280165193
GPT-OSS-120B casually calculating the product of two random 30-digit numbers. without any tools, just 18k tokens https://x.com/scaling01/status/1952892387539259455
I think gpt-oss was always expected to be put in an agent harness that uses search for all its world knowledge. Ive always argued this is not a valid replacement, the rich connections it builds from actual backprop on the worlds knowledge – not just facts, but the aggregate”” / X https://x.com/Teknium1/status/1953230352568467761
I’m thrilled @OpenAI has released two open weight models. Thank you to all my friends at OpenAI for this gift! I’m also encouraged that from my quick tests gpt-oss-120b looks strong (though we should still wait for rigorous 3rd party evals).”” / X https://x.com/AndrewYNg/status/1952838045235126510
I’ve written the full story of Attention Sinks — a technical deep-dive into how the mechanism was developed and how our research ended up being used in OpenAI’s new OSS models. For those interested in the details: https://x.com/Guangxuan_Xiao/status/1953656755109376040
ICYMI: you can vibe test the latest gpt-oss models on gpt-oss[.]com 💥 We partnered with @OpenAI to bring easy access to the model right down to a browser near you! https://x.com/reach_vb/status/1953041435999010916
Ollama and @nvidia collaborate to accelerate gpt-oss on GeForce RTX and RTX PRO GPUs. NVIDIA and Ollama are advancing their partnership to boost model performance on NVIDIA GeForce RTX and RTX PRO GPUs. This collaboration enables users on RTX-powered PCs to accurately leverage https://x.com/ollama/status/1952782326926328313
Our new @OpenAI open models https://x.com/polynoamial/status/1952778238368887184
RT @ggerganov: Llama.cpp supports the new gpt-oss model in native MXFP4 format The ggml inference engine (powering llama.cpp) can run the…”” / X https://x.com/ggerganov/status/1952978670328660152
RT @OpenAIDevs: Student credits for gpt-oss With @huggingface, we’re offering 500 students $50 in inference credits to explore gpt-oss.…”” / X https://x.com/reach_vb/status/1953010091377958984
RT @satyanadella: Excited to bring OpenAI’s gpt-oss models to Azure AI Foundry and to Windows via Foundry Local. It’s hybrid AI in action:…”” / X https://x.com/xikun_zhang_/status/1952902211278913629
Thank you @OpenAI for open-sourcing these great models! 🙌 We’re proud to be the official launch partner for gpt-oss (20B & 120B) – now supported in vLLM 🎉 ⚡ MXFP4 quant = fast & efficient 🌀 Hybrid attention (sliding + full) 🤖 Strong agentic abilities 🚀 Easy deployment 👉🏻”” / X https://x.com/vllm_project/status/1952784530466849091
We fixed some issues for @OpenAI’s gpt-oss model! 1. Jinja template has extra \n s, didn’t parse thinking sections + tool calling wasn’t rendered correctly 2. Some versions miss <|channel|>final -> this is a must! 3. F16 infs: use F32+BF16! We made a few free Colab notebooks as https://x.com/danielhanchen/status/1953901104150065544
OpenAI models gain web browsing and code execution abilities
OpenAI has enhanced its GPT models with two built-in tools that significantly expand their capabilities. The models can now browse the web to search for information, read websites, follow links between pages, and cite their sources – similar to how a person would research topics online. They also include an interactive Python notebook that allows the models to write and run code directly, enabling them to perform calculations, analyze data, and create visualizations. These additions transform the models from text-only systems into more versatile assistants that can gather real-time information and solve computational problems, making them more practical for tasks like research, fact-checking, and technical analysis.
The gpt-oss models have been post-trained to use two specific first-party tools: 1. a web browser that can search, read pages, follow links, and cite sources 2. an interactive python notebook This will give gpt-oss based agents super powerful capabilities out of the box! https://x.com/corbtt/status/1952810876165312805
OpenAI’s GPT-OSS model shows mixed performance in early testing
Early users of OpenAI’s new GPT-OSS-120B model report inconsistent performance, with the system excelling at mathematical problems and benchmarks while struggling with practical tasks like coding and creative writing. The model scored 41.8% on the Aider Polyglot coding test, significantly below competitors like Kimi-K2 (59.1%) and DeepSeek-R1 (56.9%), and only slightly better than the much smaller Qwen3 32B model (40.0%). Users describe erratic behavior where the model switches between professional-level coding and making up basic facts that it refuses to correct, leading some to question its real-world usefulness beyond academic benchmarks. The consensus among early testers suggests the model lacks common sense and practical judgment despite its strong performance on standardized tests.
GPT-OSS models seem to be slopmaxxed on math/coding and reasoning – they are great at that but they completely lack taste and common sense at least that’s my vibe so far”” / X https://x.com/scaling01/status/1952881329772564764
holy shit get ready for a hallucination fiesta with gpt-oss https://x.com/scaling01/status/1952781018554933261
I was just about to make a post that GPT-OSS-120B is nontheless an overall good for the very low end. But I honestly don’t know what it is good at, except benchmarks. Coding seems to suck, creative writing is terrible… So it’s just a math model? https://x.com/scaling01/status/1953047913954791696
i’ve spent the last couple hours talking to gpt-oss and can safely say it’s unlike any model i’ve tested one second it’s coding for me at a professional level, the next it’s making up basic facts and clinging to them no matter what i say something very strange is going on”” / X https://x.com/jxmnop/status/1953216881361600729
Is it over for gpt-oss ? What are these Aider Polyglot scores? https://x.com/scaling01/status/1952780629772321257
It’s looking bad bois.. Aider Polyglot results for GPT-OSS-120B: 41.8% for comparison: Kimi-K2: 59.1% DeepSeek-R1: 56.9% Qwen3 32B: 40.0% https://x.com/scaling01/status/1953047534122713130
Anthropic cuts off OpenAI’s access to Claude API over terms violations
Anthropic revoked OpenAI’s API access to its Claude models this week, citing violations of terms of service that prohibit using the service to build competing products or train rival AI models. According to sources, OpenAI staff had been using Claude’s coding tools through developer access to evaluate its capabilities against their own models and test safety responses, particularly as OpenAI reportedly prepares to release GPT-5 with improved coding abilities. While OpenAI called the benchmarking “industry standard” and noted their API remains available to Anthropic, the AI company said it would continue providing OpenAI access specifically for safety evaluations and benchmarking purposes. The move follows a pattern of tech companies restricting competitor access to their services, including Anthropic’s recent cutoff of AI coding startup Windsurf after rumors of a potential OpenAI acquisition.
@aidan_mclau This take is sad to see but you might not have full context. We cut OpenAl’s access for violating our APl terms and for the heavy usage of Claude Code among OAI tech staff. We’re going to continue providing API access for safety evals and benchmarking. That’s important to us. https://x.com/sammcallister/status/1951642025381511608
Anthropic Revokes OpenAI’s Access to Claude | WIRED https://www.wired.com/story/anthropic-revokes-openais-access-to-claude/
Nobel laureates and experts demand transparency from OpenAI about restructuring
Over 100 Nobel laureates, professors, whistleblowers, public figures, artists, and nonprofit organizations have signed an open letter calling on OpenAI to provide clear information about its corporate restructuring plans. The group is requesting that the artificial intelligence company be transparent about changes to its organizational structure, which could affect its original nonprofit mission and governance. The letter represents growing concern among academics and civil society about how OpenAI’s evolution from a nonprofit research organization to a more commercially-focused entity might impact its commitment to developing AI safely and for the benefit of humanity.
🚨 Breaking: A group of 100+ Nobel laureates, professors, whistleblowers, public figures, artists, and nonprofit organizations just released a letter asking OpenAI to tell the truth about its restructuring. Here’s what they had to say: 🧵 https://x.com/TheMidasProj/status/1952326634981543979
The Midas Project on X: “🚨 Breaking: A group of 100+ Nobel laureates, professors, whistleblowers, public figures, artists, and nonprofit organizations just released a letter asking OpenAI to tell the truth about its restructuring. Here’s what they had to say: 🧵 https://t.co/zhIccjnWU4” / X https://x.com/TheMidasProj/status/1952326634981543979
OpenAI becomes official vendor for US government agencies
OpenAI has secured approval as an official AI vendor for the U.S. government and will provide ChatGPT access to all federal employees through a partnership with the Government Services Administration. The company is offering the AI assistant to federal agencies for just $1 per year per agency, making the technology available across the entire federal workforce. This arrangement aims to help government employees use AI tools for their work while maintaining the privacy and security standards required for government operations. The partnership represents a significant expansion of AI adoption in the public sector, potentially affecting how millions of federal workers complete their daily tasks.
America’s hardest problems need the world’s most capable AI. OpenAI is now officially an approved U.S. Government AI vendor. We’re bringing privacy, security, and innovation to the nation’s most critical missions. 🇺🇸 https://x.com/cryps1s/status/1952749787994112275
In partnership with the Government Services Administration, we are providing ChatGPT to the entire U.S. federal workforce for essentially no cost for the next year. https://x.com/gdb/status/1953120865115074805
OpenAI for the U.S. government:”” / X https://x.com/gdb/status/1952756538399228091
Providing ChatGPT to the entire U.S. federal workforce | OpenAI https://openai.com/index/providing-chatgpt-to-the-entire-us-federal-workforce/
we are providing ChatGPT access to the entire federal workforce! (for $1 a year per agency) https://x.com/sama/status/1953103336044990779
Federal agencies grapple with employee AI adoption challenges
A new analysis reveals that many government employees are already using AI tools, often without official guidance or oversight. This widespread informal adoption raises critical questions about how federal agencies will manage these technologies to improve services rather than create new problems. The situation highlights a gap between grassroots AI usage by staff and the need for leadership to establish proper frameworks, training, and innovation labs within agencies. Without coordinated strategies from agency leadership and dedicated innovation teams, the benefits of AI could be lost to inefficiency, security risks, or misuse.
The giant question is: now that The Crowd in government has access to AI tools (which, given representative surveys, many were already using) how are they going to be used to make things better, not worse? Where are Leadership & The Lab inside agencies? https://x.com/emollick/status/1953118449611272575
Swedish Prime Minister adopts ChatGPT for government communications and tasks
Sweden’s Prime Minister has begun using ChatGPT to assist with various government functions, marking one of the first instances of a national leader publicly adopting AI tools for official duties. The integration includes using the AI assistant for drafting communications, analyzing policy documents, and streamlining administrative tasks. This move reflects growing acceptance of AI technology in government operations and could influence how other world leaders approach digital transformation. While specific details about security measures and usage guidelines remain limited, the adoption signals a shift toward AI-assisted governance in democratic nations.
ChatGPT for helping the Swedish Prime Minister:”” / X https://x.com/gdb/status/1952111193868673335
North Carolina state employees cut task times from minutes to seconds with AI
North Carolina’s Department of State Treasurer tested ChatGPT for three months and found it dramatically reduced work time for public employees. Tasks that previously took 20 minutes were completed in 20 seconds, while a 90-minute audit review was cut to 30 minutes. The independent study by N.C. Central University showed 85% of participating employees had positive experiences and saved 30-60 minutes daily. Employees used the AI tool to draft communications, summarize long documents, translate technical information into plain language, and explore new problem-solving approaches. The report emphasized that ChatGPT enhanced rather than replaced human judgment, with employees applying their expertise to refine AI-generated results.
ChatGPT for speeding up North Carolina public servants (e.g. reducing some tasks from 20 minutes to 20 seconds): https://x.com/gdb/status/1951376444363514100
State Treasurer Briner: “”OpenAI Report Shows Many Benefits, Offers Great Promise”” | NC Treasurer https://www.nctreasurer.gov/news/press-releases/2025/08/01/state-treasurer-briner-openai-report-shows-many-benefits-offers-great-promise
AI technology poses major risks for 2028 presidential election
A technology expert warns that artificial intelligence tools could dramatically impact the 2028 U.S. presidential election in unprecedented ways. The concern centers on how AI could be used to spread misinformation, create convincing fake content, or manipulate voters at a massive scale. The expert suggests that current safeguards and public awareness are insufficient to handle these emerging threats, calling for immediate discussions about protective measures. This warning reflects growing anxiety among technologists about AI’s potential to disrupt democratic processes, particularly as the technology becomes more sophisticated and accessible over the next few years.
honestly scared about the power and scale of ai technologies that’ll be used in the upcoming 2028 presidential election. it could be a civilizational turning point. we aren’t ready. we should probably start preparing, or at least talking about how we could prepare.”” / X https://x.com/DavidSHolz/status/1952541453491867792
Truth Social partners with Perplexity to add AI search capabilities
Donald Trump’s media company has integrated Perplexity’s AI search technology into Truth Social, allowing users to search the web directly from the platform’s browser version. The partnership, announced Wednesday, is currently in public beta testing. Trump Media CEO Devin Nunes, who also chairs the President’s Intelligence Advisory Board, described the addition as strengthening Truth Social’s role in what he called the “Patriot Economy.” The integration marks Truth Social’s entry into AI-powered search features, following the trend of social media platforms incorporating artificial intelligence tools to enhance user experience.
Trump Is Launching an AI Search Engine Powered by Perplexity https://www.404media.co/trump-is-launching-an-ai-search-engine-powered-by-perplexity/
OpenAI designs ChatGPT to support users’ wellbeing and productivity (GARGBAGE SUMMARY)
OpenAI is reshaping ChatGPT to be a tool that enhances users’ lives rather than capturing their attention. The company has introduced features to help during difficult times, added break reminders to prevent overuse, and is developing improved life advice capabilities. These changes are being guided by expert input to ensure the AI assistant supports users in achieving their goals while maintaining healthy usage patterns. The focus represents a shift toward building AI that prioritizes user wellbeing over engagement metrics.
We build ChatGPT to help you thrive in the ways you choose — not to hold your attention, but to help you use it well. We’re improving support for tough moments, have rolled out break reminders, and are developing better life advice, all guided by expert input.”” / X https://x.com/OpenAI/status/1952414411131671025
What we’re optimizing ChatGPT for | OpenAI https://openai.com/index/how-we’re-optimizing-chatgpt/
Perplexity challenges Cloudflare’s stance on AI agents and web access
Perplexity has issued a strong response to Cloudflare’s position on AI agents accessing websites, arguing that AI agents are simply extensions of human users and should be treated as such. The dispute centers on whether AI agents should have different access rights than human users when browsing the web. Perplexity’s rebuttal suggests that Cloudflare’s leadership either misunderstands fundamental AI concepts or is taking a stance that prioritizes appearance over substance, with the company stating that Cloudflare’s position shows they are “more flair than cloud.”
RT @balajis: Good rebuttal to Cloudflare by Perplexity. The core point is that an AI agent is just an extension of a human. So when it mak…”” / X https://x.com/jeremyphoward/status/1952818615578968265
The bluster around this issue reveals that Cloudflare’s leadership is either dangerously misinformed on the basics of AI, or simply more flair than cloud.”” / X https://x.com/perplexity_ai/status/1952532113095643185
Google defends AI search amid publisher traffic decline concerns
Google is pushing back against reports that its AI search features are harming website traffic, claiming that overall clicks from its search engine to websites have remained “relatively stable” year-over-year. The company’s VP of Search, Liz Reid, argues that while some sites are seeing decreased traffic, others are gaining, with users increasingly seeking out forums, videos, and social content for authentic perspectives. However, Google hasn’t provided specific data to support these claims, and independent studies show concerning trends – one report found that news searches resulting in zero clicks to publishers grew from 56% to 69% between May 2024 and May 2025. The company acknowledges that user behavior is shifting, with younger users often starting searches on TikTok, Instagram, or Reddit instead of Google, suggesting that changes in web traffic patterns may reflect broader shifts in how people use the internet rather than just the impact of AI features.
Google denies AI search features are killing website traffic | TechCrunch https://techcrunch.com/2025/08/06/google-denies-ai-search-features-are-killing-website-traffic/
DeepSeek releases comprehensive guide for companies to build custom AI models *WRONG*. IT’S HUGGINGFACE. ANOTHER BAD SUMMARY – I LEAVE THEM FOR EXAMPLES
DeepSeek has published a 200-page “Ultra-Scale Playbook” that teaches companies how to train their own large language models similar to DeepSeek R1, Llama, or GPT-5. The guide covers advanced technical concepts like 5D parallelism, which helps distribute the massive computational workload across multiple processors. The company argues that just as every tech company writes its own software code, they should also be able to create their own AI models, viewing artificial intelligence as the next evolution of software development. This democratization of AI training knowledge could enable more organizations to develop specialized models tailored to their specific needs rather than relying solely on general-purpose models from major tech companies.
Every tech company can and should train their own deepseek R1, Llama or GPT5, just like every tech company writes their own code (and AI is no more than software 2.0). This is why we’re releasing the Ultra-Scale Playbook. 200 pages to master: – 5D parallelism (DP, TP, PP, EP, https://x.com/ClementDelangue/status/1952048356710039700
Gray Swan launches $500K challenge to test AI model safety **ALSO WRONG- IT’S OPENAI**
Gray Swan has announced a $500,000 competition inviting researchers and developers worldwide to find security vulnerabilities in their newly released open-source AI model. Participants will search for novel risks and potential weaknesses in the system, with their findings reviewed by experts from major AI companies including OpenAI, Anthropic, Google, UK AISI, and Apollo. The challenge aims to improve AI safety by identifying and fixing problems before they can cause harm, with the company using the results to strengthen their model’s defenses against misuse or unexpected behaviors.
Red teamers assemble! ⚔️💰 We’re putting $500K on the line to stress‑test just released open‑source model. Find novel risks, get your work reviewed by OpenAI, Anthropic, Google, UK AISI, Apollo, and help harden AI for everyone.”” / X https://x.com/woj_zaremba/status/1952886644090241209
We’re launching a $500K Red Teaming Challenge to strengthen open source safety. Researchers, developers, and enthusiasts worldwide are invited to help uncover novel risks—judged by experts from OpenAI and other leading labs. https://x.com/OpenAI/status/1952818694054355349
Tech companies pledge to expand AI education for American students
Over 100 organizations have committed to a new initiative called the Pledge to America’s Youth, which aims to teach AI and cybersecurity skills to students across the United States. The participating companies will partner with schools, educators, and local communities to develop programs that help young people understand and work with artificial intelligence technology. This effort addresses growing concerns that students need these technical skills to prepare for future jobs, as AI becomes increasingly important in many industries. The pledge represents one of the largest coordinated efforts to date to ensure American students have access to AI education, though specific details about funding, curriculum, and implementation timelines have not been announced.
We joined the Pledge to America’s Youth along with 100+ organizations committed to advancing AI education. We’ll work with educators, students, and communities nationwide to build essential AI and cybersecurity skills for the next generation.”” / X https://x.com/AnthropicAI/status/1953864587192770921
Google introduces Guided Learning feature in Gemini for deeper understanding
Google has launched Guided Learning in Gemini, a new educational feature that acts as a personal learning companion rather than just providing quick answers. The tool uses open-ended questions, step-by-step breakdowns, and multimodal content including images, videos, and quizzes to help users actively engage with subjects and build deep understanding. Developed in partnership with educators and learning experts since 2022, the feature is powered by LearnLM models that incorporate educational research and learning science principles. Students can use it for exam preparation, writing papers, or exploring personal interests, while teachers can easily share it through Google Classroom to encourage critical thinking. The feature creates a judgment-free conversational space where learners can explore topics at their own pace, representing Google’s shift from simply answering questions to fostering genuine comprehension and skill development.
Guided Learning in Gemini: From answers to understanding https://blog.google/outreach-initiatives/education/guided-learning/
Here are new @GeminiApp tools to help you learn, understand and study better this school year ✏️ – Guided Learning helps you build a deep understanding of subjects, with step-by-step breakdowns that uncover the “why” and “how” – Gemini’s responses automatically integrate https://x.com/Google/status/1953143185011617891
Google commits one billion dollars to AI education initiatives
Google has announced a major education initiative that will provide free AI training and Google Career Certificates to college students across the United States through its new AI for Education Accelerator program. The tech giant is committing $1 billion over the next three years to support AI literacy programs, research, and other educational efforts. This investment aims to prepare students for careers in artificial intelligence by giving them access to professional training and certification programs at no cost. The initiative represents one of the largest corporate investments in AI education to date and could help address the growing demand for workers with AI skills across industries.
New: The Google AI for Education Accelerator will provide free AI training & Google Career Certificates to college students in the U.S. We’re also committing $1 billion to AI literacy, research and more over the next 3 years → https://x.com/Google/status/1953126394847768936
ChatGPT launches study mode to help adults relearn math
OpenAI has introduced a new Study Mode feature in ChatGPT designed to help users learn subjects like algebra through interactive tutoring. The mode acts as a personal tutor, breaking down complex math concepts into manageable steps and providing practice problems with detailed explanations. Users can ask questions, work through examples at their own pace, and receive personalized feedback on their work. The feature aims to make learning math less intimidating for adults who struggled with it in school or need to refresh their knowledge. Early users report that the conversational approach and patience of the AI tutor helps them understand concepts they previously found difficult.
ChatGPT study mode for learning algebra as an adult:”” / X https://x.com/gdb/status/1951792801143980238
I bombed algebra in high school. ChatGPT’s new Study Mode is my redemption arc 😅 https://x.com/sharongoldman/status/1950988509352743014
Google releases Deep Think for Gemini app subscribers
Google has launched Deep Think, an advanced AI feature for Google AI Ultra subscribers that uses parallel thinking techniques to solve complex problems. The tool, available through the Gemini app, represents an improved version of technology that achieved gold-medal performance at the International Mathematical Olympiad. Deep Think excels at tasks requiring creativity and strategic planning, including web development, scientific research, and coding challenges. The system works by extending “thinking time” to explore multiple ideas simultaneously before arriving at optimal solutions. Google is also providing select mathematicians access to the full competition-level model for research purposes. Users can activate Deep Think through a toggle in the Gemini app’s prompt bar, with the feature automatically integrating tools like code execution and Google Search.
“”Claude Code can now handle long-running tasks in the background. Start your dev server, run tests, or build your project without blocking your workflow https://x.com/_catwu/status/1953926541370630538
ByteDance launches open source AI agent for software engineering
ByteDance, the Chinese company behind TikTok, has released Trae Agent, an AI tool that helps developers write and manage software code through simple English commands. The system uses large language models to understand what users want to build and can handle complex programming tasks through an interactive command-line interface. It works with popular AI services from OpenAI and Anthropic, and the company has made the entire codebase freely available as open source software, allowing anyone to use, modify, or improve it.
Gemini 2.5 Deep Think is state-of-the-art performance across many challenging benchmarks!”” / X https://x.com/demishassabis/status/1951468051578142848
Gemini 2.5: Deep Think is now rolling out https://blog.google/products/gemini/gemini-2-5-deep-think/
Gemini Deep Think, our SOTA model with parallel thinking that won the IMO Gold Medal 🥇, is now available in the Gemini App for Ultra subscribers!! Should we put it in the Gemini API next? https://x.com/OfficialLoganK/status/1951260803459338394
not enough people are talking about the delta between the parallel thinking uplifts of oai vs gdm AIME o3 pro: +3% (from 90->93 on 2024) deep think: +11.2% (from 88->99.2 on 2025) Knowledge o3 pro: +3% (on GPQA) deep think: +13.2% (on HLE) Coding o3 pro: +9.1% (on Codeforces https://x.com/swyx/status/1951460518293807241
Played with Deep Think, a dramatic improvement for Google. It is getting close to O3 Pro – I’d say, a solidly second best model right now. Far less verbose! With limits of about 10 a day, not ready for the professional use, though. https://x.com/MParakhin/status/1952028947153371631
Google launches AI coding assistant Jules with new pricing tiers
Google has officially launched Jules, its AI-powered coding assistant, after a two-month beta period that saw thousands of developers complete over 140,000 code improvements. The tool, which runs on Gemini 2.5 Pro, works differently from competitors by operating asynchronously – meaning developers can assign it tasks and walk away while it clones repositories, analyzes code, and implements fixes in the background. Google introduced a free tier limited to 15 daily tasks, with paid plans at $19.99 and $124.99 monthly offering higher limits. During beta testing, 45% of the 2.28 million visits came from mobile devices, prompting Google to explore mobile-specific features. The company also clarified its privacy policy, confirming that private repository data won’t be used for AI training, while public repository data may be used.
China’s ByteDance just released an LLM-based agent for general purpose software engineering tasks. Trae Agent comes with an interactive CLI that can execute complex workflows using simple English prompts. It works with OpenAI and Anthropic API. 100% opensource. https://x.com/Saboo_Shubham_/status/1942047679758151783
Manus launches Wide Research for large-scale parallel computing tasks
Manus AI has released Wide Research, a feature that allows users to control multiple AI agents working in parallel to handle complex research tasks. The system lets users analyze hundreds of items simultaneously – like comparing Fortune 500 companies or MBA programs – through simple chat interactions. Unlike traditional multi-agent systems with fixed roles, each agent in Wide Research is a full Manus instance that can adapt to any task. The feature runs on Manus’s cloud computing platform, which provides each user session with a dedicated virtual machine. Wide Research is initially available to Pro subscribers, with plans to expand to other tiers. The company positions this as the first step in making supercomputing power accessible to non-technical users through conversational interfaces.
Google’s AI coding agent Jules is now out of beta | TechCrunch https://techcrunch.com/2025/08/06/googles-ai-coding-agent-jules-is-now-out-of-beta/
ByteDance’s SeedProver sets new benchmark for mathematical problem solving
ByteDance has released SeedProver, an AI system that achieved record-breaking performance on PutnamBench, a challenging mathematical problem-solving test. The model correctly solved 331 out of 657 problems, nearly four times better than previous leading systems, and maintained strong performance even with limited computing resources, solving 201 problems under lightweight conditions. SeedProver outperformed DeepMind’s AlphaGeometry2 and achieved perfect scores on certain mathematical tasks, demonstrating significant progress in AI’s ability to tackle complex mathematical reasoning that typically challenges even advanced mathematics students.
Introducing Wide Research https://manus.im/blog/introducing-wide-research
Perplexity adds restaurant booking through OpenTable partnership
Perplexity, the AI-powered search engine, has partnered with OpenTable to let users make restaurant reservations directly within its platform. This integration means people can search for restaurants and book tables without leaving Perplexity’s interface, combining the AI assistant’s ability to answer dining-related questions with immediate booking capabilities. The feature streamlines the process of finding and reserving restaurants by eliminating the need to switch between different apps or websites.
ByteDance dropped SeedProver. This model scored 331/657 on PutnamBench (nearly 4× better than the previous state of the art) and 201/657 under lightweight inference (pass@64‑256 equivalent). Its reported performance surpasses DeepMind’s AlphaGeometry2 and achieves 100% on https://x.com/cgeorgiaw/status/1952301113446699347
Anthropic upgrades Claude Opus with improved coding and reasoning abilities
Anthropic released Claude Opus 4.1, an update to their AI model that improves performance on coding tasks, research, and reasoning. The model achieved 74.5% accuracy on SWE-bench Verified, a benchmark that tests AI systems’ ability to solve real software engineering problems. Companies testing the update reported significant improvements, with GitHub noting better multi-file code refactoring capabilities and Rakuten Group highlighting the model’s precision in debugging large codebases without introducing new errors. Windsurf found the performance jump from Opus 4 to 4.1 was comparable to the improvement seen between previous major model versions. The update is available through Claude’s paid services and various cloud platforms at the same price as the previous version, with Anthropic recommending all users upgrade from Opus 4 to take advantage of the enhanced capabilities.
Need a table? Just ask. Perplexity is partnering with @OpenTable to bring restaurant reservations directly into Perplexity products. https://x.com/perplexity_ai/status/1952434779036774488
OpenAI CEO predicts pocket-sized AI will surpass human intelligence
OpenAI CEO Sam Altman has stated that artificial intelligence systems more capable than the smartest humans will soon run on personal devices like smartphones. In a recent social media post, Altman described this development as “very remarkable,” suggesting that these advanced AI assistants will be able to help users with any task they need. While he didn’t provide a specific timeline, his use of “someday soon” indicates he believes this breakthrough in portable AI technology is approaching rapidly. The prediction represents a significant leap from current AI capabilities, which typically require substantial computing power and internet connectivity to operate at high levels.
Claude Opus 4.1 (“”claude-leopard-v2-02-prod””) “”Opus 4.1 is here – Try our latest model for more problem solving power.”””” / X https://x.com/btibor91/status/1952366658326036781
Claude Opus 4.1 \ Anthropic https://www.anthropic.com/news/claude-opus-4-1
Claude Opus 4.1 beats GPT-5 on SWE bench https://x.com/Sauers_/status/1953504854044704973
Claude Opus 4.1 is available in Cursor! Let us know what you think.”” / X https://x.com/cursor_ai/status/1952782293925298655
Going live with the fellas @tbpn in an hour to talk about Opus 4.1 and Claude Code”” / X https://x.com/alexalbert__/status/1952801100299681959
AI tools democratize access to expert knowledge and skills
The widespread availability of AI assistants is breaking down traditional barriers to learning and creation. Where people once abandoned promising ideas due to lack of technical knowledge or access to experts, they can now tap into AI tools that serve as on-demand advisors, programmers, and teachers. This shift means that individuals no longer need formal training or personal connections to explore new fields, build prototypes, or understand complex topics. The technology essentially provides everyone with a knowledgeable mentor available 24/7, potentially unlocking countless innovations that would have otherwise remained unrealized due to knowledge gaps or resource constraints.
someday soon something smarter than the smartest person you know will be running on a device in your pocket, helping you with whatever you want. this is a very remarkable thing.”” / X https://x.com/sama/status/1952879515287601465
AI tools are becoming as disposable as fast fashion
The software industry is experiencing a fundamental shift where AI-powered tools are being created, used briefly, and discarded at an unprecedented pace, similar to how fast fashion operates. This trend suggests that software-as-a-service (SaaS) products are moving away from long-term subscriptions toward temporary, single-purpose applications that users adopt for specific tasks and quickly abandon. The comparison to fast fashion indicates concerns about sustainability, quality, and the environmental impact of rapidly cycling through digital tools, while also highlighting how AI has lowered the barriers to creating new software products so dramatically that developers can pump out applications as quickly as clothing retailers produce new styles.
Something I think about a lot: who knows how many brilliant ideas never saw the light of day because “”I don’t know how to do that.”” Pretty crazy to think that with AI everyone now has a reasonable VC advisor, coder, or professor on hand to teach you about anything you want”” / X https://x.com/mustafasuleyman/status/1951323569905934427
Shopify launches AI shopping tools for conversational commerce platforms
Shopify has released new tools that allow AI assistants to search for products and complete purchases directly within chat conversations. The system works through three main components: a catalog search that finds products across Shopify merchants, a universal shopping cart that collects items from multiple stores, and a checkout system that processes payments while maintaining both the merchant’s and AI platform’s branding. The tools handle complex product variations like subscriptions and bundles automatically, and meet standard compliance requirements including GDPR and payment security standards. Currently in early access, developers must apply for approval to integrate these shopping capabilities into their AI applications.
entering the fast fashion era of SaaS very soon”” / X https://x.com/sama/status/1952084574366032354
AI voice startup EliseAI reaches $2 billion valuation with Andreessen backing
EliseAI, a company that develops AI voice agents for property management and healthcare industries, has secured funding from venture capital firm Andreessen Horowitz at a $2 billion valuation. The investment highlights growing investor interest in AI voice technology, which allows businesses to automate phone conversations with customers. EliseAI’s software can handle tasks like scheduling apartment tours, answering tenant questions, and managing healthcare appointments through natural-sounding phone conversations. The significant valuation reflects strong demand for voice AI solutions that can reduce staffing costs while maintaining customer service quality in industries that rely heavily on phone communications.
Agentic commerce has arrived https://shopify.dev/docs/agents
ElevenLabs launches redesigned interface for AI voice conversation agents
ElevenLabs has unveiled a completely redesigned Conversational Agents page that showcases their latest AI voice technology. The new interface represents a significant shift in the company’s brand direction and provides users with access to more advanced AI agents capable of natural voice conversations. The redesign emphasizes the growing importance of voice-based AI interactions and positions ElevenLabs’ agents as more sophisticated tools for businesses and developers looking to integrate conversational AI into their products.
Earlier this summer, we told you that AI voice agents are hot. For an idea of just how hot: Andreessen Horowitz is backing EliseAI, which makes AI voice agents for property mgmt + healthcare, at a $2B valuation. w/ @srimuppidi @coryweinberg https://x.com/steph_palazzolo/status/1952740505747382364
OpenAI launches GPT-5 with built-in reasoning capabilities
OpenAI released GPT-5, which automatically decides when to think deeply about complex problems versus responding quickly to simple questions. The system uses a smart router that chooses between a fast model for basic tasks and a deeper reasoning model called “GPT-5 thinking” for harder problems, eliminating the need for users to manually switch between different AI models. The company claims GPT-5 reduces hallucinations by 45% compared to GPT-4o and scores significantly higher on coding, math, and health-related benchmarks. GPT-5 becomes the default model for all ChatGPT users, replacing previous versions, with paid subscribers getting higher usage limits and access to GPT-5 pro for the most complex tasks. The model shows particular improvements in creating functional websites and apps from single prompts, following complex instructions more reliably, and providing more accurate health information while being less overly agreeable than previous versions.
The design team at @elevenlabsio is making big moves🚀 We’ve released a brand new Conversational Agents page, redesigned from the ground up for the most powerful version of Al Agents yet. It’s a big deal because it represents a new direction for the ElevenLabs brand – a https://x.com/RomaTesla/status/1949808534595526806
Runway releases Aleph video AI with improved scene consistency
Runway has launched Aleph, its latest AI video generation tool that addresses one of the biggest challenges in AI-generated videos: maintaining consistency across different scenes. The system demonstrates its capabilities through various examples, including transforming a woman riding a snail into different scenarios – from nighttime settings to mechanical creatures being chased by police cars. Users are already experimenting with the tool in creative ways, such as transitioning footage from shooting ranges to battlefields using FPV glasses, and integrating it with 3D software like Blender for more complex workflows. The release includes both web and API access, along with community showcases and weekly challenges through Discord to encourage user experimentation.
💥 It’s here! GPT-5 is rolling out in ChatGPT for everyone, starting today. It’s a 🤯 good model, and we’ve simplified the UI alongside it. No more choosing between gpt-4o and o4-mini. When you ask a hard question and the model needs to think hard, it does. When it can give you”” / X https://x.com/kevinweil/status/1953502681181618277
A compilation of experiences I made with GPT-5 in one shot. The poem camera app is particularly impressive because the model came up with all the details, like the way the photos stack in the gallery, the photo developing animation, etc https://x.com/skirano/status/1953516768317628818
AMA with @sama + some members of the GPT-5 team Tomorrow 11am PT. https://x.com/OpenAI/status/1953548075760595186
ChatGPT Plus subscription lost 95% of its value over night and this is without accounting for the loss of GPT-4.5″” / X https://x.com/scaling01/status/1953782641838190782
Codex CLI + GPT-5:”” / X https://x.com/gdb/status/1953556751762288653
Does OpenAI not do basic integration testing? At the time of release, the first code sample provided in the GPT-5 docs could not be run, because someone accidentally deleted the `output_text` property. My CI notified me. Why didn’t theirs? https://x.com/jeremyphoward/status/1953610071654772985
going to try live-tweeting the GPT-5 livestream. first, GPT-5 in an integrated model, meaning no more model switcher and it decides when it needs to think harder or not. it is very smart, intuitive, and fast. it is available to everyone, including the free tier, w/reasoning!”” / X https://x.com/sama/status/1953502614676811865
GPT-5 (medium reasoning) is the new leader on the Short Story Creative Writing benchmark! GPT-5 mini (medium reasoning) is much better than o4-mini (medium reasoning). Claude Opus 4.1 shows gains over Opus 4. https://x.com/LechMazur/status/1953658077300875656
GPT-5 (medium reasoning) sets a new record on the Confabulations/Hallucinations on Provided Texts benchmark! https://x.com/LechMazur/status/1953582063686434834
GPT-5 claims #1 spot on LiveBench https://x.com/scaling01/status/1953602929375813677
gpt-5 for long context reasoning:”” / X https://x.com/gdb/status/1953747271666819380
GPT-5 gets 74.9 on SWE-bench. Wonder what the budget per task is. https://x.com/OfirPress/status/1953502998627221519
GPT-5 Hands-On: Welcome to the Stone Age https://www.latent.space/p/gpt-5-review
GPT-5 in the high reasoning setting hit the 100K token limit for our evaluations on 10/290 Tier 1-3 samples (3%). This means our evaluation might slightly underestimate the reasoning capabilities of GPT-5.”” / X https://x.com/EpochAIResearch/status/1953615908695314564
GPT-5 is extremely sensitive to instructions. Either give it demonstrations or tell it explicitly how you want the output. Avoid doing both. If you do, GPT-5 will override the examples with your output instructions. Sharing more just in case you face this issue:”” / X https://x.com/omarsar0/status/1953876255037612531
GPT-5 is here – and it’s #1 across the board. 🥇#1 in Text, WebDev, and Vision Arena 🥇#1 in Hard Prompts, Coding, Math, Creativity, Long Queries, and more Tested under the codename “summit”, GPT-5 now holds the highest Arena score to date. Huge congrats to @OpenAI on this https://x.com/lmarena_ai/status/1953504958378356941
GPT-5 is here! 🚀 For the first time, users don’t have to choose between models — or even think about model names. Just one seamless, unified experience. It’s also the first time frontier intelligence is available to everyone, including free users! GPT-5 sets new highs across”” / X https://x.com/ElaineYaLe6/status/1953607005144506454
GPT-5 is here. Rolling out to everyone starting today. https://x.com/OpenAI/status/1953504357821165774
GPT-5 is live in Cline. We’ve been working with OpenAI to get this model ready, and here’s our take: it’s disciplined, persistent, & highly competent. It’s collaborative in planning & and a diligent operator while acting. It plans thoroughly, asks optioned follow-ups when https://x.com/cline/status/1953525433808695319
GPT-5 is now available in Cursor. It’s the most intelligent coding model our team has tested. We’re launching it for free for the time being. Enjoy!”” / X https://x.com/cursor_ai/status/1953519580627742750
GPT-5 is now available on Perplexity and Comet for Max and Pro subscribers. Just ask. https://x.com/perplexity_ai/status/1953537170964459632
GPT-5 new SOTA on WeirdML beating o3-pro https://x.com/scaling01/status/1953919743842238472
GPT-5 only a 3% improvement over o3 at reproducing scientific papers https://x.com/scaling01/status/1953503883331846629
GPT-5 pricing is insane IT’S OVER https://x.com/scaling01/status/1953509084008710547
GPT-5 rollout updates: *We are going to double GPT-5 rate limits for ChatGPT Plus users as we finish rollout. *We will let Plus users choose to continue to use 4o. We will watch usage as we think about how long to offer legacy models for. *GPT-5 will seem smarter starting”” / X https://x.com/sama/status/1953893841381273969
GPT-5 sentiment from the trenches (AKA 24 hours in Cline users’ hands): It’s a precision instrument, not a Swiss Army knife. Give it detailed prompts and it delivers exactly what you asked for — no tangents, no hallucinations about “”finished”” code. However, it’s less performant https://x.com/cline/status/1953898747928441017
GPT-5 sets a new record on FrontierMath! On our scaffold, GPT-5 with high reasoning effort scores 24.8% (±2.5%) and 8.3% (±4.0%) in tiers 1-3 and 4, respectively. https://x.com/EpochAIResearch/status/1953615906535313664
GPT-5 system card capability evals reactions thread. First observation: ~no improvement on all the coding evals that aren’t SWEBench https://x.com/eli_lifland/status/1953507434238288230
GPT-5 Thinking is less deceptive than o3 However when elicited to display deceptive behaviour it jumps to 28% https://x.com/scaling01/status/1953504438691221856
GPT-5 was doing 2B tokens per minute 3 hours after launch 🤯”” / X https://x.com/kevinweil/status/1953649263411704195
GPT-5 with big improvements in Tau-Bench except the airline category https://x.com/scaling01/status/1953505637242974695
GPT-5 with high reasoning effort on SimpleBench https://x.com/scaling01/status/1953771276549358041
GPT-5: $0.625/$5.00 with flex pricing is ridiculous https://x.com/scaling01/status/1953517149768593903
GPT-5’s Router: how it works and why Frontier Labs are now targeting the Pareto Frontier https://www.latent.space/p/gpt5-router
Hallucinations are almost gone with GPT-5 https://x.com/scaling01/status/1953507569609134506
ICYMI, OpenAI released an insane amount of guides on how to use GPT-5. > Examples > Prompting guide > New features guide > Reasoning tips > Setting verbosity > New tool calling features > Migration guide And much more. https://x.com/omarsar0/status/1953583336603234726
If GPT-5 made this chart I’m bearish 😭 https://x.com/iScienceLuvr/status/1953503815292092904
In a new report, we evaluate whether GPT-5 poses significant catastrophic risks via AI R&D acceleration, rogue replication, or sabotage of AI labs. We conclude that this seems unlikely. However, capability trends continue rapidly, and models display increasing eval awareness. https://x.com/METR_Evals/status/1953525150374150654
Introducing GPT-5 | OpenAI https://openai.com/index/introducing-gpt-5/
Introducing GPT-5 Our best AI system yet, rolling out to all ChatGPT users and developers starting today. https://x.com/OpenAI/status/1953526577297600557
Long context reasoning performance: A stand out is long context reasoning performance as shown by our AA-LCR evaluation whereby GPT-5 occupies the #1 and #2 positions. https://x.com/ArtificialAnlys/status/1953507713222422866
Lots of excitement about GPT-5 in Codex CLI via your ChatGPT plan. Some details: 1. Yes, if you sign in with ChatGPT, usage is included via your paid plan! 2. Still determining exact rate limits, but the goal is to be generous: — Pro users should basically not hit limits”” / X https://x.com/embirico/status/1953590991870697896
made a little Sankey to show you why I’m fuming ChatGPT Plus before vs after the GPT-5 release https://x.com/scaling01/status/1953780931552031056
Markets disappointed by GPT-5 OpenAI getting crushed on Polymarket https://x.com/scaling01/status/1953515099257282763
model switching in gpt-5 very cool!”” / X https://x.com/sama/status/1953526708742537220
New in Notion AI’s toolbelt: @OpenAI’s GPT-5 It’s fast, thorough, and handles complex work 15% better than other models we’ve tested. A great choice for tasks with multiple moving parts. Gradual rollout starting today. https://x.com/NotionHQ/status/1953506907924443645
OpenAI GPT-5 System Card released “”GPT-5 is a unified system with a smart and fast model that answers most questions, a deeper reasoning model for harder problems, and a real-time router that quickly decides which model to use based on conversation type, complexity, tool needs, https://x.com/iScienceLuvr/status/1953503173932724614
Priority Processing debuts with GPT-5. under-hyped imo for apps where millisecond matters, pay extra and get our fastest token speeds just add “”service_tier””: “”priority”” to your requests https://x.com/jeffintime/status/1953857260729643136
Quick PSA. Settings for minimizing GPT-5 latency (time to first token). “”service_tier””: “”priority””, “”reasoning_effort””: “”minimal””, “”verbosity””: “”low””. P50 TTFT with these settings is ~750ms. With the defaults, it’s >3s. The default settings are the right starting point for https://x.com/kwindla/status/1953868672470331423
RT @lmarena_ai: GPT-5 is here – and it’s #1 across the board. 🥇#1 in Text, WebDev, and Vision Arena 🥇#1 in Hard Prompts, Coding, Math, Cre…”” / X https://x.com/aidan_mclau/status/1953517672941158577
The GPT 5 launch included a chart showing 52.8 as a bigger number than 69.1, which in turn is shown as the same magnitude as 30.8. Not quite ASI… https://x.com/jeremyphoward/status/1953509671446196715
The straight up GPT-5 in Codex CLI fixed a bug in 3 minutes that I was working on for three or four hours this morning…can’t wait to try in Cursor.”” / X https://x.com/sound4movement/status/1953583522587017345
Think harder is back! Routing changes in GPT-5 OpenAI means capability is moving from model selection to prompting https://x.com/dariusemrani/status/1953591404003045562
this is the detail of GPT-5 I’m most proud of GPT-4 launched at $30/$60, no cache discount since then, it’s been an unrelenting cross-team push to collapse the cost of intelligence. we’re nowhere near done”” / X https://x.com/jeffintime/status/1953534466854453751
We are actively evaluating GPT-5 models on document understanding capabilities 🔎📄 – specifically screenshotting the page and feeding it into the model. A WIP preliminary finding is that even though on paper GPT-5 is $1.25 per 1M tokens, it uses 4-5x more tokens than GPT-4.1, https://x.com/jerryjliu0/status/1953582723672814054
We’re also releasing v0.16 of the Codex CLI today. – GPT-5 is now the default model – Use with your ChatGPT plan – A new, refreshed terminal UI `npm i -g @openai/codex` to update”” / X https://x.com/OpenAIDevs/status/1953559797883891735
We’ve put together some guides on how to get started with GPT-5: 💬 Prompting guide: https://x.com/OpenAIDevs/status/1953528513480347840
What the hell man, this is such a lame way to technically not lie. «A unified system» is… literally just SEPARATE CoT + non-CoT models + a router. > OpenAI reasoning models, including gpt-5-thinking, gpt-5-thinking-mini, and gpt-5-thinking-nano > gpt-5-main just fuck off washed https://x.com/teortaxesTex/status/1953512363031757048
Google DeepMind unveils Genie 3 interactive world generator
Google DeepMind has released Genie 3, an AI system that creates interactive 3D environments from text descriptions. Users can explore these generated worlds in real-time at 24 frames per second, with the system maintaining visual consistency for several minutes at 720p resolution. The technology represents a significant advance from previous versions, with only seven months between Genie 2 and Genie 3. The system can simulate complex physical properties like water, lighting effects, and natural phenomena without using traditional 3D models or game engines. Researchers describe the experience as navigating through a controlled dream-like environment. DeepMind positions this as a stepping stone toward artificial general intelligence, as it enables training AI agents in unlimited simulated environments. The technology could eventually enable virtual reality experiences by generating offset views for each eye, bringing science fiction concepts like Star Trek’s holodeck closer to reality.
One of the big issues with AI videos is consistency across scenes. It isn’t there, but it is getting closer. This is Runway Aleph on a woman riding a snail… “”it is night”” “”the snail is mechanical”” “”show me the front”” “”the snail is very fast and is being pursued by police cars”” https://x.com/emollick/status/1951856889995653305
Runway Aleph transporting me from the range to the battlefield. Footage captured with FPV glasses. https://x.com/bilawalsidhu/status/1951433057665425837
The general release of Runway Aleph on both web and via API. A community showcase. And our weekly Discord challenge. Get caught up on what happened This Week with Runway. https://x.com/runwayml/status/1951634909501575659
Using @Blender and @runwayml Aleph. So many interesting things happening here. If you are exploring similar workflows, DM me. https://x.com/c_valenzuelab/status/1952419024291188794
Google DeepMind launches Veo 3 with native audio generation capabilities
Google DeepMind has released Veo 3, a video generation model that can create videos with synchronized sound effects, ambient noise, and dialogue without requiring separate audio tools. The model produces 4K resolution videos and demonstrates improved accuracy in following user instructions and depicting realistic physics. Early users have created diverse content including stop-motion animations, character-consistent scenes, and cinematic sequences, with the technology now available through APIs at $0.40 per second for the fast version. The system allows creators to specify detailed camera movements, shot compositions, and audio elements in their prompts, enabling production of content ranging from advertisements to short films entirely through text descriptions.
Genie 2 vs Genie 3. Just 7 months between them. The bitter lesson continues to be bitter. https://x.com/bilawalsidhu/status/1952792880285896710
Genie 3 feels like playing a dream – a controlled hallucination of reality. Really makes you wonder if reality is just the same – except instead of just a few minutes, we can recall a whole lifetime. https://x.com/bilawalsidhu/status/1952895900390404231
Genie 3 generates interactive video in real-time. Just need to generate offset left/right eye views and you’ve got stereo VR worlds. No 3D models, no game engine – just generated dreams you can walk through. The holodeck is closer than you think.”” / X https://x.com/bilawalsidhu/status/1953094066993803454
genie 3 is wild. imagine looking over at your reflection in the tv screen and it’s just you standing there with a gopro strapped to your head… 🤯”” / X https://x.com/bilawalsidhu/status/1953158780835012881
Genie 3: A new frontier for world models – Google DeepMind https://deepmind.google/discover/blog/genie-3-a-new-frontier-for-world-models/
Genie-3 just achieved what AAA game engines do – but WITHOUT any 3D models. Interactive REAL-TIME video generation @ 24 fps Wild how this model figured out complex effects like exposure shifts, volumetric god rays, and phenomena we need to code explicitly in 3D engines TL;DR 🧵 https://x.com/bilawalsidhu/status/1952742891295764620
Google is flexing their AI muscles again. DeepMind has unveiled Genie 3, a real-time interactive, general-purpose world model that generates environments from text prompts, with visual memory extending as far back as one minute to keep scenes consistent. It could help advance https://x.com/TheHumanoidHub/status/1952801280059183210
hollup! so can you pre-load a chunk of a real video into genie’s world memory, so what you’ve seen IRL is actually what you see when you look around? can genie 3 basically do neural scene reconstruction in real time?! even if it’s not a “”factual”” rendition of the world like”” / X https://x.com/bilawalsidhu/status/1953187618700574851
Introducing Genie 3, the most advanced world simulator ever created, enabled by numerous research breakthroughs. 🤯 Featuring high fidelity visuals, 20-24 fps, prompting on the go, world memory, and more. https://x.com/OfficialLoganK/status/1952732206176112915
One word: relentless. just in the past two weeks, we’ve shipped: 🌐 Genie 3 – the most advanced world simulator ever 🤔 Gemini 2.5 Pro Deep Think available to Ultra subs 🎓 Gemini Pro free for uni students & $1B for US ed 🌍 AlphaEarth – a geospatial model of the entire planet”” / X https://x.com/demishassabis/status/1953887339094143156
RT @OriolVinyalsML: Incredible evolution of “”Neural Video Games””: from GQN (2018) to Genie3 (2025). The future is exciting! https://x.com/demishassabis/status/1952890039643353219
Sparks of in-context learning in Genie 3. You can prompt Genie 3 with a video (e.g. Veo 3) then control from there. Genie 3 will mimic the dynamics. I think we have only scratched the surface of what can be done with prompting and post-training of foundational world models.”” / X https://x.com/_rockt/status/1953117236975030653
We need to go deeper. Genie 3 is having its inception moment.”” / X https://x.com/shlomifruchter/status/1953155882902274126
We’re entering the era of infinite AI training environments. Google DeepMind just announced Genie 3, the first real-time interactive world model that creates worlds from text prompts. The video below shows a controllable environment generated by Genie 3 in real time. Insane. https://x.com/rowancheung/status/1952732216959623583
World modeling for robotics is incredibly hard because (1) control of humanoid robots & 5-finger hands is wayyy harder than ⬆️⬅️⬇️➡️ in games (Genie 3); and (2) object interaction is much more diverse than FSD, which needs to *avoid* coming into contact. Our GR00T Dreams work was”” / X https://x.com/DrJimFan/status/1952760780706984051
Grok 4 outperforms major AI models in reasoning and chess competitions
Grok 4 has demonstrated superior performance compared to other leading AI models in two significant benchmarks. The model achieved a 15.9% score on the ARC-AGI-2 test, which measures an AI’s ability to solve novel reasoning problems, significantly outperforming GPT-5’s 9.9% score. Additionally, Grok 4 defeated Google’s Gemini AI in the semi-finals of the Kaggle AI Chess competition, advancing to the grand finale. These results suggest Grok 4 has made notable improvements in both abstract reasoning capabilities and strategic game-playing, areas that are considered important indicators of artificial intelligence progress.
Veo – Google DeepMind – Veo 3 video generation lets you add sound effects, ambient noise, and even dialogue to your creations – generating all audio natively. It also delivers best in class quality, excelling in physics, realism and prompt adherence. https://deepmind.google/models/veo/
🌋 Volcano rock 🌊 Ocean wave ⚡ Storm cloud ASMR Made With #Veo3 Automated With My AI Autopilot! https://x.com/Mentor/status/1942016976827863103
🚨Veo3 Update is Here🚨 Wow, this will change how I make film with AI! With Google Veo3, you can now make yourself talk anything in any language, anywhere. What would you create?? https://x.com/herokominato/status/1942729320948256828
An alien vlogs his first Ahmedabad trip, discovering ‘Khalasi’. 👽 Google Veo 3 vividly realizes imagination, speaking Gujarati effortlessly with 100% auto-generated audio. #AI #Khalasi #GenerativeAI #Veo3 https://x.com/drashyakuruwa/status/1942647461522333777
Comparing Kling 2.1 with audio via Thinksound in @replicate (1st 10 sec vid) to Veo3 in Flow Studio (2nd 8 sec vid). Very impressed with Thinksound. Landscape design and real project photo (for i2v) by VizX Design Studio. https://x.com/Clearstory3D/status/1944505549543833656
I AM DYING 😭🤣 made this with Veo3. also side note this just goes to show that any AI brookejlacey video will never, ever be as good as the original. I will be the ultimate reality dealer, mark my words. https://x.com/brookejlacey/status/1944477615827611691
I made a stop-motion animation for Chanel No. 5 using Veo3. Prompt share: A claymation Paris street at midnight — the Eiffel Tower sparkles in the background, and a tiny clay bottle of Chanel No.5 tiptoes across cobblestones. Make sure everything used stop motion animation. https://x.com/crystalsssup/status/1942162692938354804
I made this in 4 hours! Veo 3 image to video character consistency. Midjourney for character creation and style. Runway ML for coverage. Veo 3 image to video with json prompts #AIart #veo3 #midjourneyv7 #runwayml #rockwilerai #ai #PromptEngineering https://x.com/therockwiler/status/1942883991117336812
It’s kinda crazy how easy it is to make ads with AI now! This is my very first try using VEO3. Not bad, right? 😉 https://x.com/agentsrihan/status/1942987346921533463
Just saw the vampire rap generated with Veo3 by my friend @WuxiaRocks. I didn’t know Veo3 can do such cool rap lipsync. Now, I want to create cool raps for brands I love. First up, @Netlify. @biilmann, what do you think?😉 https://x.com/zeng_wt/status/1943684922214171125
people are using veo3 to bring history to life in the form of vlogs 🤣 via HistoryVisualizedbyAI on YouTube https://x.com/tanayj/status/1934373978098778145
Prompt + image segmentation with @GoogleDeepMind VEO3 + @ultralytics 🚀 Here’s the simple workflow: ✅ Generated a video using VEO3 (prompt shared in the comments) ✅ Processed the clip directly with YOLOE for prompt-based image segmentation. Prompt in the comments👇 #AI https://x.com/muhammdrizwanmr/status/1941015898082468277
Quick play with Veo3 + Astra @topazlabs https://x.com/AllarHaltsonen/status/1941202785363788125
sailing like You’ve never seen before 🚤🌊 hyperreal waves, golden sunsets, & raw human emotion brought to life with cinematic precision. generated using #Veo3, now available on @moofeedcom. this isn’t just AI. It’s storytelling in motion. https://x.com/iUllr/status/1943956162874867858
The actual budget friendly launch trailer. Every shot generated with @GoogleDeepMind Veo3. With the philosophical idea that “”Death isn’t the end, forgetting is””. We proudly introducing EzCall AI. Time to speak the words you never got to say https://x.com/zhaoyuWu8/status/1942285389651403102
This entire short film was made using AI. No actors. No cameras. Just prompts, imagination, and tech. Watch it now #AIshortfilm #AIFilmmaking #AhmedabadCrash #RadheWorks #GenAI #OpenAI #FutureOfCinema #CinematicAI #Veo3 https://x.com/punit19nov/status/1942220841657508003
This is AI … but with real actors! The Hollywood film is about to change. We made with Veo3, Runway Reference and Flux Kontext using Me and My friend’s @Jamesgulles_ performances. Will the future be shaped by AI creators or by filmmakers? https://x.com/herokominato/status/1941844050451243187
Two word VEO3 prompt experimentations: > Cat Kaleidoscope 🔊 I find this calming kind of like ASMR but with hypnotic video https://x.com/rBKeeper/status/1943202740945006659
Veo 3 Fast and Veo 3 image-to-video are now available in the API! 📹 Veo 3 Fast is $0.40 per second of video (with audio) and comes with production ready rate limits and has comparable quality in certain cases! https://x.com/OfficialLoganK/status/1950959720606396655
Veo3 (Fast) is actually much better with consistent character than Veo3 (Quality). Here’s a 4-scene video of a Japanese tight rope walker on top of a sky scraper. Prompts in the comment. https://x.com/juminoz/status/1942399268192674285
Veo3 fast { “”shot””: { “”composition””: “”High-angle tracking shot from a helicopter, 200mm telephoto lens on a stabilized gimbal system, shot on RED Helium 8K S35″”, “”camera_motion””: “”aerial tracking following the emus’ path, with gradual zoom-in””, “”frame_rate””: https://x.com/IamEmily2050/status/1941126453715948005
Veo3 fast on the Gemini app 😀 { “”shot””: { “”composition””: “”POV first-person perspective, 35mm lens, shot on ARRI Alexa Mini, shallow depth of field with focus pulls on the trainer’s face, sword, and gate””, “”camera_motion””: “”handheld with natural inertia—subtle sway https://x.com/IamEmily2050/status/1943206541496369569
Grok launches ultra-fast AI image and video generation tools
X (formerly Twitter) has released Grok Imagine, a new AI feature that creates images and videos at unprecedented speeds. The tool, available through the Grok app, generates visual content so quickly that users report it produces images faster than they can scroll through results. Initially offered as a free trial to US users, the feature is now available to all X Premium subscribers. Early users describe the generation speed and quality as exceptional, with the system capable of creating both still images and videos through simple text prompts.
Grok 4 is still state-of-the-art on ARC-AGI-2 among frontier models. 15.9% for Grok 4 vs 9.9% for GPT-5. https://x.com/fchollet/status/1953511631054680085
Grok-4 BEATS GPT-5 on ARC-AGI-2 https://x.com/scaling01/status/1953509485453902173
RT @cb_doge: 🚨 BREAKING: Grok 4 defeats Google’s Gemini in the Kaggle AI Chess semi-final and moves on to the grand finale! 🤖♟️🔥 https://t.…”” / X https://x.com/hyhieu226/status/1953220787084902888
ElevenLabs expands into AI music generation with commercial licensing deals
ElevenLabs, known for its text-to-speech AI tools, has launched a music generation service that creates songs from text prompts. The company claims the AI-generated music is cleared for commercial use through licensing agreements with Merlin Network and Kobalt Music Group, which represent artists like Adele, Nirvana, Beck, and Childish Gambino. Artists must opt-in for their music to be used in AI training, and they receive revenue sharing from the deals. The service comes with strict usage restrictions, prohibiting users from referencing specific artists, song titles, or lyrics in their prompts, and banning use in certain industries like firearms, tobacco, and political campaigns. This launch follows legal challenges faced by competitors Suno and Udio, who were sued by the Recording Industry Association of America for allegedly training their models on copyrighted material without permission.
Grok 4 Imagine generates images faster than I can scroll. How is this even possible? It’s so good. 😭”” / X https://x.com/tetsuoai/status/1951444393065586840
RT @elonmusk: For the next few days, Grok Imagine video generation is free to all US users! Download the Grok app and try it out.”” / X https://x.com/Yuhu_ai_/status/1953367318521655594
Super fast image & video generation via Imagine in the @Grok app is now available to all 𝕏 Premium users”” / X https://x.com/elonmusk/status/1952535613560983757
You guys need to try imagine mode on grok app. It’s incredible.”” / X https://x.com/tobi/status/1951789462268391749
Claude’s system prompt receives updates for improved performance
Anthropic has updated Claude’s system prompt to enhance the AI assistant’s capabilities and responses. The changes aim to improve Claude’s ability to follow instructions more accurately, provide clearer explanations, and maintain consistency across different types of tasks. The updates include refinements to how Claude processes user requests, handles edge cases, and structures its outputs. These modifications are part of Anthropic’s ongoing efforts to make Claude more helpful and reliable for users across various applications, from creative writing to technical problem-solving. The company has not disclosed all specific changes but indicates the updates should result in more natural conversations and better alignment with user intentions.
ElevenLabs launches an AI music generator, which it claims is cleared for commercial use | TechCrunch https://techcrunch.com/2025/08/05/elevenlabs-launches-an-ai-music-generator-which-it-claims-is-cleared-for-commercial-use/
Music Terms | ElevenLabs https://elevenlabs.io/music-terms
6 AI Visuals and Charts: Week Ending August 08, 2025
concerning https://x.com/DZhang50/status/1953510507631071658
FantasyPortrait: Enhancing Multi-Character Portrait Animation with Expression-Augmented Diffusion Transformers https://fantasy-amap.github.io/fantasy-portrait/
4d gaussian splat processed from one of the largest volumetric capture stages ever made. And fun fact, it was Intel that originally made it! I remember seeing the results in 2017, and my mind was blown. You could capture something once, and reframe it infinitely in post. Alas, https://x.com/bilawalsidhu/status/1952000783492186424
Relightable Full-body Gaussian Codec Avatars https://neuralbodies.github.io/RFGCA/
more tests reskinning google earth photogrammetry renders with runway’s aleph video-to-video ai model https://x.com/bilawalsidhu/status/1950717547206037511
i think the whole interactive video vs. explicit 3d debate is about get supercharged this week. meanwhile, here’s me reskinning 3d gaussian splat renders with runway aleph. https://x.com/bilawalsidhu/status/1952489882024386819
Top 39 Links of The Week – Organized by Category
AGI
Ethan Mollick on X: “Lots of vague statements from leaders of the various AI labs about starting to see signs of self-improvement in AI systems (including Zuckerberg today), seems like proof that this is indeed happening would be pretty significant. (thanks o3 for providing details & saving me time) https://t.co/gl648hyzE0” / X https://x.com/emollick/status/1950801459915727309
ARVR
This is game engine 2.0. Some day, all the complexity of UE5 will be absorbed by a data-driven blob of attention weights. Those weights take as input game controller commands and directly animate a spacetime chunk of pixels. Agrim and I were close friends and coauthors back at”” / X https://x.com/DrJimFan/status/1952747404379504855
AgentsCopilots
Developers, brace yourselves. @lovable_dev just dropped a wild new AI agent — it builds apps, games, and tools in under 10 minutes. No code. Just prompts. I built 50+ working apps and games. Here’s what I tried: Bubble Shooter Game (Fully Playable) https://x.com/ketan_tayal16/status/1948724087418769465
This is crazy… Lovable’s new agent writes better code than humans, works non-stop, never quits …and costs less than Netflix. Here 8 apps people built in under an hour: 1. Hand controlled shooter game https://x.com/AngryTomtweets/status/1948655404160102876
The future of shopping is here, brought to you by @Shopify and @Copilot. Imagine having a perfect shopping assistant in your pocket. Shop with conversation, not clicks – you’ll never go back to endless scrolling. Very excited to partner with the Shopify team. Lots more to come!”” / X https://x.com/mustafasuleyman/status/1952804181061799961
Perplexity is using stealth, undeclared crawlers to evade website no-crawl directives https://blog.cloudflare.com/perplexity-is-using-stealth-undeclared-crawlers-to-evade-website-no-crawl-directives/
Audio
🚨The World’s Best Just Got Better — MiniMax Speech 2.5 is live – 40 languages – Voice cloning so real, it feels human – Accent, age, emotion — every detail, perfectly preserved. No more robotic voices. Just pure expression. 👉Try it now → https://x.com/MiniMax__AI/status/1953424332577026372
BusinessAI
Y Combinator on X: “https://t.co/0d2RAFb7in (@IronLedgerAI) provides AI agents for property accounting, starting with accounts payable. They’re already eliminating 100s of hours of work for 1000s of multifamily units every month. https://t.co/s1YWqqEAal https://t.co/UiUZnNRtlf” / X https://x.com/ycombinator/status/1950647587549089921
Andrew Tulloch – the man at Thinky who turned down Zuck’s $1.5B offer, had been at Meta for 11 years. Just like when Google lost Noam Shazeer, and had to pay $2.7B to get him back. Always know who your MVP is, and treat them right. Or pay 100X later, and still hear “hell no.””” / X https://x.com/Yuchenj_UW/status/1951677714173477031
EducationAI
🚀@RiselyAI’s agents automate admin work across college campuses. Their first product, the AI Advisor, unifies student data to flag at-risk students, deliver personalized support, and improve retention. Congrats on the launch, @shahryarsabbasi, @sadiasaifuddin1 & @danial_asif! https://x.com/ycombinator/status/1950602852751253983
New AI Breakthrough from Google Google developed a new active learning method that drastically reduces the amount of training data needed to fine tune LLMs for complex tasks. “”We describe a new, scalable curation process for active learning that can drastically reduce the https://x.com/Dr_Singularity/status/1953573112726839663
we’re building chatgpt to help you make progress, learn something new, and solve problems:”” / X https://x.com/gdb/status/1952431265749340212
Runway is now being used by hundreds of schools and universities all across the world by all kinds of students, from film to architecture, design, and engineering. We recently sat down with USC’s School of Cinematic Arts and UPenn Architecture to learn more about how they are https://x.com/c_valenzuelab/status/1951568696155017286
EthicsLegalSecurity
Ethan Mollick on X: “Plus, these judges are likely using free or default models. Even o1-preview significantly reduced hallucinations, let alone more recent or more grounded models. (Though, to be clear, AI should definitely not be used to create legal opinions from sitting judges at this point) https://t.co/WjryQVjSm9” / X https://x.com/emollick/status/1950742345546150237
A reporter asked me for my off-the-record take on recent safety research from Anthropic. After I drafted an off-the-record reply, I realized that I was actually fine with it being on the record, so: *** Since I never expected any of the current alignment technology to work in”” / X https://x.com/ESYudkowsky/status/1952422379478741301
🚨New prompting report, from us: Don’t bother with threats. Does threatening an AI really make it perform better (the way Google founder Brin claimed)? How about offering to tip the AI? We find no impact of threats or tips on average performance (but variance at question level) https://x.com/emollick/status/1951289250915221589
Swedish Prime Minister is using AI models “”quite often”” at his job. He says he uses it get a “”second opinion”” and asks questions such as “”what have others done?”” At the moment he is not uploading any documents. IMO, when these models are capable of giving seemingly better https://x.com/rohanpaul_ai/status/1952025736111366590
Guys, I understand you like drama, but this is a remark about the AI development at large. We are seeing the plateau: just scaling up is coming to an end. For EVERYONE, not one company in particular.”” / X https://x.com/francoisfleuret/status/1953530837619630254
It’s story time, reimagined. Now you can create personalized, illustrated storybooks about anything, complete with read-aloud narration. Try Storybook in 3 easy steps: 1. Open Gemini at https://x.com/GeminiApp/status/1952770641133781255
Gemini Embeddings are already in production with thousands of customers in only ~2 weeks, crazy to see a space which has seen little momentum lately reignited by content engineering + new models. https://x.com/OfficialLoganK/status/1950947167524295080
How AI is helping advance the science of bioacoustics to save endangered species – Google DeepMind https://deepmind.google/discover/blog/how-ai-is-helping-advance-the-science-of-bioacoustics-to-save-endangered-species/
Imagery
Today we’re releasing our *HD* Video mode to Pro and Mega subscribers. It’s ~3.2x more expensive with ~4x more pixels than our default SD video outputs. This is for professionals that need the absolute highest quality footage possible out of Midjourney and we hope you enjoy it.”” / X https://x.com/midjourney/status/1953265002254921958
MicrosoftAI
Interestingly, an economics paper that came out in 2023 predicting which jobs would overlap most with AI turned out to be right. A new Microsoft study of actual AI use by workers (more on that in another post) found a 90% correlation between real world overlap & the predictions. https://x.com/emollick/status/1950931672968429835
OpenAI
In my experience, almost every important person who tells you that they are using AI is using GPT-4o, even if they have a premium account.”” / X https://x.com/emollick/status/1952120741815550200
The surprise deprecation of GPT-4o for ChatGPT consumers | Hacker News https://news.ycombinator.com/item?id=44839842
Introducing Stargate Norway | OpenAI https://openai.com/index/introducing-stargate-norway/
Jensen Huang congratulates OpenAI on Stargate Norway. It will run on GB300 Superchips, scaling to hundreds of thousands of GPUs – purpose-built for training, reasoning, and real-time inference. “Just as electricity and the internet became foundational to modern life, AI will https://x.com/vitrupo/status/1950828090260955165
I am agnostic about the quantitative size of the current health hazard of ChatGPT psychosis. I see tons of it myself, but I could be seeing a biased selection. I make a big deal out of ChatGPT’s driving *some* humans insane because it looks *deliberate*!”” / X https://x.com/ESYudkowsky/status/1951324864163487984
🤔”(it’s a person demoing a local agent on their mac)” / X https://x.com/sama/status/1952767676922974463
We just removed a feature from @ChatGPTapp that allowed users to make their conversations discoverable by search engines, such as Google. This was a short-lived experiment to help people discover useful conversations. This feature required users to opt-in, first by picking a chat https://x.com/cryps1s/status/1951041845938499669
OpenSource
🚀We’re expanding the Tencent Hunyuan open-source LLM ecosystem with four compact models (0.5B, 1.8B, 4B, 7B)! Designed for low-power scenarios like consumer-grade GPUs, smart vehicles, smart home devices, mobile phones, and PCs, these models support cost-effective fine-tuning https://x.com/TencentHunyuan/status/1952262079051940322
Perplexity
Invisible × Perplexity: Infrastructure Meets Agentic Browser – Invisible https://www.getinvisible.com/articles/invisible-perplexity-infrastructure-meets-agentic-browser
Robotics
Elon says Tesla already has world model for Optimus. https://x.com/TheHumanoidHub/status/1952771309383077906
ScienceMedicine
With AI, researchers predict the location of virtually any protein within a human cell | MIT News | Massachusetts Institute of Technology https://news.mit.edu/2025/researchers-predict-protein-location-within-human-cell-using-ai-0515
We’re excited to introduce the Open Direct Air Capture 2025 dataset, the largest open dataset for discovering advanced materials that capture CO2 directly from the air. Developed by Meta FAIR, @GeorgiaTech, and @cusp_ai, this release enables rapid, accurate screening of carbon https://x.com/AIatMeta/status/1952477453857017948
TechPapers
I think everyone interested in AI should read the model cards for the frontier models, especially the safety sections, which give you a sense of immediate concerns: Gemini Deep Think: https://x.com/emollick/status/1952218373397647411
TwitterXGrok
Grok Imagine usage is growing like wildfire. 14 million images generated yesterday, now over 20 million today!”” / X https://x.com/elonmusk/status/1952636922477572324
Grok Imagine is now live to all SuperGrok and Premium+ subscribers. Update your Grok app to version 1.1.33 and try it out. https://x.com/chaitualuru/status/1952174534142067092
Grok-4 ranks #1 on LisanBench https://x.com/scaling01/status/1953843352366903622





Leave a Reply