$Photorealistic, cinematic editorial portrait in a bathtub; landscape 16:9. A single sleek matte-graphite humanoid robot reclines diagonally in pale, milky sea-foam/aquamarine water. Camera is slightly above and to the left, gentle three-quarter angle. The robot’s head sits near the upper-right third, turned subtly toward camera as if making eye contact; glossy black faceplate above the waterline with crisp studio-strip reflections and tiny water droplets. A thin meniscus clings to the jawline and cheek edges. Pose and limb placement: torso reclined; left arm bent and arched over/behind the head, forearm curving around the crown; right forearm angled across the torso; right hand relaxed near the water surface in the lower-left quadrant with splayed articulated fingers just touching the surface, creating delicate concentric ripples and a shallow indentation in the water skin. Right shoulder sits deeper than the left; neck long and elegant. Submersion behavior and refraction: everything below the waterline—lower chest, abdomen, parts of both arms and the right hand—takes on a soft jade/sea-foam color shift with reduced contrast; edges soften through realistic refraction and slight magnification; tiny trapped micro-bubbles and wispy white waterline swirls cling to panel seams. Above the waterline, materials retain natural finishes: matte graphite body panels, satin gray composite sections, high-gloss black faceplate; subtle specular highlights tracing shoulders, biceps covers, and finger segments. Any indicator glyphs are dim or off. Collage elements: multiple glossy photo-paper clippings (varied rectangular sizes) float and overlap around the robot, some fully on top of the water, some half-submerged. Corners are wet and slightly curled; edges show silvering; each clipping contains cropped close-ups of this robot (faceplate fragment, shoulder joint, hand segments, chest panel). Accurate wet-paper physics: refraction and distortion where clippings dip below the surface, tiny beads of water on top, slight glare streaks from lights. Lighting and palette: soft warm key light from upper-left; gentle fill from lower-right; faint rim along the right shoulder and side of the faceplate. Golden highlights skate across ripples and clippings; overall teal-orange balance with pale jade water, warm honey reflections, and a few sparse copper-glitter flecks drifting near upper-left and lower-right. Background limited to bathtub interior only; soft corner vignette. Rendering details: ultra-real materials and water physics; shallow depth of field with a 50mm, f/2.8 feel; natural film grain; mild halation on brightest highlights; immaculate edges; no posterization or banding. Extend water and collage laterally to fill the wider canvas while keeping the robot centered as described. Strict exclusions: no text or typography, no readable logos/watermarks/brand names, no extra robots or limbs, no faucets or bathroom fixtures, no UI overlays. Style keywords: photorealistic, cinematic editorial beauty, wet refraction, underwater caustics, glossy reflections, magazine-cover quality.$

AI News #97: Week Ending August 08, 2025 with 50 Executive Summaries, Top 44 Links, and 6 Helpful Visuals

August 9, 2025

About This Week’s Covers

This week’s newsletter cover was inspired by the new Taylor Swift album “The Life of A Showgirl”

I gave the original album art to GPT and asked for a prompt to give an image tool that removed the text and swapped in a humanoid robot in a similar pose in a landscape image. I tried the prompt in all of the major tools and Flux won easily. I added the text in Photoshop.

I used my nine-week-old GPT rubric + Flux Pro Ultra to automatically incorporate all of the categories into a Vegas showgirl theme. I gave it a one-sentence description of the theme, and GPT automatically generated 46 cover image prompts and sent them through the API (Flux this week, but I can change it) with no supervision. All ideas and compositions came from GPT autonomously.

I love the aesthetic of the imagery this week. I’d give the covers a B-. My favorite six are below:

This Week By The Numbers

Total Organized Headlines: 552

This Week’s Executive Summaries

Here’s everything you need to know about AI news for the week ending August 8, 2025.

ChatGPT is on track to reach 700 million weekly active users by the end of next week. That’s up from 500 million users at the end of March and four times the weekly users at this time last year.

The seven largest technology companies spent over $100 billion on AI infrastructure in the past three months; that’s more than all U.S. consumer spending combined in the last six months.

Meta is selling $2 billion in assets to help pay for its new data center infrastructure and growth.

OpenAI is reportedly working on a secondary sale that values the company at $500 billion.

OpenAI released GPT-5 to quite a bit of fanfare, but for most users the only visible difference was GPT chooses which model to use, without you having to tell it.

That said, feedback is pouring in with tons of videos and examples, if you’re interested, below.

Google released a world simulation model that creates entire explorable 3D worlds with just a prompt. This is going to be a big boost for robot training and potentially AR/VR, gaming, and videos. It’s almost “see it to believe it” level tech. Here are some examples:

Genie 3: A new frontier for world models – Google DeepMind
https://deepmind.google/discover/blog/genie-3-a-new-frontier-for-world-models/

Now, it’s worth spending some time talking about open source software and what it means to the AI race between the US and China. There are a ton of links below in the executive summary, but here are the main points:

Likely under advisement by the tech companies, the White House last week made open source AI development a national priority in order to keep up with the Chinese models. For a decade, the United States has led the open source development race, however in 2025 China has begun to dominate the strongest models. As of last week four out of the top five open source models were Chinese.

A petition is circulating called the American Truly Open Models project or ATOM. After Meta’s llama model lost its lead, the petition is pushing for consensus that frontier companies continue releasing leading open source options.

This week OpenAI released two strong, open source models that have reopened the debate of whether or not the US is back in the lead. Regardless of the U.S. status, the gap between the best closed frontier models and the leading open source models is now only 10%!

The OpenAI open source models are remarkable because they can be run on a MacBook Pro locally. This is a really big deal for privacy fanatics. Technical users are able to modify the code to fine tune their models as well.

One notable element of the two locally run open source models from OpenAI is that they have access to web browsing tools and can read websites and follow links between pages. Another harbinger of the death of the organic page view.

While the OpenAI open models are getting a lot of praise, some critics have noticed that the models seem to be tailored to beat benchmarks. While they excel at some tasks, they’re almost worthless in other common sense cases. We’ll know more in the coming weeks.

Hugging Face released a 200 page playbook on how companies can train their own private models using open source software. It’s a fantastic resource.

Former Google DeepMind researchers at the Reflection AI are in discussions to raise over $1 billion to build a new open source model.

In addition to open-source news last week, there have been a lot of ethical headlines that are worth walking through.

Anthropic cut off OpenAI’s access to Claude, claiming that OpenAI was violating their API terms.

Over 100 Nobel laureates, professors, whistleblowers and organizations have signed an open letter calling on OpenAI to provide more information about their corporate restructuring plans. OpenAI has been all over the map when it comes to their nonprofit versus for-profit status. It’s worth reading the open letter which is also in the summaries below.

OpenAI became an official vendor for US government agencies and will now provide ChatGPT access to all federal employees for just one dollar per year per agency. This means that every federal employee now has essentially a PhD at their fingertips, assuming they use it wisely and ethically. We are entering a new era of potential misuse and security risks, in addition to potential positive outcomes.

The Prime Minister of Sweden has been transparent that he often uses GPT for a second opinion when he has questions or ideas of what he should do. I’m happy that he is transparent about it because I do the same thing and I think everyone should use GPT for second opinions… however there’s also the risk of security breaches if the advice is too close to policy decisions.

North Carolina’s Department of the State Treasurer tested GPT for three months and dramatically reduced the time it takes for operational tasks. For example, a 90 minute audit was reduced to 30 minutes. Some tasks that used to take 20 minutes now only take 20 seconds. The study was conducted by North Carolina Central University and showed that 85% of participants had a positive experience and saved between 30 and 60 minutes every day.

As AI use becomes more common across consumers, as well as the government, experts are beginning to caution that the next election cycle is at risk for tremendous misinformation.

President Trump’s media company has integrated Perplexity’s AI search tools, allowing users to search the web with AI.

OpenAI put out a statement that they are dedicated to quality of user experience rather than engagement, attention, and addiction. I am unsure how I feel about OpenAI’s veracity or ability to be candid. Right now in my mind, they are somewhere between Google and Meta. That said I appreciate that they have made at least a gesture to try not to get people addicted to their product.

In speaking to my friends, I’ve discovered that I rarely chat with AI. Instead, I have a task I need to achieve and I execute it surgically. I’m still digesting the fact that people sit down and have a conversation with AI. That’s not anywhere near how I use it. I’m thankful for that to be honest.

Speaking of veracity and conflict of interests, I’ve been wary about the fact that Internet backbone provider Cloudflare is positioning themselves as an AI firewall and protector of privacy and training. Back in the day, Akamai sold their consumer tracking tools (Akamai ADS) to MediaMath, because they did not want to get into the business of arbitraging their position as a backbone and cache for additional profit.

Cloudflare’s marketing began by saying that they could block AI agents that were ignoring robots.txt. Then it evolved to attempting to enable commission structures for publisher payments. And now Cloudflare almost seems to be actively trying to block the efficiency of AI beyond just crawlers and training.

This has manifested in a battle between Cloudflare and Perplexity. Cloudflare argues that Perplexity is circumventing transparent web usage as an AI agent, but if I don’t feel like browsing the Internet and I want an agent to do it for me, I should have that right without Cloudflare trying to take a position on my behalf. This will become especially important as mundane tasks that would have usually required a lot of clicking around can now simply be delegated to an agent.

I think the agents will win handily in the long game, just as the internet ate up or transformed essentially every legacy business before it.

Conversely, yet very much related, Google is claiming that referrals from search engines to websites have remained stable year over year… even though everything I’ve read shows that traffic is down as much as 50%.

I can’t imagine how Google’s AI search overviews are driving traffic to websites. I trust Google about as far as I can throw them. And that’s from 20 years of working with them closely and being almost constantly disappointed.

Frontier models continue to position themselves as stewards for education.

Anthropic signed the Pledge To America’s Youth, which commits to advancing AI education.

Much like OpenAI launched study mode recently, Gemini launched guided learning to help students learn and study better and build deep understanding of subjects, including the why and how rather than just getting an answer.

Google also created an AI for education accelerator that provides free AI training and career certificates to college students in the United States. This includes a $1 billion commitment to AI literacy.

Students 18 and over can get Gemini Pro FREE for one year.

This week saw a lot of developments with agents and copilots:

Google launched Gemini DeepThink which is a consumer version of the thinking engine used to win the International Math Olympiad recently.

ByteDance released an open source agent for software engineering called Trae Agent.

ByteDance also released a math proof system that is breaking records on benchmarks. It performed 4x better than the rest of the leading models.

Google launched its coding copilot Jules. It’s too early to see if this is the Claude Code killer. I’m still using GPT for most things and Claude for debugging, because I’m not heavily into coding. Most of my serious coder friends use Claude in the command line.

Singapore start-up Manus announced Wide Research which empowers multiple agents across multiple models (as a wrapper). It’s supposedly incredibly powerful for dauntingly scaled tasks like researching all Fortune 100 companies at the same time.

Perplexity partnered with OpenTable to power agents to book reservations. This sort of small breakthrough will add up and change our world, once we can simply talk to our phones and get mundane challenges accomplished easily.

Andreessen Horowitz is backing voice agent platform, EliseAI, which makes AI voice agents for property management, and healthcare, at a $2B valuation!

ElevenLabs announced their redesigned interface for AI voice conversation agents. I’m very excited about this one.

Shopify continues to roll out conversational commerce agents.

Conversational agents will be common in a year or less, I would guess. It will be fun to see what Amazon and Apple do alongside these new competitors/APIs.

Anthropic released Opus 4.1 (up from 4) with improvements to coding and reasoning.

AI leaders are routinely (if not intentionally) aligned lately with their message of democratizing knowledge:

“someday soon something smarter than the smartest person you know will be running on a device in your pocket, helping you with whatever you want. this is a very remarkable thing.” -Sam Altman

“Something I think about a lot: who knows how many brilliant ideas never saw the light of day because “I don’t know how to do that.”” Pretty crazy to think that with AI everyone now has a reasonable VC advisor, coder, or professor on hand to teach you about anything you want” -Mustafa Suleyman

Last week I found myself sheepishly saying “I think Saas is dead”. I felt self-conscious.

Then to my surprise Sam Altman tweeted “entering the fast fashion era of SaaS very soon”.

Two weeks ago, Runway announced Aleph, a powerful video tool. More great examples have poured in, and a few are below (the rest are in the full executive summary section).

Also a few weeks ago, Google released their video tool Veo 3. Examples keep coming in and they are worth watching.

This week’s humanities reading is an excerpt from the end of Don Quixote by Miguel de Cervantes:

Here lies the noble fearless knight, Whose valor rose to such a height; When Death at last did strike him down, His was the victory and renown. He reck’d the world of little prize, And was a bugbear in men’s eyes; But had the fortune in his age To live a fool and die a sage.

Full Executive Summaries with Links, Generated by Claude 4

ChatGPT approaches 700 million weekly users in rapid growth
ChatGPT is set to reach 700 million weekly active users this week, marking a significant increase from 500 million users at the end of March and representing a fourfold growth compared to last year. The AI chatbot’s expanding user base reflects its growing adoption by individuals and teams who use it for learning, creative tasks, and problem-solving. This milestone demonstrates the rapid mainstream acceptance of conversational AI tools, as millions integrate ChatGPT into their daily workflows for various personal and professional applications.

This week, ChatGPT is on track to reach 700M weekly active users — up from 500M at the end of March and 4× since last year. Every day, people and teams are learning, creating, and solving harder problems. Big week ahead. Grateful to the team for making ChatGPT more useful and”” / X https://x.com/nickaturley/status/1952385556664520875

Tech giants’ AI spending now drives more economic growth than consumers
The seven largest technology companies spent over $100 billion on AI infrastructure like data centers in just the past three months, contributing more to U.S. economic growth in the last six months than all consumer spending combined. This massive investment in AI capabilities by companies like Microsoft, Google, and Amazon represents an unprecedented shift in what drives the American economy, as businesses race to build the computing power needed for artificial intelligence systems.

Christopher Mims 🤌 on X: “The AI infrastructure build-out is so gigantic that in the past 6 months, it contributed more to the growth of the U.S. economy than /all of consumer spending/ The ‘magnificent 7’ spent more than $100 billion on data centers and the like in the past three months *alone* 1/🧵 https://t.co/sHMK1zI0sP” / X https://x.com/mims/status/1951256592642441239

Meta plans to sell $2 billion in data center assets to share AI costs
Meta Platforms is selling $2 billion worth of data center land and construction assets to bring in partners who can help pay for the expensive infrastructure needed to run artificial intelligence systems. The company reclassified these assets as “held-for-sale” in June and expects to transfer them to a third party within the next year for joint data center development. This move represents a significant change for tech companies, which traditionally funded their own growth but now face enormous costs for AI infrastructure. Meta’s CEO Mark Zuckerberg has described plans to build AI data center “superclusters” that would each cover a significant portion of Manhattan’s footprint, requiring hundreds of billions of dollars in investment. The company raised its annual spending forecast to between $66 billion and $72 billion, though executives said AI-improved advertising is helping offset these rising infrastructure costs.

Meta to share AI infrastructure costs via $2 billion asset sale | Reuters https://www.reuters.com/business/meta-share-ai-infrastructure-costs-via-2-billion-asset-sale-2025-08-01/

OpenAI seeks secondary sale that would triple its valuation to $500 billion
OpenAI is reportedly negotiating a secondary sale that would value the company at $500 billion, a massive jump from its current $300 billion valuation and the $157 billion it achieved in last year’s funding round. The sale, which would allow current and former employees to sell their shares, comes as the company’s annual revenue has surged from $10 billion in June to $13 billion and is expected to reach $20 billion by year-end. The timing coincides with hints that OpenAI will announce GPT-5 tomorrow, with early testers reporting the model shows strong improvements in science, coding and math capabilities, though the leap won’t be as dramatic as from GPT-3 to GPT-4. The potential valuation reflects broader investor enthusiasm for AI companies, with competitors Anthropic, Mistral AI and Cohere also reportedly seeking major funding rounds at significantly higher valuations.

OpenAI reportedly in talks for secondary sale at $500B valuation – SiliconANGLE https://siliconangle.com/2025/08/06/openai-reportedly-talks-secondary-sale-500b-valuation/

US prioritizes open-source AI to compete with China’s growing influence
The White House has made open-source AI development a national priority after Chinese models like DeepSeek-R1 gained widespread adoption among American developers and researchers. China’s recent open-source AI releases have become the most popular models globally, with thousands of US companies now building on Chinese foundations rather than American ones. This represents a major shift from 2016-2020 when the US led in open-source AI development, but American tech giants have since locked their best models behind proprietary APIs. The administration recognizes that falling behind in open-source AI could mean losing the broader AI race, as open models drive faster innovation, allow for security auditing, and prevent vendor lock-in. Industry leaders are calling for renewed US commitment to open AI development through companies like Meta and research institutions to ensure American leadership in this critical technology.

Why open-source AI became an American national priority | VentureBeat https://venturebeat.com/ai/why-open-source-ai-became-an-american-national-priority/

US launches initiative to compete with China in open AI models
The United States has fallen behind China in developing and adopting open-source AI models, despite initially leading with Meta’s Llama system. The American Truly Open Models (ATOM) Project aims to address this gap by building support for US-based open AI development. Open models allow anyone to access, modify, and build upon AI technology, making them crucial for innovation and preventing any single country or company from controlling AI advancement. The project highlights growing concerns that America’s early advantage in accessible AI technology is slipping away to international competitors.

America needs to take open models more seriously. This summer the early lead in open model adoption of the US via Llama has been overtaken by Chinese models. With The American Truly Open Models (ATOM) Project we’re looking to build support and express the urgency of this issue. https://x.com/natolambert/status/1952370970762871102

I signed this because, despite worrying about misuse of open models more than most, I would like that to be the bottleneck rather than “”is it beneficial to big companies commercially/reputationally etc.”” There are many benefits to the US investing here. https://x.com/Miles_Brundage/status/1952400404668657966

very excited by the ATOM project”” / X https://x.com/finbarrtimbers/status/1952401883391520794

Meta’s Llama 4 struggles reshape global AI development landscape
The underperformance of Meta’s Llama 4 language model has triggered significant changes in the artificial intelligence industry. The model’s shortcomings have pushed open-source AI development leadership toward China, as Western companies struggle to maintain competitive alternatives. Many organizations that previously relied on running Llama models locally have been forced to switch to proprietary, closed-source systems due to the lack of viable upgrades. This shift has also intensified competition for AI talent in the United States, as companies scramble to recruit experts who can help them develop competitive models independently rather than relying on Meta’s open-source offerings.

The relative failure of Llama 4 turned out to be very consequential to the AI landscape. It led to the shifting the locus of open weights development to China, a move towards closed models as companies running local Llama couldn’t continue to upgrade, & big talent wars in the US.”” / X https://x.com/emollick/status/1951433537485500476

OpenAI’s new model narrows gap between open and closed AI systems
OpenAI’s latest release has sparked debate about whether open-source AI models are catching up to proprietary ones. While some argue that the US now has competitive open-weight models, others point out that Chinese open-source models still outperform Western alternatives in certain areas. The performance gap appears to be shrinking, with reports suggesting that advanced closed models like GPT-5 are only about 10% better at coding tasks than open-weight models that can run on consumer hardware. This development raises questions about whether OpenAI will continue releasing open models and what this means for the timeline of achieving artificial general intelligence, with some observers suggesting that if major companies like Anthropic don’t produce significantly better models soon, AGI may be further away than previously thought.

Did yesterday’s release shift the needle in the open vs. closed debate? Today in @ReedAlbergotti’s newsletter https://x.com/fdaudens/status/1953147586312872057

OpenAI / America is still ahead in the race”” -> no There is no western open-source model that beats or ties the best chinese open-source models.”” / X https://x.com/scaling01/status/1952900225120780705

The US now likely has the leading open weights models (or close to it)… … but the real question is whether this is a one-off situation from OpenAI, in which case the lead will evaporate quickly as others catch up. But also unclear what their incentives are to keep updating.”” / X https://x.com/emollick/status/1952836130958917894

It seems the closed-source vs open-weights landscape has been leveled. GPT-5 is just 10% better at coding than an open-weight model you can run on a consumer desktop and soon laptop. If Anthropic cannot come up with a good model, then we will probably not see AGI for a while.”” / X https://x.com/Tim_Dettmers/status/1953521836299350494

Reflection AI seeks billion-dollar funding to build open-source models
Reflection AI, a one-year-old startup founded by former DeepMind researchers, is reportedly in discussions to raise over $1 billion in funding. The company aims to develop open-source artificial intelligence models that would compete directly with established players like DeepSeek, Meta, and Mistral. This significant fundraising effort highlights the growing competition in the open-source AI space, where companies are racing to create powerful language models that developers can freely access and modify. The involvement of DeepMind veterans suggests the startup has strong technical expertise, while the massive funding target indicates investor confidence in the open-source AI market’s potential.

Late night scoop w/ @KevKubernetes @nmasc_: Reflection AI, the 1-yr-old startup founded by DeepMind researchers, is in talks to raise $1B+ as it looks to develop open-source models to compete with DeepSeek, Meta and Mistral: https://x.com/steph_palazzolo/status/1952555858761588892

OpenAI releases powerful open-source language models for local use
OpenAI has released gpt-oss, a collection of open-source language models that match the performance of their o4-mini model while running entirely on personal devices. The release includes two versions: a 120-billion parameter model that runs on high-end laptops and a smaller 20-billion parameter version that works on smartphones. These models represent a significant shift in AI accessibility, allowing users to run advanced language processing locally without relying on cloud services or internet connections. The models quickly gained traction in the developer community, reaching the top spot on Hugging Face, a popular AI model repository, within just two hours of release. This development marks an important step toward democratizing AI technology by giving users full control over powerful language models on their own hardware.

🚨 It’s official: OpenAI’s gpt-oss-120b & gpt-oss-20b just landed on Hugging Face! Brand new open-weight LLMs ready for anyone to try, fine-tune, and run anywhere. Here’s what makes this drop a big deal: https://x.com/fdaudens/status/1952781183575593234

And just like that, @OpenAI gpt-oss is now the number one trending model on @huggingface, out of almost 2M open models 🚀 People sometimes forget that they’ve already transformed the field: GPT-2, released back in 2019 is HF’s most downloaded text-generation model ever, and https://x.com/ClementDelangue/status/1952827283808375168

BREAKING: OpenAI just released two open-weight models: gpt-oss-120b and gpt-oss-20b. The 120B model is on par with o4-mini on reasoning benchmarks and can run on a single 80GB GPU. The 20B model achieves similar results to o3-mini and can run on edge devices with 16GB of https://x.com/rowancheung/status/1952777754904072566

Frontier models, capable of agentic reasoning, can now run on your Macbook Pro 🧑‍💻 @OpenAI’s release of GPT-OSS 20B and 120B are the biggest releases in open-source this year. Build agentic workflows with @llama_index that run 100% locally! Huge props to @LoganMarkewich and https://x.com/jerryjliu0/status/1952883595787239563

gpt-oss for entirely local tool use:”” / X https://x.com/gdb/status/1952802157956350221

gpt-oss https://gpt-oss.com/

gpt-oss is a big deal; it is a state-of-the-art open-weights reasoning model, with strong real-world performance comparable to o4-mini, that you can run locally on your own computer (or phone with the smaller size). We believe this is the best and most usable open model in the”” / X https://x.com/sama/status/1952778518225723434

gpt-oss is out! we made an open model that performs at the level of o4-mini and runs on a high-end laptop (WTF!!) (and a smaller one that runs on a phone). super proud of the team; big triumph of technology.”” / X https://x.com/sama/status/1952777539052814448

gpt-oss-120b & gpt-oss-20b Model Card | OpenAI https://openai.com/index/gpt-oss-model-card/

Introducing gpt-oss | OpenAI https://openai.com/index/introducing-gpt-oss/

Just released gpt-oss: state-of-the-art open-weight language models that deliver strong real-world performance. Runs locally on a laptop! https://x.com/gdb/status/1952780717638942910

Open models by OpenAI | OpenAI https://openai.com/open-models/

Well, it took just 2 hours for OSS-GPT to hit #1 on @huggingface. Don’t remember seeing anything rise that fast! https://x.com/fdaudens/status/1952814865795698954

A hypothesis: gpt-oss is trained entirely on synthetic data, from pre-training to post-training. The approach enhances safety and helps smaller models achieve better performance.”” / X https://x.com/huybery/status/1952905224890532316

attention is 0.84% of gpt oss, intelligence is stored in those 99.16% mlp layer, attn is key to unlock it https://x.com/shxf0072/status/1953143243992166849

curious about the training data of OpenAI’s new gpt-oss models? i was too. so i generated 10M examples from gpt-oss-20b, ran some analysis, and the results were… pretty bizarre time for a deep dive 🧵 https://x.com/jxmnop/status/1953899426075816164

Everyone is sleeping on AMD for local models – gpt-oss 20B running on an AMD GPU @ 52 tok/sec in a <$1000 laptop https://x.com/dzhng/status/1953132623280165193

GPT-OSS-120B casually calculating the product of two random 30-digit numbers. without any tools, just 18k tokens https://x.com/scaling01/status/1952892387539259455

I think gpt-oss was always expected to be put in an agent harness that uses search for all its world knowledge. Ive always argued this is not a valid replacement, the rich connections it builds from actual backprop on the worlds knowledge – not just facts, but the aggregate”” / X https://x.com/Teknium1/status/1953230352568467761

I’m thrilled @OpenAI has released two open weight models. Thank you to all my friends at OpenAI for this gift! I’m also encouraged that from my quick tests gpt-oss-120b looks strong (though we should still wait for rigorous 3rd party evals).”” / X https://x.com/AndrewYNg/status/1952838045235126510

I’ve written the full story of Attention Sinks — a technical deep-dive into how the mechanism was developed and how our research ended up being used in OpenAI’s new OSS models. For those interested in the details: https://x.com/Guangxuan_Xiao/status/1953656755109376040

ICYMI: you can vibe test the latest gpt-oss models on gpt-oss[.]com 💥 We partnered with @OpenAI to bring easy access to the model right down to a browser near you! https://x.com/reach_vb/status/1953041435999010916

Ollama and @nvidia collaborate to accelerate gpt-oss on GeForce RTX and RTX PRO GPUs. NVIDIA and Ollama are advancing their partnership to boost model performance on NVIDIA GeForce RTX and RTX PRO GPUs. This collaboration enables users on RTX-powered PCs to accurately leverage https://x.com/ollama/status/1952782326926328313

Our new @OpenAI open models https://x.com/polynoamial/status/1952778238368887184

RT @ggerganov: Llama.cpp supports the new gpt-oss model in native MXFP4 format The ggml inference engine (powering llama.cpp) can run the…”” / X https://x.com/ggerganov/status/1952978670328660152

RT @OpenAIDevs: Student credits for gpt-oss With @huggingface, we’re offering 500 students $50 in inference credits to explore gpt-oss.…”” / X https://x.com/reach_vb/status/1953010091377958984

RT @satyanadella: Excited to bring OpenAI’s gpt-oss models to Azure AI Foundry and to Windows via Foundry Local. It’s hybrid AI in action:…”” / X https://x.com/xikun_zhang_/status/1952902211278913629

Thank you @OpenAI for open-sourcing these great models! 🙌 We’re proud to be the official launch partner for gpt-oss (20B & 120B) – now supported in vLLM 🎉 ⚡ MXFP4 quant = fast & efficient 🌀 Hybrid attention (sliding + full) 🤖 Strong agentic abilities 🚀 Easy deployment 👉🏻”” / X https://x.com/vllm_project/status/1952784530466849091

We fixed some issues for @OpenAI’s gpt-oss model! 1. Jinja template has extra \n s, didn’t parse thinking sections + tool calling wasn’t rendered correctly 2. Some versions miss <|channel|>final -> this is a must! 3. F16 infs: use F32+BF16! We made a few free Colab notebooks as https://x.com/danielhanchen/status/1953901104150065544

OpenAI models gain web browsing and code execution abilities
OpenAI has enhanced its GPT models with two built-in tools that significantly expand their capabilities. The models can now browse the web to search for information, read websites, follow links between pages, and cite their sources – similar to how a person would research topics online. They also include an interactive Python notebook that allows the models to write and run code directly, enabling them to perform calculations, analyze data, and create visualizations. These additions transform the models from text-only systems into more versatile assistants that can gather real-time information and solve computational problems, making them more practical for tasks like research, fact-checking, and technical analysis.

The gpt-oss models have been post-trained to use two specific first-party tools: 1. a web browser that can search, read pages, follow links, and cite sources 2. an interactive python notebook This will give gpt-oss based agents super powerful capabilities out of the box! https://x.com/corbtt/status/1952810876165312805

OpenAI’s GPT-OSS model shows mixed performance in early testing
Early users of OpenAI’s new GPT-OSS-120B model report inconsistent performance, with the system excelling at mathematical problems and benchmarks while struggling with practical tasks like coding and creative writing. The model scored 41.8% on the Aider Polyglot coding test, significantly below competitors like Kimi-K2 (59.1%) and DeepSeek-R1 (56.9%), and only slightly better than the much smaller Qwen3 32B model (40.0%). Users describe erratic behavior where the model switches between professional-level coding and making up basic facts that it refuses to correct, leading some to question its real-world usefulness beyond academic benchmarks. The consensus among early testers suggests the model lacks common sense and practical judgment despite its strong performance on standardized tests.

GPT-OSS models seem to be slopmaxxed on math/coding and reasoning – they are great at that but they completely lack taste and common sense at least that’s my vibe so far”” / X https://x.com/scaling01/status/1952881329772564764

holy shit get ready for a hallucination fiesta with gpt-oss https://x.com/scaling01/status/1952781018554933261

I was just about to make a post that GPT-OSS-120B is nontheless an overall good for the very low end. But I honestly don’t know what it is good at, except benchmarks. Coding seems to suck, creative writing is terrible… So it’s just a math model? https://x.com/scaling01/status/1953047913954791696

i’ve spent the last couple hours talking to gpt-oss and can safely say it’s unlike any model i’ve tested one second it’s coding for me at a professional level, the next it’s making up basic facts and clinging to them no matter what i say something very strange is going on”” / X https://x.com/jxmnop/status/1953216881361600729

Is it over for gpt-oss ? What are these Aider Polyglot scores? https://x.com/scaling01/status/1952780629772321257

It’s looking bad bois.. Aider Polyglot results for GPT-OSS-120B: 41.8% for comparison: Kimi-K2: 59.1% DeepSeek-R1: 56.9% Qwen3 32B: 40.0% https://x.com/scaling01/status/1953047534122713130

Anthropic cuts off OpenAI’s access to Claude API over terms violations
Anthropic revoked OpenAI’s API access to its Claude models this week, citing violations of terms of service that prohibit using the service to build competing products or train rival AI models. According to sources, OpenAI staff had been using Claude’s coding tools through developer access to evaluate its capabilities against their own models and test safety responses, particularly as OpenAI reportedly prepares to release GPT-5 with improved coding abilities. While OpenAI called the benchmarking “industry standard” and noted their API remains available to Anthropic, the AI company said it would continue providing OpenAI access specifically for safety evaluations and benchmarking purposes. The move follows a pattern of tech companies restricting competitor access to their services, including Anthropic’s recent cutoff of AI coding startup Windsurf after rumors of a potential OpenAI acquisition.

@aidan_mclau This take is sad to see but you might not have full context. We cut OpenAl’s access for violating our APl terms and for the heavy usage of Claude Code among OAI tech staff. We’re going to continue providing API access for safety evals and benchmarking. That’s important to us. https://x.com/sammcallister/status/1951642025381511608

Anthropic Revokes OpenAI’s Access to Claude | WIRED https://www.wired.com/story/anthropic-revokes-openais-access-to-claude/

Nobel laureates and experts demand transparency from OpenAI about restructuring
Over 100 Nobel laureates, professors, whistleblowers, public figures, artists, and nonprofit organizations have signed an open letter calling on OpenAI to provide clear information about its corporate restructuring plans. The group is requesting that the artificial intelligence company be transparent about changes to its organizational structure, which could affect its original nonprofit mission and governance. The letter represents growing concern among academics and civil society about how OpenAI’s evolution from a nonprofit research organization to a more commercially-focused entity might impact its commitment to developing AI safely and for the benefit of humanity.

🚨 Breaking: A group of 100+ Nobel laureates, professors, whistleblowers, public figures, artists, and nonprofit organizations just released a letter asking OpenAI to tell the truth about its restructuring. Here’s what they had to say: 🧵 https://x.com/TheMidasProj/status/1952326634981543979

The Midas Project on X: “🚨 Breaking: A group of 100+ Nobel laureates, professors, whistleblowers, public figures, artists, and nonprofit organizations just released a letter asking OpenAI to tell the truth about its restructuring. Here’s what they had to say: 🧵 https://t.co/zhIccjnWU4” / X https://x.com/TheMidasProj/status/1952326634981543979

OpenAI becomes official vendor for US government agencies
OpenAI has secured approval as an official AI vendor for the U.S. government and will provide ChatGPT access to all federal employees through a partnership with the Government Services Administration. The company is offering the AI assistant to federal agencies for just $1 per year per agency, making the technology available across the entire federal workforce. This arrangement aims to help government employees use AI tools for their work while maintaining the privacy and security standards required for government operations. The partnership represents a significant expansion of AI adoption in the public sector, potentially affecting how millions of federal workers complete their daily tasks.

America’s hardest problems need the world’s most capable AI. OpenAI is now officially an approved U.S. Government AI vendor. We’re bringing privacy, security, and innovation to the nation’s most critical missions. 🇺🇸 https://x.com/cryps1s/status/1952749787994112275

In partnership with the Government Services Administration, we are providing ChatGPT to the entire U.S. federal workforce for essentially no cost for the next year. https://x.com/gdb/status/1953120865115074805

OpenAI for the U.S. government:”” / X https://x.com/gdb/status/1952756538399228091

Providing ChatGPT to the entire U.S. federal workforce | OpenAI https://openai.com/index/providing-chatgpt-to-the-entire-us-federal-workforce/

we are providing ChatGPT access to the entire federal workforce! (for $1 a year per agency) https://x.com/sama/status/1953103336044990779

Federal agencies grapple with employee AI adoption challenges
A new analysis reveals that many government employees are already using AI tools, often without official guidance or oversight. This widespread informal adoption raises critical questions about how federal agencies will manage these technologies to improve services rather than create new problems. The situation highlights a gap between grassroots AI usage by staff and the need for leadership to establish proper frameworks, training, and innovation labs within agencies. Without coordinated strategies from agency leadership and dedicated innovation teams, the benefits of AI could be lost to inefficiency, security risks, or misuse.

The giant question is: now that The Crowd in government has access to AI tools (which, given representative surveys, many were already using) how are they going to be used to make things better, not worse? Where are Leadership & The Lab inside agencies? https://x.com/emollick/status/1953118449611272575

Swedish Prime Minister adopts ChatGPT for government communications and tasks
Sweden’s Prime Minister has begun using ChatGPT to assist with various government functions, marking one of the first instances of a national leader publicly adopting AI tools for official duties. The integration includes using the AI assistant for drafting communications, analyzing policy documents, and streamlining administrative tasks. This move reflects growing acceptance of AI technology in government operations and could influence how other world leaders approach digital transformation. While specific details about security measures and usage guidelines remain limited, the adoption signals a shift toward AI-assisted governance in democratic nations.

ChatGPT for helping the Swedish Prime Minister:”” / X https://x.com/gdb/status/1952111193868673335

North Carolina state employees cut task times from minutes to seconds with AI
North Carolina’s Department of State Treasurer tested ChatGPT for three months and found it dramatically reduced work time for public employees. Tasks that previously took 20 minutes were completed in 20 seconds, while a 90-minute audit review was cut to 30 minutes. The independent study by N.C. Central University showed 85% of participating employees had positive experiences and saved 30-60 minutes daily. Employees used the AI tool to draft communications, summarize long documents, translate technical information into plain language, and explore new problem-solving approaches. The report emphasized that ChatGPT enhanced rather than replaced human judgment, with employees applying their expertise to refine AI-generated results.

ChatGPT for speeding up North Carolina public servants (e.g. reducing some tasks from 20 minutes to 20 seconds): https://x.com/gdb/status/1951376444363514100

State Treasurer Briner: “”OpenAI Report Shows Many Benefits, Offers Great Promise”” | NC Treasurer https://www.nctreasurer.gov/news/press-releases/2025/08/01/state-treasurer-briner-openai-report-shows-many-benefits-offers-great-promise

AI technology poses major risks for 2028 presidential election
A technology expert warns that artificial intelligence tools could dramatically impact the 2028 U.S. presidential election in unprecedented ways. The concern centers on how AI could be used to spread misinformation, create convincing fake content, or manipulate voters at a massive scale. The expert suggests that current safeguards and public awareness are insufficient to handle these emerging threats, calling for immediate discussions about protective measures. This warning reflects growing anxiety among technologists about AI’s potential to disrupt democratic processes, particularly as the technology becomes more sophisticated and accessible over the next few years.

honestly scared about the power and scale of ai technologies that’ll be used in the upcoming 2028 presidential election. it could be a civilizational turning point. we aren’t ready. we should probably start preparing, or at least talking about how we could prepare.”” / X https://x.com/DavidSHolz/status/1952541453491867792

Truth Social partners with Perplexity to add AI search capabilities
Donald Trump’s media company has integrated Perplexity’s AI search technology into Truth Social, allowing users to search the web directly from the platform’s browser version. The partnership, announced Wednesday, is currently in public beta testing. Trump Media CEO Devin Nunes, who also chairs the President’s Intelligence Advisory Board, described the addition as strengthening Truth Social’s role in what he called the “Patriot Economy.” The integration marks Truth Social’s entry into AI-powered search features, following the trend of social media platforms incorporating artificial intelligence tools to enhance user experience.

Trump Is Launching an AI Search Engine Powered by Perplexity https://www.404media.co/trump-is-launching-an-ai-search-engine-powered-by-perplexity/

OpenAI designs ChatGPT to support users’ wellbeing and productivity (GARGBAGE SUMMARY)
OpenAI is reshaping ChatGPT to be a tool that enhances users’ lives rather than capturing their attention. The company has introduced features to help during difficult times, added break reminders to prevent overuse, and is developing improved life advice capabilities. These changes are being guided by expert input to ensure the AI assistant supports users in achieving their goals while maintaining healthy usage patterns. The focus represents a shift toward building AI that prioritizes user wellbeing over engagement metrics.

We build ChatGPT to help you thrive in the ways you choose — not to hold your attention, but to help you use it well. We’re improving support for tough moments, have rolled out break reminders, and are developing better life advice, all guided by expert input.”” / X https://x.com/OpenAI/status/1952414411131671025

What we’re optimizing ChatGPT for | OpenAI https://openai.com/index/how-we’re-optimizing-chatgpt/

Perplexity challenges Cloudflare’s stance on AI agents and web access
Perplexity has issued a strong response to Cloudflare’s position on AI agents accessing websites, arguing that AI agents are simply extensions of human users and should be treated as such. The dispute centers on whether AI agents should have different access rights than human users when browsing the web. Perplexity’s rebuttal suggests that Cloudflare’s leadership either misunderstands fundamental AI concepts or is taking a stance that prioritizes appearance over substance, with the company stating that Cloudflare’s position shows they are “more flair than cloud.”

RT @balajis: Good rebuttal to Cloudflare by Perplexity. The core point is that an AI agent is just an extension of a human. So when it mak…”” / X https://x.com/jeremyphoward/status/1952818615578968265

The bluster around this issue reveals that Cloudflare’s leadership is either dangerously misinformed on the basics of AI, or simply more flair than cloud.”” / X https://x.com/perplexity_ai/status/1952532113095643185

Google defends AI search amid publisher traffic decline concerns
Google is pushing back against reports that its AI search features are harming website traffic, claiming that overall clicks from its search engine to websites have remained “relatively stable” year-over-year. The company’s VP of Search, Liz Reid, argues that while some sites are seeing decreased traffic, others are gaining, with users increasingly seeking out forums, videos, and social content for authentic perspectives. However, Google hasn’t provided specific data to support these claims, and independent studies show concerning trends – one report found that news searches resulting in zero clicks to publishers grew from 56% to 69% between May 2024 and May 2025. The company acknowledges that user behavior is shifting, with younger users often starting searches on TikTok, Instagram, or Reddit instead of Google, suggesting that changes in web traffic patterns may reflect broader shifts in how people use the internet rather than just the impact of AI features.

Google denies AI search features are killing website traffic | TechCrunch https://techcrunch.com/2025/08/06/google-denies-ai-search-features-are-killing-website-traffic/

DeepSeek releases comprehensive guide for companies to build custom AI models *WRONG*. IT’S HUGGINGFACE. ANOTHER BAD SUMMARY – I LEAVE THEM FOR EXAMPLES
DeepSeek has published a 200-page “Ultra-Scale Playbook” that teaches companies how to train their own large language models similar to DeepSeek R1, Llama, or GPT-5. The guide covers advanced technical concepts like 5D parallelism, which helps distribute the massive computational workload across multiple processors. The company argues that just as every tech company writes its own software code, they should also be able to create their own AI models, viewing artificial intelligence as the next evolution of software development. This democratization of AI training knowledge could enable more organizations to develop specialized models tailored to their specific needs rather than relying solely on general-purpose models from major tech companies.

Every tech company can and should train their own deepseek R1, Llama or GPT5, just like every tech company writes their own code (and AI is no more than software 2.0). This is why we’re releasing the Ultra-Scale Playbook. 200 pages to master: – 5D parallelism (DP, TP, PP, EP, https://x.com/ClementDelangue/status/1952048356710039700

Gray Swan launches $500K challenge to test AI model safety **ALSO WRONG- IT’S OPENAI**
Gray Swan has announced a $500,000 competition inviting researchers and developers worldwide to find security vulnerabilities in their newly released open-source AI model. Participants will search for novel risks and potential weaknesses in the system, with their findings reviewed by experts from major AI companies including OpenAI, Anthropic, Google, UK AISI, and Apollo. The challenge aims to improve AI safety by identifying and fixing problems before they can cause harm, with the company using the results to strengthen their model’s defenses against misuse or unexpected behaviors.

Red teamers assemble! ⚔️💰 We’re putting $500K on the line to stress‑test just released open‑source model. Find novel risks, get your work reviewed by OpenAI, Anthropic, Google, UK AISI, Apollo, and help harden AI for everyone.”” / X https://x.com/woj_zaremba/status/1952886644090241209

We’re launching a $500K Red Teaming Challenge to strengthen open source safety. Researchers, developers, and enthusiasts worldwide are invited to help uncover novel risks—judged by experts from OpenAI and other leading labs. https://x.com/OpenAI/status/1952818694054355349

Tech companies pledge to expand AI education for American students
Over 100 organizations have committed to a new initiative called the Pledge to America’s Youth, which aims to teach AI and cybersecurity skills to students across the United States. The participating companies will partner with schools, educators, and local communities to develop programs that help young people understand and work with artificial intelligence technology. This effort addresses growing concerns that students need these technical skills to prepare for future jobs, as AI becomes increasingly important in many industries. The pledge represents one of the largest coordinated efforts to date to ensure American students have access to AI education, though specific details about funding, curriculum, and implementation timelines have not been announced.

We joined the Pledge to America’s Youth along with 100+ organizations committed to advancing AI education. We’ll work with educators, students, and communities nationwide to build essential AI and cybersecurity skills for the next generation.”” / X https://x.com/AnthropicAI/status/1953864587192770921

Google introduces Guided Learning feature in Gemini for deeper understanding
Google has launched Guided Learning in Gemini, a new educational feature that acts as a personal learning companion rather than just providing quick answers. The tool uses open-ended questions, step-by-step breakdowns, and multimodal content including images, videos, and quizzes to help users actively engage with subjects and build deep understanding. Developed in partnership with educators and learning experts since 2022, the feature is powered by LearnLM models that incorporate educational research and learning science principles. Students can use it for exam preparation, writing papers, or exploring personal interests, while teachers can easily share it through Google Classroom to encourage critical thinking. The feature creates a judgment-free conversational space where learners can explore topics at their own pace, representing Google’s shift from simply answering questions to fostering genuine comprehension and skill development.

Guided Learning in Gemini: From answers to understanding https://blog.google/outreach-initiatives/education/guided-learning/

Here are new @GeminiApp tools to help you learn, understand and study better this school year ✏️ – Guided Learning helps you build a deep understanding of subjects, with step-by-step breakdowns that uncover the “why” and “how” – Gemini’s responses automatically integrate https://x.com/Google/status/1953143185011617891

Google commits one billion dollars to AI education initiatives
Google has announced a major education initiative that will provide free AI training and Google Career Certificates to college students across the United States through its new AI for Education Accelerator program. The tech giant is committing $1 billion over the next three years to support AI literacy programs, research, and other educational efforts. This investment aims to prepare students for careers in artificial intelligence by giving them access to professional training and certification programs at no cost. The initiative represents one of the largest corporate investments in AI education to date and could help address the growing demand for workers with AI skills across industries.

New: The Google AI for Education Accelerator will provide free AI training & Google Career Certificates to college students in the U.S. We’re also committing $1 billion to AI literacy, research and more over the next 3 years → https://x.com/Google/status/1953126394847768936

ChatGPT launches study mode to help adults relearn math
OpenAI has introduced a new Study Mode feature in ChatGPT designed to help users learn subjects like algebra through interactive tutoring. The mode acts as a personal tutor, breaking down complex math concepts into manageable steps and providing practice problems with detailed explanations. Users can ask questions, work through examples at their own pace, and receive personalized feedback on their work. The feature aims to make learning math less intimidating for adults who struggled with it in school or need to refresh their knowledge. Early users report that the conversational approach and patience of the AI tutor helps them understand concepts they previously found difficult.

ChatGPT study mode for learning algebra as an adult:”” / X https://x.com/gdb/status/1951792801143980238

I bombed algebra in high school. ChatGPT’s new Study Mode is my redemption arc 😅 https://x.com/sharongoldman/status/1950988509352743014

Google releases Deep Think for Gemini app subscribers
Google has launched Deep Think, an advanced AI feature for Google AI Ultra subscribers that uses parallel thinking techniques to solve complex problems. The tool, available through the Gemini app, represents an improved version of technology that achieved gold-medal performance at the International Mathematical Olympiad. Deep Think excels at tasks requiring creativity and strategic planning, including web development, scientific research, and coding challenges. The system works by extending “thinking time” to explore multiple ideas simultaneously before arriving at optimal solutions. Google is also providing select mathematicians access to the full competition-level model for research purposes. Users can activate Deep Think through a toggle in the Gemini app’s prompt bar, with the feature automatically integrating tools like code execution and Google Search.

“”Claude Code can now handle long-running tasks in the background. Start your dev server, run tests, or build your project without blocking your workflow https://x.com/_catwu/status/1953926541370630538

ByteDance launches open source AI agent for software engineering
ByteDance, the Chinese company behind TikTok, has released Trae Agent, an AI tool that helps developers write and manage software code through simple English commands. The system uses large language models to understand what users want to build and can handle complex programming tasks through an interactive command-line interface. It works with popular AI services from OpenAI and Anthropic, and the company has made the entire codebase freely available as open source software, allowing anyone to use, modify, or improve it.

Gemini 2.5 Deep Think is state-of-the-art performance across many challenging benchmarks!”” / X https://x.com/demishassabis/status/1951468051578142848

Gemini 2.5: Deep Think is now rolling out https://blog.google/products/gemini/gemini-2-5-deep-think/

Gemini Deep Think, our SOTA model with parallel thinking that won the IMO Gold Medal 🥇, is now available in the Gemini App for Ultra subscribers!! Should we put it in the Gemini API next? https://x.com/OfficialLoganK/status/1951260803459338394

not enough people are talking about the delta between the parallel thinking uplifts of oai vs gdm AIME o3 pro: +3% (from 90->93 on 2024) deep think: +11.2% (from 88->99.2 on 2025) Knowledge o3 pro: +3% (on GPQA) deep think: +13.2% (on HLE) Coding o3 pro: +9.1% (on Codeforces https://x.com/swyx/status/1951460518293807241

Played with Deep Think, a dramatic improvement for Google. It is getting close to O3 Pro – I’d say, a solidly second best model right now. Far less verbose! With limits of about 10 a day, not ready for the professional use, though. https://x.com/MParakhin/status/1952028947153371631

Google launches AI coding assistant Jules with new pricing tiers
Google has officially launched Jules, its AI-powered coding assistant, after a two-month beta period that saw thousands of developers complete over 140,000 code improvements. The tool, which runs on Gemini 2.5 Pro, works differently from competitors by operating asynchronously – meaning developers can assign it tasks and walk away while it clones repositories, analyzes code, and implements fixes in the background. Google introduced a free tier limited to 15 daily tasks, with paid plans at $19.99 and $124.99 monthly offering higher limits. During beta testing, 45% of the 2.28 million visits came from mobile devices, prompting Google to explore mobile-specific features. The company also clarified its privacy policy, confirming that private repository data won’t be used for AI training, while public repository data may be used.

China’s ByteDance just released an LLM-based agent for general purpose software engineering tasks. Trae Agent comes with an interactive CLI that can execute complex workflows using simple English prompts. It works with OpenAI and Anthropic API. 100% opensource. https://x.com/Saboo_Shubham_/status/1942047679758151783

Manus launches Wide Research for large-scale parallel computing tasks
Manus AI has released Wide Research, a feature that allows users to control multiple AI agents working in parallel to handle complex research tasks. The system lets users analyze hundreds of items simultaneously – like comparing Fortune 500 companies or MBA programs – through simple chat interactions. Unlike traditional multi-agent systems with fixed roles, each agent in Wide Research is a full Manus instance that can adapt to any task. The feature runs on Manus’s cloud computing platform, which provides each user session with a dedicated virtual machine. Wide Research is initially available to Pro subscribers, with plans to expand to other tiers. The company positions this as the first step in making supercomputing power accessible to non-technical users through conversational interfaces.

Google’s AI coding agent Jules is now out of beta | TechCrunch https://techcrunch.com/2025/08/06/googles-ai-coding-agent-jules-is-now-out-of-beta/

ByteDance’s SeedProver sets new benchmark for mathematical problem solving
ByteDance has released SeedProver, an AI system that achieved record-breaking performance on PutnamBench, a challenging mathematical problem-solving test. The model correctly solved 331 out of 657 problems, nearly four times better than previous leading systems, and maintained strong performance even with limited computing resources, solving 201 problems under lightweight conditions. SeedProver outperformed DeepMind’s AlphaGeometry2 and achieved perfect scores on certain mathematical tasks, demonstrating significant progress in AI’s ability to tackle complex mathematical reasoning that typically challenges even advanced mathematics students.

Introducing Wide Research https://manus.im/blog/introducing-wide-research

Perplexity adds restaurant booking through OpenTable partnership
Perplexity, the AI-powered search engine, has partnered with OpenTable to let users make restaurant reservations directly within its platform. This integration means people can search for restaurants and book tables without leaving Perplexity’s interface, combining the AI assistant’s ability to answer dining-related questions with immediate booking capabilities. The feature streamlines the process of finding and reserving restaurants by eliminating the need to switch between different apps or websites.

ByteDance dropped SeedProver. This model scored 331/657 on PutnamBench (nearly 4× better than the previous state of the art) and 201/657 under lightweight inference (pass@64‑256 equivalent). Its reported performance surpasses DeepMind’s AlphaGeometry2 and achieves 100% on https://x.com/cgeorgiaw/status/1952301113446699347

Anthropic upgrades Claude Opus with improved coding and reasoning abilities
Anthropic released Claude Opus 4.1, an update to their AI model that improves performance on coding tasks, research, and reasoning. The model achieved 74.5% accuracy on SWE-bench Verified, a benchmark that tests AI systems’ ability to solve real software engineering problems. Companies testing the update reported significant improvements, with GitHub noting better multi-file code refactoring capabilities and Rakuten Group highlighting the model’s precision in debugging large codebases without introducing new errors. Windsurf found the performance jump from Opus 4 to 4.1 was comparable to the improvement seen between previous major model versions. The update is available through Claude’s paid services and various cloud platforms at the same price as the previous version, with Anthropic recommending all users upgrade from Opus 4 to take advantage of the enhanced capabilities.

Need a table? Just ask. Perplexity is partnering with @OpenTable to bring restaurant reservations directly into Perplexity products. https://x.com/perplexity_ai/status/1952434779036774488

OpenAI CEO predicts pocket-sized AI will surpass human intelligence
OpenAI CEO Sam Altman has stated that artificial intelligence systems more capable than the smartest humans will soon run on personal devices like smartphones. In a recent social media post, Altman described this development as “very remarkable,” suggesting that these advanced AI assistants will be able to help users with any task they need. While he didn’t provide a specific timeline, his use of “someday soon” indicates he believes this breakthrough in portable AI technology is approaching rapidly. The prediction represents a significant leap from current AI capabilities, which typically require substantial computing power and internet connectivity to operate at high levels.

Claude Opus 4.1 (“”claude-leopard-v2-02-prod””) “”Opus 4.1 is here – Try our latest model for more problem solving power.”””” / X https://x.com/btibor91/status/1952366658326036781

Claude Opus 4.1 \ Anthropic https://www.anthropic.com/news/claude-opus-4-1

Claude Opus 4.1 beats GPT-5 on SWE bench https://x.com/Sauers_/status/1953504854044704973

Claude Opus 4.1 is available in Cursor! Let us know what you think.”” / X https://x.com/cursor_ai/status/1952782293925298655

Going live with the fellas @tbpn in an hour to talk about Opus 4.1 and Claude Code”” / X https://x.com/alexalbert__/status/1952801100299681959

AI tools democratize access to expert knowledge and skills
The widespread availability of AI assistants is breaking down traditional barriers to learning and creation. Where people once abandoned promising ideas due to lack of technical knowledge or access to experts, they can now tap into AI tools that serve as on-demand advisors, programmers, and teachers. This shift means that individuals no longer need formal training or personal connections to explore new fields, build prototypes, or understand complex topics. The technology essentially provides everyone with a knowledgeable mentor available 24/7, potentially unlocking countless innovations that would have otherwise remained unrealized due to knowledge gaps or resource constraints.

someday soon something smarter than the smartest person you know will be running on a device in your pocket, helping you with whatever you want. this is a very remarkable thing.”” / X https://x.com/sama/status/1952879515287601465

AI tools are becoming as disposable as fast fashion
The software industry is experiencing a fundamental shift where AI-powered tools are being created, used briefly, and discarded at an unprecedented pace, similar to how fast fashion operates. This trend suggests that software-as-a-service (SaaS) products are moving away from long-term subscriptions toward temporary, single-purpose applications that users adopt for specific tasks and quickly abandon. The comparison to fast fashion indicates concerns about sustainability, quality, and the environmental impact of rapidly cycling through digital tools, while also highlighting how AI has lowered the barriers to creating new software products so dramatically that developers can pump out applications as quickly as clothing retailers produce new styles.

Something I think about a lot: who knows how many brilliant ideas never saw the light of day because “”I don’t know how to do that.”” Pretty crazy to think that with AI everyone now has a reasonable VC advisor, coder, or professor on hand to teach you about anything you want”” / X https://x.com/mustafasuleyman/status/1951323569905934427

Shopify launches AI shopping tools for conversational commerce platforms
Shopify has released new tools that allow AI assistants to search for products and complete purchases directly within chat conversations. The system works through three main components: a catalog search that finds products across Shopify merchants, a universal shopping cart that collects items from multiple stores, and a checkout system that processes payments while maintaining both the merchant’s and AI platform’s branding. The tools handle complex product variations like subscriptions and bundles automatically, and meet standard compliance requirements including GDPR and payment security standards. Currently in early access, developers must apply for approval to integrate these shopping capabilities into their AI applications.

entering the fast fashion era of SaaS very soon”” / X https://x.com/sama/status/1952084574366032354

AI voice startup EliseAI reaches $2 billion valuation with Andreessen backing
EliseAI, a company that develops AI voice agents for property management and healthcare industries, has secured funding from venture capital firm Andreessen Horowitz at a $2 billion valuation. The investment highlights growing investor interest in AI voice technology, which allows businesses to automate phone conversations with customers. EliseAI’s software can handle tasks like scheduling apartment tours, answering tenant questions, and managing healthcare appointments through natural-sounding phone conversations. The significant valuation reflects strong demand for voice AI solutions that can reduce staffing costs while maintaining customer service quality in industries that rely heavily on phone communications.

Agentic commerce has arrived https://shopify.dev/docs/agents

ElevenLabs launches redesigned interface for AI voice conversation agents
ElevenLabs has unveiled a completely redesigned Conversational Agents page that showcases their latest AI voice technology. The new interface represents a significant shift in the company’s brand direction and provides users with access to more advanced AI agents capable of natural voice conversations. The redesign emphasizes the growing importance of voice-based AI interactions and positions ElevenLabs’ agents as more sophisticated tools for businesses and developers looking to integrate conversational AI into their products.

Earlier this summer, we told you that AI voice agents are hot. For an idea of just how hot: Andreessen Horowitz is backing EliseAI, which makes AI voice agents for property mgmt + healthcare, at a $2B valuation. w/ @srimuppidi @coryweinberg https://x.com/steph_palazzolo/status/1952740505747382364

OpenAI launches GPT-5 with built-in reasoning capabilities
OpenAI released GPT-5, which automatically decides when to think deeply about complex problems versus responding quickly to simple questions. The system uses a smart router that chooses between a fast model for basic tasks and a deeper reasoning model called “GPT-5 thinking” for harder problems, eliminating the need for users to manually switch between different AI models. The company claims GPT-5 reduces hallucinations by 45% compared to GPT-4o and scores significantly higher on coding, math, and health-related benchmarks. GPT-5 becomes the default model for all ChatGPT users, replacing previous versions, with paid subscribers getting higher usage limits and access to GPT-5 pro for the most complex tasks. The model shows particular improvements in creating functional websites and apps from single prompts, following complex instructions more reliably, and providing more accurate health information while being less overly agreeable than previous versions.

The design team at @elevenlabsio is making big moves🚀 We’ve released a brand new Conversational Agents page, redesigned from the ground up for the most powerful version of Al Agents yet. It’s a big deal because it represents a new direction for the ElevenLabs brand – a https://x.com/RomaTesla/status/1949808534595526806

Runway releases Aleph video AI with improved scene consistency
Runway has launched Aleph, its latest AI video generation tool that addresses one of the biggest challenges in AI-generated videos: maintaining consistency across different scenes. The system demonstrates its capabilities through various examples, including transforming a woman riding a snail into different scenarios – from nighttime settings to mechanical creatures being chased by police cars. Users are already experimenting with the tool in creative ways, such as transitioning footage from shooting ranges to battlefields using FPV glasses, and integrating it with 3D software like Blender for more complex workflows. The release includes both web and API access, along with community showcases and weekly challenges through Discord to encourage user experimentation.

💥 It’s here! GPT-5 is rolling out in ChatGPT for everyone, starting today. It’s a 🤯 good model, and we’ve simplified the UI alongside it. No more choosing between gpt-4o and o4-mini. When you ask a hard question and the model needs to think hard, it does. When it can give you”” / X https://x.com/kevinweil/status/1953502681181618277

A compilation of experiences I made with GPT-5 in one shot. The poem camera app is particularly impressive because the model came up with all the details, like the way the photos stack in the gallery, the photo developing animation, etc https://x.com/skirano/status/1953516768317628818

AMA with @sama + some members of the GPT-5 team Tomorrow 11am PT. https://x.com/OpenAI/status/1953548075760595186

ChatGPT Plus subscription lost 95% of its value over night and this is without accounting for the loss of GPT-4.5″” / X https://x.com/scaling01/status/1953782641838190782

Codex CLI + GPT-5:”” / X https://x.com/gdb/status/1953556751762288653

Does OpenAI not do basic integration testing? At the time of release, the first code sample provided in the GPT-5 docs could not be run, because someone accidentally deleted the `output_text` property. My CI notified me. Why didn’t theirs? https://x.com/jeremyphoward/status/1953610071654772985

going to try live-tweeting the GPT-5 livestream. first, GPT-5 in an integrated model, meaning no more model switcher and it decides when it needs to think harder or not. it is very smart, intuitive, and fast. it is available to everyone, including the free tier, w/reasoning!”” / X https://x.com/sama/status/1953502614676811865

GPT-5 (medium reasoning) is the new leader on the Short Story Creative Writing benchmark! GPT-5 mini (medium reasoning) is much better than o4-mini (medium reasoning). Claude Opus 4.1 shows gains over Opus 4. https://x.com/LechMazur/status/1953658077300875656

GPT-5 (medium reasoning) sets a new record on the Confabulations/Hallucinations on Provided Texts benchmark! https://x.com/LechMazur/status/1953582063686434834

GPT-5 claims #1 spot on LiveBench https://x.com/scaling01/status/1953602929375813677

gpt-5 for long context reasoning:”” / X https://x.com/gdb/status/1953747271666819380

GPT-5 gets 74.9 on SWE-bench. Wonder what the budget per task is. https://x.com/OfirPress/status/1953502998627221519

GPT-5 Hands-On: Welcome to the Stone Age https://www.latent.space/p/gpt-5-review

GPT-5 in the high reasoning setting hit the 100K token limit for our evaluations on 10/290 Tier 1-3 samples (3%). This means our evaluation might slightly underestimate the reasoning capabilities of GPT-5.”” / X https://x.com/EpochAIResearch/status/1953615908695314564

GPT-5 is extremely sensitive to instructions. Either give it demonstrations or tell it explicitly how you want the output. Avoid doing both. If you do, GPT-5 will override the examples with your output instructions. Sharing more just in case you face this issue:”” / X https://x.com/omarsar0/status/1953876255037612531

GPT-5 is here – and it’s #1 across the board. 🥇#1 in Text, WebDev, and Vision Arena 🥇#1 in Hard Prompts, Coding, Math, Creativity, Long Queries, and more Tested under the codename “summit”, GPT-5 now holds the highest Arena score to date. Huge congrats to @OpenAI on this https://x.com/lmarena_ai/status/1953504958378356941

GPT-5 is here! 🚀 For the first time, users don’t have to choose between models — or even think about model names. Just one seamless, unified experience. It’s also the first time frontier intelligence is available to everyone, including free users! GPT-5 sets new highs across”” / X https://x.com/ElaineYaLe6/status/1953607005144506454

GPT-5 is here. Rolling out to everyone starting today. https://x.com/OpenAI/status/1953504357821165774

GPT-5 is live in Cline. We’ve been working with OpenAI to get this model ready, and here’s our take: it’s disciplined, persistent, & highly competent. It’s collaborative in planning & and a diligent operator while acting. It plans thoroughly, asks optioned follow-ups when https://x.com/cline/status/1953525433808695319

GPT-5 is now available in Cursor. It’s the most intelligent coding model our team has tested. We’re launching it for free for the time being. Enjoy!”” / X https://x.com/cursor_ai/status/1953519580627742750

GPT-5 is now available on Perplexity and Comet for Max and Pro subscribers. Just ask. https://x.com/perplexity_ai/status/1953537170964459632

GPT-5 new SOTA on WeirdML beating o3-pro https://x.com/scaling01/status/1953919743842238472

GPT-5 only a 3% improvement over o3 at reproducing scientific papers https://x.com/scaling01/status/1953503883331846629

GPT-5 pricing is insane IT’S OVER https://x.com/scaling01/status/1953509084008710547

GPT-5 rollout updates: *We are going to double GPT-5 rate limits for ChatGPT Plus users as we finish rollout. *We will let Plus users choose to continue to use 4o. We will watch usage as we think about how long to offer legacy models for. *GPT-5 will seem smarter starting”” / X https://x.com/sama/status/1953893841381273969

GPT-5 sentiment from the trenches (AKA 24 hours in Cline users’ hands): It’s a precision instrument, not a Swiss Army knife. Give it detailed prompts and it delivers exactly what you asked for — no tangents, no hallucinations about “”finished”” code. However, it’s less performant https://x.com/cline/status/1953898747928441017

GPT-5 sets a new record on FrontierMath! On our scaffold, GPT-5 with high reasoning effort scores 24.8% (±2.5%) and 8.3% (±4.0%) in tiers 1-3 and 4, respectively. https://x.com/EpochAIResearch/status/1953615906535313664

GPT-5 system card capability evals reactions thread. First observation: ~no improvement on all the coding evals that aren’t SWEBench https://x.com/eli_lifland/status/1953507434238288230

GPT-5 Thinking is less deceptive than o3 However when elicited to display deceptive behaviour it jumps to 28% https://x.com/scaling01/status/1953504438691221856

GPT-5 was doing 2B tokens per minute 3 hours after launch 🤯”” / X https://x.com/kevinweil/status/1953649263411704195

GPT-5 with big improvements in Tau-Bench except the airline category https://x.com/scaling01/status/1953505637242974695

GPT-5 with high reasoning effort on SimpleBench https://x.com/scaling01/status/1953771276549358041

GPT-5: $0.625/$5.00 with flex pricing is ridiculous https://x.com/scaling01/status/1953517149768593903

GPT-5’s Router: how it works and why Frontier Labs are now targeting the Pareto Frontier https://www.latent.space/p/gpt5-router

Hallucinations are almost gone with GPT-5 https://x.com/scaling01/status/1953507569609134506

ICYMI, OpenAI released an insane amount of guides on how to use GPT-5. > Examples > Prompting guide > New features guide > Reasoning tips > Setting verbosity > New tool calling features > Migration guide And much more. https://x.com/omarsar0/status/1953583336603234726

If GPT-5 made this chart I’m bearish 😭 https://x.com/iScienceLuvr/status/1953503815292092904

In a new report, we evaluate whether GPT-5 poses significant catastrophic risks via AI R&D acceleration, rogue replication, or sabotage of AI labs. We conclude that this seems unlikely. However, capability trends continue rapidly, and models display increasing eval awareness. https://x.com/METR_Evals/status/1953525150374150654

Introducing GPT-5 | OpenAI https://openai.com/index/introducing-gpt-5/

Introducing GPT-5 Our best AI system yet, rolling out to all ChatGPT users and developers starting today. https://x.com/OpenAI/status/1953526577297600557

Long context reasoning performance: A stand out is long context reasoning performance as shown by our AA-LCR evaluation whereby GPT-5 occupies the #1 and #2 positions. https://x.com/ArtificialAnlys/status/1953507713222422866

Lots of excitement about GPT-5 in Codex CLI via your ChatGPT plan. Some details: 1. Yes, if you sign in with ChatGPT, usage is included via your paid plan! 2. Still determining exact rate limits, but the goal is to be generous: — Pro users should basically not hit limits”” / X https://x.com/embirico/status/1953590991870697896

made a little Sankey to show you why I’m fuming ChatGPT Plus before vs after the GPT-5 release https://x.com/scaling01/status/1953780931552031056

Markets disappointed by GPT-5 OpenAI getting crushed on Polymarket https://x.com/scaling01/status/1953515099257282763

model switching in gpt-5 very cool!”” / X https://x.com/sama/status/1953526708742537220

New in Notion AI’s toolbelt: @OpenAI’s GPT-5 It’s fast, thorough, and handles complex work 15% better than other models we’ve tested. A great choice for tasks with multiple moving parts. Gradual rollout starting today. https://x.com/NotionHQ/status/1953506907924443645

OpenAI GPT-5 System Card released “”GPT-5 is a unified system with a smart and fast model that answers most questions, a deeper reasoning model for harder problems, and a real-time router that quickly decides which model to use based on conversation type, complexity, tool needs, https://x.com/iScienceLuvr/status/1953503173932724614

Priority Processing debuts with GPT-5. under-hyped imo for apps where millisecond matters, pay extra and get our fastest token speeds just add “”service_tier””: “”priority”” to your requests https://x.com/jeffintime/status/1953857260729643136

Quick PSA. Settings for minimizing GPT-5 latency (time to first token). “”service_tier””: “”priority””, “”reasoning_effort””: “”minimal””, “”verbosity””: “”low””. P50 TTFT with these settings is ~750ms. With the defaults, it’s >3s. The default settings are the right starting point for https://x.com/kwindla/status/1953868672470331423

RT @lmarena_ai: GPT-5 is here – and it’s #1 across the board. 🥇#1 in Text, WebDev, and Vision Arena 🥇#1 in Hard Prompts, Coding, Math, Cre…”” / X https://x.com/aidan_mclau/status/1953517672941158577

The GPT 5 launch included a chart showing 52.8 as a bigger number than 69.1, which in turn is shown as the same magnitude as 30.8. Not quite ASI… https://x.com/jeremyphoward/status/1953509671446196715

The straight up GPT-5 in Codex CLI fixed a bug in 3 minutes that I was working on for three or four hours this morning…can’t wait to try in Cursor.”” / X https://x.com/sound4movement/status/1953583522587017345

Think harder is back! Routing changes in GPT-5 OpenAI means capability is moving from model selection to prompting https://x.com/dariusemrani/status/1953591404003045562

this is the detail of GPT-5 I’m most proud of GPT-4 launched at $30/$60, no cache discount since then, it’s been an unrelenting cross-team push to collapse the cost of intelligence. we’re nowhere near done”” / X https://x.com/jeffintime/status/1953534466854453751

We are actively evaluating GPT-5 models on document understanding capabilities 🔎📄 – specifically screenshotting the page and feeding it into the model. A WIP preliminary finding is that even though on paper GPT-5 is $1.25 per 1M tokens, it uses 4-5x more tokens than GPT-4.1, https://x.com/jerryjliu0/status/1953582723672814054

We’re also releasing v0.16 of the Codex CLI today. – GPT-5 is now the default model – Use with your ChatGPT plan – A new, refreshed terminal UI `npm i -g @openai/codex` to update”” / X https://x.com/OpenAIDevs/status/1953559797883891735

We’ve put together some guides on how to get started with GPT-5: 💬 Prompting guide: https://x.com/OpenAIDevs/status/1953528513480347840

What the hell man, this is such a lame way to technically not lie. «A unified system» is… literally just SEPARATE CoT + non-CoT models + a router. > OpenAI reasoning models, including gpt-5-thinking, gpt-5-thinking-mini, and gpt-5-thinking-nano > gpt-5-main just fuck off washed https://x.com/teortaxesTex/status/1953512363031757048

Google DeepMind unveils Genie 3 interactive world generator
Google DeepMind has released Genie 3, an AI system that creates interactive 3D environments from text descriptions. Users can explore these generated worlds in real-time at 24 frames per second, with the system maintaining visual consistency for several minutes at 720p resolution. The technology represents a significant advance from previous versions, with only seven months between Genie 2 and Genie 3. The system can simulate complex physical properties like water, lighting effects, and natural phenomena without using traditional 3D models or game engines. Researchers describe the experience as navigating through a controlled dream-like environment. DeepMind positions this as a stepping stone toward artificial general intelligence, as it enables training AI agents in unlimited simulated environments. The technology could eventually enable virtual reality experiences by generating offset views for each eye, bringing science fiction concepts like Star Trek’s holodeck closer to reality.

One of the big issues with AI videos is consistency across scenes. It isn’t there, but it is getting closer. This is Runway Aleph on a woman riding a snail… “”it is night”” “”the snail is mechanical”” “”show me the front”” “”the snail is very fast and is being pursued by police cars”” https://x.com/emollick/status/1951856889995653305

Runway Aleph transporting me from the range to the battlefield. Footage captured with FPV glasses. https://x.com/bilawalsidhu/status/1951433057665425837

The general release of Runway Aleph on both web and via API. A community showcase. And our weekly Discord challenge. Get caught up on what happened This Week with Runway. https://x.com/runwayml/status/1951634909501575659

Using @Blender and @runwayml Aleph. So many interesting things happening here. If you are exploring similar workflows, DM me. https://x.com/c_valenzuelab/status/1952419024291188794

Google DeepMind launches Veo 3 with native audio generation capabilities
Google DeepMind has released Veo 3, a video generation model that can create videos with synchronized sound effects, ambient noise, and dialogue without requiring separate audio tools. The model produces 4K resolution videos and demonstrates improved accuracy in following user instructions and depicting realistic physics. Early users have created diverse content including stop-motion animations, character-consistent scenes, and cinematic sequences, with the technology now available through APIs at $0.40 per second for the fast version. The system allows creators to specify detailed camera movements, shot compositions, and audio elements in their prompts, enabling production of content ranging from advertisements to short films entirely through text descriptions.

Genie 2 vs Genie 3. Just 7 months between them. The bitter lesson continues to be bitter. https://x.com/bilawalsidhu/status/1952792880285896710

Genie 3 feels like playing a dream – a controlled hallucination of reality. Really makes you wonder if reality is just the same – except instead of just a few minutes, we can recall a whole lifetime. https://x.com/bilawalsidhu/status/1952895900390404231

Genie 3 generates interactive video in real-time. Just need to generate offset left/right eye views and you’ve got stereo VR worlds. No 3D models, no game engine – just generated dreams you can walk through. The holodeck is closer than you think.”” / X https://x.com/bilawalsidhu/status/1953094066993803454

genie 3 is wild. imagine looking over at your reflection in the tv screen and it’s just you standing there with a gopro strapped to your head… 🤯”” / X https://x.com/bilawalsidhu/status/1953158780835012881

Genie 3: A new frontier for world models – Google DeepMind https://deepmind.google/discover/blog/genie-3-a-new-frontier-for-world-models/

Genie-3 just achieved what AAA game engines do – but WITHOUT any 3D models. Interactive REAL-TIME video generation @ 24 fps Wild how this model figured out complex effects like exposure shifts, volumetric god rays, and phenomena we need to code explicitly in 3D engines TL;DR 🧵 https://x.com/bilawalsidhu/status/1952742891295764620

Google is flexing their AI muscles again. DeepMind has unveiled Genie 3, a real-time interactive, general-purpose world model that generates environments from text prompts, with visual memory extending as far back as one minute to keep scenes consistent. It could help advance https://x.com/TheHumanoidHub/status/1952801280059183210

hollup! so can you pre-load a chunk of a real video into genie’s world memory, so what you’ve seen IRL is actually what you see when you look around? can genie 3 basically do neural scene reconstruction in real time?! even if it’s not a “”factual”” rendition of the world like”” / X https://x.com/bilawalsidhu/status/1953187618700574851

Introducing Genie 3, the most advanced world simulator ever created, enabled by numerous research breakthroughs. 🤯 Featuring high fidelity visuals, 20-24 fps, prompting on the go, world memory, and more. https://x.com/OfficialLoganK/status/1952732206176112915

One word: relentless. just in the past two weeks, we’ve shipped: 🌐 Genie 3 – the most advanced world simulator ever 🤔 Gemini 2.5 Pro Deep Think available to Ultra subs 🎓 Gemini Pro free for uni students & $1B for US ed 🌍 AlphaEarth – a geospatial model of the entire planet”” / X https://x.com/demishassabis/status/1953887339094143156

RT @OriolVinyalsML: Incredible evolution of “”Neural Video Games””: from GQN (2018) to Genie3 (2025). The future is exciting! https://x.com/demishassabis/status/1952890039643353219

Sparks of in-context learning in Genie 3. You can prompt Genie 3 with a video (e.g. Veo 3) then control from there. Genie 3 will mimic the dynamics. I think we have only scratched the surface of what can be done with prompting and post-training of foundational world models.”” / X https://x.com/_rockt/status/1953117236975030653

We need to go deeper. Genie 3 is having its inception moment.”” / X https://x.com/shlomifruchter/status/1953155882902274126

We’re entering the era of infinite AI training environments. Google DeepMind just announced Genie 3, the first real-time interactive world model that creates worlds from text prompts. The video below shows a controllable environment generated by Genie 3 in real time. Insane. https://x.com/rowancheung/status/1952732216959623583

World modeling for robotics is incredibly hard because (1) control of humanoid robots & 5-finger hands is wayyy harder than ⬆️⬅️⬇️➡️ in games (Genie 3); and (2) object interaction is much more diverse than FSD, which needs to *avoid* coming into contact. Our GR00T Dreams work was”” / X https://x.com/DrJimFan/status/1952760780706984051

Grok 4 outperforms major AI models in reasoning and chess competitions
Grok 4 has demonstrated superior performance compared to other leading AI models in two significant benchmarks. The model achieved a 15.9% score on the ARC-AGI-2 test, which measures an AI’s ability to solve novel reasoning problems, significantly outperforming GPT-5’s 9.9% score. Additionally, Grok 4 defeated Google’s Gemini AI in the semi-finals of the Kaggle AI Chess competition, advancing to the grand finale. These results suggest Grok 4 has made notable improvements in both abstract reasoning capabilities and strategic game-playing, areas that are considered important indicators of artificial intelligence progress.

Veo – Google DeepMind – Veo 3 video generation lets you add sound effects, ambient noise, and even dialogue to your creations – generating all audio natively. It also delivers best in class quality, excelling in physics, realism and prompt adherence. https://deepmind.google/models/veo/

🌋 Volcano rock 🌊 Ocean wave ⚡ Storm cloud ASMR Made With #Veo3 Automated With My AI Autopilot! https://x.com/Mentor/status/1942016976827863103

🚨Veo3 Update is Here🚨 Wow, this will change how I make film with AI! With Google Veo3, you can now make yourself talk anything in any language, anywhere. What would you create?? https://x.com/herokominato/status/1942729320948256828

An alien vlogs his first Ahmedabad trip, discovering ‘Khalasi’. 👽 Google Veo 3 vividly realizes imagination, speaking Gujarati effortlessly with 100% auto-generated audio. #AI #Khalasi #GenerativeAI #Veo3 https://x.com/drashyakuruwa/status/1942647461522333777

Comparing Kling 2.1 with audio via Thinksound in @replicate (1st 10 sec vid) to Veo3 in Flow Studio (2nd 8 sec vid). Very impressed with Thinksound. Landscape design and real project photo (for i2v) by VizX Design Studio. https://x.com/Clearstory3D/status/1944505549543833656

I AM DYING 😭🤣 made this with Veo3. also side note this just goes to show that any AI brookejlacey video will never, ever be as good as the original. I will be the ultimate reality dealer, mark my words. https://x.com/brookejlacey/status/1944477615827611691

I made a stop-motion animation for Chanel No. 5 using Veo3. Prompt share： A claymation Paris street at midnight — the Eiffel Tower sparkles in the background, and a tiny clay bottle of Chanel No.5 tiptoes across cobblestones. Make sure everything used stop motion animation. https://x.com/crystalsssup/status/1942162692938354804

I made this in 4 hours! Veo 3 image to video character consistency. Midjourney for character creation and style. Runway ML for coverage. Veo 3 image to video with json prompts #AIart #veo3 #midjourneyv7 #runwayml #rockwilerai #ai #PromptEngineering https://x.com/therockwiler/status/1942883991117336812

It’s kinda crazy how easy it is to make ads with AI now! This is my very first try using VEO3. Not bad, right? 😉 https://x.com/agentsrihan/status/1942987346921533463

Just saw the vampire rap generated with Veo3 by my friend @WuxiaRocks. I didn’t know Veo3 can do such cool rap lipsync. Now, I want to create cool raps for brands I love. First up, @Netlify. @biilmann, what do you think?😉 https://x.com/zeng_wt/status/1943684922214171125

people are using veo3 to bring history to life in the form of vlogs 🤣 via HistoryVisualizedbyAI on YouTube https://x.com/tanayj/status/1934373978098778145

Prompt + image segmentation with @GoogleDeepMind VEO3 + @ultralytics 🚀 Here’s the simple workflow: ✅ Generated a video using VEO3 (prompt shared in the comments) ✅ Processed the clip directly with YOLOE for prompt-based image segmentation. Prompt in the comments👇 #AI https://x.com/muhammdrizwanmr/status/1941015898082468277

Quick play with Veo3 + Astra @topazlabs https://x.com/AllarHaltsonen/status/1941202785363788125

sailing like You’ve never seen before 🚤🌊 hyperreal waves, golden sunsets, & raw human emotion brought to life with cinematic precision. generated using #Veo3, now available on @moofeedcom. this isn’t just AI. It’s storytelling in motion. https://x.com/iUllr/status/1943956162874867858

The actual budget friendly launch trailer. Every shot generated with @GoogleDeepMind Veo3. With the philosophical idea that “”Death isn’t the end, forgetting is””. We proudly introducing EzCall AI. Time to speak the words you never got to say https://x.com/zhaoyuWu8/status/1942285389651403102

This entire short film was made using AI. No actors. No cameras. Just prompts, imagination, and tech. Watch it now #AIshortfilm #AIFilmmaking #AhmedabadCrash #RadheWorks #GenAI #OpenAI #FutureOfCinema #CinematicAI #Veo3 https://x.com/punit19nov/status/1942220841657508003

This is AI … but with real actors! The Hollywood film is about to change. We made with Veo3, Runway Reference and Flux Kontext using Me and My friend’s @Jamesgulles_ performances. Will the future be shaped by AI creators or by filmmakers? https://x.com/herokominato/status/1941844050451243187

Two word VEO3 prompt experimentations: > Cat Kaleidoscope 🔊 I find this calming kind of like ASMR but with hypnotic video https://x.com/rBKeeper/status/1943202740945006659

Veo 3 Fast and Veo 3 image-to-video are now available in the API! 📹 Veo 3 Fast is $0.40 per second of video (with audio) and comes with production ready rate limits and has comparable quality in certain cases! https://x.com/OfficialLoganK/status/1950959720606396655

Veo3 (Fast) is actually much better with consistent character than Veo3 (Quality). Here’s a 4-scene video of a Japanese tight rope walker on top of a sky scraper. Prompts in the comment. https://x.com/juminoz/status/1942399268192674285

Veo3 fast { “”shot””: { “”composition””: “”High-angle tracking shot from a helicopter, 200mm telephoto lens on a stabilized gimbal system, shot on RED Helium 8K S35″”, “”camera_motion””: “”aerial tracking following the emus’ path, with gradual zoom-in””, “”frame_rate””: https://x.com/IamEmily2050/status/1941126453715948005

Veo3 fast on the Gemini app 😀 { “”shot””: { “”composition””: “”POV first-person perspective, 35mm lens, shot on ARRI Alexa Mini, shallow depth of field with focus pulls on the trainer’s face, sword, and gate””, “”camera_motion””: “”handheld with natural inertia—subtle sway https://x.com/IamEmily2050/status/1943206541496369569

Grok launches ultra-fast AI image and video generation tools
X (formerly Twitter) has released Grok Imagine, a new AI feature that creates images and videos at unprecedented speeds. The tool, available through the Grok app, generates visual content so quickly that users report it produces images faster than they can scroll through results. Initially offered as a free trial to US users, the feature is now available to all X Premium subscribers. Early users describe the generation speed and quality as exceptional, with the system capable of creating both still images and videos through simple text prompts.

Grok 4 is still state-of-the-art on ARC-AGI-2 among frontier models. 15.9% for Grok 4 vs 9.9% for GPT-5. https://x.com/fchollet/status/1953511631054680085

Grok-4 BEATS GPT-5 on ARC-AGI-2 https://x.com/scaling01/status/1953509485453902173

RT @cb_doge: 🚨 BREAKING: Grok 4 defeats Google’s Gemini in the Kaggle AI Chess semi-final and moves on to the grand finale! 🤖♟️🔥 https://t.…”” / X https://x.com/hyhieu226/status/1953220787084902888

ElevenLabs expands into AI music generation with commercial licensing deals
ElevenLabs, known for its text-to-speech AI tools, has launched a music generation service that creates songs from text prompts. The company claims the AI-generated music is cleared for commercial use through licensing agreements with Merlin Network and Kobalt Music Group, which represent artists like Adele, Nirvana, Beck, and Childish Gambino. Artists must opt-in for their music to be used in AI training, and they receive revenue sharing from the deals. The service comes with strict usage restrictions, prohibiting users from referencing specific artists, song titles, or lyrics in their prompts, and banning use in certain industries like firearms, tobacco, and political campaigns. This launch follows legal challenges faced by competitors Suno and Udio, who were sued by the Recording Industry Association of America for allegedly training their models on copyrighted material without permission.

Grok 4 Imagine generates images faster than I can scroll. How is this even possible? It’s so good. 😭”” / X https://x.com/tetsuoai/status/1951444393065586840

RT @elonmusk: For the next few days, Grok Imagine video generation is free to all US users! Download the Grok app and try it out.”” / X https://x.com/Yuhu_ai_/status/1953367318521655594

Super fast image & video generation via Imagine in the @Grok app is now available to all 𝕏 Premium users”” / X https://x.com/elonmusk/status/1952535613560983757

You guys need to try imagine mode on grok app. It’s incredible.”” / X https://x.com/tobi/status/1951789462268391749

Claude’s system prompt receives updates for improved performance
Anthropic has updated Claude’s system prompt to enhance the AI assistant’s capabilities and responses. The changes aim to improve Claude’s ability to follow instructions more accurately, provide clearer explanations, and maintain consistency across different types of tasks. The updates include refinements to how Claude processes user requests, handles edge cases, and structures its outputs. These modifications are part of Anthropic’s ongoing efforts to make Claude more helpful and reliable for users across various applications, from creative writing to technical problem-solving. The company has not disclosed all specific changes but indicates the updates should result in more natural conversations and better alignment with user intentions.

ElevenLabs launches an AI music generator, which it claims is cleared for commercial use | TechCrunch https://techcrunch.com/2025/08/05/elevenlabs-launches-an-ai-music-generator-which-it-claims-is-cleared-for-commercial-use/

Music Terms | ElevenLabs https://elevenlabs.io/music-terms

6 AI Visuals and Charts: Week Ending August 08, 2025

concerning https://x.com/DZhang50/status/1953510507631071658

FantasyPortrait: Enhancing Multi-Character Portrait Animation with Expression-Augmented Diffusion Transformers https://fantasy-amap.github.io/fantasy-portrait/

4d gaussian splat processed from one of the largest volumetric capture stages ever made. And fun fact, it was Intel that originally made it! I remember seeing the results in 2017, and my mind was blown. You could capture something once, and reframe it infinitely in post. Alas, https://x.com/bilawalsidhu/status/1952000783492186424

Relightable Full-body Gaussian Codec Avatars https://neuralbodies.github.io/RFGCA/

more tests reskinning google earth photogrammetry renders with runway’s aleph video-to-video ai model https://x.com/bilawalsidhu/status/1950717547206037511

i think the whole interactive video vs. explicit 3d debate is about get supercharged this week. meanwhile, here’s me reskinning 3d gaussian splat renders with runway aleph. https://x.com/bilawalsidhu/status/1952489882024386819