About This Week’s Covers

This week’s cover was my favorite of the subcategory covers. I wanted to see if OpenAI could create a rubric for a challenging topic, so I gave it the recent Met Gala theme of black dandyism and the related book “Slaves to Fashion: Black Dandyism and the Styling of Black Diasporic Identity”.

I figured this would be a good test to see how GPT o3 would navigate a historical topic with symbolism and deep context and nuance. I asked GPT to first build a rubric, so that I could give it single-word category prompts, and it could apply the rubric to develop historically and culturally sensitive and appropriate fashion looks based on the technology theme. That’s a pretty tough assignment.

The rubric itself was off the charts with incredible depth and thoughtfulness as well as accuracy. However, once the rubric was applied to the category titles, the rubric did not result in an effective prompt. Much of the dandyism theme got lost while the category title word overwhelmed the image composition. Many of the images were just plain bad.

I could’ve fixed this with a few revisions, but my category image exercise is always meant to give AI a single opportunity, and resist the temptation to edit and improve the results. I will create a separate blog post that includes the rubric as well as the prompts so you can check them out.

The main cover that I selected is pretty cool though. The prompt was for the Local category: “Wearing a bespoke trench lined with quilted newspaper clippings from the Black press, a hometown hero strides confidently through a brick alley painted with murals of Harriet Tubman and digital graffiti; a hat tipped low, shot with documentary intimacy on 35mm — celebrating the dignity of place and the power of community-sourced data.” I liked the subtlety of the newspaper peeking out from under the lining of the trench coat.

o3 wrote the prompts, and most of the images were made with GPT Image 1. I ran a few through MidJourney, Flux, and Ideogram to spot check the quality. I prefer the MidJourney images for fashion imagery (by far), but MidJourney still does not have an API and I have not figured out the Discord workaround yet. I’ve included my favorite six category covers below.

This Week By The Numbers

Total Organized Headlines: 548

This Week’s Executive Summaries

The top story this week was Google Gemini’s 2.5 Pro model becoming the first model to achieve the number one rank across all text, vision, and web dev benchmarks simultaneously. Gemini 2.5’s multimodal skills are impressive. The model can watch videos and convert them into functional web applications. Gemini has such a large memory that it can review 50,000 lines of code (!) at once within a chat window with no problem. Google is finally the top dog in the artificial intelligence model universe. At least for this week.

To test Google Gemini 2.5’s incredible memory, professor Ethan Mollick gave the model the entire text of the novel War and Peace, added a single imposter sentence in varying places, and the model caught it every time.

OpenAI’s powerful o3 model successfully analyzed a Harvard Business School case study from a PDF and was able to extract scattered financial data and build a business model comparable to an MBA student.

Both OpenAI and Google have integrated shopping capabilities into their AI features. A big change is coming for retailers (and affiliate models) this year. Sorry Athletic Greens (LOL), I’m rooting against you.

Anthropic has integrated web searching into both its chat model and the API. This is yet another shift in the publishing landscape and the internet itself, as models search the Internet and retrieve information without leaving the chat.

A remarkable trend over the last three weeks is Open AI’s ability to guess where a picture is located. If you haven’t tried it already, give a photo to o3 and ask it to guess where the photo was taken. One of the coolest aspects is that the chat window will display the thought process as the AI tries to figure out where the picture is in the world. While not every image can be located, the ability is uncanny and very disconcerting when you see it in action.

Apple plans to add artificial intelligence search options to its web browser, Safari. While this may seem unexpected, it actually lines up, as we are watching the disintegration of the web browser as websites are absorbed into chat interfaces. In hindsight, it seems like a pretty good move by Apple.

OpenAI is partnering with the FDA to employ artificial intelligence to expedite the drug approval process.

Former Google CEO Eric Schmitt has a startup called FutureHouse which has built five specialized artificial intelligence agents to help scientists navigate large amounts of data and research.

OpenAI has announced a restructuring from a for-profit company to a public benefit corporation. To be candid, a lot of it is just noise to me. However, it is important, and I’ve included all of the relevant links below.

LinkedIn has added AI powered job searching which supposedly allows users to search for jobs using plain language descriptions. If it works, it sounds pretty cool. After seeing the tool Boardy in action, it was clear that putting language search on top of large structured data sets will be very powerful.

These are the main callouts, however there are several more newsworthy stories with full executive summaries below. Never a dull moment or week in artificial intelligence news!

Gemini 2.5 Pro Makes History as First Model to Sweep All Benchmarks
Google’s Gemini 2.5 Pro becomes the first AI model to achieve #1 rankings across all text, vision, and WebDev benchmarks simultaneously. Amazingly. Gemini 2.5 can watch videos and convert them functional web applications. It can also generate complete single page applications and responsive mobile games from scratch on the first try. Its interface development abilities were a dramatic improvement of 147 Elo points on the WebDev Arena leaderboard, surpassing Claude for the first time. Beyond user interfaces, Gemini 2.5 exhibits significantly fewer errors in complex coding tasks. It’s memory window is so large that it can analyze 50,000 lines of code at once.

Big Gemini 2.5 Pro Update! Better coding and UI web applications! We’re excited to drop this I/O preview early, focused on coding, especially UIs, new video-to-code features and improved agentic capabilities. 🌋 > Better on LiveCodeBench and Aider > #1 on @lmsysorg WebDev Arena https://x.com/_philschmid/status/1919770969788313836

Google released an early preview of Gemini 2.5 Pro I/O Edition, delivering: —Enhanced performance for frontend and UI dev, code editing, and agentic workflows —Top score on LM Arena and WebDev Arena, beating Claude 3.7 Sonnet —Video understanding features https://x.com/rowancheung/status/1920018816236499178

The Ultimate LLM Meta-Leaderboard averaged across the 28 best benchmarks Gemini 2.5 Pro > o3 > Sonnet 3.7 Thinking https://x.com/scaling01/status/1919217718420508782

🚨Breaking: @GoogleDeepMind’s latest Gemini-2.5-Pro is now ranked #1 across all LMArena leaderboards 🏆 Highlights: – #1 in all text arenas (Coding, Style Control, Creative Writing, etc) – #1 on the Vision leaderboard with a ~70 pts lead! – #1 on WebDev Arena, surpassing Claude https://x.com/lmarena_ai/status/1919774743038984449

New Gemini-2.5-Pro ranks #1 on WebDev Arena as well, first model surpassing Claude! 🏆 https://x.com/lmarena_ai/status/1919774753398915225

Gemini 2.5 Pro has dethroned Sonnet 3.7 on the WebDevArena Leaderboard. Not even o3 could do that! https://x.com/scaling01/status/1919771796334616759

We will enter a new era for vibe coding! The new Gemini 2.5 Pro can now zero-shot full Single Page Application, Complete Responsive Mobile Games, convert UI screenshots precisly to working code. Can’t wait and see what @cursor_ai ,@github, @windsurf_ai, @boltdotnew, @v0, https://x.com/_philschmid/status/1919774801767317799

Gemini 2.5 Pro update: Coding, web apps with Gemini https://blog.google/products/gemini/gemini-2-5-pro-updates/

Gemini 2.5 Pro Preview: even better coding performance – Google Developers Blog https://developers.googleblog.com/en/gemini-2-5-pro-io-improved-coding-performance/

Gemini 2.5 Finds Single Fake Sentence in War and Peace
Google’s Gemini 2.5 demonstrated remarkable search precision in a test by professor Ethan Mollick. The AI successfully located a single fabricated sentence about “Crab Man the superhero” deliberately inserted into Tolstoy’s “War and Peace,” consistently finding this needle in a haystack of 860,000 tokens. This highlights the model’s enhanced ability to process and analyze extensive documents with exceptional accuracy.

Pretty awesome result from the new version of Gemini 2.5 I changed one line of War and Peace, inserting a sentence into Book 14, Chapter 10 (halfway through), where Princess Mary “spoke to Crab Man the superhero” Gemini 2.5 consistently found this reference among 860,000 tokens https://x.com/emollick/status/1919966879398891579

OpenAI’s O3 Model Solves Complex Harvard Business School Case
OpenAI’s O3 model successfully analyzed a Harvard Business School case study from PDF format, extracting scattered financial data and building coherent business models comparable to MBA-level work. While testing showed no hallucinations in this instance, Ethan Mollick cautions the system still occasionally generates incorrect information. The model demonstrated impressive capabilities in brand strategy, strategic positioning, and finance questions, producing responses that matched the quality expected from business school graduates.

o3 now cracks new Harvard Business School cases from the PDF, in one shot I blurred the figures to not ruin the case, but I asked the AI to figure out financials, which incorporates data scattered throughout the case. More interesting, I asked it to compare to the case’s answer. https://x.com/emollick/status/1918355078253027802

OpenAI and Google Launch Competing AI Shopping Features
Both OpenAI and Google are integrating shopping capabilities into their AI platforms. ChatGPT now offers improved product results with visual details, pricing, reviews, and direct purchase links – available to all users regardless of subscription status. Meanwhile, Google’s upcoming AI Mode will leverage their database of 45 billion products and 250 million places to provide real-time pricing and local shopping recommendations within AI search results.

i have been on a shopping bender this morning, this is much better than i expected!” / X https://x.com/sama/status/1918735773098004680

For example, searching “best vintage shops for mid-century modern furniture. I’m trying to find a coffee table or record cabinet” will give you relevant local options with descriptions, images, ratings, live pricing, availability and more. This all is powered by Google’s https://x.com/rmstein/status/1917976694377574711

Shopping We’re experimenting with making shopping simpler and faster to find, compare, and buy products in ChatGPT. ✅ Improved product results ✅ Visual product details, pricing, and reviews ✅ Direct links to buy Product results are chosen independently and are not ads. https://x.com/OpenAI/status/1916947243044856255

Anthropic Launches Web Search Across All Products and API
Anthropic has integrated web search capabilities into both its API and chat application. The feature adjusts search depth based on query complexity, enabling developers to build applications that deliver up-to-date information without maintaining separate search infrastructure. Claude generates targeted queries, analyzes results with source citations, and can perform progressive searches for more comprehensive answers. Seach helps supports applications requiring real-time data across industries including financial services, legal research, and software development.

Introducing web search on the Anthropic API \ Anthropic https://www.anthropic.com/news/web-search-api

Web search is now available on our API. Developers can augment Claude’s comprehensive knowledge with up-to-date data. https://x.com/AnthropicAI/status/1920209430529900791

Friday feature drop: We’ve improved web search and rolled it out worldwide to all paid plans. Web search now combines light Research functionality, allowing Claude to automatically adjust search depth based on your question.” / X https://x.com/alexalbert__/status/1918349277962879218

We’ve added a web search tool to the Anthropic API, giving Claude direct access to real-time web content. You can control the depth of Claude’s search by changing the max_uses parameter. https://x.com/alexalbert__/status/1920207966256705888

OpenAI’s o3 Model Defeats Master Geoguessr Player, Ignores Fake GPS Data
File under “see it to believe it”. OpenAI’s o3 model outscored a Master I-ranked Geoguessr player 23,179 to 22,054 in a head-to-head match, correctly identifying all five countries and twice pinpointing locations within a few hundred meters. The AI demonstrated impressive visual reasoning skills, analyzing architecture, road markings, vegetation, and signage to determine locations. When the player embedded fake GPS coordinates in image EXIF data, o3 identified the inconsistencies and still accurately located the real settings. While the AI took longer to make decisions (4+ minutes versus the human’s typical 2 minutes), it showed mastery of geographic details that typically require thousands of practice games for humans to develop. The test confirms o3’s geolocation ability isn’t based on hidden metadata but genuine visual analysis and reasoning.

Sam Patterson https://sampatt.com/blog/2025-04-28-can-o3-beat-a-geoguessr-master

Andrej Karpathy Predicts Visual Future for AI Interfaces
Tech pioneer Andrej Karpathy argues that text-based AI interactions are comparable to 1980s computer terminals before graphical interfaces emerged. He envisions future AI interfaces becoming highly visual, utilizing pictures, charts, and animations to leverage the brain’s processing power, as vision represents our highest-bandwidth information channel. These interfaces will likely be generated on-demand for specific user needs, with content dynamically arranged for immediate purposes. Karpathy suggests we’re seeing early signs of this evolution in current features like code highlighting, LaTeX support, markdown formatting, emoji, and artifacts that include Mermaid charts and basic applications.

Chatting” with LLM feels like using an 80s computer terminal. The GUI hasn’t been invented, yet but imo some properties of it can start to be predicted. 1 it will be visual (like GUIs of the past) because vision (pictures, charts, animations, not so much reading) is the 10-lane https://x.com/karpathy/status/1917920257257459899

Apple Plans to Add AI Search Options to Safari
Apple is looking to incorporate AI search engines from OpenAI, Perplexity, and Anthropic into Safari, according to testimony from Apple’s Services VP Eddy Cue. Speaking during the DOJ lawsuit against Alphabet, Cue revealed Safari searches declined last month for the first time due to growing AI use. He believes AI search will eventually replace traditional search engines like Google, though these services need improvement before becoming default options. Apple has already held discussions with Perplexity about potential integration.

Apple Explores Move to AI Search in Browser Amid Google Fallout – Bloomberg https://www.bloomberg.com/news/articles/2025-05-07/apple-working-to-move-to-ai-search-in-browser-amid-google-fallout?embedded-checkout=true

Apple is looking to add AI search engines to Safari | TechCrunch https://techcrunch.com/2025/05/07/apple-is-looking-to-add-ai-search-engines-to-safari/

Anthropic Enables Custom MCP Server Integration with Claude’s Website
Anthropic has introduced functionality allowing users to connect any custom MCP server to claude.ai. Users simply need to host their server remotely and provide the URL to add it to the platform. According to Anthropic’s Alex Albert, this is a big deal and may have been overlooked in the initial launch announcement despite its significance for developers and organizations seeking to extend Claude’s functionality with custom tools and service integrations.

Not sure if some folks are fully realizing what we launched today so to make it more explicit: You can bring any custom MCP server into claude dot ai now. All you need is to host it somewhere (remote MCP) and provide the URL link to add it in. https://x.com/alexalbert__/status/1918047745790914772

FDA and OpenAI in Talks to Use AI for Drug Evaluation
The FDA is meeting with OpenAI to explore using artificial intelligence to accelerate drug approval processes through a project called cderGPT. This AI system would assist the FDA’s Center for Drug Evaluation and Research, which oversees both prescription and over-the-counter medications. Representatives from Elon Musk’s DOGE have also participated in these discussions. The initiative targets the final stages of drug development, which traditionally can take more than a decade from research to market approval. While AI could help automate documentation review and data analysis, questions remain about controlling for AI reliability in these critical regulatory processes. The FDA has broader plans to implement AI tools agency-wide for scientific reviews starting in June 2025.

OpenAI and the FDA Are Holding Talks About Using AI In Drug Evaluation | WIRED https://www.wired.com/story/openai-fda-doge-ai-drug-evaluation/

Eric Schmidt’s FutureHouse Launches AI Agents for Scientific Research – Similar to the FDA and OpenAI Story
FutureHouse, backed by former Google CEO Eric Schmidt, has introduced five specialized AI agents designed to help scientists navigate vast amounts of research data. The platform includes Crow for general research, Falcon for literature reviews, Owl for identifying previous research, Phoenix for chemistry workflows, and now Finch for biological data analysis. The agents address a critical bottleneck in scientific discovery by processing millions of research papers and specialized databases that would overwhelm human researchers. According to FutureHouse, their agents have demonstrated better precision than PhD-level researchers in head-to-head literature search tasks. The platform offers both web interface and API access, allowing researchers to build automated systems that continuously monitor new publications or conduct large-scale literature searches.

Former Google CEO Eric Schmidt-backed FutureHouse launched four ‘superhuman’ AI agents for scientific discovery —Crow for general research —Falcon for deep literature reviews —Owl for identifying previous research —Phoenix for chemistry workflows https://x.com/rowancheung/status/1919286217197170877

FutureHouse Platform: Superintelligent AI Agents for Scientific Discovery | FutureHouse https://www.futurehouse.org/research-announcements/launching-futurehouse-platform-ai-agents

Former Google CEO Eric Schmidt-backed FutureHouse released Finch, an AI agent for discovery in biology Currently in beta, Finch can do open-ended and directed data analysis It joins FutureHouse’s four previously announced ‘superintelligent’ AI agents https://x.com/rowancheung/status/1920018905352769783

OpenAI Transitions For-Profit Arm to Public Benefit Corporation
OpenAI announced it will convert its for-profit arm into a Public Benefit Corporation (PBC) while maintaining nonprofit control with a majority stake. The nonprofit will serve as both controller and significant shareholder of the PBC, ensuring the organization adheres to its original mission of ensuring artificial general intelligence benefits humanity. This structural change follows discussions with Delaware and California Attorneys General. In a letter to employees, CEO Sam Altman emphasized OpenAI’s commitment to “democratic AI” that puts powerful tools in everyone’s hands rather than limiting access to a select few. The company aims to simplify its previous capped-profit structure, transitioning to a standard capital model where all employees hold stock, while keeping the nonprofit in control.

OpenAI Restructuring: Microsoft (MSFT) Is Key Holdout of Plan – Bloomberg https://www.bloomberg.com/news/articles/2025-05-05/microsoft-said-to-be-key-holdout-for-openai-restructuring-plan?embedded-checkout=true

OpenAI said it will convert its existing for-profit arm into a public benefit corporation—but keep the non-profit in control with a majority stake over the PBC. The move came after pressure from ex-employees and a long legal battle with Elon Musk https://x.com/rowancheung/status/1919656417780248613

A message from Bret Taylor, Chair of the OpenAI Board of Directors, and a letter from @sama about our structure. https://x.com/OpenAI/status/1919453166979957115

Evolving OpenAI’s Structure | Hacker News https://news.ycombinator.com/item?id=43897772

Elon Musk Lawyer Says OpenAI Restructuring Update ‘Changes Nothing’ – Bloomberg https://www.bloomberg.com/news/articles/2025-05-06/musk-s-lawyer-says-openai-restructuring-update-changes-nothing?embedded-checkout=true

OpenAI Adds Former Instacart CEO Fidji Simo to Executive Team
Sam Altman announced that Fidji Simo will join OpenAI as CEO of Applications, reporting directly to him. Simo, currently finishing her tenure at Instacart and already serving on OpenAI’s board, will oversee business and operational teams responsible for bringing AI research to market. Altman remains OpenAI’s CEO with continued oversight of all company pillars, Research, Compute, and Applications, while planning to increase his focus on research initiatives and safety systems.

OpenAI Expands Leadership with Fidji Simo | OpenAI https://openai.com/index/leadership-expansion-with-fidji-simo/

OpenAI to Acquire Coding Tool Windsurf for $3 Billion
OpenAI reportedly agreed to purchase AI-assisted coding platform Windsurf for approximately $3 billion, according to Bloomberg. The acquisition would strengthen ChatGPT’s programming capabilities as competition intensifies in the AI development space. Formerly known as Codeium, Windsurf was previously valued at $1.25 billion last August after securing $150 million in funding led by General Catalyst. This marks OpenAI’s largest acquisition to date, coming as the company’s weekly active users exceeded 400 million in February and amid plans to raise up to $40 billion at a $300 billion valuation.

OpenAI agrees to buy Windsurf for about $3 billion, Bloomberg News reports | Reuters https://www.reuters.com/business/openai-agrees-buy-windsurf-about-3-billion-bloomberg-news-reports-2025-05-06/

OpenAI has now agreed to buy Windsurf formerly Codeium for about $3 billion, its largest acquisition yet. Bloomberg first broke the news of talks several weeks ago. Win for Kleiner, General Catalyst, Greenoaks. Scoop with @rachelmetz https://x.com/Katie_Roof/status/1919547270913048804

Scoop: OpenAI is in talks to acquire AI code editor Windsurf for $3 billion. w/ @rachelmetz & @shiringhaffary https://x.com/KateClarkTweets/status/1912569653777301816

OpenAI Reaches Agreement to Buy Windsurf for $3 Billion – Bloomberg https://www.bloomberg.com/news/articles/2025-05-06/openai-reaches-agreement-to-buy-startup-windsurf-for-3-billion?embedded-checkout=true

Google Introduces AI Max for Search Campaigns
Google is launching AI Max for Search campaigns, a one-click feature suite that enhances targeting and creative capabilities for advertisers. The tool helps campaigns discover new relevant queries by expanding beyond traditional keywords with broad match and keywordless technology. Early data shows advertisers typically see 14% more conversions at similar cost metrics, with even higher performance (27%) for campaigns previously using exact and phrase keywords. The suite includes text customization to generate optimized headlines and descriptions, final URL expansion to direct users to the most relevant landing pages, and enhanced controls like “locations of interest” targeting. L’Oréal reports achieving 2X higher conversion rates at 31% lower costs, while MyConnect saw 16% more leads and discovered that 30% of conversions came from previously untapped search queries. The global beta rollout begins later this month.

Introducing AI Max for Search campaigns https://blog.google/products/ads-commerce/google-ai-max-for-search-campaigns/#relevant

Nvidia’s Parakeet V2 Sets New Speech Recognition Standard
Nvidia’s Parakeet V2 speech recognition model has claimed the top spot on the HuggingFace Open-ASR Leaderboard with a 6.05% Word Error Rate, outperforming competitors like ElevenLabs’ Scribe and OpenAI’s Whisper. The open-source model can transcribe an hour of audio in just one second—about 50 times faster than alternatives. Beyond standard transcription, Parakeet V2 offers specialized capabilities including song-to-lyrics conversion and precise timestamp formatting. The model is available under a CC-BY-4.0 license, making it accessible for developers to incorporate into their applications.

Nvidia released Parakeet V2, a new open-source automatic speech recognition AI —Transcribes an hour of audio in a second —Top model on the Open ASR, beating ElevenLabs’ Scribe and OpenAI’s Whisper —6.05% Word Error Rate —Available under CC-BY-4.0 license https://x.com/rowancheung/status/1919656472574615857

🏆 With our new Parakeet model (parakeet-tdt-0.6b-v2), we have achieved a new standard for automatic speech recognition (ASR) with an 👀 industry-best 6.05% Word Error Rate on the @HuggingFace Open-ASR-Leaderboard. 🦜 Parakeet V2 takes performance to the next level with https://x.com/NVIDIAAIDev/status/1917976429939351944

Anduril Acquires Irish Tech Firm Klas to Enhance Military Edge Computing
Defense technology company Anduril is acquiring Klas, an Irish edge computing and tactical communications specialist. This strategic move integrates Klas’ rugged Voyager hardware systems, designed to function in extreme conditions, with Anduril’s autonomous defense systems. The combined technologies aim to provide military operations with more robust computing power and connectivity in remote or harsh environments where traditional infrastructure fails. Klas will maintain its facilities in Ireland and the U.S., establishing Anduril’s first Dublin office and expanding its international footprint.

Anduril to acquire Ireland’s Klas to bolster AI warfare systems | Reuters https://www.reuters.com/business/aerospace-defense/anduril-acquire-irelands-klas-bolster-ai-warfare-systems-2025-05-05/

LinkedIn Upgrades Job Search with AI-Powered Matching
LinkedIn introduced AI tools that help job seekers find positions matching their career goals without requiring exact keyword matches. Users can now describe what they want in plain language, such as “business development roles in video games” or even abstract goals like “using brand marketing skills to cure cancer,” and the system finds relevant opportunities. The AI understands context and infers skills, helping both applicants who currently submit more applications than before and hiring teams who spend hours reviewing often unqualified candidates. Additional features include a “job match” function that analyzes how well someone fits a position before applying and indicators showing if positions are actively hiring.

LinkedIn’s new AI tools help job seekers find smarter career fits – Fast Company https://www.fastcompany.com/91329471/linkedins-new-ai-tools-help-jobseekers-find-better-fit-roles

DeepSeek Prover-V2 Sets New Benchmark for AI Mathematics
DeepSeek’s Prover-V2 combines informal mathematical reasoning with theorem proving capabilities in a 671B parameter open-source model. It solves 88.9% of problems on the MiniF2F benchmark by breaking down complex proofs into manageable subgoals before formal verification. The system also shows significant improvements on PutnamBench and handles AIME 24 & 25 problems in formal mathematics, marking a substantial advancement in AI’s ability to tackle higher-level mathematical challenges.

DeepSeek released Prover-V2, an open-source AI combining informal math reasoning with theorem proving With 671B params, the model solves 88.9% of problems on MiniF2F It does a ‘cold-start’ to break down proofs into subgoals before formal verification https://x.com/adcock_brett/status/1919060364655800684

We just released DeepSeek-Prover V2. – Solves nearly 90% of miniF2F problems – Significantly improves the SoTA performance on the PutnamBench – Achieves a non-trivial pass rate on AIME 24 & 25 problems in their formal version Github: https://x.com/zhs05232838/status/1917600755936018715

Apple Teams Up with Anthropic to Develop AI Coding Platform
Apple is partnering with Anthropic to create a “vibe-coding” platform that helps programmers write, edit, and test code using generative AI, according to Bloomberg. The system builds on Apple’s existing Xcode software and is powered by Anthropic’s Claude Sonnet model. While currently planned for internal use only, Apple hasn’t ruled out future public release. This collaboration joins Apple’s growing AI partnerships, which include OpenAI for Apple Intelligence features and potentially Google’s Gemini in the future. Claude models are already popular among developers for coding tasks, particularly on platforms like Cursor and Windsurf.

Apple, Anthropic Team Up to Build AI-Powered ‘Vibe-Coding’ Platform – Bloomberg https://www.bloomberg.com/news/articles/2025-05-02/apple-anthropic-team-up-to-build-ai-powered-vibe-coding-platform?embedded-checkout=true

Apple and Anthropic reportedly partner to build an AI coding platform | TechCrunch https://techcrunch.com/2025/05/02/apple-and-anthropic-reportedly-partner-to-build-an-ai-coding-platform/

Netflix Unveils Redesigned TV Experience with Smarter Recommendations
Netflix has redesigned its TV interface to make finding content more intuitive. Chief Product Officer Eunice Kim and CTO Elizabeth Stone announced the update featuring simplified navigation, more responsive recommendations, and improved content discovery. The redesign moves Search and My List shortcuts to the top of the screen for easier access and enhances title information with relevant callouts like “Emmy Award Winner.” On mobile, Netflix is testing AI-powered natural language search and a vertical feed of video clips for easier discovery. The global rollout begins in the coming weeks.

Unveiling Our Innovative New TV Experience Featuring Enhanced Design, Responsive Recommendations and a New Way to Search – About Netflix https://about.netflix.com/en/news/unveiling-our-innovative-new-tv-experience

Andrej Karpathy Tries Vibe Coding, Builds App Without Writing Code
Andrej Karpathy wanted to try vibe coding, so he built a web application called MenuGen without writing any code himself. The app helps restaurant-goers visualize unfamiliar menu items by generating images from photos of menus. While creating a local demo was enjoyable, deploying a production-ready application proved challenging. Karpathy discovered that modern app development involves navigating numerous services, configurations, and APIs rather than just coding. He questions how AI can effectively automate software development when so much work happens outside the code editor in browser interfaces that AI tools cannot access.

I attended a vibe coding hackathon recently and used the chance to build a web app (with auth, payments, deploy, etc.). I tinker but I am not a web dev by background, so besides the app, I was very interested in what it’s like to vibe code a full web app today. As such, I wrote https://x.com/karpathy/status/1917961248031080455

Google to Introduce Gemini Chatbot for Children Under 13
Google will make its Gemini AI chatbot available next week to children under 13 who have parent-managed accounts. The service will allow children to ask questions, get homework help, and create stories through Google’s Family Link program, which requires parents to provide personal information like name and birth date. Google promises specific safety guardrails for younger users and won’t use children’s interactions to train its AI systems. However, the company acknowledges potential risks, advising parents to help children “think critically” about responses, fact-check information, understand that “Gemini isn’t human,” and be prepared that children “may encounter content you don’t want them to see” despite content filters. This move intensifies competition for young users among AI companies, even as children’s advocacy groups like UNICEF warn these systems could confuse, misinform, or manipulate children who might struggle to understand they’re not interacting with humans.

Google Plans to Roll Out Gemini A.I. Chatbot to Children Under 13 – The New York Times https://www.nytimes.com/2025/05/02/technology/google-gemini-ai-chatbot-kids.html

HeyGen Launches Avatar IV for Lifelike Digital Expressions
HeyGen’s Avatar IV transforms single photos into expressive animations through its audio-to-expression engine. The system analyzes voice recordings to capture emotional nuances and generate realistic facial movements that match tone and rhythm. Beyond human subjects, the technology works with pets, fictional characters, or other creative concepts, supporting various camera angles and output formats. Users need only provide one photo and a voice script to create animations in seconds.

HeyGen dropped Avatar IV, an AI for expressive animations With one photo and voice script, its audio to expression engine captures tone, rhythm, and emotion to generate facial motion Also supports different subjects, camera shots, and formats https://x.com/rowancheung/status/1920018838462095760

Avatar IV is here and it changes everything. The most advanced avatar model we’ve ever built. Upload one photo and a script. That’s it. Our new audio to expression engine captures your tone, rhythm, and emotion, then generates facial motion so real it feels alive. And it’s https://x.com/HeyGen_Official/status/1919824467821551828

Meta Advances Vision and Perception Models for Open AI Development
Meta introduced the Perception Language Model (PLM), a new open-source vision-language system designed to tackle challenging visual tasks. Trained on a combination of synthetic data and 2.5 million human-labeled video samples, PLM comes in 1B, 3B and 8B parameter variants to support transparent academic research. The model aims to help the open source community build more capable computer vision systems that can understand fine-grained activities and perform spatiotemporally grounded reasoning. This release is part of a broader initiative from Meta’s FAIR team that includes four additional AI perception tools: the Meta Perception Encoder for image and video classification; Meta Locate 3D for 3D object localization using natural language; Dynamic Byte Latent Transformer for more robust language processing; and Collaborative Reasoner, which improves how language models work together on complex tasks.

Introducing Meta Perception Language Model (PLM): an open & reproducible vision-language model tackling challenging visual tasks. Learn more about how PLM can help the open source community build more capable computer vision systems. Read the research paper, and download the https://x.com/AIatMeta/status/1920153975921521018

9 AI Visuals and Charts: Week Ending May 09, 2025

o3 I want you to make a map of the lighthouses of the great lakes. I want the map in “dark mode “ but each lighthouse marker should be aesthetically sized so it covers the distance it can be seen on an average night and is the color of the light” Few rounds of feedback later… https://x.com/emollick/status/1918888777826676738

Me: “o3, do the first chapter of Genesis as an IKEA instruction manual and show me the flatpack,” “Fix the spelling” Me: “Now come up with an idea of what to do next” o3: “How about Genesis 2 – Garden Starter Kit?” Me: “Sure.” https://x.com/emollick/status/1918545934633357596

scale of stargate 1 site is hard to describe. very easy to overlook the size of machine you’re programming when training frontier models. https://x.com/gdb/status/1920254049590321395

Sim-to-ruin https://x.com/TheHumanoidHub/status/1918080151155511326

Deep Dive into Long Context – YouTube https://www.youtube.com/watch?v=NHMJ9mqKeMQ

Reinforcement Learning for Agents – Will Brown, ML Researcher at Morgan Stanley – YouTube https://www.youtube.com/watch?v=JIsgyk0Paic&t=54s

GraphRAG: The Marriage of Knowledge Graphs and RAG: Emil Eifrem – YouTube https://www.youtube.com/watch?v=knDDGYHnnSI&t=1s

Dyna Robotics introduced DYNA-1, a robot foundation model for high-throughput dexterous tasks The co’s founder released a video showing how the model enabled a robot to fold 850+ napkins in 24 hours, with a 99.4% success rate and zero human intervention https://x.com/adcock_brett/status/1919060493488070677

China’s Deep Robotics launched Lynx M20, a rugged version of its robo-dog It can run through rough terrain and extreme temperature Specialized for tasks like power inspection, emergency response, and logistics https://x.com/adcock_brett/status/1919060538379677767