About This Week’s Covers

This week’s cover was an attempt at humor, where artificial general intelligence is hiding in plain sight. The concept was supposed to be a super intelligent computer doing a very poor job hiding in a field on an over-the-top spring day. While my ego wants to make really great covers, the exercise here is to test how well AI can create covers. I tested Ideogram, Flux, GPT, and MidJourney. Flux was the winner, unseating last week’s winner, Ideogram.

Prompt: A vibrant spring meadow under clear blue skies with exaggerated blooming flowers in bright colors. Hidden comically in the middle is a futuristic supercomputer with blinking lights and multiple screens, attempting to disguise itself with a few flowers taped to its sides and a bird’s nest balanced precariously on top. The computer has cartoonish eyes peeking nervously from between server racks. Surrounding the ‘hiding’ computer are woodland animals (rabbits, deer, squirrels) giving suspicious side-eye glances at it. Some butterflies are landing on the computer’s exposed antennas. A few trees line the background, with one tree having a branch bent low from the weight of a keyboard hanging from it. The whole scene is bathed in golden sunshine with lens flares. Text overlay reading ‘AI News 81: 2025/04/18’ in a clean, modern font at the top. 16:9 aspect ratio with vibrant colors and high detail.

Here’s the Ideogram image:

And here’s the ChatGPT image:

The category images are also references to technology attempting to hide pitifully on a spring day. I used Claude 3.7 and Ideogram to automate the creation of 40 category covers in one bulk command (aka one press of a button generates all 40 images). Ideogram failed miserably. Perhaps the worst covers since I’ve started this newsletter. Here are my favorite six from the batch. The goal is to create these quickly, not perfectly, and test the capabilities of the models. We win some, we lose some.

This Week By The Numbers

Total Organized Headlines: 460

This Week’s Executive Summaries

It was another busy week of AI news. OpenAI stole the show this week by far. Their new model o3 is getting rave reviews. In particular, it can combine every single ChatGPT tool, including web search, Python coding, visual analysis, and image generation. It’s exceptional with math and coding, and has a more natural conversational style.

The tool use alone is incredible, but perhaps most remarkable is the fact that the new models can now think with images and incorporate imagery within the context of conversations and requests.

OpenAI claims that o3 is capable of generating original ideas.

ChatGPT now has 1 billion weekly active users.

OpenAI’s CFO says that o3 mini is now the number one competitive coder in the world.

OpenAI also launched a downloadable, locally hosted, open-source command line tool that takes plain English and turns it into code.

OpenAI is also increasing the performance of its API integrations, which means that automation and agents can be more powerful and run locally.

OpenAI also launched an image library where users can go back and review all of their creations in one convenient place.

It’s rumored that OpenAI is creating a social media platform as well.

A company called Firecrawl has released a web scraper that can navigate complex websites and fill out forms.

NVIDIA has come out with a video model that can create one-minute-long Tom and Jerry cartoons.

Florent Daudens at HuggingFace (and professional friend of mine!) created a really nice integration that connects Claude with the New York Times using the media content protocol. It’s a good example to demonstrate how agents and interfaces can be built on structured content.

Google has announced a model that is going to attempt to talk with dolphins. Anthropic is coming out with a voice feature to compete with OpenAI.

A company called Goodfire raised $50 million to study how AI models think.

Meta released two open-weight vision models.

Nvidia announced plans to build $500 billion worth of AI servers in the US. Not everybody believes it’s true.

AI can now find a location using only a single photo with no attached metadata.

ByteDance released a very strong video model that is going to compete with OpenAI and Google.

OpenAI’s cofounder Ilya Sutskever’s company has reached a $32 billion valuation without a product.

Grok now has a memory feature and can remember past conversations, similar to ChatGPT.

All this and much more in this week’s newsletter!

OpenAI Introduces Smartest Models Yet with Full Tool Integration
Huge news week. OpenAI has launched O3 and O4-mini, its most powerful reasoning models that can independently combine every ChatGPT tool including web search, Python coding, visual analysis, and image generation. The models deliver significant performance gains, with O3 reducing major errors by 20% compared to O1 and O4-mini offering exceptional efficiency for math and coding tasks, while maintaining more natural, conversational responses. Both models introduce visual reasoning that allows them to “think with images” rather than just see them, and they can deploy multiple tools to solve complex problems, typically in under a minute. Available now for ChatGPT Plus, Pro and Team users (with Enterprise and Edu access coming next week), these models also offer better performance-to-cost ratios than their predecessors, representing a key step toward more agentic AI that can independently execute multi-faceted tasks.

Introducing OpenAI o3 and o4-mini | OpenAI https://openai.com/index/introducing-o3-and-o4-mini/

“really good summary of o3’s strengths https://x.com/aidan_mclau/status/1912580976456474812

“💥 o3 and o4-mini are launching today! Both models are mind-blowing. But maybe the coolest for me has been seeing them use tools as they think. They can search, write code, and manipulate images in the chain of thought, and it’s a huge multiplier. I will never forget the first” / X https://x.com/kevinweil/status/1912554045849411847

“Introducing OpenAI o3 and o4-mini—our smartest and most capable models to date. For the first time, our reasoning models can agentically use and combine every tool within ChatGPT, including web search, Python, image analysis, file interpretation, and image generation. https://x.com/OpenAI/status/1912560057100955661

Rowan Cheung on X: “Some other key updates: Both new models are able to use all ChatGPT tools independently (web browsing, Python, images, etc.). Available now for Plus, Pro, and Team users, and will be replacing o1, o3-mini, and o3-mini-high. o3-pro is coming in “a few weeks”. https://t.co/risspxyDUL” / X https://x.com/rowancheung/status/1912561389824070120

Insane: OpenAI Models Now Use Images in Their Reasoning Process
OpenAI’s o3 and o4-mini models can now incorporate images directly into their thinking process, rather than simply viewing them. It can see objects basically the way we do… if not better and with more detail and patience.This allows the AI to actively use visual information as part of its reasoning chain. The models can also process uploaded images even when they’re blurry or incorrectly oriented, automatically adjusting them as needed. This marks a significant advancement in how AI systems integrate and reason with visual information alongside text.

Rowan Cheung on X: “The biggest change: o3 and o4-mini can now think using images as part of their reasoning process. Uploaded visuals can also be handled even if blurry or rotated — with the models able to adjust them using its own tools. https://t.co/1Z1waVkh7v” / X https://x.com/rowancheung/status/1912561386208825751

““Thinking with Images” has been one of our core bets in Perception since the earliest o-series launch. We quietly shipped o1 vision as a glimpse—and now o3 and o4-mini bring it to life with real polish. Huge shoutout to our amazing team members, especially: – @mckbrando, for” / X https://x.com/jhyuxm/status/1912562461624131982

“OpenAI o3 and o4-mini are our first models to integrate uploaded images directly into their chain of thought. That means they don’t just see an image—they think with it. https://x.com/OpenAI/status/1912560060284502016

OpenAI Launches Free “Best Coder In The World” Tool That Turns Plain English Into Code On Your PC
OpenAI has released Codex CLI, an open‑source command‑line tool that lets you build, fix, or explain code by typing plain English. It runs locally on the o3 and o4‑mini models (GPT‑4.1 support coming) and comes with $25 000 API‑credit grants for early projects. CFO Sarah Friar says o3‑mini is already “the No. 1 competitive coder in the world.”

OpenAI Developers on X: “Meet Codex CLI—an open-source local coding agent that turns natural language into working code. Tell Codex CLI what to build, fix, or explain, then watch it bring your ideas to life. https://t.co/jjPZdRIgrm” / X https://x.com/OpenAIDevs/status/1912556874211422572

Rowan Cheung on X: “The open-source Codex CLI agent launching today: — Runs locally in terminal — Interface to “link models with local code and computing tasks” — Built for o3 + o4-mini, GPT-4.1 support coming soon — $25K API credit grants are available for early projects https://t.co/p71wVqdsto” / X https://x.com/rowancheung/status/1912561395591241819

“o3 and o4-mini are super good at coding, so we are releasing a new product, Codex CLI, to make them easier to use. this is a coding agent that runs on your computer. it is fully open source and available today; we expect it to rapidly improve.” / X https://x.com/sama/status/1912558495997784441

Rowan Cheung on X: “CFO Sarah Friar also recently said that o3-mini is already the No. 1 competitive coder in the world: https://t.co/wsHPQqKI5I” / X https://x.com/rowancheung/status/1912561394068697210

OpenAI Launches GPT-4.1 Model Family with Major Coding and Performance Gains (So Many Models To Memorize!)
OpenAI introduced three API models: GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano, which outperform previous versions with significant improvements in coding (54.6% on SWE-bench Verified, 21.4% better than GPT-4o), instruction following (38.3% on MultiChallenge), and long-context handling with support for up to 1 million tokens and knowledge updated through June 2024. The model family offers better performance at lower costs, with GPT-4.1 mini matching or exceeding GPT-4o while cutting latency by half and reducing costs by 83%, while testers report more accurate results in specialized domains like tax analysis and SQL generation. OpenAI will deprecate GPT-4.5 Preview on July 14, 2025, as GPT-4.1 offers similar or better performance at lower cost, with these models available only via API while ChatGPT continues receiving gradual improvements to GPT-4o.

“gpt-4.1 is for developers:” / X https://x.com/OpenAIDevs/status/1912241877199581572

“GPT-4.1 outperforms GPT-4.5 in coding https://x.com/scaling01/status/1911828552452112536

“introducing the gpt-4.1 series. incredible at coding and instruction following, with a 1M token context window https://x.com/stevenheidel/status/1911830165317173740

“One last note: we’ll also begin deprecating GPT-4.5 Preview in the API today as GPT-4.1 offers improved or similar performance on many key capabilities at lower latency and cost. GPT-4.5 in the API will be turned off in three months, on July 14, to allow time to transition (and” / X https://x.com/OpenAIDevs/status/1911860805810716929

Introducing GPT-4.1 in the API | OpenAI https://openai.com/index/gpt-4-1/

OpenAI Releases Two New AI Models That Generate Original Ideas
OpenAI just launched two powerful new AI models, called o3 and o4-mini, that researchers say are exceptionally smart and capable of generating genuinely useful and original ideas. Experts are optimistic these latest breakthroughs will positively impact everyday life and help solve some of humanity’s most challenging problems.

“Just released o3 and o4-mini! These models feel incredibly smart. We’ve heard from top scientists that they produce useful novel ideas. Excited to see their positive impact on people’s daily lives and humanity’s hardest problems!” / X https://x.com/gdb/status/1912575762483540322

OpenAI’s Latest Breakthrough: AI That Comes Up With New Ideas — The Information https://www.theinformation.com/articles/openais-latest-breakthrough-ai-comes-new-ideas

New Image Library Rolls Out for ChatGPT Users
OpenAI has launched a dedicated image library for ChatGPT, giving users a centralized place to view and manage all their AI-generated images. Available now to Free, Plus, and Pro users on both mobile and desktop at chatgpt.com, the feature simplifies access to past creations, making it easier to browse and reuse images.

“All of your image creations, all in one place. Introducing the new library for your ChatGPT image creations—rolling out now to all Free, Plus, and Pro users on mobile and https://x.com/OpenAI/status/1912255254512722102

OpenAI Team Shares Behind-the-Scenes Look at GPT-4.5 Development
Sam Altman revealed that growing interest in how GPT-4.5 was built led to a new podcast featuring key members of the project team. In the discussion, Alex Paino, Dan Selsam, and Amin Tootoonchian—who played major roles in creating GPT-4.5—talk about the development process and hint at what’s coming next. The episode offers an insider’s look at how one of today’s most advanced AI models came to life.

Pre-Training GPT-4.5 – YouTube https://www.youtube.com/watch?v=6nJZopACRuQ

OpenAI Eyes Social Media with ChatGPT-Powered Platform
OpenAI is quietly developing a new social media platform that could rival Elon Musk’s X (formerly Twitter) and Meta’s Instagram, according to The Verge and CNBC. Early prototypes reportedly feature a social feed powered by ChatGPT’s image generation tools. CEO Sam Altman has been privately seeking feedback, but it’s unclear whether the platform will launch as a standalone app or be integrated into ChatGPT. The move could intensify OpenAI’s feud with Musk, a former co-founder who left the company in 2018 and has since sued OpenAI, accusing it of straying from its original mission. OpenAI, in turn, has counter-sued. As tech giants race to integrate AI into social media, OpenAI’s project signals its intent to join the front lines.

OpenAI is working on X-like social media network, the Verge reports | Reuters https://www.reuters.com/technology/artificial-intelligence/openai-is-working-x-like-social-media-platform-verge-reports-2025-04-15/

Firecrawl Unveils Smarter Web-Scraping Agent
Firecrawl has released FIRE-1, a new AI-powered web scraper that acts more like a human browsing the web. Unlike traditional scrapers, FIRE-1 can navigate complex websites, interact with dynamic content, and even fill out forms to collect the data users need. This agentic approach marks a big leap in how efficiently and intelligently tools can gather information online.

“Agentic web scrapers are here! Firecrawl just launched FIRE-1, their new agent-powered web-scraper. This is really dope! It navigates complex websites, interacts with dynamic content, and fills forms to scrape the data you need. https://x.com/omarsar0/status/1912596779784143002

NVIDIA Pushes Video AI Forward with One-Minute Cartoon Generator
NVIDIA has unveiled a new research paper showcasing its ability to generate one-minute-long, story-driven Tom and Jerry-style cartoons using AI. The team added “Test-Time Training” (TTT) layers to a pre-trained Transformer, allowing it to create full-length animated clips in a single shot—without any post-editing. This tackles a major challenge for current AI models, which struggle with generating long videos due to memory limitations. TTT layers, which are more expressive than existing methods like Mamba or Gated DeltaNet, helped the model produce more coherent and consistent animations. In human tests, videos made with TTT layers scored 34 Elo points higher than other methods. While early results show promise, NVIDIA notes that there’s room to improve quality and scalability.

“Today, we’re releasing a new paper – One-Minute Video Generation with Test-Time Training. We add TTT layers to a pre-trained Transformer and fine-tune it to generate one-minute Tom and Jerry cartoons with strong temporal consistency. Every video below is produced directly by https://x.com/karansdalal/status/1909312851795411093

Hugging Face Links Claude AI to The New York Times
Florent Daudens from Hugging Face shared a new integration that connects Claude, Anthropic’s AI assistant, directly to The New York Times using the Media Content Protocol (MCP). This setup allows publishers to manage exactly what content they make available through their API, giving them more control. Meanwhile, users benefit from quick, reliable access to trusted news right within their AI tools.

“Just tested something cool: connecting Claude directly to the New York Times via MCP. It feels like a smart way for publishers to stay in control of what they share through their API, while users get fast, reliable access through the AI tools they already use. https://x.com/fdaudens/status/1912586133466214526

Claude Gets Smarter: AI Assistant Now Does Deep Research and Connects to Your Google Workspace
Anthropic has unveiled powerful new updates to Claude, its AI assistant, aimed at making it an even more helpful work companion. The latest features include Research, which allows Claude to search both your documents and the web to deliver fast, well-cited answers, and a new integration with Google Workspace that connects to Gmail, Calendar, and Docs. This enables Claude to pull insights directly from your work context—whether it’s drafting meeting notes, surfacing key documents, or helping prepare for major presentations. With improved document cataloging for enterprise users and expanded global availability, Claude is stepping up as a trusted partner for tasks ranging from sales briefings to family planning. The new tools are now in beta for users on Max, Team, and Enterprise plans in the U.S., Japan, and Brazil.

Claude takes research to new places \ Anthropic https://www.anthropic.com/news/research

Google’s AI Helps Scientists Get Closer to Talking with Dolphins
Google has introduced DolphinGemma, an AI model trained to analyze and generate dolphin vocalizations. Developed in partnership with Georgia Tech and the Wild Dolphin Project (WDP), this tool processes decades of underwater audio and video to detect patterns in dolphin clicks, whistles, and squawks, sounds linked to behaviors like courtship, fighting, or mother-calf bonding. DolphinGemma, built on Google’s lightweight Gemma model architecture, runs directly on Pixel smartphones used in the field and supports real-time analysis. Researchers hope it will speed up discoveries about how dolphins communicate, while also powering interactive systems like CHAT, which enables dolphins to mimic synthetic sounds to “request” favorite objects.

“Meet DolphinGemma, an AI helping us dive deeper into the world of dolphin communication. 🐬 https://x.com/GoogleDeepMind/status/1911767367534735832

DolphinGemma: How AI can decipher dolphin communication https://blog.google/technology/ai/dolphingemma/

Claude May Soon Speak – Anthropic Preps Voice AI with Unique Personalities
Anthropic is reportedly gearing up to launch a voice feature for its Claude AI assistant, adding a conversational layer that could compete with ChatGPT’s voice mode. According to Bloomberg, the new “voice mode” may debut as early as this month, offering users three distinct voice options: Mellow, Airy, and Buttery. The company hinted at voice capabilities last month, and app researchers recently spotted signs of the feature in Claude’s iOS app. While Anthropic hasn’t officially commented, the move reflects its growing ambition to challenge OpenAI by making Claude more interactive and personalized.

Anthropic Nears Launch of Voice Assistant Feature for Claude to Rival OpenAI – Bloomberg https://www.bloomberg.com/news/articles/2025-04-15/anthropic-is-readying-a-voice-assistant-feature-to-rival-openai?embedded-checkout=true

Cohere Releases Embed 4: Supercharged Multimodal Search for Smarter Business AI
Cohere has launched Embed 4, a cutting-edge embedding model designed to power smarter search and retrieval in enterprise AI applications. Built for real-world business needs, Embed 4 understands complex, multimodal documents—like financial reports, medical charts, and product manuals—that mix text, images, tables, and code. It supports over 100 languages, handles documents up to 200 pages long, and performs well even with messy or scanned data. Optimized for industries like finance, healthcare, and manufacturing, Embed 4 helps companies find insights faster without clunky pre-processing steps. With improved accuracy, lower storage costs, and deployment flexibility across cloud or on-premise setups, Embed 4 sets a new standard for AI-ready document search—and it’s available now via Cohere and Microsoft Azure.

Introducing Embed 4: Multimodal search for business https://cohere.com/blog/embed-4

Google Eyes “Omni” AI Future by Merging Gemini with Video-Generating Veo 2
In a recent podcast appearance, Google DeepMind CEO Demis Hassabis revealed plans to merge Gemini, Google’s powerful AI model, with Veo 2, its advanced video-generation tool. The goal is to create a truly multimodal AI assistant that better understands and interacts with the physical world. Veo 2, now available to advanced users via the Gemini app, can create cinematic 8-second videos from a single prompt and demonstrates a striking grasp of real-world physics, thanks in part to training on vast amounts of video data, likely sourced from YouTube. This move reflects a broader industry trend toward “omni” models that can interpret and generate across text, audio, images, and video. With similar efforts underway from OpenAI and Amazon, the race to build universal AI assistants is clearly accelerating.

“1/ Today, Veo 2, our state-of-the-art video model, is rolling out to Gemini Advanced + Whisk! You can create 8s, high-res videos from text prompts in @GeminiApp with fluid character movement + lifelike scenes across a range of styles. Tip: the more detailed your description, the https://x.com/sundarpichai/status/1912191784312271034

DeepMind CEO Demis Hassabis says Google will eventually combine its Gemini and Veo AI models | TechCrunch https://techcrunch.com/2025/04/10/deepmind-ceo-demis-hassabis-says-google-will-eventually-combine-its-gemini-and-veo-ai-models/

“Dive into video creation with @GeminiApp — rolling out today.🪂 Transform text prompts into cinematic 8-second videos with Veo 2 in Gemini Advanced. Select Veo 2 from the model dropdown menu to get started. Prompt: Write the word “GOOGLE” out of skydiving parachutes opening up https://x.com/Google/status/1912190959820898355

“You write the script, Veo 2 brings it to life. 🎥 Starting today, @GeminiApp Advanced users can create stunning 8-second videos, in 720p cinematic quality, with just one text prompt. ✨ https://x.com/GoogleDeepMind/status/1912191340424601835

“Veo 2 is super fun to play with, and people have been creating some amazing videos with it. Its implicit understanding of the physics of the world is kind of mindblowing. Looking forward to seeing more people enjoy it now that it’s part of @GeminiApp!” / X https://x.com/demishassabis/status/1912197180187897985

Hugging Face Enters the Robot Arena with Pollen Robotics Acquisition
Hugging Face has made a bold move into open-source robotics by acquiring Pollen Robotics, a French startup known for its humanoid robot, Reachy 2. This VR-compatible robot is already being used in top research labs like Cornell and Carnegie Mellon and is available for purchase at $70,000. With this acquisition, Hugging Face strengthens its robotics initiative, *LeRobot*, aimed at providing models, tools, and datasets for embodied AI. The company, valued at $4.5 billion, is betting that open, customizable robots—not closed, expensive systems—will be the future interface for AI.

“Hugging Face acquired Pollen Robotics, a French startup building open-source humanoids Pollen is already selling Reachy 2, an open and VR-compatible humanoid for research, education, and embodied AI A big move from HF in open robotics! https://x.com/rowancheung/status/1912034498276900999

“Hugging Face has acquired a French robotics startup, Pollen Robotics, for an undisclosed amount. The Reachy 2 robot, developed by Pollen, is an open-source and VR-compatible bimanual robot for advancing embodied AI, available to order at $70K. Hugging Face, the leading platform https://x.com/TheHumanoidHub/status/1911824405296230652

“Super happy to announce that we are acquiring @pollenrobotics to bring open-source robots to the world! 🤖 Since @RemiCadene joined us from Tesla, we’ve become the most widely used software platform for open robotics thanks to @LeRobotHF and the Hugging Face Hub. Now, we’re https://x.com/huggingface/status/1911785683376648232

“Hugging Face just acquired humanoid robotics company Pollen Robotics https://x.com/_akhaliq/status/1911786756938006756

“Super inspiring to see @huggingface double down on AI for Robotics by acquiring @pollenrobotics, an open source robot manufacturer. https://x.com/ben_burtenshaw/status/1911843020309213547

AI company Hugging Face buys humanoid robot company Pollen Robotics, maker of Reachy 2 | Fortune https://fortune.com/2025/04/14/ai-company-hugging-face-buys-humanoid-robot-company-pollen-robotics-reachy-2/

Goodfire Secures $50M to Decode the Inner Workings of AI
AI interpretability startup Goodfire has raised $50 million in Series A funding to push forward its mission of making AI systems understandable and controllable from the inside out. Led by Menlo Ventures and backed by investors including Anthropic, Lightspeed, and B Capital, the funding will support the development of *Ember*—Goodfire’s platform designed to unlock AI models’ inner workings. Founded by veterans from OpenAI and DeepMind, Goodfire is tackling the “black box” problem in neural networks by investing in mechanistic interpretability research. Ember gives users direct insight into how AI systems make decisions, enabling safer, more reliable applications. With growing partnerships and research in fields from biology to language models, Goodfire aims to transform AI from mysterious to manageable.

Announcing Our $50M Series A to Advance AI Interpretability Research Funding from Menlo Ventures powers our mission to decode the neurons of AI models, reshaping how they’re understood and designed https://www.goodfire.ai/blog/announcing-our-50m-series-a

Meta Unveils Powerful Open Source Llama 4 Models, Challenges GPT-4 and Claude
Meta has released two cutting-edge, open-weight vision-language models—*Llama 4 Scout* and *Llama 4 Maverick*—with a third, *Llama 4 Behemoth*, still in training. Built on a mixture-of-experts (MoE) architecture, these models boost efficiency by using only select parameters during inference. *Scout* features a massive 10 million-token context window, the largest of its kind. *Maverick* has reportedly outperformed OpenAI’s GPT-4o in benchmarks, while *Behemoth*, a 2-trillion-parameter model, aims to surpass GPT-4.5 and Claude 3.7 Sonnet. Meta’s Llama 4 series sets a bold new standard for open-source, multimodal AI.

“Meta released two open-weight vision-language models, Llama 4 Scout and Llama 4 Maverick, and previewed a third, Llama 4 Behemoth. Built on a mixture-of-experts (MoE) architecture, these models offer greater efficiency by activating only a subset of parameters during inference. https://x.com/DeepLearningAI/status/1911841914590015586

“Meta released the Llama 4 family of natively multimodal, open-source models—with context windows up to 10M tokens! Currently, the series has two MoE models: 109B param Scout and 400B param Maverick, and a third, 2T param Behemoth, currently in training https://x.com/adcock_brett/status/1911450182937346285

Nvidia Bets Big on U.S. AI Manufacturing with $500 Billion Plan
Nvidia has announced a massive push to build AI servers worth up to $500 billion in the U.S. over the next four years, aligning with the Trump administration’s call for domestic tech manufacturing. The effort includes producing its powerful Blackwell AI chips at TSMC’s Arizona plant and building supercomputers in Texas through partners like Foxconn and Wistron. While some analysts see the headline figure as inflated, the move signals a strategic shift from Nvidia, which has largely relied on overseas production. CEO Jensen Huang emphasized the benefits of boosting supply chain resilience and meeting soaring AI demand. The announcement also follows U.S. tariff exemptions for certain tech imports, highlighting a balancing act between economic policy and the fast-growing AI sector.

Nvidia to produce AI servers worth up to $500 billion in US over four years | Reuters https://www.reuters.com/technology/artificial-intelligence/nvidia-says-working-with-partners-make-ai-supercomputers-us-2025-04-14/

ChatGPT Reaches 1 Billion Weekly Users Milestone
OpenAI’s ChatGPT now boasts 1 billion weekly active users, marking a major milestone in its rapid growth. The AI-powered tool is becoming a daily staple for millions around the world, with usage nearing the coveted 1 billion daily active users mark.

“ChatGPT has hit 1 billion weekly active users (WAU). OpenAI is not far from the vaunted 1 billion daily active users (DAU) club. https://x.com/bilawalsidhu/status/1911125917218508945

OpenAI Eyes $3B Acquisition of AI Coding Startup Windsurf
OpenAI is in advanced talks to acquire Windsurf, a rising AI-powered coding assistant formerly known as Codeium, in a deal valued at around $3 billion. Windsurf has quickly become a favorite among developers alongside rivals like Cursor and Replit, offering tools that help users “vibe code”—a fast, AI-assisted way to generate software. The potential acquisition marks OpenAI’s largest to date and signals its push to stay ahead in the competitive generative AI space, where companies like Google, Anthropic, and xAI are racing to innovate. This move follows OpenAI’s record-setting $40 billion funding round and recent launch of its new AI models, o3 and o4-mini, which can interpret rough user sketches.

OpenAI in talks to pay about $3 billion to acquire startup Windsurf https://www.cnbc.com/2025/04/16/openai-in-talks-to-pay-about-3-billion-to-acquire-startup-windsurf.html

OpenAI Said to Be In Talks to Buy Windsurf for About $3 Billion – Bloomberg https://www.bloomberg.com/news/articles/2025-04-16/openai-said-to-be-in-talks-to-buy-windsurf-for-about-3-billion?embedded-checkout=true

AI Tool Can Reveal Home Addresses From Photos, Even With Metadata Removed
Even if you scrub metadata from your photos, AI can pinpoint your location using visual clues in the background, like buildings, terrain, and shadows. This technology builds on systems like Google’s Visual Positioning System (VPS), originally developed for navigation. Now, it’s being used in unsettling ways, with reports that GeoSpy can match a single social media photo to a specific home address and even link to shots of the interior through real estate history. As AI becomes more powerful, experts warn that everyday posts may reveal far more than intended.

“Think scrubbing metadata keeps you safe? Not from GeoSpy. A single photo can still reveal your exact geolocation — thanks to AI. I worked on Google’s Visual Positioning System (VPS). That’s how I know how real and how dual use this capability has become. https://x.com/bilawalsidhu/status/1911442948723524049

ByteDance Unveils Seaweed, a Lean but Powerful Video AI Model
ByteDance has introduced Seaweed-7B, a new 7-billion-parameter video AI model that punches well above its weight. Despite using less computing power than rivals, Seaweed rivals or outperforms larger models like Sora and Google’s Veo. The model supports text-to-video, image-to-video, and even audio-driven video generation, producing up to 20-second clips with impressive realism. It excels at generating lifelike human characters with expressive gestures and synchronized lip movements. Seaweed can also create long-form stories, real-time video at 24fps, and high-resolution clips up to 2K. From image transitions to camera-controlled 3D scenes, Seaweed proves versatile across a range of creative and practical video tasks—all while staying compute-efficient.

“TikTok parent ByteDance just dropped Seaweed, a hyper-efficient 7B-param video AI —Supports text-to-video, image-to-video, and audio-driven synthesis —Clips up to 20s —Matches or outperforms larger models like Sora, Kling 1.6, and Veo https://x.com/rowancheung/status/1912034520473157660

Seaweed https://seaweed.video/

Ilya Sutskever’s New AI Startup Hits $32B Valuation
Safe Superintelligence (SSI), the stealthy AI company launched by OpenAI co-founder Ilya Sutskever, has reached a massive $32 billion valuation after securing $2 billion in new funding. The round was reportedly led by investment firm Greenoaks. Sutskever, who departed OpenAI in May 2024 following internal tensions, co-founded SSI with Daniel Gross and Daniel Levy. The company is focused on developing a single product: a safe superintelligent AI. While details remain scarce, SSI’s sparse website reinforces its mission-driven approach to AI safety and advancement.

“Ilya Sutskever’s new AI startup Safe Superintelligence (SSI) just hit a $32B valuation after raising $2B in a funding round reportedly led by Greenoaks https://x.com/rohanpaul_ai/status/1911404858223046672

Grok Adds Memory Feature for Smarter, Personalized Replies
Elon Musk’s xAI has launched a global update to Grok, its AI chatbot, introducing a new Memory feature. Grok can now remember past conversations, allowing it to offer more personalized advice and recommendations over time.

“BREAKING: xAI @grok rolling out Memory Feature, globally https://x.com/ns123abc/status/1910839241170432340

“Grok now remembers your conversations. When you ask for recommendations or advice, you’ll get personalized responses. https://x.com/grok/status/1912670182012801156

There’s Only One Lonely AI Visual: Week Ending April 18, 2025

Jerry Tworek on X: “Scaling is incredibly hard and demanding and leaves very little room for error in every little part of the training stack But once it works, it’s beautiful to see it https://t.co/I13hW5gAuE” / X https://x.com/MillionInt/status/1912568397419954642

Top 22 Links of The Week – Organized by Category

AGI

“Today, we’re announcing our $50M Series A and sharing a preview of Ember – a universal neural programming platform that gives direct, programmable access to any AI model’s internal thoughts. https://x.com/GoodfireAI/status/1912929145870536935

“We did not “solve math”. For example, our models are still not great at writing proofs. o3 and o4-mini are nowhere close to getting International Mathematics Olympiad gold medals.” / X https://x.com/polynoamial/status/1912575974782423164

ARVR

“Jim Fan’s predictions: ⦿ In the next 2–5 years, robotics will uncover its own scaling laws – similar to those seen in LLMs – by analyzing how model size, real‑world data, simulation data, and compute affect performance. ⦿ Within the next 20 years, robotics will accelerate https://x.com/TheHumanoidHub/status/1910367639425384568

AgentsCopilots

“🔧🤖 MCP-Use Tools Just launched: An open-source library that connects any LLM to MCP tools for custom agents, featuring seamless integration with LangChain and support for web browsing, Airbnb search, and 3D modeling capabilities. Explore the implementation on GitHub 🚀 https://x.com/LangChainAI/status/1911449301542195582

“🚨 Breaking: Google just open-sourced the Agent Development Kit (ADK) a framework for building AI agents and multi-agent systems. – Build agents in under 100 lines. – Supports MCP More information and how to get started 👇 1/5 https://x.com/ai_for_success/status/1910017257335402816

“Transforming government service delivery in Abu Dhabi with LangGraph The Abu Dhabi government’s AI Assistant, TAMM 3.0, now delivers 940+ services across all platforms with personalized, seamless interactions. Built on LangGraph, their key workflows include: 🔍 Fast, accurate” / X https://x.com/LangChainAI/status/1912207364448743797

“OMG !! Google AgentSpace looks insane , You need to see this. Google has launched new Agent2Agent(A2A) : An open protocol to enable AI agents from different vendors and frameworks to securely communicate, collaborate, and coordinate actions across enterprise platforms. More https://x.com/ai_for_success/status/1909949067871793453

“the ability of the new models to effectively use tools together has somehow really surprised me intellectually i knew this was going to happen but it hits different to see it” / X https://x.com/sama/status/1912564175253172356

“You can also enable pagination. There are so many limitations like this with traditional web scrapers, so this looks super promising. Just make sure to prompt it properly and be very specific about the results you want. More here: https://x.com/omarsar0/status/1912599033144619411

“Let’s build a multi-agent brand monitoring system using DeepSeek-R1 (100% local):” / X https://x.com/akshay_pachaar/status/1910309289186762804

Amazon

Amazon CEO Andy Jassy’s 2024 Letter to Shareholders https://www.aboutamazon.com/news/company-news/amazon-ceo-andy-jassy-2024-letter-to-shareholders

“Amazon released a speech-to-speech AI called “Nova Sonic” Available on Bedrock, the model generates speech with a 1.09 sec latency, outperforming the best OpenAI models at 20% cost Amazon also launched Reel 1.1 AI for extended 2-min video generations https://x.com/adcock_brett/status/1911450262977368259

“Excited about the launch of Amazon Nova Sonic, our new speech-to-speech model that helps make AI voice applications feel remarkably natural. It’s designed to understand not just what people say, but how they say it – working with tone, style, and conversation flow including https://x.com/ajassy/status/1909691335877312757

Anthropic

“A new version of the MCP spec was finalized today. Some of the major changes: – Auth framework based on OAuth 2.1 – Replaced the previous HTTP+SSE transport with Streamable HTTP transport – Support for JSON-RPC batching – Tool annotations for better describing tool behavior” / X https://x.com/alexalbert__/status/1904908450473324721

BusinessAI

“RIP Tableau and PowerBI. Enter Julius AI. This is what Julius can do: https://x.com/mdancho84/status/1909571551193670024

Assort Health Secures $26 Million in Funding to Expand Specialty-Specific Generative AI Platform for Managing Patient Phone Calls https://www.assorthealth.com/blog/assort-health-secures-26-million-in-funding-to-expand-specialty-specific-generative-ai-platform-for-managing-patient-phone-calls

“heard from some startup engineers that they lost several work hours gawking, stupefied, after they plugged 4.1 mini/nano into every previously-expensive part of their stack you can just do gpt-4o-quality things 25 × cheaper now” / X https://x.com/aidan_mclau/status/1911850291026362426

Google

Google Workspace Updates: Use Gemini in Google Classroom to generate questions or a quiz based on specific text https://workspaceupdates.googleblog.com/2025/04/use-gemini-in-google-classroom-to-generate-questions-from-text.html

Multimodality

Moondream https://moondream.ai/blog/moondream-2025-04-14-release

OpenAI

“o3 as a management coach, for personalized learning, and more: https://x.com/gdb/status/1912568418626420804

Our updated Preparedness Framework | OpenAI https://openai.com/index/updating-our-preparedness-framework/

Robotics

“The Beijing Half Marathon, originally set for April 13, is now postponed to April 19 due to bad weather. Humanoids from over 20 teams will run alongside humans. Rules: ⦿ Only bipedal robots (no wheels) – remote-controlled or autonomous ⦿ Height: 1.6 ft to 6.5 ft ⦿ 10-minute https://x.com/TheHumanoidHub/status/1910416585191547055

Leave a Reply

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading