This week’s cover is inspired by Microsoft’s deep fake video engine, VASA-1, that can turn a single profile photo into a talking head. It is so powerful that Microsoft will not release it to the public in any form. The cover depicts a robotic Wicked Queen from “Snow White” looking away from the magic mirror, which is showing a human reflection. Image prompted in MidJourney. Font is Snotex.
Executive Summary
As usual, I focus on what I think is most important to know, not just the most popular headlines.
- VASA-1 – Microsoft Research: Microsoft can take a single image of a person and create a viable deep fake video. It is a bit like Viggle. Or many of the Bytedance products (DreamTalk, DiffPortrait3D, MagicVideo-V2, and DreamTuner). However, this one is remarkably convincing. To my knowledge, Microsoft is not releasing it due to ethics concerns. “Given such context, we have no plans to release an online demo, API, product, additional implementation details, or any related offerings until we are certain that the technology will be used responsibly and in accordance with proper regulations.”
- Quote of the week: Microsoft Research’s Chris Bishop: “When AI models regurgitate information in response to prompts we call them stochastic parrots; when humans do it we give them university degrees.”
- Boston Dynamics: introduces a fully electric humanoid robot that “exceeds human performance” YouTube
- Meta drops new Llama model and integrates their chatbot into their products: AI is now all over Facebook and Instagram. Prompting people to ask more, learn more, have a chat. Is this Meta’s first move against Google? If Meta can keep people on their products they, can train their models and keep users on their products. Please take time to read the links below in the top links section. It might feel dry, but if you want to be up to speed, understanding Llama is critical. “Meta is playing the long game and will go down in history as a seminal company! Most AI innovation in the OSD ecosystem will happen on the Llama architecture! The probability that the next breakthrough happens because of these Llama models is very high!” -Bindu Reddy
- Adobe plans a lot more AI in Adobe Premiere: “Bringing generative AI to video editing workflows in Adobe Premiere Pro” via Adobe Blog. A lot of the mundane difficult tasks in video editing are going to be helped by AI. For example, removing an object across frames with one click. Or inserting an object. Adobe is also partnering with third party AI tools like Runway, Pika, and OpenAI.
- Autonomous jet dogfighting: “USAF Test Pilot School and DARPA announce breakthrough in aerospace machine learning. Dogfighting is a highly complex scenario that the X-62A utilized to successfully prove using non-deterministic artificial intelligence safely is possible within aerospace. The AI dogfights paired the X-62A VISTA against manned F-16 aircraft in the skies above Edwards. Initial flight safety was built up first using defensive maneuvers, before switching to offensive high-aspect nose-to-nose engagements where the dogfighting aircraft got as close as 2,000 feet at 1,200 miles per hour.“ https://www.edwards.af.mil/
- Autonomous submarines: “Anduril is now making giant f*cking autonomous submarines. First Ghost Shark Debuts in Australia – On Schedule & On Budget Anduril, the Royal Australian Navy (RAN). via PatrickJBlum
- Infinite memory: When most people chat with AI using prompts they never hit the limit. Context windows are basically AI’s short term memory, and they are already huge. For context, pun intended, Gemini can fit 5.5 Harry Potter books in its memory. This week both Google and Meta announced memory breakthroughs: “Google’s new technique gives LLMs infinite context” and “Meta announces Megalodon Efficient LLM Pretraining and Inference with Unlimited Context Length”
- Artificial General Intelligence (AGI): “Dan Schulman (former PayPal CEO) says, “GPT-t5 will be a freak out moment” and “80% of the jobs out there will be reduced 80% in scope”
- Apple AI rumblings are getting louder heading into June: Apple’s iOS 18 AI will be on-device preserving privacy, and not server-side via Apple Insider
- Tesla’s LLM may drive your car: Nvidia’s Jim Fan is my #1 trusted source of AI expertise. Here’s what he writes: “What excites me the most about Grok-1.5V is the potential to solve edge cases in self-driving. Using language for “chain of thought” will help the car break down a complex scenario, reason with rules and counterfactuals, and explain its decisions. What Grok-1.5V can help is to lift pixel->action mapping to pixel->language->action instead. With Tesla AI’s highly mature data pipeline, it is not hard to label tons of edge cases with high-quality human explanation traces, and finetune Grok to be far better than GPT-4V and Gemini for multimodal FSD reasoning. Tesla is spinning an unparalleled data flywheel that could scale far beyond.” See his full Tweet.
- Humane’s AI pin gets creamed: “The Worst Product I’ve Ever Reviewed” – Marques Brownlee “The Humane AI Pin is lost in translation” – The Verge.
- Another AI pin launches: Limitless AI: (formerly Rewind) a new wearable gadget, and app, for remembering your meetings via the Verge
- Adobe uses MidJourney to train: “Adobe’s ‘Ethical’ Firefly AI Was Trained on Midjourney Images” Bloomberg
- DeepMind spending: DeepMind CEO Says Google Will Spend More Than $100 Billion on AI – Bloomberg
- MidJourney’s website opens to all, includes social features: “Midjourney has officially started testing social features on the web! The feature is called “Rooms” and it’s kind of reminiscent of a Discord server, but better”. Via Nick Floats
- Product photography AI tool: Flair AI allows brands to “build scenes and recreate professional photoshoots with remarkable accuracy to showcase your products!” via mickeyxfriedman
Top 55 Links of The Week
These are the must-click links, in order by topic. Even if they look boring, click them! I did the work, so you don’t have to worry. All are 10/10 would recommend.
- #1 Link of the Week – Microsoft Image to Video
- VASA-1 – Microsoft Research is so powerful, they won’t release it. “Given such context, we have no plans to release an online demo, API, product, additional implementation details, or any related offerings until we are certain that the technology will be used responsibly and in accordance with proper regulations.”
- “Ultra realistic AI-video from a photo This is VASA-1 from Microsoft research The improvements in quality we’re getting between each new release is incredible.
- Adobe plans a LOT more AI in Adobe Premiere
- Bringing generative AI to video editing workflows in Adobe Premiere Pro | Adobe Blog
- Generative AI in Premiere Pro powered by Adobe Firefly | Adobe Video – YouTube
- AI Dogfighting
- AI Flew X-62 VISTA During Simulated Dogfight Against Manned F-16 – The Aviationist
- “DARPA and USAF just quietly dropped that an AI-piloted F-16 engaged in a dogfight against a human over Area 51.”
- LLMs will drive your car
- “Tesla FSD v13 will likely be grokking language tokens. What excites me the most about Grok-1.5V is the potential to solve edge cases in self-driving. Using language for “chain of thought” will help the car break down a complex scenario, reason with rules and counterfactuals, and…”
- Science breakthroughs
- “GPT-4 ranked higher than the majority of physicians in psychiatry… it performed similarly to the median physician in general surgery & internal medicine… GPT-4 performance was lower in pediatrics & OB/GYN but remained higher than a considerable fraction” of active doctors.”
- GPT versus Resident Physicians — A Benchmark Based on Official Board Scores | NEJM AI
- “Large language models (LLMs) are approaching expert-level ophthalmological knowledge and reasoning”
- Bezos Earth Fund Announces $100 Million for AI Solutions to Tackle Climate Change and Nature Loss
- Large language models approach expert-level clinical knowledge and reasoning in ophthalmology: A head-to-head cross-sectional study | PLOS Digital Health –
- AI speeds up drug design for Parkinson’s ten-fold | University of Cambridge
- The Next Frontier for Brain Implants Is Artificial Vision | WIRED –
- Robots are coming quickly
- “Humanoid robots will exceed the supply of iPhones in the next decade. Gradually, then suddenly.”
- “AI winter? No. Even if GPT-5 plateaus. Robotics hasn’t even started to scale yet. Embodied intelligence in the physical world will be a powerhouse for economic value. Friendly reminder to everyone that LLM is not all of AI. It is just one piece of a bigger puzzle.”
- Boston Dynamics retired its hydraulic Atlas yesterday, making room for a brand new, all-electric Atlas destined for factory work.
- “We promise this is not a person in a bodysuit.
- An Electric New Era for Atlas | Boston Dynamics
- All New Atlas | Boston Dynamics – YouTube
- Hello, Electric Atlas – IEEE Spectrum – Boston Dynamics introduces a fully electric humanoid robot that “exceeds human performance”
- “Google DeepMind researchers shared new demos of its low-cost ALOHA 2 autonomous robots. The new demos showcase the robot’s ability to tie shoelaces, hang shirts, and more.
- “For the past year we’ve been working on ALOHA Unleashed 🌋 @GoogleDeepmind – pushing the scale and dexterity of tasks on our ALOHA 2 fleet. Here is a thread with some of the coolest videos! The first task is hanging a shirt on a hanger (autonomous 1x)
- Watch how fast Whisper runs using Llama 3 (worth it):
- Meta is on fire this week with Llama – read these
- “Introducing Meta Llama 3: the most capable openly available LLM to date. Today we’re releasing 8B & 70B models that deliver on new capabilities such as improved reasoning and set a new state-of-the-art for models of their sizes. Today’s release includes the first two Llama..
- “Meta is playing the long game and will go down in history as a seminal company! Most AI innovation in the OSD ecosystem will happen on the Llama architecture! The probability that the next breakthrough happens because of these Llama models is very high! Secretive closed…”
- “I think Meta and Llama-3 is the final nail in the coffin to several misconceptions I’ve been fighting against for the last year. Llama-3 Chat was trained on over 10M Instruction/Chat samples, and is one of the only finetunes that shows significant improvements to MMLU.…
- “HISTORIC MOMENT! Llama-3 70B numbers are INSANE! At 82.0 MMLU, it’s FAR AND AWAY the best OSS Model. GSM-8K, Math, and Human Eval are MIND BLOWING as well. The OSS community is definitely going to beat GPT-4 in a matter of weeks!! Xmas came very very early
- “The upcoming Llama-3-400B+ will mark the watershed moment that the community gains open-weight access to a GPT-4-class model. It will change the calculus for many research efforts and grassroot startups. I pulled the numbers on Claude 3 Opus, GPT-4-2024-04-09, and Gemini.…
- “It’s crazy that GPT 3.5 – that started the global AI revolution – was presented only 15 months ago. And now we have a model that is better AND can be run locally on a normal computer – Llama 3 8B.”
- Meta’s battle with ChatGPT begins now. Meta’s AI assistant is being put everywhere across Instagram, WhatsApp, and Facebook. Meanwhile, the company’s next major AI model, Llama 3, has arrived.
- “My mind is blown. @GroqInc is serving LLaMA 3 at over 800 tokens per second! 800. Tokens. Per. Second. This unlocks so many incredible use-cases. It’s one thing to see my demo — it’s another thing entirely to experience it for yourself. Do yourself a favor and try it asap.
- “Meta AI built into the search box of all $META products. 3B daily users, opening these apps multiple times a day, have access to LLAMA 3. What does this means for $GOOGL?”
- “Meta open source is by far the greatest threat and competitor to Google. If LLaMA 4/5 becomes default on devices with intelligence it disintermediates Google completely and they directly compete on advertising revenues. Defaults are super powerful & intents super $$s”
- “Meta releasing near gpt-4 level models is really driving the price of tokens down because anyone can take the weights and optimize the runtime eg groq, togetherapi, fireworks etc. Definitely not good for OpenAI”
- “Because anyone can work with them, open models are likely to improve very quickly, creating a lot of capabilities focused on factors ranging from speed to costs. Here is the new Llama 3 70B being served by Groq (with a q) at 224 tokens/second. This is real-time of me using it.
- Artificial General Intelligence (AGI)
- “This chart shows the three big questions of AI right now (even given the limitations of MMLU as a test): 1) OpenAI was very far ahead, now it isn’t. Will GPT-5 be another leap? 2) Will open source models converge with closed source? 3) Will rapid progress continue or hit a wall?
- “The current state-of-play in the key question of how good AI gets: So much depends on GPT-5. OpenAI had a year+ lead in creating a GPT-4 class model. Now there are four GPT-4 class models. If exponential growth is still possible, OpenAI should be the first to show us. Or not.
- OpenAI and Meta Reportedly Preparing New AI Models Capable of Reasoning
- AI now beats humans at basic tasks — new benchmarks are needed, says major report
- Augmented and Virtual Reality
- “Making 3D objects from a single photograph was a big challenge, but Tecent’s image mesh can create one in 10 seconds There is a free demo to play with. I made a penguin mech in Midjourney (the upper lefthand image) and it made it a 3D model. Impressive.
- “InstantMesh from Tencent is insane – Super fast Image-to-3D with high quality output. Link below – Generate a 3D model from a single image in 30 seconds for free”
- “Computer vision is so dope. You can map your space in real-time with the phone in your pocket — creating a simplified 3D model with ‘semantically meaningful’ entities representing the room, walls, windows and furniture within that space. Basically, a 3D floor plan.”
- AI agents
- “Multi-agent collaboration has emerged as a key AI agentic design pattern. Given a complex task like writing software, a multi-agent approach would break down the task into subtasks to be executed by different roles — such as a software engineer, product manager, designer, QA”
- “3/ Introducing our solution with Neo Sapiens: At the core of this innovation is the “boss” agent, a unique component capable of generating a team of N agents tailored to handle any assigned task with dynamic adaptability based on the task’s complexity.
- “Today we are adding an important new capability to Poe: multi-bot chat. This feature lets you easily chat with multiple models in a single thread. (1/n)
- “🤖 Introducing @Taskade’s Multi-AI Agents, now entering Beta! Imagine one AI agent researching while another converts insights into tasks. They can write articles, perform research, summarize findings, and edit content—all at once! 🚀 For early access, reply ‘AI Agent’! ✨
- “New toys for capturing reality in immaculate 3D with LiDAR SLAM. It’s so fun seeing it build out a 3D representation of your space in real-time. The Lixel K1 captures 200,000 points per second and has a 36 megapixel panoramic camera. These puppies can also produce radiance… https://twitter.com/bilawalsidhu/status/1781443758909010226
- Elections
- Tracking AI use in global elections – Rest of World
- OpenAI
- “Introducing a series of updates to the Assistants API 🧵 With the new file search tool, you can quickly integrate knowledge retrieval, now allowing up to 10,000 files per assistant. It works with our new vector store objects for automated file parsing, chunking, and embedding.
- “Claude 3 was the best model in the world for a grand total of 17 seconds. Right after it happened, OpenAI updated GPT-4 and put it back on top. As a reminder, models can’t cheat to get higher scores on this leaderboard. The score represents which answers people like better.
- Elon betting big
- Elon Musk to Raise $4 Billion for xAI, Challenging ChatGPT
- Synthetic data not a problem?
- “Big day for unexpectedly powerful LLM releases. Microsoft’s open source WizardLM 2 (also note that it used synthetic inputs in training, maybe “running out of data” will not be a big deal):
- HuggingFace app
- “we just shipped HuggingChat on iOS 💬 The app is super polished and gives you access to the community’s best open AI models, on the go. Give it a try! link to Appstore below
- Viggle: Lil Yachty Image to Video
- This AI Meme Went Viral (Here’s How It Was Made) – YouTube – https://www.youtube.com/watch?v=_C-1yAHPCIQ&t=523s
- See also my post discussing the other similar technologies: https://ethanbholland.com/2024/04/08/the-viggle-ai-memes-impact-on-image-to-video-awareness/
The Rest: AI News of The Week
Don’t let the volume overwhelm you. Have fun and skim these. The links are organized by topic, sorted from ‘coolest’ to ‘least cool’, and each topic is clearly defined with a headline. I’ve added a description and glossary of what the topics mean, beneath each label, in plain language. I do the work so you don’t have to! When you visit the pages, note that the links and descriptions are often pulled directly from tweets or articles, so it’s not always my voice. Pause when you see something that interests you. Reach out to me any time. I enjoy sharing and discussing these items.
Agency/Agents/Copilots News of the Week: Agency is when AI can do things for you (like Googling an actress name or fetching the latest weather forecast). An agent is one step further, when AI given autonomy to take action on your behalf (“Alexa, book a reservation for three at Peak in Hudson Yards for Friday night”). A co-pilot is an assistant (like spell check or autofill).
This weeks’s latest agent news: https://ethanbholland.com/2024/04/19/agents-and-copilots-ai-news-week-ending-04-19-2024/
Amazon News of The Week: Individual company products will often be placed in the categories they match (image, audio, agents, robots, etc). Occasionally, I’ll dedicate space to a company’s news if it’s broad or a major product release.
This week’s latest Amazon AI news: https://ethanbholland.com/2024/04/19/amazon-ai-news-week-ending-04-19-2024/
Artificial General Intelligence (AGI) News of the Week: Artificial General Intelligence, in a nutshell, is when artificial intelligence is able to beat humans at everything (including embodying physical forms and completing physical tasks). It’s usually a thought catalyst for predictions, like when AGI will occur. 10 years? 25 years? 100? AGI is an event horizon that is tough to define, tough to imagine, and tough to predict. OpenAI defined AGI in its charter as “highly autonomous systems that outperform humans at most economically valuable work”. OpenAI has a section of its website dedicated to AGI. Google’s DeepMind published my favorite report on the five levels of artificial intelligence on the way to AGI (see also here).
This week’s latest Artificial General Intelligence (AGI) news: https://ethanbholland.com/2024/04/19/artificial-general-intelligence-agi-news-week-ending-04-19-2024/
AI Audio News of the Week: In this case, AI audio can mean a few things. The first is “generative audio” which refers to creating sounds with AI, much like ChatGPT writes words or MidJourney creates images. For example, asking for the “sound of waves crashing on the beach” would be text to sound. Another example would be an AI ‘watching’ a video and adding sound to it, like a foley artist would add footsteps or a creaking door to a movie scene. Lastly, AI audio can refer to microphones that only pick up certain speaker’s voices or headsets that cancel out all voices but your friends. This week’s latest AI audio news: https://ethanbholland.com/2024/04/19/audio-news-week-ending-04-19-2024/
Autonomous Vehicles/Driverless Cars News of the Week: Driverless car news doesn’t always get its own category, because it’s so close to robot embodiment. I go with my gut each week around what to place in each category. My recommendation would be to follow Robotics/Embodiment also, as the two fields are converging.
This week’s autonomous vehicle news: https://ethanbholland.com/2024/04/19/autonomous-vehicles-news-week-ending-04-19-2024/
Augmented and Virtual Reality (AR/VR) News of the Week: Augmented reality is when you see images or information on top of the real world. A car windshield with a heads-up display of the speed. Or glasses that have facial recognition and overlay the names of everyone in view. Virtual reality is when you are transported into another place, usually wearing goggles, but a flight simulator could also be considered virtual reality.
This weeks’s latest AR/VR news: https://ethanbholland.com/2024/04/19/augmented-and-virtual-reality-ar-vr-news-week-ending-04-19-2024/
Business/EnterpriseAI News of the Week: This broad category is for stories that impact corporations and large scale AI implementation. Enterprise refers to a type of AI that is often custom built for a business or leverage an API to connect secure data to an AI model.
This weeks’s latest enterprise AI news: https://ethanbholland.com/2024/04/19/business-and-enterprise-ai-news-week-ending-04-19-2024/
Chips and Hardware AI News of the Week: Most of the chip news is NVIDA usually, yet more and more Meta, Google, and OpenAI are starting toward their own manufacturing. I have to make the call whether to put Meta, Google, and OpenAI’s chip news under this section or their company sections. Lately, I’m putting each company’s chips news into the company category, rather than the chips category. This is the rest of the chips headlines.
This weeks’s latest chips and hardware news: https://ethanbholland.com/2024/04/19/chips-and-hardware-week-ending-04-19-2024/
Consumer Electronics AI News of the Week: This is a broad category meant to capture end user tools and products that incorporate artificial into their feature, from high-end grills to smartphones.
This weeks’s latest consumer AI news: https://ethanbholland.com/2024/04/19/consumer-products-week-ending-04-19-2024/
Education AI News of the Week: There is a lot of buzz around the impact of AI in education. This section focuses both on the risks and rewards of how AI can impact learning. It’s broader than just K-12 and includes things like skills, trade, professional, and higher education. This is not about how to learn AI, it’s about AI’s impact on learning.
This weeks’s latest education news: https://ethanbholland.com/2024/04/19/education-ai-news-week-ending-04-19-2024/
Ethics/Legal/Security AI News of the Week: This section focuses on the impact AI is having on ethics (deep fakes, war, trust, false information, plagiarism, job loss, income), legal (rights, laws, regulations), and security (hacking, phishing, national interests, safety). For huge news stories like the NY Times suing OpenAI, I usually put them under the main section or give them their own page.
This weeks’s latest AI ethics/legal/security news: https://ethanbholland.com/2024/04/19/ethics-legal-security-ai-news-week-ending-04-19-2024/
Google AI News of the Week: Individual company products will often be placed in the categories they match (image, audio, agents, robots, etc). Occasionally, I’ll dedicate space to a company’s news if it’s broad
This weeks’s latest Google AI news: https://ethanbholland.com/2024/04/19/google-ai-news-week-ending-04-19-2024/
Imagery News of the Week: AI imagery covers “generative AI” image tools. This usually text-to-image, where a user enters a prompt (“a polar bear walking through NYC”) and a tool like Dalle or MidJourney generates an image in the likeness of the description. This is different than AI vision, where an AI “looks at” an image and can derive context, details, and contents. AI vision is a subset of AI called multimodality. Imagery, in this case, is for image creation and modification/editing. Adobe Photoshop’s AI tools would fall into this category. I’ll also include things like automatic masking and object removal, even though that’s in between imagery and vision… but practically speaking it fits into editing.
This weeks’s latest AI image news: https://ethanbholland.com/2024/04/19/imagery-news-week-ending-04-19-2024/
International AI News of the Week: A lot of international news will get cross listed in the chips, security, or open-source categories, however it’s nice to have a separate category for worldwide AI news.
This week’s latest international AI news: https://ethanbholland.com/2024/04/19/international-ai-news-week-ending-04-19-2024/
Locally Run AI Models News of the Week: This is a niche mostly for serious AI followers. It refers to AI that can be privately downloaded and run on a device without an internet connection. These have an array of powerful implications, from ethics of rogue users with untethered agents, to practical uses like Apple running a full AI on your phone, to corporate installations for security, to embodied robots with AI running in their virtual brain.
This weeks’s latest locally run AI news: https://ethanbholland.com/2024/04/19/locally-run-ai-models-news-week-ending-04-19-2024/
Meta AI News of the Week: This is a space dedicated for Meta specific AI advancements and news stories.
This weeks Meta AI news: https://ethanbholland.com/2024/04/19/meta-ai-news-week-ending-04-19-2024/
Microsoft AI News of the Week: This is a space dedicated for Microsoft specific AI advancements and news stories.
This weeks Microsoft AI news: https://ethanbholland.com/2024/04/19/microsoft-ai-news-week-ending-04-19-2024/
Multimodal AI News of the Week: This is a broad topic for an single AI model that demonstrates an ability to interact with more than one modality (imagery, video, audio, text). Often multimodal news will end up in one of these categories. I’m playing it by ear on a case by case basis. Please be patient with my organizational challenges.
This week’s multimodal AI news: https://ethanbholland.com/2024/04/19/multimodality-news-week-ending-04-19-2024/
OpenAI: OpenAI is the leading force in the AI boom of 2023 and now 2024. This section focuses on news that is specific to OpenAI. This section will compete with all of the other sections (imagery, vision, ethics, etc) because OpenAI is so broad. I won’t be able to consistently pick when to put things under OpenAI or other sections, so bear with me.
This weeks’s latest OpenAI news: https://ethanbholland.com/2024/04/19/openai-news-week-ending-04-19-2024/
Open Source Models: An open source AI model refers to a class of artificial intelligence models with public source code. They can be inspected, copied, installed, and customized on private computers. In contrast, a closed source model is proprietary and owned by a company that you pay to use (like PowerPoint or Photoshop). One of the most famous open source language models is a French model called Mistral. Its code is completely publicly available, and anyone can download it and customize it. On one hand, open source is a transparent and powerful way to democratize AI, but on the other hand, open source models circumvent the guard rails and copyright protections that private companies implement. Open source models are the wild west of artificial intelligence, but also the potential saving grace (depending on who you ask). It’s a bit like gun control debates but for computing power.
This weeks’s latest open source news: https://ethanbholland.com/2024/04/19/open-source-ai-news-week-ending-04-19-2024/
Podcast/YouTube Clips of the Week: This is for more general interviews and explainer videos and podcasts that provide access to leadership, demos of new products, and walkthroughs and tutorials. Videos focused on specific topics will live in the topic category (i.e. images), but broader videos will live here.
This weeks’s latest podcasts and YouTube clips: https://ethanbholland.com/2024/04/19/podcasts-youtube-op-eds-week-ending-04-19-2024/
Publishing AI News of the Week: These are stories about AI’s impact on the publishing industry. From copyright and crawling to the death of page views or even the end of browsers.
This weeks’s latest publishing AI news: https://ethanbholland.com/2024/04/19/publishing-news-week-ending-04-19-2024/
RAG Retrieval-Augmented Generation News of the Week: RAG allows a language model to “reference an authoritative knowledge base outside of its training data sources before generating a response” (via Amazon). Historically RAG was prone to hallucinations, however new methods are improving the reliability. There is enough news about RAG, that I want to start tracking it separately for my own use.
This week’s latest RAG (Retrieval-Augmented Generation) AI news: https://ethanbholland.com/2024/04/19/retrieval-augmented-generation-rag-news-week-ending-04-19-2024/
Robotics/Embodiment News of the Week: This is the most intense area of AI. Embodiment refers to putting an AI inside of a machine. It’s “embodying” the object and therefore giving a robot agency in the real world. An example would be using a large language model as an interface to a complex coding task. Just as you ask “Alexa, play Bad Blood by Taylor Swift on Spotify” using plain language, with embodiment you could ask a robot to “Go to the laundry basket and bring me all of the red shirts”. The language model in the robot would translate your request into the proper code to go get the red shirts. The robot was never trained on the task. Another type of embodiment would be training a robot using virtual reality simulations. Using an simulation, a robot could be trained on thousands of scenarios until the real world can be swapped out and the robot doesn’t “notice”. This section also includes factory automation and human prosthetics. There will be some overlap with other categories like autonomous vehicles. I first learned about embodiment from Alan Thompson. I highly recommend his video explainer: https://youtu.be/peLqYP9BAUg?si=2FzrvDlw-qaQFaCx.
This week’s latest robot and embodiment AI news: https://ethanbholland.com/2024/04/19/robotics-and-embodiment-news-week-ending-04-19-2024/
Science/Medicine AI News of the Week: AI’s strength is learning patterns. This applies nicely to medical diagnosis and identifying trends. When combined with data and AI vision, this means AI is good at looking at x-rays. Language models are helping with patient interface, and robotics and augmented reality are advancing surgery. Powerful enterprise models like Google’s Alphafold can master protein folding. Other models can read ancient scrolls without opening them.
This weeks’s latest AI science and medicine news: https://ethanbholland.com/2024/04/19/science-and-medicine-news-week-ending-04-19-2024/
AI Video News of the Week: AI video in this case refers to generative video. Much like imagery meant generative imagery. This usually text-to-video, where a user enters a prompt (“a wizard walking out of a flaming building”) and a tool like Pika or Runway generates an video in the likeness of the description. It also covers animation of still images, where an image is given motion (like a photo of a waterfall appearing to have flowing water). As with images, this is different than AI vision, where an AI “looks at” an image or video and can derive context, details, and contents. Video, in this case, is video creation and modification/editing.
This weeks’s latest AI video news: https://ethanbholland.com/2024/04/19/video-news-week-ending-04-19-2024/
X/Twitter/Grok: Grok is one of several AI’s developed by X, and it’s a bit blended in with Telsa and other Elon Musk technology. Not every week will have a Grok section, but like Meta, Google, Apple, and OpenAI, X will be in the news enough to have its own section.
This week’s latest X news: https://ethanbholland.com/2024/04/19/twitter-x-grok-week-ending-04-19-2024/
Technical and AI Developer News of the Week: Everything that is too technical for general consumption goes here. These are stories I think are important, but might be inaccessible and confusing. It’s also a space for developer news and deep dives into how AI works, under the hood.
This weeks technical and dev AI news: https://ethanbholland.com/2024/04/19/tech-and-development-week-ending-04-19-2024/
Credits/Sources

Most of these weekly links come from just a few prolific oversharing sources. Please follow them, as they work hard to find the news each week and they make it a lot easier for me to compile.
- Robert Scoble: https://x.com/Scobleizer
- Ethan Mollick: https://www.linkedin.com/in/emollick/
- Alan Thompson: https://lifearchitect.ai/
- Theoretically Media: https://www.youtube.com/@TheoreticallyMedia
- The Rundown: https://www.therundown.ai/
- Bilawal Sidhu: https://twitter.com/bilawalsidhu/
- TLDR: https://tldr.tech/ai
- Jeremiah Owyang: https://twitter.com/jowyang
- Nick St. Pierre: https://twitter.com/nickfloats
- Dr. Jim Fan: https://twitter.com/DrJimFan
- All About AI: https://www.youtube.com/@AllAboutAI
- Marshall Kirkpatrick: https://aitimetoimpact.com/
- AI News (Smol Talk): https://buttondown.email/ainews/archive/
For previous issues, please visit the archives!

Thanks for reading!





Leave a Reply