This week’s cover is a remix of Tyler, The Creator’s album “Flower Boy”.  Music is the big news this week, as a company called Udio released a text-to-music tool which creates radio-quality songs.  Tyler’s photo on the album has been replaced by a robot..  The robot was prompted in MidJourney using the /describe tool and masked using Photoshop.  “AI, The Creator” text is Cooper Black, via Tyler’s album “Goblin”.  The rest of the text is Futura Extra Bold, from “Call Me If You Get Lost”. 

Executive Summary

  • Apple AI may obliterate the smartphone: Apple is working on a multimodal AI model that understands iPhone interfaces. The model will know what’s on the screen, which apps are open, what’s running in the background,  how to interact between apps, and how to browse the web and act on behalf of users.  If this happens, you may not need to touch the phone, open browsers, or change apps, the screen will simply dynamically do and show what you want.
  • Brace yourselves for the end of smartphones and apps: Former Apple designer Jony Ive and OpenAI’s Sam Altman are seeking funding for a personal AI device.  Just as Apple (see above) is creating multimodal models that allow you to talk with your phone without the need to open or close apps or browsers, the Rabbit R1 and Open Interpreter’s O1 are building personal devices that hope to make smartphones obsolete.  If that weren’t a strong enough hint of the future, Sam Altman and former Apple designer Jony Ive are looking to build a similar device.
  • Meta’s AR glasses are secretly robot training tools: “Today we’re releasing OpenEQA — the Open-Vocabulary Embodied Question Answering Benchmark. It measures an AI agent’s understanding of physical environments by probing it with open vocabulary questions like “Where did I leave my badge?”  Remember the word “embodied” is tech talk for AI that’s been given a body.  The Meta AR glasses are really robot trainers.  It’s in plain sight, no pun intended in the press release.
  • Google Gemini is blowing people’s minds with huge memory and strong listening abilities: Google’s Gemini can fit more than five Harry Potter books inside the prompt field.  Imagine holding over a million words in your short-term memory, in one glance (all the words in a pile), and answering questions about the books, instantly.  As quickly as a calculator does math, Gemini can contextualize and discuss a million words.  Further, it can hold and process audio and video files in the same manner, without a transcript.
  • The new Search Engine Optimization: Researchers are learning how to manipulate training data to force products to show up in chats with LLMs.  via hima_lakkaraju
  • Code-writing breakthroughs:  An automated AI software coding agent “solved 67 GitHub issues in less than ten minutes each, whereas developers spent more than 2.77 days on average” via: abacaj  
  • Elon Musk predicts superhuman AI will be smarter than people next year:  Computing power is no longer an issue.  The only thing holding us back now is electricity supply.  via The Guardian
  • Udio enables high quality text-to-song generation: Radio quality audio generated by text prompts, for free, in less than 90 seconds.  Suno, the former song creation tool, just got crushed.  Insane pace of improvement.. via @udio 
  • Robotaxis: Tesla is unveiling its new ‘robotaxi’ on August 8. via electrek
  • Apple integrating AI chips into its new computers: Apple also is gearing up to overhaul its entire lineup of Macs with an upcoming AI-focused M4 chip family. via:_akhaliq and Bloomberg 
  • Google may have discovered something better than transformers: The T in GPT stands for “transform” and despite being famous thanks to OpenAI, it was discovered at Google.   This week Google released a “model with new Griffin architecture that outperforms transformers. Across multiple sizes, Griffin outperforms the benchmark scores of transformers baseline in controlled tests in both the MMLU score across different parameter sizes as well as the average score of many benchmarks. The architecture also offers efficiency advantages with faster inference and lower memory usage when inferencing long contexts.” via rohanpaul_ai
  • Meta is launching its GPT-4 competitor by May: “Meta confirms that its Llama 3 open source LLM is coming in the next month” via TechCrunch
  • JP Morgan CEO Jamie Dimon says AI is as transformational as the printing press: “JP Morgan CEO Jamie Dimon believes AI will be as transformative as the printing press. The bank is using AI for marketing and risk management, with over 2,000 employees working on 400 AI applications.” via CNBC
  • Intel releases new Gaudi 3 AI chip to challenge NVIDIA: “Intel just unveiled its Gaudi 3 AI chip at the company’s Vision event, taking direct aim at Nvidia. The chip offers enterprises an open, flexible alternative for deploying generative AI at scale and promises 50% faster training than NVIDIA’s H100” via intc.com

Top 48 Links of The Week

AI Visuals and Charts: Week Ending 04/12/2024

Apple’s new AI model can operate a smartphone

Via https://twitter.com/_akhaliq/status/1777542957383446691/photo/1 

AI v. Humans 

Via https://aiindex.stanford.edu/report/ 

Figure-01 electromechanical humanoid robot  

Via https://twitter.com/adcock_brett/status/1776672870816739369

“More reports are coming in about how much more performant the new GPT-4 model is Seems like there is a huge jump in multiple categories”

Via https://twitter.com/bindureddy/status/1778108344051572746 

The Rest: AI News of The Week

Don’t let the volume overwhelm you.  Have fun and skim these. The links are organized by topic, sorted from ‘coolest’ to ‘least cool’, and each topic is clearly defined with a headline.  I’ve added a description and glossary of what the topics mean, beneath each label, in plain language.  I do the work so you don’t have to!   When you visit the pages, note that the links and descriptions are often pulled directly from tweets or articles, so it’s not always my voice.  Pause when you see something that interests you.  Reach out to me any time. I enjoy sharing and discussing these items.

Agency/Agents/Copilots News of the Week: Agency is when AI can do things for you (like Googling an actress name or fetching the latest weather forecast). An agent is one step further, when AI given autonomy to take action on your behalf (“Alexa, book a reservation for three at Peak in Hudson Yards for Friday night”). A co-pilot is an assistant (like spell check or autofill).
This weeks’s latest agent news: https://ethanbholland.com/2024/04/12/agents-and-copilots-ai-news-week-ending-04-12-2024/

Amazon News of The Week: Individual company products will often be placed in the categories they match (image, audio, agents, robots, etc). Occasionally, I’ll dedicate space to a company’s news if it’s broad or a major product release.
This week’s latest Amazon AI news: https://ethanbholland.com/2024/04/25/amazon-ai-news-week-ending-04-12-2024/

Anthropic News of the Week:
Anthropic is a company that builds LLMs like OpenAI, Mistral, Meta, etc. Their main AI brand is Claude. As with Amazon and Apple, individual Anthropic company posts will often be placed in the categories they match (image, audio, agents, robots, etc). Occasionally, I’ll dedicate space to a company’s news if it’s broad or a major product release.
This week’s Anthropic news: https://ethanbholland.com/2024/04/12/anthropic-news-week-ending-04-12-2024/

Apple News of the Week: As with Amazon, individual Apple company products will often be placed in the categories they match (image, audio, agents, robots, etc). Occasionally, I’ll dedicate space to a company’s news if it’s broad or a major product release.
This weeks’ latest Apple AI news: https://ethanbholland.com/2024/04/12/apple-ai-news-week-ending-04-12-2024/

Artificial General Intelligence (AGI) News of the Week: Artificial General Intelligence, in a nutshell, is when artificial intelligence is able to beat humans at everything (including embodying physical forms and completing physical tasks).  It’s usually a thought catalyst for predictions, like when AGI will occur. 10 years? 25 years? 100? AGI is an event horizon that is tough to define, tough to imagine, and tough to predict. OpenAI defined AGI in its charter as “highly autonomous systems that outperform humans at most economically valuable work”. OpenAI has a section of its website dedicated to AGI. Google’s DeepMind published my favorite report on the five levels of artificial intelligence on the way to AGI (see also here).
This week’s latest Artificial General Intelligence (AGI) news: https://ethanbholland.com/2024/04/12/artificial-general-intelligence-agi-news-week-ending-04-12-2024/

AI Audio News of the Week: In this case, AI audio can mean a few things. The first is “generative audio” which refers to creating sounds with AI, much like ChatGPT writes words or MidJourney creates images. For example, asking for the “sound of waves crashing on the beach” would be text to sound. Another example would be an AI ‘watching’ a video and adding sound to it, like a foley artist would add footsteps or a creaking door to a movie scene. Lastly, AI audio can refer to microphones that only pick up certain speaker’s voices or headsets that cancel out all voices but your friends. This week’s latest AI audio news: https://ethanbholland.com/2024/04/12/audio-news-week-ending-04-12-2024/

Autonomous Vehicles/Driverless Cars News of the Week: Driverless car news doesn’t always get its own category, because it’s so close to robot embodiment. I go with my gut each week around what to place in each category. My recommendation would be to follow Robotics/Embodiment also, as the two fields are converging.
This week’s autonomous vehicle news: https://ethanbholland.com/2024/04/12/augmented-and-virtual-reality-ar-vr-news-week-ending-04-12-2024/

Augmented and Virtual Reality (AR/VR) News of the Week: Augmented reality is when you see images or information on top of the real world.  A car windshield with a heads-up display of the speed. Or glasses that have facial recognition and overlay the names of everyone in view. Virtual reality is when you are transported into another place, usually wearing goggles, but a flight simulator could also be considered virtual reality.
This weeks’s latest AR/VR news: https://ethanbholland.com/2024/04/12/autonomous-vehicles-news-week-ending-04-12-2024/

Business/EnterpriseAI News of the Week: This broad category is for stories that impact corporations and large scale AI implementation. Enterprise refers to a type of AI that is often custom built for a business or leverage an API to connect secure data to an AI model. 
This weeks’s latest enterprise AI news: https://ethanbholland.com/2024/04/12/business-and-enterprise-ai-news-week-ending-04-12-2024/

Chips and Hardware AI News of the Week: Most of the chip news is NVIDA usually, yet more and more Meta, Google, and OpenAI are starting toward their own manufacturing. I have to make the call whether to put Meta, Google, and OpenAI’s chip news under this section or their company sections. Lately, I’m putting each company’s chips news into the company category, rather than the chips category. This is the rest of the chips headlines.
This weeks’s latest chips and hardware news: https://ethanbholland.com/2024/04/12/chips-and-hardware-week-ending-04-12-2024/

Consumer Electronics AI News of the WeekThis is a broad category meant to capture end user tools and products that incorporate artificial into their feature, from high-end grills to smartphones.
This weeks’s latest consumer AI news: https://ethanbholland.com/2024/04/12/consumer-products-week-ending-04-12-2024/

Ethics/Legal/Security AI News of the Week: This section focuses on the impact AI is having on ethics (deep fakes, war, trust, false information, plagiarism, job loss, income), legal (rights, laws, regulations), and security (hacking, phishing, national interests, safety). For huge news stories like the NY Times suing OpenAI, I usually put them under the main section or give them their own page.
This weeks’s latest AI ethics/legal/security news: https://ethanbholland.com/2024/04/12/ethics-legal-security-ai-news-week-ending-04-12-2024/

Google AI News of the Week: Individual company products will often be placed in the categories they match (image, audio, agents, robots, etc). Occasionally, I’ll dedicate space to a company’s news if it’s broad
This weeks’s latest Google AI news: https://ethanbholland.com/2024/04/12/google-ai-news-week-ending-04-12-2024/

Imagery News of the Week: AI imagery covers “generative AI” image tools. This usually text-to-image, where a user enters a prompt (“a polar bear walking through NYC”) and a tool like Dalle or MidJourney generates an image in the likeness of the description. This is different than AI vision, where an AI “looks at” an image and can derive context, details, and contents. AI vision is a subset of AI called multimodality. Imagery, in this case, is for image creation and modification/editing. Adobe Photoshop’s AI tools would fall into this category. I’ll also include things like automatic masking and object removal, even though that’s in between imagery and vision… but practically speaking it fits into editing.
This weeks’s latest AI image news: https://ethanbholland.com/2024/04/12/imagery-news-week-ending-04-12-2024/

International AI News of the Week: A lot of international news will get cross listed in the chips, security, or open-source categories, however it’s nice to have a separate category for worldwide AI news.
This week’s latest international AI news: https://ethanbholland.com/2024/04/12/international-ai-news-week-ending-04-12-2024/

Locally Run AI Models News of the Week: This is a niche mostly for serious AI followers. It refers to AI that can be privately downloaded and run on a device without an internet connection. These have an array of powerful implications, from ethics of rogue users with untethered agents, to practical uses like Apple running a full AI on your phone, to corporate installations for security, to embodied robots with AI running in their virtual brain.
This weeks’s latest locally run AI news: https://ethanbholland.com/2024/04/12/locally-run-ai-models-news-week-ending-04-12-2024/

Meta AI News of the WeekThis is a space dedicated for Meta specific AI advancements and news stories.
This weeks Meta AI news: https://ethanbholland.com/2024/04/12/meta-ai-news-week-ending-04-12-2024/

Microsoft AI News of the WeekThis is a space dedicated for Microsoft specific AI advancements and news stories.
This weeks Microsoft AI news: https://ethanbholland.com/2024/04/12/microsoft-ai-news-week-ending-04-12-2024/

Mobile AI News of the Week: In April, 2024 I added a dedicated category for mobile. Prior, I put all most the mobile news into either the company (Apple v. Google v. Microsoft) or locally run AI. It also ended up in the chips and hardware section, or the consumer products category. There is enough mobile news to at least start cross linking it all in one place. This weeks’s latest mobile AI news: https://ethanbholland.com/2024/04/12/mobile-2/

Multimodal AI News of the Week: This is a broad topic for an single AI model that demonstrates an ability to interact with more than one modality (imagery, video, audio, text). Often multimodal news will end up in one of these categories. I’m playing it by ear on a case by case basis. Please be patient with my organizational challenges.
This week’s multimodal AI news: https://ethanbholland.com/2024/04/12/multimodality-news-week-ending-04-12-2024/

OpenAI: OpenAI is the leading force in the AI boom of 2023 and now 2024. This section focuses on news that is specific to OpenAI. This section will compete with all of the other sections (imagery, vision, ethics, etc) because OpenAI is so broad. I won’t be able to consistently pick when to put things under OpenAI or other sections, so bear with me.
This weeks’s latest OpenAI news: https://ethanbholland.com/2024/04/12/openai-news-week-ending-04-12-2024/

Open Source Models: An open source AI model refers to a class of artificial intelligence models with public source code. They can be inspected, copied, installed, and customized on private computers. In contrast, a closed source model is proprietary and owned by a company that you pay to use (like PowerPoint or Photoshop). One of the most famous open source language models is a French model called Mistral. Its code is completely publicly available, and anyone can download it and customize it. On one hand, open source is a transparent and powerful way to democratize AI, but on the other hand, open source models circumvent the guard rails and copyright protections that private companies implement. Open source models are the wild west of artificial intelligence, but also the potential saving grace (depending on who you ask). It’s a bit like gun control debates but for computing power.
This weeks’s latest open source news: https://ethanbholland.com/2024/04/12/open-source-ai-news-week-ending-04-12-2024/

Podcast/YouTube Clips of the Week: This is for more general interviews and explainer videos and podcasts that provide access to leadership, demos of new products, and walkthroughs and tutorials. Videos focused on specific topics will live in the topic category (i.e. images), but broader videos will live here.
This weeks’s latest podcasts and YouTube clips: https://ethanbholland.com/2024/04/12/podcasts-youtube-op-eds-week-ending-04-12-2024/

Publishing AI News of the Week: These are stories about AI’s impact on the publishing industry. From copyright and crawling to the death of page views or even the end of browsers.
This weeks’s latest publishing AI news: https://ethanbholland.com/2024/04/12/publishing-news-week-ending-04-12-2024/

Robotics/Embodiment News of the Week: This is the most intense area of AI. Embodiment refers to putting an AI inside of a machine. It’s “embodying” the object and therefore giving a robot agency in the real world. An example would be using a large language model as an interface to a complex coding task. Just as you ask “Alexa, play Bad Blood by Taylor Swift on Spotify” using plain language, with embodiment you could ask a robot to “Go to the laundry basket and bring me all of the red shirts”. The language model in the robot would translate your request into the proper code to go get the red shirts. The robot was never trained on the task. Another type of embodiment would be training a robot using virtual reality simulations. Using an simulation, a robot could be trained on thousands of scenarios until the real world can be swapped out and the robot doesn’t “notice”. This section also includes factory automation and human prosthetics. There will be some overlap with other categories like autonomous vehicles. I first learned about embodiment from Alan Thompson. I highly recommend his video explainer: https://youtu.be/peLqYP9BAUg?si=2FzrvDlw-qaQFaCx.
This week’s latest robot and embodiment AI news: https://ethanbholland.com/2024/04/12/robotics-and-embodiment-news-week-ending-04-12-2024/

Science/Medicine AI News of the Week: AI’s strength is learning patterns. This applies nicely to medical diagnosis and identifying trends. When combined with data and AI vision, this means AI is good at looking at x-rays. Language models are helping with patient interface, and robotics and augmented reality are advancing surgery. Powerful enterprise models like Google’s Alphafold can master protein folding. Other models can read ancient scrolls without opening them.
This weeks’s latest AI science and medicine news: https://ethanbholland.com/2024/04/12/science-and-medicine-news-week-ending-04-12-2024/

AI Video News of the Week: AI video in this case refers to generative video. Much like imagery meant generative imagery. This usually text-to-video, where a user enters a prompt (“a wizard walking out of a flaming building”) and a tool like Pika or Runway generates an video in the likeness of the description. It also covers animation of still images, where an image is given motion (like a photo of a waterfall appearing to have flowing water). As with images, this is different than AI vision, where an AI “looks at” an image or video and can derive context, details, and contents. Video, in this case, is video creation and modification/editing.
This weeks’s latest AI video news: https://ethanbholland.com/2024/04/12/video-news-week-ending-04-12-2024/

X/Twitter/Grok: Grok is one of several AI’s developed by X, and it’s a bit blended in with Telsa and other Elon Musk technology. Not every week will have a Grok section, but like Meta, Google, Apple, and OpenAI, X will be in the news enough to have its own section.
This week’s latest X news: https://ethanbholland.com/2024/04/12/twitter-x-grok-week-ending-04-12-2024/

Technical and AI Developer News of the Week: Everything that is too technical for general consumption goes here. These are stories I think are important, but might be inaccessible and confusing. It’s also a space for developer news and deep dives into how AI works, under the hood.
This weeks technical and dev AI news: https://ethanbholland.com/2024/04/12/tech-and-development-news-week-ending-04-12-2024/

Credits/Sources

Most of these weekly links come from just a few prolific oversharing sources. Please follow them, as they work hard to find the news each week and they make it a lot easier for me to compile.

For previous issues, please visit the archives!

Thanks for reading!

19 responses to “AI News #28: Week Ending 04/12/2024 with Executive Summary, Top 48 Links, and Helpful Visuals”

  1. […] This week’s executive overview and top links are here:AI News #28: Week Ending 04/12/2024 with Executive Summary and Top 48 Links […]

  2. […] This week’s executive overview and top links are here:AI News #28: Week Ending 04/12/2024 with Executive Summary and Top 48 Links […]

  3. […] This week’s executive overview and top links are here:AI News #28: Week Ending 04/12/2024 with Executive Summary and Top 48 Links […]

  4. […] This week’s executive overview and top links are here:AI News #28: Week Ending 04/12/2024 with Executive Summary and Top 48 Links […]

  5. […] This week’s executive overview and top links are here:AI News #28: Week Ending 04/12/2024 with Executive Summary and Top 48 Links […]

  6. […] This week’s executive overview and top links are here:AI News #28: Week Ending 04/12/2024 with Executive Summary and Top 48 Links […]

  7. […] This week’s executive overview and top links are here:AI News #28: Week Ending 04/12/2024 with Executive Summary and Top 48 Links […]

  8. […] This week’s executive overview and top links are here:AI News #28: Week Ending 04/12/2024 with Executive Summary and Top 48 Links […]

  9. […] This week’s executive overview and top links are here:AI News #28: Week Ending 04/12/2024 with Executive Summary and Top 48 Links […]

  10. […] This week’s executive overview and top links are here:AI News #28: Week Ending 04/12/2024 with Executive Summary and Top 48 Links […]

  11. […] This week’s executive overview and top links are here:AI News #28: Week Ending 04/12/2024 with Executive Summary and Top 48 Links […]

  12. […] This week’s executive overview and top links are here:AI News #28: Week Ending 04/12/2024 with Executive Summary and Top 48 Links […]

  13. […] This week’s executive overview and top links are here:AI News #28: Week Ending 04/12/2024 with Executive Summary and Top 48 Links […]

  14. […] This week’s executive overview and top links are here:AI News #28: Week Ending 04/12/2024 with Executive Summary and Top 48 Links […]

  15. […] This week’s executive overview and top links are here:AI News #28: Week Ending 04/12/2024 with Executive Summary and Top 48 Links […]

  16. […] This week’s executive overview and top links are here:AI News #28: Week Ending 04/12/2024 with Executive Summary and Top 48 Links […]

  17. […] This week’s executive overview and top links are here:AI News #28: Week Ending 04/12/2024 with Executive Summary and Top 48 Links […]

  18. […] This week’s executive overview and top links are here:AI News #28: Week Ending 04/12/2024 with Executive Summary and Top 48 Links […]

  19. […] AI News #28: Week Ending 04/12/2024 with Executive Summary, Top 48 Links, and Helpful Visuals […]

Leave a Reply

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading