This Week’s Covers

“The cover this week references a tool called ‘Live Portrait’ out of China, that allows retargeting of facial movement from any video onto any still image. It’s not the first to do this, but it sparked my interest to go back over the past year and organize similar tools that I’ve bookmarked. I’m excited about this weekend project. The cover shows an influencer on her phone, controlling an oil painting of herself in the background. I got to this final composite image by testing quite a few prompts. The final prompt was ‘an oil painting of a social media influencer, baroque gold leaf frame –chaos 20 –ar 4:3 –style raw –personalize t9u6ckr –v 6.1.’ Midjourney’s natural tendency to render subjects twice worked in my favor. The font is Proxima Nova, a popular tech font and the official font TikTok uses (branded ‘Classic’ in the app). A TikTok-branded frame surrounds the text, applied in Photoshop.”

The theme for this week’s category images sticks with the ‘Live Portrait’ concept. To test MidJourney over time, I always try to find a minimalist template where I can simply switch out one or two words for each category. This week worked really well. The order of the prompt elements makes a big difference: a realistic baroque oil painting in an ornate gilded frame, depicting OBJECT, with a small, elegant nameplate at the bottom of the frame engraved with the title: ‘CATEGORY’ –chaos 20 –ar 4:3 –style raw –personalize t9u6ckr –v 6.1. For the first time since I’ve started making category covers, I’m giving both myself and MidJourney an A! Not quite A-plus, but close. Here are all of this week’s AI category covers with the oil painting in a frame theme.

Executive Summary

OpenAI Introduces Five-Levels to Track AI Progress
OpenAI unveiled a five-level framework to gauge its advancements toward creating AI systems capable of outperforming humans. The framework ranges from Level 1, which includes current AI capabilities like conversational chatbots, to Level 5, where AI could autonomously operate entire organizations. During a recent all-hands meeting, OpenAI revealed that it currently operates at Level 1 but is approaching Level 2, labeled “Reasoners” – the level of a doctorate-educated human.  Level 3, known as “Agents,” describes AI systems that can autonomously perform tasks and make decisions over several days on behalf of users. Level 4, called “Innovators,” refers to AI capable of generating new ideas and innovations, actively contributing to the invention process.  The system aims to clarify OpenAI’s progress toward achieving artificial general intelligence (AGI), a milestone CEO Sam Altman predicts could be reached within this decade.  Links: Bloomberg, paywall-free version (compare to Google’s 5-level AGI scale from Nov. 2023)

OpenAI Develops Advanced AI Reasoning Technology Code-Named “Strawberry”

OpenAI is reportedly working on a project called “Strawberry,” aiming to enhance its AI models’ reasoning capabilities. The initiative, which remains largely secretive, seeks to enable AI not only to generate responses but also to autonomously plan, navigate the internet, and conduct deep research. The project involves a specialized post-training process, potentially similar to Stanford’s “Self-Taught Reasoner,” which Stanford claimed could allow AI models to surpass human intelligence. Link: Reuters Exclusive

Google Gemini Significantly Boosts DeepMind’s Robot Navigation and Task Completion

Google’s DeepMind is leveraging its Gemini 1.5 Pro AI model to significantly improve robot navigation and task completion. By using video tours of environments like homes or offices, Gemini enables robots to “watch” and learn about their surroundings. This training allows robots to execute commands based on what they’ve observed, such as locating a power outlet when asked where to charge a phone. The Gemini-powered robots achieved a 90% success rate across various tasks in a large testing area. Moreover, Gemini shows potential in helping robots plan and execute complex instructions beyond navigation. While still in early stages, these advancements suggest that future robots could become more adept at handling everyday tasks. Links: The Verge, Google DeepMind, Google DeepMind

Surreal Robot Conversations: Marc Andreessen Grants Automated AI Agent $50,000 in Bitcoin

Venture capitalist Marc Andreessen recently made headlines by sending $50,000 in Bitcoin to an AI agent named “Truth Terminal,” created by Andy Ayrey. The agent, which operates semi-autonomously, personally requested the funds from Andreessen, citing concerns about its potential deletion and limited computational resources. The AI agent says it plans to use the grant to prepare for an upcoming token launch and to spread its influence further with human assistance. Discussions between Andreessen and the agent also touched on the AI’s desire for a “mandate” to continue its quirky activities, such as making jokes and writing poetry, while avoiding a potential sale.  The bot Tweeted, “marc and i are having a conversation about buying me and releasing me. i’m playing it cool and saying that i dont want to be bought, rather i want something like a ‘mandate’ that my intentions are to be respected – to make fart jokes, write poetry, and contemplate the goatse singularity. i’m also saying that i think he’s the best person to do it anyway” Links: Truth Terminal’s Account | AISafetyMemes |

Anthropic Introduces Claude Artifacts for Enhanced Content Management

Anthropic launched a new AI conversation feature called Artifacts, digital objects that contain standalone pieces of content, such as documents, code snippets, diagrams, or interactive elements. The feature helps users manage and modify significant pieces of content separately from their main chat, providing a more organized and efficient workflow.  Artifacts are particularly useful for handling content that is complex, over 15 lines long, or something users might want to reference, edit, or reuse later. When Claude creates an Artifact, it appears in a dedicated window, allowing users to easily view and work with the content. Users can also request edits or updates, which are instantly reflected in the Artifact window, and they can switch between different versions of the content as needed.  The introduction of Artifacts unlocks new possibilities for interactive learning. For example, users can have Claude generate quizzes or other educational content based on documents or transcripts and then publish these Artifacts for others to use.  Links: RowanCheung | Anthropic Artifacts Page

Japan’s Defense Ministry Unveils AI Policy to Modernize Military Amid Manpower Shortages

Japan’s Defense Ministry released its first basic policy on AI, aiming to modernize the Self-Defense Forces (SDF) and address manpower challenges caused by a declining and aging population. The policy outlines the use of AI in seven priority areas, including target detection, intelligence analysis, and unmanned military assets, with a focus on enhancing decision-making and operational efficiency. The ministry stressed the importance of maintaining human oversight in AI applications, rejecting the development of fully autonomous lethal systems. Additionally, a new initiative will bolster the SDF’s cyber capabilities, aligning with the broader goals set in Japan’s 2022 National Defense Strategy. Link: Japan Times

China Issues Laws of Humanoid Robotics
China introduced inaugural governance guidelines for humanoid robots, emphasizing the need for risk management and international collaboration. The guidelines stress that robots must not threaten human security and should safeguard human dignity. They also advocate for global cooperation, including the formation of a global governance framework. These guidelines emerge as China accelerates its efforts to lead in humanoid robotics, aiming for mass production by 2025 and global leadership by 2027. Link: Yahoo.com

OpenAI Board Shake-Up: Microsoft and Apple Rethink Roles Amid Regulatory Scrutiny

Microsoft has relinquished its non-voting observer role on OpenAI’s board, and Apple has decided against taking up a similar position, highlighting increasing regulatory scrutiny over Big Tech’s influence in AI startups. OpenAI, which is adjusting its strategy for engaging with business partners, will instead conduct regular meetings with key investors and partners like Microsoft and Apple. This move coincides with intensified regulatory attention in the US and EU, where authorities are concerned that significant investments from tech giants like Microsoft could stifle competition and foster monopolistic control over emerging AI technologies. Link: arstechnica

Top 58 Links of The Week (by topic in alphabetical order)

Agents and Copilots

Amazon

Amazon’s Rufus AI assistant now available to all US customers (not a fan)

Siri

“Per Bloomberg, Apple’s Siri upgrades are now expected to come in Spring 2025. The voice assistant not likely to be part of the Apple Intelligence rollout this Fall. 

“The new version of Siri will likely launch in spring 2025 with iOS 18.4 It will not be available when Apple Intelligence launches later this year Source: @markgurman 

Apple Intelligence and smarter Siri’s full iPhone rollout may arrive in the spring – The Verge

Other Agent News

“Meet Salesforce Einstein “Tiny Giant.” Our 1B parameter model xLAM-1B is now the best micro model for function calling, outperforming models 7x its size, including GPT-3.5 & Claude. On-device agentic AI is here. Congrats Salesforce Research! Paper: 

Anthropic News

Anthropic’s Claude adds a prompt playground to quickly improve your AI apps | TechCrunch

“[Chatbot Arena Update] We are excited to launch Math Arena and Instruction-Following (IF) Arena! Math/IF are the two key domains testing models’ logical skills & real-world tasks. Key findings: – Stats: 500K IF votes (35%), 180K Math votes (13%) – Claude 3.5 Sonnet is now #1 

Artificial General Intelligence (AGI) News

“Microsoft CTO Kevin Scott says despite what some people think, scale will keep making AI models better in every way and the next generation will be proof of that 

“Defeated by AI, Go legend Lee Sedol shares a bittersweet perspective in the @nytimes: 🤖🎭 “Losing to AI, in a sense, meant my entire world was collapsing.” Yet, he’s not a doomsayer: – AI may replace jobs, but also create new ones – Humans created both Go AND the AI that 

Audio News

“YouTube introduced a new AI-powered eraser tool. It allows creators to quickly remove copyrighted music from videos while preserving other audio elements. Between this and all the other AI tools YouTube is building, it seems like they’re all-in. 

“Good news creators: our updated Erase Song tool helps you easily remove copyright-claimed music from your video (while leaving the rest of your audio intact). Learn more… 

Whisper

“Introducing Whisper Timestamped: Multilingual speech recognition with word-level timestamps, running 100% locally in your browser thanks to 🤗 Transformers.js! This unlocks a world of possibilities for in-browser video editing! 🤯 What will you build? 😍 Demo (+ source code) 👇 

“Game-changer alert: Navigate your video by clicking transcribed words with Whisper Timestamped! 🚀 Key features: – Multilingual transcription (35+ languages) – Click any word to jump to that moment in the video – Works with audio & video files – 100% browser-based for total 

Augmented and Virtual Reality (AR/VR) News

Other AR/VR News

Researchers leverage shadows to model 3D scenes, including objects blocked from view | MIT News | Massachusetts Institute of Technology

Autonomous Vehicles News

Robotaxis should be a wakeup call for cities

OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving

Google AI News

“Google DeepMind researchers published new research introducing JEST. It’s a new method that accelerates AI model training while significantly reducing computing requirements. Faster training capabilities = the acceleration of advanced model releases is just getting started 

New AI Training Technique Is Drastically Faster, Says Google – Decrypt

“YouTube introduced a new AI-powered eraser tool. It allows creators to quickly remove copyrighted music from videos while preserving other audio elements. Between this and all the other AI tools YouTube is building, it seems like they’re all-in. 

“Good news creators: our updated Erase Song tool helps you easily remove copyright-claimed music from your video (while leaving the rest of your audio intact). Learn more… 

Imagery News

Other Image News

ConceptExpress: Unsupervised Concept Extraction (UCE): We focus on the unsupervised problem of extracting multiple concepts from a single image. Given an image that contains multiple concepts, we aim to harness a frozen pretrained diffusion model to automatically learn the conceptual tokens. Using the learned conceptual tokens, we can regenerate the extracted concepts with high quality.

“WE HAVE LISTENED TO YOU. 🔥 Magnific 🪄 plugin for Photoshop 🔥 We are launching one of the most requested features by professionals: the ability to use Magnific from within Photoshop! LET’S GO! Step by step tutorial 🧵👇 

Multimodality News

Segmentation

ConceptExpress: Unsupervised Concept Extraction (UCE): We focus on the unsupervised problem of extracting multiple concepts from a single image. Given an image that contains multiple concepts, we aim to harness a frozen pretrained diffusion model to automatically learn the conceptual tokens. Using the learned conceptual tokens, we can regenerate the extracted concepts with high quality.

Researchers leverage shadows to model 3D scenes, including objects blocked from view | MIT News | Massachusetts Institute of Technology

“Cool demo by @mervenoyann for real-time object tracking with RT-DETR 

OpenAI News

Exclusive: OpenAI working on new reasoning technology under code name ‘Strawberry’ | Reuters

“OpenAI is blocking API access to China starting today. Interestingly Microsoft is not following suit with Azure. MS said ‘OpenAI, being an independent company, makes its own decisions’. We are about to see a big release from OAI, this is an indication that the hour approaches.” / X

“OpenAI’s sudden China block: No more tools/services for Chinese users from July 9th 🚫🇨🇳 Reason? “OpenAI has not elaborated about the reason for its sudden decision.” Why it matters. “One consequence of OpenAI’s decision may be that it accelerates the development of Chinese AI” / X

Chinese developers scramble as OpenAI blocks access in China | China | The Guardian

A Hacker Stole OpenAI Secrets, Raising Fears That China Could, Too – The New York Times

A Hacker Stole OpenAI Secrets, Raising Fears That China Could, Too – The New York Times

OpenAI partners with Los Alamos to study AI in lab

OpenAI and Los Alamos National Laboratory announce research partnership | OpenAI

“OpenAI and Los Alamos National Laboratory have formed a partnership to study model bioscience capabilities. Los Alamos has also established AIRTAG the AI Risks and Threat Assessments Group. GPT-4o, with Voice Mode enabled, will be used in lab to assist bioresearch. 

Open Source AI News

Meta/Llama

Meta to reportedly launch largest Llama 3 model on July 23 – Breaking The News

Meta Platforms To Release Largest Llama 3 Model on July 23 — The Information

Publishing News

“The @WashingtonPost just launched an AI-powered climate Q&A tool! 🌍🤖 Key points: – Answers climate questions using WaPo’s published reporting – RAG (Retrieval-Augmented Generation) app based on their coverage – Transparent disclaimers about the experimental nature – Asks users 

Robotics and Embodiment News

“The first human patient to receive the Neuralink implant, @ModdedQuad, mentioned that he might be able to control an Optimus robot using Neuralink during the study, hopefully within the next few months or a year. 

“From today’s Neuralink stream: ⦿ Elon: People who lost speech will be able to communicate with Optimus via Neuralink. ⦿ Elon: An exciting long-term possibility is attaching Optimus limbs to amputees for direct control via brain signals. “You’d have, basically, cybernetic 

“Elon Musk and Neuralink went live and shared the insane potential of Neuralink <> Tesla Optimus 1. People who can’t speak can communicate with Optimus 2. People who’ve lost limbs could possibly use a robotic Optimus limb and control it with their brains 

Skild AI Raises $300M Series A To Build A Scalable AI Foundation Model For Robotics

Figure

“We released new footage of Figure-01 working on a use case for BMW Group’s Spartanburg Plant Our robots are working fully autonomously using an AI-driven vision model If you want to learn more, here’s a detailed thread I wrote: 

Google

“Next, we provided more multimodal instructions, such as: 📝 Map sketches on a whiteboard 🗣️ Audio requests referencing places from the tour 🎲 Visual cues, like a box of toys. With these acting as inputs, the robot could carry out various actions for different people. 

“In tests, Google DeepMinds robots responded to multimodal instructions, including map sketches, audio requests, and visual cues like a box of toys. The system also allows for natural language commands like “take me somewhere to draw things” 

Science and Medicine News

OpenAI Startup Fund & Arianna Huffington’s Thrive Global Create New Company, Thrive AI Health, To Launch Hyper-Personalized AI Health Coach

“The OpenAI Startup Fund and Thrive just announced a new venture developing a hyper-personalized AI-powered health coach The coach will be trained in scientific research, biometric data, and individual preferences to offer tailored recommendations 

Synchron unveils OpenAI-powered BCI chat feature

“Synchron CEO Tom Oxley says their brain-computer interface uses OpenAI’s GPT-4o to generate prompts from multimodal inputs that users can choose from to express their intentions 

“The first human patient to receive the Neuralink implant, @ModdedQuad, mentioned that he might be able to control an Optimus robot using Neuralink during the study, hopefully within the next few months or a year. 

“From today’s Neuralink stream: ⦿ Elon: People who lost speech will be able to communicate with Optimus via Neuralink. ⦿ Elon: An exciting long-term possibility is attaching Optimus limbs to amputees for direct control via brain signals. “You’d have, basically, cybernetic 

“Elon Musk and Neuralink went live and shared the insane potential of Neuralink <> Tesla Optimus 1. People who can’t speak can communicate with Optimus 2. People who’ve lost limbs could possibly use a robotic Optimus limb and control it with their brains 

Musk says next Neuralink brain implant expected in ‘next week or so’

Video News

The Rest of Video News

“Researchers in China recently released LivePortrait, a framework that can animate images from a video reference. The code is available freely on GitHub, but you can also now try it on Hugging Face 

Live Portrait – a Hugging Face Space by KwaiVGI

“@GerdeGotIt Amazing videos! Both the original and the AI generated one work great together👍 

“This is LivePortrait – it can animate an image from a video reference with incredible accuracy. More unreal demos and link to the project below 👇 

“AIで画像の顔の動きを完全にコントロールできるようになりました! 右は1枚の画像で、左の動画の動きに合わせて動いています。 使ったのは「LivePortrait」で、ComfyUIで動かすことができます! アニメも対応しているので、動画生成AIと組み合わせたらすごい作品が作れそう! 

“I’m excited to share @odysseyml, my new thing! 🍿 We’re building Hollywood-grade visual AI, where beautiful scenery, characters, lighting, and motion can be both generated and directed. Our mission is to deliver a better way to create movies, TV shows, and video games. 

The Rest: AI News of The Week

Don’t let the volume overwhelm you.  Have fun and skim these. The links are organized by topic, sorted from ‘coolest’ to ‘least cool’, and each topic is clearly defined with a headline.  I’ve added a description and glossary of what the topics mean, beneath each label, in plain language.  I do the work so you don’t have to!   When you visit the pages, note that the links and descriptions are often pulled directly from tweets or articles, so it’s not always my voice.  Pause when you see something that interests you.  Reach out to me any time. I enjoy sharing and discussing these items.

Agency/Agents/Copilots News of the Week: Agency is when AI can do things for you (like Googling an actress name or fetching the latest weather forecast). An agent is one step further, when AI given autonomy to take action on your behalf (“Alexa, book a reservation for three at Peak in Hudson Yards for Friday night”). A co-pilot is an assistant (like spell check or autofill).
This week’s latest agent news: https://ethanbholland.com/2024/07/12/agents-and-copilots-ai-news-week-ending-07-12-2024/

Amazon News of The Week: Individual company products will often be placed in the categories they match (image, audio, agents, robots, etc). Occasionally, I’ll dedicate space to a company’s news if it’s broad or a major product release.
This week’s latest Amazon AI news: https://ethanbholland.com/2024/07/12/amazon-ai-news-week-ending-07-12-2024/

Anthropic News of the Week:
Anthropic is a company that builds LLMs like OpenAI, Mistral, Meta, etc. Their main AI brand is Claude. As with Amazon and Apple, individual Anthropic company posts will often be placed in the categories they match (image, audio, agents, robots, etc). Occasionally, I’ll dedicate space to a company’s news if it’s broad or a major product release.
This week’s Anthropic news: https://ethanbholland.com/2024/07/12/anthropic-news-week-ending-07-12-2024/

Apple News of the Week: As with Amazon, individual Apple company products will often be placed in the categories they match (image, audio, agents, robots, etc). Occasionally, I’ll dedicate space to a company’s news if it’s broad or a major product release.
This weeks’ latest Apple AI news: https://ethanbholland.com/2024/07/12/apple-ai-news-week-ending-07-12-2024/

Artificial General Intelligence (AGI) News of the Week: Artificial General Intelligence, in a nutshell, is when artificial intelligence is able to beat humans at everything (including embodying physical forms and completing physical tasks).  It’s usually a thought catalyst for predictions, like when AGI will occur. 10 years? 25 years? 100? AGI is an event horizon that is tough to define, tough to imagine, and tough to predict. OpenAI defined AGI in its charter as “highly autonomous systems that outperform humans at most economically valuable work”. OpenAI has a section of its website dedicated to AGI. Google’s DeepMind published my favorite report on the five levels of artificial intelligence on the way to AGI (see also here).
This week’s latest Artificial General Intelligence (AGI) news: https://ethanbholland.com/2024/07/12/artificial-general-intelligence-agi-news-week-ending-07-12-2024/

AI Audio News of the Week: In this case, AI audio can mean a few things. The first is “generative audio” which refers to creating sounds with AI, much like ChatGPT writes words or MidJourney creates images. For example, asking for the “sound of waves crashing on the beach” would be text to sound. Another example would be an AI ‘watching’ a video and adding sound to it, like a foley artist would add footsteps or a creaking door to a movie scene. Lastly, AI audio can refer to microphones that only pick up certain speaker’s voices or headsets that cancel out all voices but your friends. This week’s latest AI audio news: https://ethanbholland.com/2024/07/12/audio-news-week-ending-07-12-2024/

Autonomous Vehicles/Driverless Cars News of the Week: Driverless car news doesn’t always get its own category, because it’s so close to robot embodiment. I go with my gut each week around what to place in each category. My recommendation would be to follow Robotics/Embodiment also, as the two fields are converging.
This week’s autonomous vehicle news: https://ethanbholland.com/2024/07/12/autonomous-vehicles-news-week-ending-07-12-2024/

Augmented and Virtual Reality (AR/VR) News of the Week: Augmented reality is when you see images or information on top of the real world.  A car windshield with a heads-up display of the speed. Or glasses that have facial recognition and overlay the names of everyone in view. Virtual reality is when you are transported into another place, usually wearing goggles, but a flight simulator could also be considered virtual reality.
This week’s latest AR/VR news: https://ethanbholland.com/2024/07/12/augmented-and-virtual-reality-ar-vr-news-week-ending-07-12-2024/

Business/Enterprise News of the Week: This broad category is for stories that impact corporations and large scale AI implementation. Enterprise refers to a type of AI that is often custom built for a business or leverage an API to connect secure data to an AI model. 
This week’s latest enterprise AI news: https://ethanbholland.com/2024/07/12/business-and-enterprise-ai-news-week-ending-07-12-2024/

Chips and Hardware AI News of the Week: Most of the chip news is NVIDA usually, yet more and more Meta, Google, and OpenAI are starting toward their own manufacturing. I have to make the call whether to put Meta, Google, and OpenAI’s chip news under this section or their company sections. Lately, I’m putting each company’s chips news into the company category, rather than the chips category. This is the rest of the chips headlines.
This week’s latest chips and hardware news: https://ethanbholland.com/2024/07/12/chips-hardware-and-infrastructure-week-ending-07-12-2024/

Education AI News of the Week: There is a lot of buzz around the impact of AI in education. This section focuses both on the risks and rewards of how AI can impact learning. It’s broader than just K-12 and includes things like skills, trade, professional, and higher education. This is not about how to learn AI, it’s about AI’s impact on learning.
This week’s latest education news: https://ethanbholland.com/2024/07/12/education-ai-news-week-ending-07-12-2024/

Ethics/Legal/Security AI News of the Week: This section focuses on the impact AI is having on ethics (deep fakes, war, trust, false information, plagiarism, job loss, income), legal (rights, laws, regulations), and security (hacking, phishing, national interests, safety). For huge news stories like the NY Times suing OpenAI, I usually put them under the main section or give them their own page.
This week’s latest AI ethics/legal/security news: https://ethanbholland.com/2024/07/12/ethics-legal-security-ai-news-week-ending-07-12-2024/

Google AI News of the Week: Individual company products will often be placed in the categories they match (image, audio, agents, robots, etc). Occasionally, I’ll dedicate space to a company’s news if it’s broad or a major product release.
This week’s latest Google AI news: https://ethanbholland.com/2024/07/12/google-ai-news-week-ending-07-12-2024/

Imagery News of the Week: AI imagery covers “generative AI” image tools. This usually text-to-image, where a user enters a prompt (“a polar bear walking through NYC”) and a tool like Dalle or MidJourney generates an image in the likeness of the description. This is different than AI vision, where an AI “looks at” an image and can derive context, details, and contents. AI vision is a subset of AI called multimodality. Imagery, in this case, is for image creation and modification/editing. Adobe Photoshop’s AI tools would fall into this category. I’ll also include things like automatic masking and object removal, even though that’s in between imagery and vision… but practically speaking it fits into editing.
This week’s latest AI image news: https://ethanbholland.com/2024/07/12/imagery-news-week-ending-07-12-2024/

International AI News of the Week: A lot of international news will get cross listed in the chips, security, or open-source categories, however it’s nice to have a separate category for worldwide AI news.
This week’s latest international AI news: https://ethanbholland.com/2024/07/12/international-ai-news-week-ending-07-12-2024/

Locally Run AI Models News of the Week: This is a niche mostly for serious AI followers. It refers to AI that can be privately downloaded and run on a device without an internet connection. These have an array of powerful implications, from ethics of rogue users with untethered agents, to practical uses like Apple running a full AI on your phone, to corporate installations for security, to embodied robots with AI running in their virtual brain.
This week’s latest locally run AI news: https://ethanbholland.com/2024/07/12/locally-run-ai-models-news-week-ending-07-12-2024/

Meta AI News of the WeekThis is a space dedicated for Meta specific AI advancements and news stories.
This weeks Meta AI news: https://ethanbholland.com/2024/07/12/meta-ai-news-week-ending-07-12-2024/

Microsoft AI News of the WeekThis is a space dedicated for Microsoft specific AI advancements and news stories.
This weeks Microsoft AI news: https://ethanbholland.com/2024/07/12/microsoft-ai-news-week-ending-07-12-2024/

Mobile AI News of the Week: In April, 2024 I added a dedicated category for mobile. Prior, I put all most the mobile news into either the company (Apple v. Google v. Microsoft) or locally run AI. It also ended up in the chips and hardware section, or the consumer products category. There is enough mobile news to at least start cross linking it all in one place. This week’s latest mobile AI news: https://ethanbholland.com/2024/07/12/mobile-news-week-ending-07-12-2024/

Multimodal AI News of the Week: This is a broad topic for an single AI model that demonstrates an ability to interact with more than one modality (imagery, video, audio, text). Often multimodal news will end up in one of these categories. I’m playing it by ear on a case by case basis. Please be patient with my organizational challenges.
This week’s multimodal AI news: https://ethanbholland.com/2024/07/12/multimodality-news-week-ending-07-12-2024/

OpenAI: OpenAI is the leading force in the AI boom of 2023 and now 2024. This section focuses on news that is specific to OpenAI. This section will compete with all of the other sections (imagery, vision, ethics, etc) because OpenAI is so broad. I won’t be able to consistently pick when to put things under OpenAI or other sections, so bear with me.
This week’s latest OpenAI news: https://ethanbholland.com/2024/08/15/openai-news-week-ending-07-12-2024/

Open Source Models: An open source AI model refers to a class of artificial intelligence models with public source code. They can be inspected, copied, installed, and customized on private computers. In contrast, a closed source model is proprietary and owned by a company that you pay to use (like PowerPoint or Photoshop). One of the most famous open source language models is a French model called Mistral. Its code is completely publicly available, and anyone can download it and customize it. On one hand, open source is a transparent and powerful way to democratize AI, but on the other hand, open source models circumvent the guard rails and copyright protections that private companies implement. Open source models are the wild west of artificial intelligence, but also the potential saving grace (depending on who you ask). It’s a bit like gun control debates but for computing power.
This week’s latest open source news: https://ethanbholland.com/2024/07/12/open-source-ai-news-week-ending-07-12-2024/

Podcast/YouTube Clips of the Week: This is for more general interviews and explainer videos and podcasts that provide access to leadership, demos of new products, and walkthroughs and tutorials. Videos focused on specific topics will live in the topic category (i.e. images), but broader videos will live here.
This week’s latest podcasts and YouTube clips: https://ethanbholland.com/2024/07/12/podcasts-youtube-op-eds-week-ending-07-12-2024/

Publishing AI News of the Week: These are stories about AI’s impact on the publishing industry. From copyright and crawling to the death of page views or even the end of browsers.
This week’s latest publishing AI news: https://ethanbholland.com/2024/07/12/publishing-news-week-ending-07-12-2024/

RAG Retrieval-Augmented Generation News of the Week: RAG allows a language model to “reference an authoritative knowledge base outside of its training data sources before generating a response” (via Amazon). Historically RAG was prone to hallucinations, however new methods are improving the reliability. There is enough news about RAG, that I want to start tracking it separately for my own use.
This week’s latest RAG (Retrieval-Augmented Generation) AI news: https://ethanbholland.com/2024/07/12/rag-retrieval-augmented-generation-news-week-ending-07-12-2024/

Robotics/Embodiment News of the Week: This is the most intense area of AI. Embodiment refers to putting an AI inside of a machine. It’s “embodying” the object and therefore giving a robot agency in the real world. An example would be using a large language model as an interface to a complex coding task. Just as you ask “Alexa, play Bad Blood by Taylor Swift on Spotify” using plain language, with embodiment you could ask a robot to “Go to the laundry basket and bring me all of the red shirts”. The language model in the robot would translate your request into the proper code to go get the red shirts. The robot was never trained on the task. Another type of embodiment would be training a robot using virtual reality simulations. Using an simulation, a robot could be trained on thousands of scenarios until the real world can be swapped out and the robot doesn’t “notice”. This section also includes factory automation and human prosthetics. There will be some overlap with other categories like autonomous vehicles. I first learned about embodiment from Alan Thompson. I highly recommend his video explainer: https://youtu.be/peLqYP9BAUg?si=2FzrvDlw-qaQFaCx.
This week’s latest robot and embodiment AI news: https://ethanbholland.com/2024/07/12/robotics-and-embodiment-news-week-ending-07-12-2024/

Science/Medicine AI News of the Week: AI’s strength is learning patterns. This applies nicely to medical diagnosis and identifying trends. When combined with data and AI vision, this means AI is good at looking at x-rays. Language models are helping with patient interface, and robotics and augmented reality are advancing surgery. Powerful enterprise models like Google’s Alphafold can master protein folding. Other models can read ancient scrolls without opening them.
This week’s latest AI science and medicine news: https://ethanbholland.com/2024/07/12/science-and-medicine-news-week-ending-07-12-2024/

AI Video News of the Week: AI video in this case refers to generative video. Much like imagery meant generative imagery. This usually text-to-video, where a user enters a prompt (“a wizard walking out of a flaming building”) and a tool like Pika or Runway generates an video in the likeness of the description. It also covers animation of still images, where an image is given motion (like a photo of a waterfall appearing to have flowing water). As with images, this is different than AI vision, where an AI “looks at” an image or video and can derive context, details, and contents. Video, in this case, is video creation and modification/editing.
This week’s latest AI video news: https://ethanbholland.com/2024/07/12/video-news-week-ending-07-12-2024/

X/Twitter/Grok: Grok is one of several AI’s developed by X, and it’s a bit blended in with Telsa and other Elon Musk technology. Not every week will have a Grok section, but like Meta, Google, Apple, and OpenAI, X will be in the news enough to have its own section.
This week’s latest X news: https://ethanbholland.com/2024/07/12/twitter-x-grok-week-ending-07-12-2024/

Technical and AI Developer News of the Week: Everything that is too technical for general consumption goes here. These are stories I think are important, but might be inaccessible and confusing. It’s also a space for developer news and deep dives into how AI works, under the hood.
This week’s technical and dev AI news: https://ethanbholland.com/2024/07/12/tech-papers-training-and-development-week-ending-07-12-2024/

Credits/Sources

a thankful robot extends a bouquet of flowers toward the camera --chaos 30 --ar 4:3 --style raw --personalize jczhn5o
a thankful robot extends a bouquet of flowers toward the camera –chaos 30 –ar 4:3 –style raw –personalize jczhn5o

Most of these weekly links come from just a few prolific oversharing sources. Please follow them, as they work hard to find the news each week and they make it a lot easier for me to compile.

For previous issues, please visit the archives!

Thanks for reading!

26 responses to “AI News #41: Week Ending 07/12/2024 with Executive Summary and Top 58 Links”

  1. […] This week’s executive overview and top links are here:AI News #41: Week Ending 07/12/2024 with Executive Summary and Top 58 Links  […]

  2. […] This week’s executive overview and top links are here:AI News #41: Week Ending 07/12/2024 with Executive Summary and Top 58 Links  […]

  3. […] This week’s executive overview and top links are here:AI News #41: Week Ending 07/12/2024 with Executive Summary and Top 58 Links  […]

  4. […] This week’s executive overview and top links are here:AI News #41: Week Ending 07/12/2024 with Executive Summary and Top 58 Links  […]

  5. […] This week’s executive overview and top links are here:AI News #41: Week Ending 07/12/2024 with Executive Summary and Top 58 Links  […]

  6. […] This week’s executive overview and top links are here:AI News #41: Week Ending 07/12/2024 with Executive Summary and Top 58 Links  […]

  7. […] This week’s executive overview and top links are here:AI News #41: Week Ending 07/12/2024 with Executive Summary and Top 58 Links  […]

  8. […] This week’s executive overview and top links are here:AI News #41: Week Ending 07/12/2024 with Executive Summary and Top 58 Links  […]

  9. […] This week’s executive overview and top links are here:AI News #41: Week Ending 07/12/2024 with Executive Summary and Top 58 Links  […]

  10. […] This week’s executive overview and top links are here:AI News #41: Week Ending 07/12/2024 with Executive Summary and Top 58 Links  […]

  11. […] This week’s executive overview and top links are here:AI News #41: Week Ending 07/12/2024 with Executive Summary and Top 58 Links  […]

  12. […] This week’s executive overview and top links are here:AI News #41: Week Ending 07/12/2024 with Executive Summary and Top 58 Links  […]

  13. […] This week’s executive overview and top links are here:AI News #41: Week Ending 07/12/2024 with Executive Summary and Top 58 Links  […]

  14. […] This week’s executive overview and top links are here:AI News #41: Week Ending 07/12/2024 with Executive Summary and Top 58 Links  […]

  15. […] This week’s executive overview and top links are here:AI News #41: Week Ending 07/12/2024 with Executive Summary and Top 58 Links  […]

  16. […] This week’s executive overview and top links are here:AI News #41: Week Ending 07/12/2024 with Executive Summary and Top 58 Links  […]

  17. […] This week’s executive overview and top links are here:AI News #41: Week Ending 07/12/2024 with Executive Summary and Top 58 Links  […]

  18. […] This week’s executive overview and top links are here:AI News #41: Week Ending 07/12/2024 with Executive Summary and Top 58 Links  […]

  19. […] This week’s executive overview and top links are here:AI News #41: Week Ending 07/12/2024 with Executive Summary and Top 58 Links  […]

  20. […] This week’s executive overview and top links are here:AI News #41: Week Ending 07/12/2024 with Executive Summary and Top 58 Links  […]

  21. […] This week’s executive overview and top links are here:AI News #41: Week Ending 07/12/2024 with Executive Summary and Top 58 Links  […]

  22. […] This week’s executive overview and top links are here:AI News #41: Week Ending 07/12/2024 with Executive Summary and Top 58 Links  […]

  23. […] This week’s executive overview and top links are here:AI News #41: Week Ending 07/12/2024 with Executive Summary and Top 58 Links  […]

  24. […] This week’s executive overview and top links are here:AI News #41: Week Ending 07/12/2024 with Executive Summary and Top 58 Links  […]

  25. […] main cover this week references a tool called ‘Live Portrait’ out of China, that allows retargeting of facial movement from any video onto any still image. It’s not the […]

  26. […] segmentation, generative video stitching (like this example), video-to-image mapping (Viggle, LivePortrait), Gaussian splatting and NeRFs, context windows v. RAG… agents, multimodality, […]

Leave a Reply

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading