“Genius. A Home Assistant user hooked up GPT-4 Vision with their security cameras and now can do things like find items in their home.
“Let’s start closing the gap between commercial vision models and open source! The PixelProse dataset contains 16M images labeled with high quality *dense* captions that are specifically designed to be refactored into instructions, question-answer pairs, etc, using a LLM.
“VoCo-LLaMA Towards Vision Compression with Large Language Models Vision-Language Models (VLMs) have achieved remarkable success in various multi-modal tasks, but they are often bottlenecked by the limited context window and high computational cost of processing
“I have access to the 2M token version of Gemini 1.5. I think multimodal video is going to have some big effects on management, training & coaching. I gave Gemini an 85 minute video of a meeting. It was able to identify what happened & how to improve it. Not perfect yet, but nice

Heads up! You’ve scrolled to the end of this category. There may have been just one or two links (above), so go back up and double check to be sure you didn’t quickly scroll down past it.
Be Sure To Read This Week’s Main Post:
This week’s executive overview and top links are here:
AI News #38: Week Ending 06/21/2024 with Executive Summary and Top 91 Links
The post you just read is an deep dive extension of my weekly newsletter, This Week In AI, an executive summary of the top things to know in AI. Each week, I create an accessible overview for laypeople to feel confident they are conversant with the week’s AI developments. I include a curated list of must-click links of the week, to offer everyone a hands-on opportunity to explore the most intriguing updates in artificial intelligence across various categories, including robotics, imagery, video, AR/VR, science, ethics, and more. Beyond the overview, I post these topic-based deeper dives (below). If you haven’t read this week’s overview, I recommend starting there.
- Agents/Copilots
- Amazon
- Apple
- Artificial General Intelligence (AGI)
- Augmented and Virtual Reality (AR/VR)
- Autonomous Vehicles
- AI Audio
- Business and Enterprise AI
- Chips and Hardware
- Consumer Products
- Education
- Ethics/Legal Security
- Images/Photos
- International AI News
- Locally Run AI Models
- Mobile
- Meta
- Microsoft
- OpenAI
- Open Source
- Podcasts/YouTube
- Publishing and News
- Retrieval-Augmented Generation (RAG) News
- Robots and Embodiment
- Safe Intelligence, Inc.
- Science and Medicine
- Video
- Vision/Multimodality
- X/Twitter/Grok
- Tech and Development
Credits/Sources

Most of these weekly links come from just a few prolific oversharing sources. Please follow them, as they work hard to find the news each week and they make it a lot easier for me to compile.
- Robert Scoble: https://x.com/Scobleizer
- Ethan Mollick: https://www.linkedin.com/in/emollick/
- Alan Thompson: https://lifearchitect.ai/
- Theoretically Media: https://www.youtube.com/@TheoreticallyMedia
- The Rundown: https://www.therundown.ai/
- Bilawal Sidhu: https://twitter.com/bilawalsidhu/
- TLDR: https://tldr.tech/ai
- Jeremiah Owyang: https://twitter.com/jowyang
- Nick St. Pierre: https://twitter.com/nickfloats
- Dr. Jim Fan: https://twitter.com/DrJimFan
- All About AI: https://www.youtube.com/@AllAboutAI
- Marshall Kirkpatrick: https://aitimetoimpact.com/
- AI News (Smol Talk): https://buttondown.email/ainews/archive/
- Andrej Karpathy: https://x.com/karpathy
- Brett Adcock: https://x.com/adcock_brett
- Florent Daudens: https://x.com/fdaudens
- Ate-a-Pi: https://x.com/8teAPi
- Francesco Marconi: https://x.com/fpmarconi
- Charlie Beckett: https://x.com/CharlieBeckett
For previous issues, please visit the archives!

Thanks for reading!





Leave a Reply