A fashion photoshoot of a runway look inspired by theater. A large screen displays the word “Multimodal” –ar 4:3 –style raw
Twelve Labs Earns $50 Million Series A Co-led by NEA and NVIDIA’s NVentures to Build the Future of Multimodal AI
“ShareGPT4Video Improving Video Understanding and Generation with Better Captions We present the ShareGPT4Video series, aiming to facilitate the video understanding of large video-language models (LVLMs) and the video generation of text-to-video models (T2VMs)
“The No Language Left Behind paper (NLLB) just appeared in Nature. High-quality translation between 200 languages in any direction, with sparse training data, and many low-resource languages.”
“Newly published today in @Nature: No Language Left Behind (NLLB) is an AI model created by researchers at Meta capable of delivering high-quality translations directly between 200 languages – including low-resource languages. Read more in Nature ⬇️
AI is cracking a hard problem – giving computers a sense of smell
“Amazon unveiled Project P.I., an AI system that scans products in fulfillment centers to detect damaged or incorrect items before they ship. Amazon also utilizes a multimodal LLM to investigate issues further, combining customer feedback with Project P.I. images.
“Today we’re announcing LiveKit’s $22.5M Series A to build the transport layer for AI. This wasn’t an easy fundraise. Late last year, we pitched investors that realtime voice and video would become THE way we interact with computers. A few didn’t agree; most said it was at least”
Amazon: AI spots product defects, reduces waste
Amazon’s Project PI AI looks for product defects before they ship – The Verge
“The future of AI glasses is normal looking, light weight and affordable – meet Frame, AI Glasses by @brilliantlabsAR It is shipping to hackers and creators already. Frame is open-source platform with mic, camera, AR display. It leverages your phone (connectivity & audio) and
Using AI to decode dog vocalizations | University of Michigan News
“In a new paper by Nature, researchers from the University of Zurich developed a more effective vision model that keeps autonomous cars from crashing. It works by combining a 5,000 FPS event camera with a 20-FPS RGB camera.
“Announcing Dragonfly, a set of vision-language models leveraging multi-resolution encoding & zoom-in patch selection to unlock fine-grained visual understanding.
Dragonfly: A large vision-language model with multi-resolution zoom
“🌟Introducing “🤖SpatialRGPT: Grounded Spatial Reasoning in Vision Language Model”
“Video-MME The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis In the quest for artificial general intelligence, Multi-modal Large Language Models (MLLMs) have emerged as a focal point in recent advancements. However, the predominant focus
OpenAI
“My AI smart speaker using @OpenAI – now with vision! #GPT4 🔊 📷 Code:
“🚨 ChatGPT adds “Background Conversations” in its latest update. It allows you to keep the conversation going even if you are using other apps or your screen is off. GPT-4o new voice feature might be coming soon!
“This is great – Gemini 1.5 Pro and Flash outperforms GPT-4o in Multi-modal LLMs in Video Analysis. Specialy given the pricing of Gemini 1.5 Flash Paper – ‘Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis’
Phi
“Introducing Phi-3 WebGPU, a private and powerful AI chatbot that runs locally in your browser, powered by 🤗 Transformers.js and onnxruntime-web! 🔒 On-device inference: no data sent to a server ⚡️ WebGPU-accelerated (> 20 t/s) 📥 Model downloaded once and cached Try it out! 👇
“Phi-3 Medium (14B) and Small (7B) models are on the @lmsysorg leaderboard! 😍 Medium ranks near GPT-3.5-Turbo-0613, but behind Llama 3 8B. Phi-3 Small is close to Llama-2-70B, and Mistral fine-tunes. This proves that we cannot purely optimize for academic benchmarks. We need to
This week’s executive overview and top links are here:
AI News #36: Week Ending 06/07/2024 with Executive Summary and Top 40 Links
The post you just read is an deep dive extension of my weekly newsletter, This Week In AI, an executive summary of the top things to know in AI. Each week, I create an accessible overview for laypeople to feel confident they are conversant with the week’s AI developments. I include a curated list of must-click links of the week, to offer everyone a hands-on opportunity to explore the most intriguing updates in artificial intelligence across various categories, including robotics, imagery, video, AR/VR, science, ethics, and more. Beyond the overview, I post these topic-based deeper dives (below). If you haven’t read this week’s overview, I recommend starting there.
- Agents/Copilots
- Amazon
- Apple
- Artificial General Intelligence (AGI)
- Augmented and Virtual Reality (AR/VR)
- Autonomous Vehicles
- AI Audio
- Business and Enterprise AI
- Chips and Hardware
- Consumer Products
- Education
- Ethics/Legal Security
- Images/Photos
- International AI News
- Locally Run AI Models
- Mobile
- Meta
- Microsoft
- OpenAI
- Open Source
- Podcasts/YouTube
- Publishing and News
- Retrieval-Augmented Generation (RAG) News
- Robots and Embodiment
- Science and Medicine
- Video
- Vision/Multimodality
- X/Twitter/Grok
- Tech and Development
Credits/Sources

Most of these weekly links come from just a few prolific oversharing sources. Please follow them, as they work hard to find the news each week and they make it a lot easier for me to compile.
- Robert Scoble: https://x.com/Scobleizer
- Ethan Mollick: https://www.linkedin.com/in/emollick/
- Alan Thompson: https://lifearchitect.ai/
- Theoretically Media: https://www.youtube.com/@TheoreticallyMedia
- The Rundown: https://www.therundown.ai/
- Bilawal Sidhu: https://twitter.com/bilawalsidhu/
- TLDR: https://tldr.tech/ai
- Jeremiah Owyang: https://twitter.com/jowyang
- Nick St. Pierre: https://twitter.com/nickfloats
- Dr. Jim Fan: https://twitter.com/DrJimFan
- All About AI: https://www.youtube.com/@AllAboutAI
- Marshall Kirkpatrick: https://aitimetoimpact.com/
- AI News (Smol Talk): https://buttondown.email/ainews/archive/
For previous issues, please visit the archives!

Thanks for reading!





Leave a Reply