Ethan B. Holland

Over 53,700 manually organized AI links and counting

Audio: AI News Week Ending 07/11/2025

July 11, 2025

Image created with OpenAI GPT-Image-1. Image prompt: mid‑1990s web‑browser screenshot, CRT glow, 256‑color dithering — Angelfire neon‑flame backdrop with scrolling marquee — embedded RealAudio player titled “AI Audio Demo” — crisp pixel edges, screen‑door scan‑lines, phosphor glow

Control your browser by just talking in voice back and forth”” / X https://x.com/AravSrinivas/status/1943003054397157764

wdyt of this @googlechrome ? on comet: you can just close tabs on voice mode 🙂 looking forward to what you will ship soon”” / X https://x.com/AravSrinivas/status/1943754539322290192

Someone using AI to impersonate Marco Rubio contacted at least five people including foreign ministers, cable says | CNN Politics https://www.cnn.com/2025/07/08/politics/marco-rubio-artificial-intelligence-impersonation

Video game actors’ strike officially ends after AI deal https://www.bbc.com/news/articles/c5ykx117keqo

crazy that in 2025 i can converse in 1000 tokens/sec on my single GPU machine with AI that’s world-class at math and programming but i still have to type. i can’t speak to it, at least not in low-enough latency to carry a conversation we don’t have this tech yet. why not?”” / X https://x.com/jxmnop/status/1941995444730540050

From idea to deployed Voice Agent Dashboard! How I Built a Voice Agent Dashboard Just using these 4 tools • ChatGPT • UX Pilot • Lovable • Cursor https://x.com/harshsoni_hs/status/1932687619093119239

🚀 Claude 4 models now available through our LeMUR API Transform your audio into actionable insights with our industry-leading speech-to-text API — enhanced with Anthropic’s most advanced AI models. Why LeMUR + Claude 4: 🎯 Unified Solution – Speech-to-text + advanced https://x.com/AssemblyAI/status/1943347120456536282

Introducing NotebookLlama – an open-source version of NotebookLM! 📓🦙 NotebookLlama is a full implementation of NotebookLM that includes all the capabilities that makes it so great for researchers+business users: ✅ Create a knowledge repository of documents. Has likely higher https://x.com/jerryjliu0/status/1941546894532149519

🚨 Veo 3 now lets you generate audio + video starting from an image This one is cool – I started with JUST the first frame of the model and prompted the dialogue, the addition of a second character, and the action in the scene. Huge breakthrough for character consistency! https://x.com/venturetwins/status/1942371183644794987

Kyutai TTS and Unmute are now open source! The text-to-speech is natural, customizable, and fast: it can serve 32 users with a 350ms latency on a single L40S. Try it out and get started on the project page: https://x.com/kyutai_labs/status/1940767331921416302