Ethan B. Holland

Over 50,200 manually organized AI links and counting

Audio: AI News Week Ending 09/05/2025

September 5, 2025

Image created with Flux Pro v1.1 Ultra. Image prompt: Audio, clean waveform trace built from small bananas increasing and decreasing in height, studio tabletop, photorealistic, editorial, minimal, high detail, 3:2 landscape

Duolingo is facing an existential crisis as Google Translate rolls out features to tutor users—and even handle live translation as a bonus | Fortune https://fortune.com/2025/08/27/duolingo-existential-crisis-ai-google-translate-language-learning-live-translation/

Notebook LM Rolling out NEW audio overview formats:
(Default) Deep Dive: a thorough examination of your sources
Brief: 1-2 minute, bite-sized overviews
Critique: an expert review, offering constructive feedback on your material
Debate: a thoughtful debate between two hosts https://x.com/NotebookLM/status/1962949985546187120

Free AI Sound Effect Generator | Add Sound Effects to Video & Audio | ElevenLabs https://elevenlabs.io/sound-effects

Introducing Lovable Voice Mode Turn your ideas into reality without touching your keyboard. https://x.com/lovable_dev/status/1963255845900484632

Vibevoice from @MicrosoftAI is #1 trending on HF for the past few days! This is Frontier Open-Source Text-to-Speech Model VibeVoice designed for generating expressive, long-form, multi-speaker conversational audio, such as podcasts, from text. A core innovation of VibeVoice https://x.com/ClementDelangue/status/1963537036616323388

this repo is wild. 100+ production-ready AI Agents, RAG, Multi-Agent teams, Voice Agents, MCP, and LLM apps with step-by-step tutorials. 100% free and open source by @Saboo_Shubham_ 👏 link in next post 👀 https://x.com/MakerThrive/status/1962661273335742780

Apple’s rumored AI search tool for Siri could rely on Google | The Verge https://www.theverge.com/news/770712/apple-ai-search-tool-siri-google-gemini

Lipsync Studio https://higgsfield.ai/create/speech

AHELM: A Holistic Evaluation of Audio-Language Models “”we introduce AHELM, a benchmark that aggregates various datasets — including 2 new synthetic audio-text datasets called PARADE, which evaluates the ALMs on avoiding stereotypes, and CoRe-Bench, which measures reasoning over https://x.com/iScienceLuvr/status/1962799344001917360

In less than a day, @StepFun_ai dropped Step-Audio 2 Mini – 8B speech to speech, beats GPT-4o-Audio, Apache 2.0 licensed 🔥 > Trained on 8M+ hours, supports 50K+ voices, benchmarks for expressive/grounded speech 🤯 > Expressive and emotionally aware generation > Retrieves and https://x.com/reach_vb/status/1961414067668558319

airline customer service AI demo with agent handoffs https://x.com/tom_doerr/status/1962972766174339271

Excited to share our first @MicrosoftAI in-house models: MAI-Voice-1 and MAI-1-preview. Details and how you can test below, with lots more to come⬇️ https://x.com/mustafasuleyman/status/1961111770422186452

Step-Audio 2 is an end-to-end multi-modal large language model designed for industry-strength audio understanding and speech conversation, presented in the paper Check out the model here: https://x.com/reach_vb/status/1961414145938485477

Step-Audio 2 is an end-to-end multi-modal large language model designed for industry-strength audio understanding and speech conversation, presented in the paper Play with the demo here: https://x.com/reach_vb/status/1961471503267979699

Step-Audio 2 is an end-to-end multi-modal large language model designed for industry-strength audio understanding and speech conversation, presented in the paper try it here: https://x.com/_akhaliq/status/1962644559868883310

Introducing gpt-realtime — our best speech-to-speech model for developers, and updates to the Realtime API https://x.com/OpenAI/status/1961110295486808394

Last week we released @OpenAIDevs gpt-realtime our latest speech-to-speech model. Prompting S2S models can be extremely powerful. To get the most out of the model we compiled a series of prompting tips of what we saw work with early customers. To present them here’s Cedar our https://x.com/dkundel/status/1962916750632353826

abs: https://x.com/iScienceLuvr/status/1962800402409365590

code: https://x.com/iScienceLuvr/status/1962798182964113547

website: https://x.com/iScienceLuvr/status/1962799346292007272