Ethan B. Holland

Over 56,600 manually organized AI links and counting

Audio: AI News Week Ending 04/03/2026

April 3, 2026

Image created with gemini-3.1-flash-image-preview with claude-sonnet-4-5. Image prompt: Using the provided reference image, preserve the exact square faceted perfume bottle with amber-gold liquid, crystal stopper, pure white background, soft shadow, and clean white label typography, but replace the label text with ‘Audio’ in matching black serif font. Add a delicate sterling silver chain draped naturally around the bottle neck with a small dainty pendant depicting a miniature sound waveform oscillation in polished silver jewelry style, keeping the pendant tiny and refined like high-fashion charm jewelry.

33 hours of audio transcribed in 12 minutes! @CohereLabs just released Cohere Transcribe – 2B open-source ASR, 66 eps of 1940s CBS Suspense from @internetarchive on A100 via @huggingface Jobs + Buckets mount 161x realtime! Script + all transcripts are public
https://x.com/vanstriendaniel/status/2037548103272632497

Very hyped by the new Cohere Transcribe model 🌍 Works surprisingly well on bad quality audio when the mic doesn’t cooperate. 2B params, 14 supported languages and it’s Apache 2.0. try the official Hugging Face demo ⬇️
https://x.com/victormustar/status/2037572662659104976

MAI-Transcribe-1 is available at $6 per 1000 minutes of audio via Microsoft Foundry.
https://x.com/ArtificialAnlys/status/2039862709744021938

Microsoft has released MAI-Transcribe-1: a speech transcription model achieving 3.0% on AA-WER (#4), and is fast at 69x real-time The model was developed by Microsoft AI (MAI)’s Superintelligence team and supports 25 languages including English, French, Arabic, Japanese, and
https://x.com/ArtificialAnlys/status/2039862705096659050

State of the Art Speech Recognition with MAI-Transcribe-1 | Microsoft AI
https://microsoft.ai/news/state-of-the-art-speech-recognition-with-mai-transcribe-1/?form=M301FW&OCID=CGE_osocial_Copilot_Free_868j3yepz

Today we’re announcing 3 new world class MAI models, available in Foundry | Microsoft AI
https://microsoft.ai/news/today-were-announcing-3-new-world-class-mai-models-available-in-foundry/

Transform your headphones into a live personal translator on iOS.
https://blog.google/products-and-platforms/products/translate/live-translate-with-headphones/

I would say that Suno is generally a better music generator at this point (though not for all songs), but Lyria is the first music creator available through an API, and seems to not have the same copyright & training issues that plague Suno and restrict how it can be used.
https://x.com/emollick/status/2036962853861662917

meet the music playground, with Lyria 3. construct the perfect prompt with composer mode: describe it, hear it, then export to code and build.
https://x.com/GoogleAIStudio/status/2039055128276148454

Almost signed up for ElevenLabs to narrate my blog. $330/month. Then I tried running an open-source model on my own laptop. Qwen 3.5 14B. Sounds fine. 200 posts a month. Costs me electricity. I almost paid $4,000 a year to rent a model I can run myself. Most AI subscriptions
https://x.com/TheGeorgePu/status/2037473248577782046

Audio: AI News Week Ending 04/03/2026

Share this:

Like this:

Leave a ReplyCancel reply

Trending

Discover more from Ethan B. Holland