Multimodality

Have you checked out the cool Whisper-WebUI yet? 

1 Generate subtitles from: Files, Youtube, Mic

2 In subtitle formats: SRT, WebVTT, etc

3 Whisper’s end-to-end STT translation From other languages to English.

4 Text to Text: Translate subtitle files using Meta’s NLLB, DeepL API

AI Video Mapping Emotions to Human Expression

Teaching AI to have emotional intelligence.  Challenges for objectively measuring human emotions across various cultural contexts.

100% Local Tiny AI Vision Language Model (1.6B) – Very Impressive!!

Prompt Highlighter: Interactive Control for Multi-Modal LLMs

OMG-Seg: Is One Model Good Enough For All Segmentation?https://lxtgh.github.io/project/omg_seg/

Be Sure To Read “This Week In AI”

This week’s executive overview and top links are here:

AI News #17: Week Ending 01/26/2024 with Executive Summary and Top 12 Stories

The post you just read is an extension of my weekly newsletter, This Week In AI, an executive summary of the top things to know in AI. Each week, I create an accessible overview for laypeople to feel confident they are conversant with the week’s AI developments. I include a curated list of must-click links of the week, to offer everyone a hands-on opportunity to explore the most intriguing updates in artificial intelligence across various categories, including robotics, imagery, video, AR/VR, science, ethics, and more. Beyond the overview, I post these topic-based deeper dives (below). If you haven’t read this week’s overview, I recommend starting there.

Credits/Sources

Most of these links come from just a few incredible sources.  Please follow them:

Previous Issues

Leave a Reply

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading