The first of what might be a weekly posts – a bare bones executive summary of everything I think is worth knowing in AI/tech from the week. Since this is the initial edition, it goes back a few weeks. The format is Kottke.org meets Delicio.us (I was the top user back in the 90s).

Each week, I’ll create an image to represent the themes of the week, and also attempt to demonstrate the state of generative imagery over time. The first cover image is “five of me in the style of Newsies” using MidJourney and InsightFaceSwap (which only supports four faces, so I ran it twice). 

I’ve already sorted everything by the biggest news first, so the “good stuff” is up top. If you think I’ve missed anything, please share in the comments.

The content goes back a few weeks, since this is the initial edition. For brevity, I often take the text from the tweet or headline directly, so when you read the copy, it’s a mix of my writing and quotes from others. Attribution is via the link that goes with the item.

Ethan Holland Presentation on AI to the National Association of Broadcasters – Sept 20, 2023

Google Doc with Links from Presentation:
https://docs.google.com/spreadsheets/d/1spNBm8m_AaNc7-j0P2QP61qxTLpoZ-/edit#gid=226988230

AI Translation with Voice Cloning – Hey Gen

The night before my presentation to the NAB, I sat in my hotel room in DC, and made this video in English.  HeyGen transcribed it, translated it, cloned my voice into German (video 1), French (vid 2), and Spanish (vid 3).  It created audio, and then matched my speech and face.

Here’s a test using Draper Media’s Delmarva Home Show promotional video (original was in English, translated to Spanish). HeyGen did pretty well!
https://app.heygen.com/video-translate/share/0f59b5becff2450ab27c472f49a1b064

Chat GPT Agency and Multimodality

“ChatGPT can now browse the internet to provide you with current and authoritative information, complete with direct links to sources. It is no longer limited to data before September 2021”

“ChatGPT can now see, hear, and speak – We are beginning to roll out new voice and image capabilities in ChatGPT. They offer a new, more intuitive type of interface by allowing you to have a voice conversation or show ChatGPT what you’re talking about.”
https://openai.com/blog/chatgpt-can-now-see-hear-and-speak

“ChatGPT took handwritten notes by Hooke and an ancient Catalan drug manual about medicinal mummies and transcribed the handwriting with reasonable accuracy, and even translated from archaic Catalan to English”
https://www.linkedin.com/posts/emollick_the-humanities-are-about-to-change-in-a-major-activity-7112865362676736000-2FEc/

Creative Tools

“Generative fill in photoshop is officially out of beta.  Adobe Firefly is now available for commercial use”

https://helpx.adobe.com/firefly/using/firefly-overview.html

“Canva’s new AI tools automate boring, labor-intensive design tasks”
https://www.theverge.com/2023/10/4/23902794/canva-magic-studio-ai-design-new-tools

AI Foley Work – Text to Sound Effects

Digital Video – Change the scene with your voice

Journalism AI 

Event: AP Solutions: Five free AI projects for your newsroom – Oct 26, 2023 
https://ap.zoom.us/webinar/register/WN_v1T54MtvRnO2OYmxpjOeIw#/registration

Facebook Announcements

Meta AI 
https://ai.meta.com/genai/

Language

“Spotify is going to clone podcasters’ voices — and translate them to other languages”
https://www.theverge.com/2023/9/25/23888009/spotify-podcast-translation-voice-replication-open-ai

Example – Lex Fridman in Spanish
https://www.linkedin.com/posts/lexfridman_this-is-me-speaking-spanish-thanks-to-amazing-ugcPost-7112129498757701632-RyiK/

Amazon Updates – Echo and Antropic 

“Amazon Devices Event 2023: Everything you need to know about Alexa, Echo, Fire TV and more”
https://techcrunch.com/2023/09/20/amazon-devices-event-2023-everything-you-need-to-know-about-alexa-echo-fire-tv-and-more/

“Today, we’re announcing that Amazon will invest up to $4 billion in Anthropic.”
https://www.anthropic.com/index/anthropic-amazon

Anthropic Claude on Amazon Bedrock: Build generative AI solutions with Anthropic’s state-of-the-art model Claude
https://aws.amazon.com/bedrock/claude/

Bard Extensions Integrate with Google Tools

Bard can now connect to your Google apps and services
https://blog.google/products/bard/google-bard-new-features-update-sept-2023/
https://bard.google.com/extensions

Robot Embodiment

“Let’s reverse engineer the phenomenal Tesla Optimus. No insider info, just my own analysis.”

“Your robot will know where everything is in your house. Office. Factory. Store. Warehouse. Library. Museum.”

“Tracking Anything with Decoupled Video Segmentation” – tracking piglets in a pen

“Autonomous driving with Chain of Thought – autopilot thinking out loud in text!”

Robot Dexterity
https://tonyzhaozh.github.io/aloha/

Security

“GPT-4 with vision is here. I got to play with it early.  Here’s an example of prompt injection via image that is non-obvious to the user.”

Misc tech news:

“News story or meme? After Elon Musk aXes headlines, it’s hard to tell
If you had to come up with a single move designed to deal a blow to whatever traffic is left and make sharing news more of a hassle, you couldn’t do much better than eliminating headlines from posts.”

Insane Immersive Studio LED Technology (must see)

Leave a Reply

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading