Ethan B. Holland

Over 54,400 manually organized AI links and counting

an abstract name tag that reads "Imagery" --ar 5:3 --style raw

Imagery News: Week Ending 05/17/2024

May 17, 2024

I decided to try a theme with this week’s cover imagery to see how creative MidJourney could be with simple prompts. Each category cover image is a name tag + art style. It was pretty neat to see the variances. The goal is not perfection. By posting the mistakes, we’ll get to see how imagery improves over time. Here is the prompt for the cover:

an abstract name tag that reads “Imagery” –ar 5:3 –style raw

Google (Imagen)

“We’re introducing Imagen 3: our highest quality text-to-image generation model yet. 🎨 It produces visuals with incredible detail, realistic lighting and fewer distracting artifacts. From quick sketches to very high-res imagery, here’s a look at what it can create. 👀 #GoogleIO

We’re introducing Imagen 3: our highest quality text-to-image generation model yet. 🎨

It produces visuals with incredible detail, realistic lighting and fewer distracting artifacts.

From quick sketches to very high-res imagery, here’s a look at what it can create. 👀 #GoogleIO pic.twitter.com/XMrQYGeSiO
— Google DeepMind (@GoogleDeepMind) May 14, 2024

“Today we’re introducing Imagen 3, @GoogleDeepMind’s most capable image generation model yet. It understands prompts the way people write, creates more photorealistic images and is our best model for rendering text. #GoogleIO

Today we’re introducing Imagen 3, @GoogleDeepMind’s most capable image generation model yet. It understands prompts the way people write, creates more photorealistic images and is our best model for rendering text. #GoogleIO pic.twitter.com/6bjidsz6pJ
— Google (@Google) May 14, 2024

Krea

MidJourney

“We’re now testing ‘private creation rooms’ on our website! For all MJ members with >100 images, just click ‘rooms’ and then ‘create room’. This lets you make private spaces to create and explore with friends or coworkers. Have fun and let us know what you think!”

We're now testing 'private creation rooms' on our website! For all MJ members with >100 images, just click 'rooms' and then 'create room'. This lets you make private spaces to create and explore with friends or coworkers. Have fun and let us know what you think!
— Midjourney (@midjourney) May 17, 2024

“Here are 8 style reference codes I’ve mined from midjourney that I think are cool No examples, just try them out against one of your favorite prompts & see what happens If you find something cool, feel free to share in replies {your prompt} –sref {code} –sref 3400018089”

Here are 8 style reference codes I've mined from midjourney that I think are cool

No examples, just try them out against one of your favorite prompts & see what happens

If you find something cool, feel free to share in replies

{your prompt} –sref {code}

–sref 3400018089…
— Nick St. Pierre (@nickfloats) May 16, 2024

“Here is an example of one of the actual prompts used in @paultrillo’s Sora Music video It’s so remarkably long that it literally won’t fit in a single post, so I’ll split it into a thread: 💬 continuous shot moving forward zooming through time, with a view of 1980s highschool”

Here is an example of one of the actual prompts used in @paultrillo’s Sora Music video

It’s so remarkably long that it literally won’t fit in a single post, so I’ll split it into a thread:

💬 continuous shot moving forward zooming through time, with a view of 1980s highschool…
— Nick St. Pierre (@nickfloats) May 17, 2024

“i try not to think about competitors too much, but i cannot stop thinking about the aesthetic difference between midjourney and dalle

i try not to think about competitors too much, but i cannot stop thinking about the aesthetic difference between midjourney and dalle pic.twitter.com/zwCPnCsqrY
— Nick St. Pierre (@nickfloats) May 16, 2024

OpenAI

“GPT-4o is a huge step forward for image generation. Not only is it amazing at rendering text and following captions, it also provides a very natural way to iteratively edit and compose visual concepts. 1/8

GPT-4o is a huge step forward for image generation. Not only is it amazing at rendering text and following captions, it also provides a very natural way to iteratively edit and compose visual concepts. 1/8 pic.twitter.com/mSgynXj1XG
— Alex Nichol (@unixpickle) May 13, 2024

Examples of GPT-4o image abilities:

“One use case I’m excited about is telling a story with images. In this example, we use the model to create a character and then immerse her in a visually-consistent, fictional world. 2/8 https://t.co/gP7uMWUkoL” / X

One use case I'm excited about is telling a story with images. In this example, we use the model to create a character and then immerse her in a visually-consistent, fictional world. 2/8 pic.twitter.com/gP7uMWUkoL
— Alex Nichol (@unixpickle) May 13, 2024

“Speaking of consistent characters, how about becoming movie stars? Here, the model is able to depict me and @gabeeegoooh as detectives in a stunning movie poster. Note how our names and the movie title are rendered properly! 3/8 https://t.co/epVA0M5joy” / X

Speaking of consistent characters, how about becoming movie stars? Here, the model is able to depict me and @gabeeegoooh as detectives in a stunning movie poster. Note how our names and the movie title are rendered properly! 3/8 pic.twitter.com/epVA0M5joy
— Alex Nichol (@unixpickle) May 13, 2024

“The model can also compose ideas across images, e.g. here it is able to add the OpenAI logo to a photo of a coaster. 4/8 https://t.co/Ad92Apmy8m” / X

The model can also compose ideas across images, e.g. here it is able to add the OpenAI logo to a photo of a coaster. 4/8 pic.twitter.com/Ad92Apmy8m
— Alex Nichol (@unixpickle) May 13, 2024

“A neat thing about this model is that it can produce multiple consistent views of a 3D object, allowing us to reconstruct 3D models of complex shapes. 5/8 https://t.co/C4cdlgWBW3” / X

A neat thing about this model is that it can produce multiple consistent views of a 3D object, allowing us to reconstruct 3D models of complex shapes. 5/8 pic.twitter.com/C4cdlgWBW3
— Alex Nichol (@unixpickle) May 13, 2024

“By generating multiple images and context, and leveraging the model’s amazing text rendering capabilities, we can do neat things like create custom fonts. 6/8 https://t.co/jsyEPgTy1t” / X

By generating multiple images and context, and leveraging the model's amazing text rendering capabilities, we can do neat things like create custom fonts. 6/8 pic.twitter.com/jsyEPgTy1t
— Alex Nichol (@unixpickle) May 13, 2024

“In this example, we can see just how well the model does at rendering a complex image. It uses two separate chat bubbles for the messages, renders a ton of text correctly all at once, and almost perfectly depicts a QWERTY keyboard. 7/8 https://t.co/Y3h6O1xPw6” / X

In this example, we can see just how well the model does at rendering a complex image. It uses two separate chat bubbles for the messages, renders a ton of text correctly all at once, and almost perfectly depicts a QWERTY keyboard. 7/8 pic.twitter.com/Y3h6O1xPw6
— Alex Nichol (@unixpickle) May 13, 2024

Heads up! You’ve scrolled to the end of this category. There may have been just one or two links (above), so go back up and double check to be sure you didn’t quickly scroll down past it.

Be Sure To Read This Week’s Main Post:

This week’s executive overview and top links are here:

AI News #33: Week Ending 05/17/2024 with Executive Summary and Top 58 Links

The post you just read is an deep dive extension of my weekly newsletter, This Week In AI, an executive summary of the top things to know in AI. Each week, I create an accessible overview for laypeople to feel confident they are conversant with the week’s AI developments. I include a curated list of must-click links of the week, to offer everyone a hands-on opportunity to explore the most intriguing updates in artificial intelligence across various categories, including robotics, imagery, video, AR/VR, science, ethics, and more. Beyond the overview, I post these topic-based deeper dives (below). If you haven’t read this week’s overview, I recommend starting there.

Credits/Sources

Most of these weekly links come from just a few prolific oversharing sources. Please follow them, as they work hard to find the news each week and they make it a lot easier for me to compile.

Robert Scoble: https://x.com/Scobleizer
Ethan Mollick: https://www.linkedin.com/in/emollick/
Alan Thompson: https://lifearchitect.ai/
Theoretically Media: https://www.youtube.com/@TheoreticallyMedia
The Rundown: https://www.therundown.ai/
Bilawal Sidhu: https://twitter.com/bilawalsidhu/
TLDR: https://tldr.tech/ai
Jeremiah Owyang: https://twitter.com/jowyang
Nick St. Pierre: https://twitter.com/nickfloats
Dr. Jim Fan: https://twitter.com/DrJimFan
All About AI: https://www.youtube.com/@AllAboutAI
Marshall Kirkpatrick: https://aitimetoimpact.com/
AI News (Smol Talk): https://buttondown.email/ainews/archive/

For previous issues, please visit the archives!

Thanks for reading!

2 responses to “Imagery News: Week Ending 05/17/2024”

AI News #33: Week Ending 05/17/2024 with Executive Summary, Top 58 Links, and Helpful Visuals – Ethan B. Holland

June 12, 2024 at 12:42 am

[…] Imagery News of the Week: AI imagery covers “generative AI” image tools. This usually text-to-image, where a user enters a prompt (“a polar bear walking through NYC”) and a tool like Dalle or MidJourney generates an image in the likeness of the description. This is different than AI vision, where an AI “looks at” an image and can derive context, details, and contents. AI vision is a subset of AI called multimodality. Imagery, in this case, is for image creation and modification/editing. Adobe Photoshop’s AI tools would fall into this category. I’ll also include things like automatic masking and object removal, even though that’s in between imagery and vision… but practically speaking it fits into editing.This week’s latest AI image news: https://ethanbholland.com/2024/05/17/imagery-news-week-ending-05-17-2024/ […]

Loading…

Reply
This week's AI category cover image theme: name tags + art styles – Ethan B. Holland

June 12, 2024 at 12:27 pm

[…] an abstract name tag that reads “Imagery” –ar 5:3 –style raw […]

Loading…

Reply