Images: AI News Week Ending 04/18/2025

Images: AI News Week Ending 04/18/2025

April 18, 2025

Image created with Ideogram V2. Image prompt: A vibrant spring meadow with exaggerated blooming flowers in bright colors. Hidden comically in the middle is a giant camera with multiple lenses pointing in all directions, decorated with flowers and twigs as disguise. Photo frames hang from tree branches displaying nature photos. Woodland animals pose for portraits while others work as assistant photographers. Photo editing tools float in midair. A printer spits out finished images onto the grass. The whole scene is bathed in golden sunshine with lens flares. Vibrant colors and high detail. The word “IMAGES” integrated into the scene.

Rowan Cheung on X: “The biggest change: o3 and o4-mini can now think using images as part of their reasoning process. Uploaded visuals can also be handled even if blurry or rotated — with the models able to adjust them using its own tools. https://t.co/1Z1waVkh7v” / X
https://x.com/rowancheung/status/1912561386208825751

“All of your image creations, all in one place. Introducing the new library for your ChatGPT image creations—rolling out now to all Free, Plus, and Pro users on mobile and https://x.com/OpenAI/status/1912255254512722102

““Thinking with Images” has been one of our core bets in Perception since the earliest o-series launch. We quietly shipped o1 vision as a glimpse—and now o3 and o4-mini bring it to life with real polish. Huge shoutout to our amazing team members, especially: – @mckbrando, for” / X https://x.com/jhyuxm/status/1912562461624131982

“💥 o3 and o4-mini are launching today! Both models are mind-blowing. But maybe the coolest for me has been seeing them use tools as they think. They can search, write code, and manipulate images in the chain of thought, and it’s a huge multiplier. I will never forget the first” / X https://x.com/kevinweil/status/1912554045849411847

“Introducing OpenAI o3 and o4-mini—our smartest and most capable models to date. For the first time, our reasoning models can agentically use and combine every tool within ChatGPT, including web search, Python, image analysis, file interpretation, and image generation. https://x.com/OpenAI/status/1912560057100955661

“OpenAI o3 and o4-mini are our first models to integrate uploaded images directly into their chain of thought. That means they don’t just see an image—they think with it. https://x.com/OpenAI/status/1912560060284502016

“Midjourney released V7, the latest version of its image generation model The AI brings improved quality in generations, enhanced prompt adherence, and a voice-capable Draft Mode to help users iterate over their creations https://x.com/adcock_brett/status/1911450308795904091

“One of the most exciting new features for our new V7 model is something we call “Draft Mode”. Draft mode is half the cost and 10 times the speed and it might be the best way to iterate on ideas ever. Try it with voice, think out loud and let our ideas flow like liquid dreams. https://x.com/midjourney/status/1908012965678420091

“Another beautiful example where AI truly gives you 10X to 100X in productivity. 💡 I am creating a LinkedIn carousel or LinkedIn slide in less than a minute and for FREE. And here I am using @MeetGamma. gamma[.]app 🧵 1/n One of the truly useful everyday life use cases of https://x.com/rohanpaul_ai/status/1910710714433405345

REPA-E: Unlocking VAE for End-to-End Tuning of Latent Diffusion Transformers https://end2end-diffusion.github.io/

“i think gwern would have saved us from the slop if anyone had cared to listen at the time. but once diffusion and preference finetuning took off, it was already too late” / X https://x.com/nearcyan/status/1912375152182223297

“Amazing interview with @DrYangSong, one of the key researchers we have to thank for diffusion models. The most important lesson IMO: be fearless! The community’s view on score matching was quite pessimistic at the time — he went against the grain and got it to work at scale!” / X https://x.com/sedielem/status/1911821106811679094

“W&B’s media panel just got smarter. 🧠 Now you can scroll through images, videos, and other media using any config key—like epoch, train/global_step, or your own custom choice. It’s a faster, more intuitive way to judge progress and debug your models. https://x.com/weights_biases/status/1912668063771898267

“The Mogao Reveal: Congratulations to ByteDance Seed on launching Seedream 3.0, the new leading model on the Artificial Analysis Image Leaderboard, beating out GPT-4o, HiDream-I1-Dev, and Recraft V3 Seedream 3.0 is the latest in the Seedream family of bilingual image diffusion https://x.com/ArtificialAnlys/status/1912122278722379903

“was in rome recently and, in one CoT, o3: >reasoned hard >resized the image and zoomed in >searched the internet several times >figured out my location >checked memory; deduced i was on vacation gave me a better explanation than the museum lol actually blew my mind https://x.com/aidan_mclau/status/1912560625005522975

Junfeng5/Liquid_V1_7B · Hugging Face https://huggingface.co/Junfeng5/Liquid_V1_7B

“⚡️ Massive Update Just Dropped: Phase 2.0 for Kling AI! 🎥 KLING 2.0 Master for video generation, 🖼️ KOLORS 2.0 for image generation, 🎮 Multi-Elements Editor, 🎨 Image Editing & Restyle… Kling AI 2.0 is all about empowering creators to bring meaningful stories to life — with https://x.com/Kling_ai/status/1912040247023788459

Cobra https://zhuang2002.github.io/Cobra/

InstantCharacter : Personalize Any Characters with a Scalable Diffusion Transformer Framework https://instantcharacter.github.io/

FireEdit: Fine-grained Instruction-based Image Editing via Region-aware
Vision Language Model https://arxiv.org/pdf/2503.19839