Image created with gemini-3.1-flash-image-preview with claude-opus-4.7. Image prompt: Using the provided reference image, preserve everything exactly — the marigold-orange backdrop, the seated woman with closed eyes and faint smile in her purple-and-white windbreaker, the tattooed singer in the red beanie and layered red vest, the lighting, framing, and depth of field — but replace only the black handheld microphone with a chunky black vintage camera prime lens held to his mouth in the same grip and position, its front glass element glinting and rendered with seamless photographic realism. After generating the image, overlay the text “Images” in the upper-left corner of the frame in large, bold, all-caps ITC Avant Garde Gothic Pro Medium (or a near-identical geometric sans-serif if unavailable), pure white (#FFFFFF), with no date, subtitle, drop shadow, or outline. The text should be substantial in scale — taking up a meaningful portion of the upper-left area — with comfortable margin from the top and left edges, set against the negative space of the orange backdrop so it does not overlap or obscure the singer, the seated woman, or the replaced object.

One fun thing about AI is that it lets you play with interfaces and approaches to displaying information in new ways without a lot of effort. I got a an internet connected e-ink display and set it up to show me the weather as interpreted by nano banana using rotating styles.
https://x.com/emollick/status/2042350797216751802

Introducing ERNIE-Image
https://ernie.baidu.com/blog/posts/ernie-image/

Introducing TIPS v2 👀Foundational text-image encoder 📸Can be used as the base for different multimodal applications 🤗Apache 2.0 🧑‍🍳New pre-training recipes
https://x.com/osanseviero/status/2044520603647164735

Meet MAI-Image-2-Efficient. Production-ready quality, 22% faster, and 4x more efficient than MAI-Image-2. Priced almost 41% lower too. Plus 40% average lower latency than other leading models. Live now in Microsoft Foundry + MAI Playground.
https://x.com/mustafasuleyman/status/2044083984343466156

Two models, two different parts of the creative process. MAI-Image-2-Efficient is a production workhorse. Volume, speed, tight cost control for iterative workflows. MAI-Image-2 is a precision tool. Highest fidelity, final deliverables, exact details, longer/more complex text.
https://x.com/mustafasuleyman/status/2044467951429116290

We just OCR’d 27,000 arxiv papers into Markdown using an open 5B model, 16 parallel HF Jobs on L40S GPUs, and a mounted bucket. Total cost: $850 Total time: ~29 hours Jobs that crashed: 0 This now powers “”Chat with your paper”” on
https://x.com/ClementDelangue/status/2043779449322160270

Leave a Reply

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading