Image created with gemini-2.5-flash-image with claude-sonnet-4-5. Image prompt: Minimalist luxury stable interior with empty stall, golden nameplate reading LLAMA on polished brass door, pristine marble floors, single dramatic spotlight, cold blue moonlight through windows, untouched hay and water, architectural emptiness, cinematic composition with deep shadows and negative space, bold white sans-serif text LLAMA overlaid across image

At this point, papers testing whether AI can or cannot do something should try to test the strongest case, as well as a default. It is fine to say Llama 2 failed, but did a serious attempt to use GPT-5.1 Thinking in an agentic harness work? It would help better map the frontier.”” / X https://x.com/emollick/status/1994913383871586563

Leave a Reply

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading