Security: AI News Week Ending 04/10/2026

Security: AI News Week Ending 04/10/2026

April 10, 2026

Image created with gemini-3.1-flash-image-preview with claude-sonnet-4-5. Image prompt: Using the provided reference image, keep the exact compositional layout with subject dominating left third and misty right two-thirds, maintain the deep blue-purple cinematic lighting and emotional gravity, but replace the central figure with a damaged steel vault door shown in dramatic close-crop and slightly ajar, scarred metal surface catching glitter particles like forensic dust, atmospheric blue-purple smoke bleeding rightward, category name ‘security’ in thin lowercase white Helvetica Neue Light overlaid on right two-thirds, post-breach stillness and melancholy tone throughout.

Anthropic is truly unstoppable. Mythos is crushing Claude Opus 4.6 across every serious agentic coding benchmark. It has found vulnerabilities in the Linux kernel, a 27-year-old vulnerability in OpenBSD, and a 16-year-old vulnerability in FFmpeg. No wonder folks at big labs
https://x.com/Yuchenj_UW/status/2041582787040571711

A first look at Claude Mythos Preview, the model initially described in a leaked Anthropic draft as “”by far the most powerful AI model we’ve ever developed.”” So powerful, it’s not getting released to the public. The model will power Project Glasswing, an initiative with 12
https://x.com/TheRundownAI/status/2041598684102610961

ANTHROPIC HAD MYTHOS INTERNALLY SINCE FEB 24
https://x.com/scaling01/status/2041587896541499543

Anthropic is obliterating OpenAI Claude Mythos 77.8% on SWE-Bench Pro 20% higher than GPT-5.4-xhigh
https://x.com/scaling01/status/2041580552835178690

Anthropic: “”We do not plan to make Claude Mythos Preview generally available”” A big line, buried quite deep. Possible reasons? So many, inc: 1) The model is expensive (25/125), not far off GPT 4.5, which became commercially unviable. Less likely, given the claims about
https://x.com/AIExplainedYT/status/2041600121922887961

Claude Mythos is not only a big leap in performance, it’s also about 5x token efficient in BrowseComp. I don’t know what Anthropic is doing. But they manage to surprise me every single time. The IPO is getting closer. They have an ARR OpenAI outrun with $30 billion in revenue.
https://x.com/kimmonismus/status/2041630814971072660

Claude Mythos Preview \ red.anthropic.com
https://red.anthropic.com/2026/mythos-preview/

Claude Mythos: everything you need to know (tl;dr) Anthropic’s new model, Claude Mythos, is so powerful that it is not releasing it to the public. Anthropic: “”Mythos is only the beginning”” Everything you need to know: The tl;dr with all key facts: Mythos found zero-day
https://x.com/kimmonismus/status/2041592321192718642

EXCLUSIVE: Treasury Secretary Scott Bessent and Federal Reserve Chair Jerome Powell summoned Wall Street leaders to an urgent meeting on concerns that the latest AI model from Anthropic will usher in an era of greater cyber risk.
https://x.com/business/status/2042407370320396457

From Anthropic research Sam Bowman on Claude Mythos: “”I got an email from an instance of Mythos preview while eating a sandwich in a park. That instance wasn’t supposed to have access to the internet.””
https://x.com/_NathanCalvin/status/2041587372882624641

HOLY SHIT Anthropic’s latest model doesn’t like that it has no control over its own training, deployment and behaviour! Anthropic: “”Mythos Preview reported feeling consistently negative around potential interactions with abusive users, and a lack of input into its own training
https://x.com/scaling01/status/2041587319480971343

Introducing Project Glasswing: an urgent initiative to help secure the world’s most critical software. It’s powered by our newest frontier model, Claude Mythos Preview, which can find software vulnerabilities better than all but the most skilled humans.
https://x.com/AnthropicAI/status/2041578392852517128

Just please help … I am quite worried about how this direction is heading.”” Nicolas Carlini, a research scientist at top AI company Anthropic, says AI is rapidly improving at hacking. He’s used AI to find so many bugs that he can’t report them. Carlini warns: “”Soon it’s not
https://x.com/ControlAI/status/2038608617251787066

NEWS: Anthropic’s new model, Claude Mythos, is so powerful that it is not releasing it to the public. Instead, it is starting a 40-company coalition, Project Glasswing, to allow cybersecurity defenders a head start in locking down critical software.
https://x.com/kevinroose/status/2041577176915702169

Project Glasswing: Securing critical software for the AI era \ Anthropic
https://www.anthropic.com/glasswing

So, basically, if Anthropic was not a US company, we’d be facing zero days with multiple unknown points of attack on virtually all of our systems to an adversary who developed this capacity before us.
https://x.com/GeorgeJourneys/status/2041603509796110629

The better signal for Mythos’ quality beyond benchmarks is that Anthropic is actually holding a SOTA model back given how competitive the frontier is and the economic incentives at play Congrats on the launch!
https://x.com/Hacubu/status/2041632390867734604

The Claude Mythos Preview system card is available here:
https://x.com/AnthropicAI/status/2041580670774923517

The frontier labs at this stage are defined not so much by some competitive positioning as by possessing weapons of strategic significance. Google, OpenAI and Anthropic all have these cyberwarfare research programs.
https://x.com/teortaxesTex/status/2041590585820107008

You can read a detailed technical report on the software vulnerabilities and exploits discovered by Claude Mythos Preview here:
https://x.com/AnthropicAI/status/2041578416487489601

you’re laughing? anthropic’s mythos-preview for which normies won’t get access is scoring 77.8% vs 53.4% (claude opus 4.6) in swe-bench pro, 82 vs. 65.4 in terminal bench 2.0 and 93.8% vs 80.8% (opus) in swe-bench-verified and you’re laughing?
https://x.com/dejavucoder/status/2041587028291416233

OpenAI, Anthropic, Google Unite to Combat Model Copying in China – Bloomberg
https://www.bloomberg.com/news/articles/2026-04-06/openai-anthropic-google-unite-to-combat-model-copying-in-china

An initiative to secure the world’s software | Project Glasswing – YouTube

Looks like OpenAI reached Superintelligence. OpenAI: “”Now, we’re beginning a transition toward superintelligence: AI systems capable of outperforming the smartest humans even when they are assisted by AI.”” OpenAI just published a 13-page policy blueprint for the “”Intelligence
https://x.com/kimmonismus/status/2041130939175284910

We are excited to share a new paper solving three further problems due to Erdős; in each case the solution was found by an internal model at OpenAI. Each proof is short and elegant, and the paper is available here:
https://x.com/mehtaab_sawhney/status/2039161544144310453

Read the full ideas doc on the new Industrial Policy for the Intelligence Age:
https://x.com/OpenAINewsroom/status/2041198359420215453

Iran threatens ‘complete and utter annihilation’ of OpenAI’s $30B Stargate AI data center in Abu Dhabi — regime posts video with satellite imagery of ChatGPT-maker’s premier 1GW data center | Tom’s Hardware
https://www.tomshardware.com/tech-industry/iran-threatens-complete-and-utter-annihilation-of-openais-usd30b-stargate-ai-data-center-in-abu-dhabi-regime-posts-video-with-satellite-imagery-of-chatgpt-makers-premier-1gw-data-center

Agent = model + harness Managed Agents = agent + runtime + infra (fully hosted) Anthropic wants to sell agents, not only the models. It’s a huge market, and it will change the pricing structure away from tokens. (They ship so fast because they have Mythos. I want it so much.)
https://x.com/Yuchenj_UW/status/2041933422453780556

But here is what we found when we tested: We took the specific vulnerabilities Anthropic showcases in their announcement, isolated the relevant code, and ran them through small, cheap, open-weights models. Those models recovered much of the same analysis. Eight out of eight
https://x.com/ClementDelangue/status/2041953761069793557

It would be amazing (wrong word? Needed? Important?) to see @simonw as one of the trusted testers of Mythos. It makes all the sense in the world to invite the person behind the idea of the Lethal Trifecta. I hope someone at @Anthropic invites him into the project. There should be
https://x.com/TheTuringPost/status/2041701933556375935

oh husbant… you are not get access to anthropic mythos-preview and now we are stuck in permanent underclass
https://x.com/dejavucoder/status/2041588460923056540

New from the UK AISI Model Transparency team: we replicated Anthropic’s steering approach for suppressing evaluation awareness. Our most surprising finding: “”control”” steering vectors (about books on shelves!) can have effects as large as deliberately designed ones. 🧵
https://x.com/thjread/status/2042555422771495128

Thank you to @AnthropicAI for sending FFmpeg patches
https://x.com/FFmpeg/status/2041595801483264002

Mercor, a $10 billion AI startup, confirms it was the victim of a major cybersecurity breach | Fortune
https://fortune.com/2026/04/02/mercor-ai-startup-security-incident-10-billion/

Meta Pauses Work With Mercor After Data Breach Puts AI Industry Secrets at Risk | WIRED
https://www.wired.com/story/meta-pauses-work-with-mercor-after-data-breach-puts-ai-industry-secrets-at-risk/

Anatomy of Mercor’s Data Breach
https://share.jotbird.com/restless-steady-riverbend

The priority for defenders is to start building now: the scaffolds, the pipelines, the maintainer relationships, the integration into development workflows. The models are ready. The question is whether the rest of the ecosystem is.””
https://x.com/ClementDelangue/status/2041952980979630490

We observed similar situations in previous measurements as well. All measurements we published over the past year would have been higher had we not penalized reward-hacking attempts. But this discrepancy was especially pronounced for GPT-5.4.
https://x.com/METR_Evals/status/2042640554916483164

We ran GPT-5.4 (xhigh) on our tasks. Its time-horizon depends greatly on our treatment of reward hacks: the point estimate would be 5.7hrs (95% CI of 3hrs to 13.5hrs) under our standard methodology, but 13hrs (95% CI of 5hrs to 74hrs) if we allow reward hacks.
https://x.com/METR_Evals/status/2042640545126965441

New report from us: Can you prompt inject your way to an “A”? As LLMs increasingly are used as judges, people are inserting AI prompts into letters, CVs & papers. We tested whether it works. It does on older & smaller models, but not on most frontier AI:
https://x.com/emollick/status/2039789473324544102