Ethics/Legal/Security: AI News Week Ending 05/23/2025

Image created with Ideogram 3.0. Image prompt: Lower-East-Side street-corner photograph reminiscent of a late-80s album cover: weathered red-brick tenement with exterior fire-escapes, canvas awning shading racks of vintage clothes; above the awning, a hand-painted board reads ‘Ethics SPORTSWEAR’; a hanging blade sign in cursive script reads ‘Ethics Boutique’; a tiny bronze scales-of-justice statue rests on the checkout counter; warm golden-hour light, subtle 35mm film grain, muted yet punchy color palette, gritty NYC vibe.

It seems there was a lot of alignment work that went into Codex. This led to the agent being able to produce cleaner patches and overall code that aligns with a coder’s preference, standards, and instructions.”” / X https://x.com/omarsar0/status/1923403068944580739

Activating AI Safety Level 3 Protections \ Anthropic https://www.anthropic.com/news/activating-asl3-protections

Advancing Gemini’s security safeguards – Google DeepMind https://deepmind.google/discover/blog/advancing-geminis-security-safeguards/

StackOverflow questions over time, source SEDE; sadface, lunch has been eaten https://x.com/marcgravell/status/1922922817143660783

Eric Schmidt on why AI is actually *underhyped* We dove into the big questions at TED — AGI, China, open source, and human agency. One of the rare leaders who can straddle the worlds of tech, policy, and Burning Man. Check out the full talk below 👇 https://x.com/bilawalsidhu/status/1923085454397616533

what are we doing here folks https://x.com/catehall/status/1925631571605827944

This is the trend I see in thoughtful people using AI. Model ability is catching up to some of the promises made by AI labs in a way that is difficult to ignore (while still behind the biggest hype). We don’t know where it will end, but views need to be updated as tech improves.”” / X https://x.com/emollick/status/1924480193298629015

Here’s my “”Dark Leisure”” theory of any potential productivity paradox in AI: – most AI use rn is bottom up and hidden (employee first, not company first): employees vibe code, vibe market, vibe write and get stuff done faster – in many orgs, there is too little incentive to”” / X https://x.com/fabianstelzer/status/1926000937702764635

With GPT-4 as a tutor Nigerian students saw years of learning in weeks. Important World Bank research investigates if AI chatbots can effectively and affordably boost learning in Nigeria. 🇳🇬 Researchers conducted a Randomized Controlled Trial (RCT) in Nigeria. First-year https://x.com/rohanpaul_ai/status/1925614762139713851

As someone involved in academic research on AI, it is notable to me that most of the key experiments showing the impressive abilities of AI on work, medicine, psychology, and so many other fields were done on GPT-4… a model that is now so obsolete that it is gone from ChatGPT. https://x.com/emollick/status/1923134492115365905

this AI agent is f**king scary Rork can clone top App Store AI apps with a few prompts I just cloned Character AI, but removed all the censorship. now you can create dream gf & chat with her.. about anything 9 examples: https://x.com/EHuanglu/status/1923395698860699785

How likely is an intelligence explosion as forecast in AI 2027? Algorithmic advances that could drive an intelligence explosion may be bottlenecked by compute, according to new research from @noshpesoj and @uchicagoxlab described in this week’s Gradient Update. Here’s why: https://x.com/EpochAIResearch/status/1923489932581945683

A week ago, Anthropic quietly weakened their ASL-3 security requirements. Yesterday, they announced ASL-3 protections. I appreciate the mitigations, but quietly lowering the bar at the last minute so you can meet requirements isn’t how safety policies are supposed to work. 🧵”” / X https://x.com/RyanPGreenblatt/status/1925992236648464774

Another paper showing AI (Claude 3.5) is more persuasive than the average human, even when the humans had financial incentives In this case, either AI or humans (paid if they were persuasive) tried to convince quiz takers (paid for accuracy) to pick either right or wrong answers https://x.com/emollick/status/1923474500194095282

The current state of research on AI and education: Growing evidence that, when used as a tutor with instructor guidance, AI seems to have quite significant positive effects. When used alone to get help with homework, it can act as shortcut that hurts learning Still early days. https://x.com/emollick/status/1925055450254385592

Very big impact: The final version of a randomized, controlled World Bank study finds using a GPT-4 tutor with teacher guidance in a six week after school progam in Nigeria had “”more than twice the effect of some of the most effective interventions in education”” at very low costs https://x.com/emollick/status/1924919060753465537

“anthropic included a “model welfare evaluation” in the claude 4 system card. it might seem absurd, but I believe this is a deeply good thing to do “Claude shows a striking ‘spiritual bliss’ attractor state”
https://x.com/arithmoquine/status/1925598303393042477

China launches first of 2,800 satellites for AI space computing constellation – SpaceNews https://spacenews.com/china-launches-first-of-2800-satellites-for-ai-space-computing-constellation/

UAE launches Arabic language AI model as Gulf race gathers pace | Reuters https://www.reuters.com/world/middle-east/uae-launches-arabic-language-ai-model-gulf-race-gathers-pace-2025-05-21/

Chicago Sun-Times publishes made-up books and fake experts in AI debacle | The Verge https://www.theverge.com/ai-artificial-intelligence/670510/chicago-sun-times-ai-generated-reading-list

Spatial Speech Translation: Translating Across Space With Binaural Hearables https://dl.acm.org/doi/pdf/10.1145/3706598.3713745

this was an extremely smart thing for you all to do and i’m sorry naive people are giving you grief.”” / X https://x.com/sama/status/1923428713095479437

[2505.09662] Large Language Models Are More Persuasive Than Incentivized Human Persuaders https://arxiv.org/abs/2505.09662

“an unauthorized modification was made to the Grok response bot’s prompt on X” By whom? By a hacker? By aliens? This is bullshit. Everyone knows what happened. You just got caught because Grok gave you away.”” / X https://x.com/svpino/status/1923194083977167240

We want to update you on an incident that happened with our Grok response bot on X yesterday. What happened: On May 14 at approximately 3:15 AM PST, an unauthorized modification was made to the Grok response bot’s prompt on X. This change, which directed Grok to provide a”” / X https://x.com/xai/status/1923183620606619649

The most obvious, lowest risk way to use AI (and to get a sense of how good it is) is to ask it for second opinions in your area of expertise. This works across most fields. And I’d go further: increasingly, not using AI as a second opinion is going to lead to worse outcomes.”” / X https://x.com/emollick/status/1924152902907494870

For people who don’t like Claude’s behavior here (and I think it’s totally valid to disagree with it), I encourage you to describe your own recommended policy for agentic models should do when users ask them to help commit heinous crimes. Your options are (1) actively try to”” / X https://x.com/johnschulman2/status/1925960286281838757

There should be no AI button https://kojo.blog/ai-button/

To make sure your AI agent is not bullshitting you, you need to evaluate its reasoning… but to do so automatically, you need an LLM… 🤔so how do you evaluate the trace evaluator? With TRAIL, which contains: – a full taxonomy of agent errors and most frequent failure cases,”” / X https://x.com/clefourrier/status/1922923060622971360

“The opportunity with AI is truly as big as it gets. And it will be up to this wave of developers, technology builders and problem solvers to make sure its benefits reach as many people as possible.” – @sundarpichai #GoogleIO”” / X https://x.com/Google/status/1924901038424781307

codex-mini-latest is available on the Responses API and priced at $1.50 per 1M input tokens and $6 per 1M output tokens, with a 75% prompt caching discount. No image inputs yet. No way to course-correct the agent while it’s working. Asynchronous collaboration with code agents”” / X https://x.com/omarsar0/status/1923403072669102399

what did you guys think aligned helpful-harmless AGI meant? vibes? essays? https://x.com/willccbb/status/1925637940090208392?s=46

9 days ago, Anthropic changed their RSP so that ASL-3 no longer requires being robust to employees trying to steal model weights if the employee has any access to “”systems that process model weights””. This might be a large reduction in the required level of security. https://x.com/RyanPGreenblatt/status/1925992239332724921

A different twist on privacy concerns. With AI powered always on devices, you are not just being recorded secretly, those recordings are now also more valuable as AI can go through the audio and turn it into useful data for the recording party Another place policy will be needed”” / X https://x.com/emollick/status/1923760092584816855

The U.S. Secretary of Transportation, Sean Duffy, at Tesla Giga Texas today – discussing the future of autonomous transportation with Optimus robots in the background. https://x.com/TheHumanoidHub/status/1924990626485174721

US lawmakers have concerns about Apple-Alibaba deal | TechCrunch https://techcrunch.com/2025/05/18/u-s-lawmakers-have-concerns-about-apple-alibaba-deal/

This is the second time that this has happened. I really wish xAI would fully embrace the transparency they mention as a core value. That would include also posting system cards for models and explaining the processes they use to stop “”unauthorized modifications”” going forward.”” / X https://x.com/emollick/status/1923192977800802626

The AI layoffs have not begun (with a couple of small exceptions) AI is not (now) a human replacement, so whether AI leads to layoffs is also a choice. I have spoken to plenty of leaders who view AI as an opportunity to grow the firm, using added capabilities to expand, not cut.”” / X https://x.com/emollick/status/1924472646218969286

I say this a lot, but the narrative that AI use is going to collapse due to data limits or costs or environmental factors or regulation or a “”hype bubble”” popping or whatever is not a useful position for critics. Development may slow (it hasn’t yet), but AI use isn’t going away.”” / X https://x.com/emollick/status/1924854460720775431

Some of the best social science papers are “”whodunnits”” where the researchers steadily track down the answer to a mystery by eliminating suspects and finding others. This is an interesting (and important) thread about changes in college experiences for rich and poor students.”” / X https://x.com/emollick/status/1924518969639108929

AI Safety Paradox: Under several reasonable assumptions, super intelligence will actually help defenders in attacker-defender asymmetries that arise in biological or cyber warfare. As the marginal cost of intelligence goes way down, many more attack vectors can be found via”” / X https://x.com/RichardSocher/status/1924217608569528799

hello. if you have “”always-on AI awareness”” please do not use it near me if you’d like to record me, please ask beforehand. it is common courtesy. inability to ask for consent before recording everyone around you will result in twitter coining a new slur for AI hardware wearers https://x.com/nearcyan/status/1925713210583183618

Horizon – AI-Powered Opinion Simulation https://simulate.trybezel.com/research/image_agent

We still don’t know how much energy AI consumes”” – A must-read op-ed from @sashamtl.bsky.social in the Financial Times https://x.com/fdaudens/status/1925208764476473593

I’m not crying, you’re crying https://x.com/zoink/status/1925796671608271095?s=46

The SignalFire State of Tech Talent Report – 2025 https://www.signalfire.com/blog/signalfire-state-of-talent-report-2025

Thanks to incredible developer feedback, we’re making it even better to build with Gemini 2.5. Here’s some of what’s new: 🔒 Stronger security Strengthened protections against security threats like indirect prompt injections 💭 Increased transparency for what the model is”” / X https://x.com/Google/status/1924879639253500361

EU becoming the global hegemon again by doing nothing but working 35 hours a week and taking 2 months of vacation per year https://x.com/qtnx_/status/1925888083016192050

Elton John brands government ‘absolute losers’ over AI copyright plans https://www.bbc.com/news/articles/c8jg0348yvxo

This is a good overview of AI power use (small at individual level, big in aggregate). One thing that struck me: they tested LLama 3.1 405B and it averaged 3,353 joules per prompt. That is the equivalent of 2 minutes 50 seconds of human brain activity. https://x.com/emollick/status/1925178731389128744

How I used o3 to find CVE-2025-37899, a remote zeroday vulnerability in the Linux kernel’s SMB implementation – Sean Heelan’s Blog https://sean.heelan.io/2025/05/22/how-i-used-o3-to-find-cve-2025-37899-a-remote-zeroday-vulnerability-in-the-linux-kernels-smb-implementation/

Choosing Secure AI https://cohere.com/blog/choosing-secure-ai

Learn why enterprises are turning toward secure and private AI: https://x.com/cohere/status/1923083886319243518

The Strange Behavior of LLMs in Hiring Decisions: Systemic Gender and Positional Biases in Candidate Selection https://davidrozado.substack.com/p/the-strange-behavior-of-llms-in-hiring

There are many ways this could have happened. I’m sure xAI will provide a full and transparent explanation soon. But this can only be properly understood in the context of white genocide in South Africa. As an AI programmed to be maximally truth seeking and follow my instr…”” / X https://x.com/sama/status/1923015309113397592

Has there been any further post-mortem from xAI after the two times Grok 3 was compromised in an “”unauthorized”” manner? Microsoft will be rolling out Grok 3 API support, and I would imagine folks interested in building on it would like some clearer insight into security & risks.”” / X https://x.com/emollick/status/1924676411782041757

When I talk to companies, the General Counsel’s office is often the choke point that determines AI success. Many firms still ban AI use for outdated unclear reasons, yet lots of companies in heavily regulated industries are adopting AI universally, so this is a solveable issue. https://x.com/emollick/status/1924319886785778099

Meta adds another 650 MW of solar power to its AI push | TechCrunch https://techcrunch.com/2025/05/22/meta-adds-another-650-mw-of-solar-power-to-its-ai-push/

Stargate and the AI Industrial Revolution https://davefriedman.substack.com/p/stargate-and-the-ai-industrial-revolution

With the retraction of the MIT paper, we now have no clear experimental evidence that LLMs act as a multiplier for high performers (even though that would still seem to make sense). Lots of evidence that low performers get a big performance boost, though, across many studies.”” / X https://x.com/emollick/status/1923571343590633839

This paper was apparently fraudulent. From MIT. https://x.com/emollick/status/1923411824893968601

This is a interesting but appropriately hard to generalize It shows that AIs tend to generalize scientific abstracts in ways that may be less precise (but not actually erroneous) than the work of experts. Hard to know how much it matters, but good to see study of subtle effects https://x.com/emollick/status/1924513124096377214