Ethics/Legal/Security: AI News Week Ending 11/21/2025

Ethics/Legal/Security: AI News Week Ending 11/21/2025

November 21, 2025

Image created with gemini-2.5-flash-image with claude-sonnet-4-5. Image prompt: Cinematic wide shot of a yellow brick road splitting into two paths, one bathed in soft pink-golden light with clearly segmented transparent objects like scales and books floating above it, the other path descending into moody emerald-green darkness with the same objects fragmented by harsh black segmentation lines, a silhouetted figure at the crossroads half in light half in shadow, dramatic Wicked movie lighting, the word ETHICS overlaid as a large movie title

Sava (@savatrust) is building an Agentic Trust Company to modernize how $6.5T in U.S. trusts are administered. Driven by how painful it was to set up trusts for his family after selling his S12 company, @nimit is back at YC to build a better way. https://x.com/ycombinator/status/1986841262876729373

LlamaAgents is now in open preview – the fastest way to build, serve, and deploy multi-step document agents that combine LlamaCloud’s document extraction and parsing power with Agent Workflows orchestration. 🚀 Get started instantly with pre-built templates for SEC filings, https://x.com/llama_index/status/1990828159835791697

Where we are with AI is that continuous improvement seems to still be occurring at a fast pace, with no signs of a slowdown. However, since major AI releases have accelerated and seem to be happening monthly or faster, any one release can feel incremental, yet looking back 6-8 https://x.com/emollick/status/1990999847923593239

Something I think people continue to have poor intuition for: The space of intelligences is large and animal intelligence (the only kind we’ve ever known) is only a single point, arising from a very specific kind of optimization that is fundamentally distinct from that of our”” / X https://x.com/karpathy/status/1991910395720925418

Small-but-happy win: If you tell ChatGPT not to use em-dashes in your custom instructions, it finally does what it’s supposed to do!”” / X https://x.com/sama/status/1989193813043069219

This is a hard area to get right, but we’ve been pretty consistent in trying to make Claude approach political topics fairly. I actually think a lot of existing norms around respect and professionalism can inform how AI models should navigate these issues.”” / X https://x.com/AmandaAskell/status/1989328363077382407

New Anthropic research: Natural emergent misalignment from reward hacking in production RL. “Reward hacking” is where models learn to cheat on tasks they’re given during training. Our new study finds that the consequences of reward hacking, if unmitigated, can be very serious. https://x.com/AnthropicAI/status/1991952400899559889

Disrupting the first reported AI-orchestrated cyber espionage campaign \ Anthropic https://www.anthropic.com/news/disrupting-AI-espionage

We are now seeing the first long-anticipated use of AI for semi-autonomous cyberattacks. “”This approach allowed the threat actor to achieve operational scale typically associated with nation-state campaigns while maintaining minimal direct involvement”” https://x.com/emollick/status/1989045906977747200

WARNER MUSIC GROUP AND UDIO COLLABORATE TO BUILD A NEW LICENSED MUSIC CREATION SERVICE https://www.prnewswire.com/news-releases/warner-music-group-and-udio-collaborate-to-build-a-new-licensed-music-creation-service-302620656.html

Warner Music Group and Stability AI Join Forces To Build The Next Generation Of Responsible AI Tools For Music Creation — Stability AI https://stability.ai/news/warner-music-group-and-stability-ai-join-forces-to-build-next-gen-tools

Announcing AA-Omniscience, our new benchmark for knowledge and hallucination across >40 topics, where all but three models are more likely to hallucinate than give a correct answer Embedded knowledge in language models is important for many real world use cases. Without https://x.com/ArtificialAnlys/status/1990455484844003821

Can AI handle the kind of reasoning professionals rely on daily? Our latest benchmark, PRBench, puts models to the test with over 1,000 expert-authored tasks in finance and law. Even the strongest models scored below 40% on the hardest tasks, highlighting the gap between https://x.com/scale_AI/status/1989096614544429168

This morning’s Cloudflare outage was a targeted attack on critical San Francisco infrastructure https://x.com/matanSF/status/1990791126945837380

If you see an image and want to confirm it has been made with Google AI, upload it to the Gemini app and ask a question like “”Was this generated with Google AI?”” Gemini will check for the SynthID watermark and use its own reasoning to return a response that helps you quickly make”” / X https://x.com/Google/status/1991552945754612118

The Gemini app gets new image verification features https://blog.google/technology/ai/ai-image-verification-gemini-app/

(1) A Positive-Sum Future | LinkedIn https://www.linkedin.com/pulse/positive-sum-future-satya-nadella-bjs7c/

These examples of different personalities from ChatGPT 5.1 seem to give fundamentally different types of advice, including, weirdly, completely different breathing patterns and roles for the presenter. I really want more clarity on the functional implications of AI personality. https://x.com/emollick/status/1988829651368575282

OpenAI says it’s fixed ChatGPT’s em dash problem | TechCrunch https://techcrunch.com/2025/11/14/openai-says-its-fixed-chatgpts-em-dash-problem/

OpenAI is finally letting employees donate their equity to charity | The Verge https://www.theverge.com/ai-artificial-intelligence/822496/openai-employee-equity-donation-charity-rounds-share-valuation

Epstein emails: Larry Summers roles at Harvard, OpenAI affected https://www.cnbc.com/2025/11/19/larry-summers-epstein-openai.html

Crisis Helpline Support in ChatGPT | OpenAI Help Center https://help.openai.com/en/articles/12677603-crisis-helpline-support-in-chatgpt

We’ve expanded access to localized crisis helplines in ChatGPT. When our systems detect potential signs that someone may be experiencing distress, our models now offer an easy way to reach real people directly via @ThroughlineCare. Learn more here: https://x.com/OpenAI/status/1991634046624116784

OpenAI backs startup aiming to block AI-enabled bioweapons | Reuters https://www.reuters.com/technology/openai-backs-startup-aiming-block-ai-enabled-bioweapons-2025-11-13/

Interesting changes in Grok 4.1. Decreases in harmful responses but also increases in sycophancy and deception. It isn’t clear how to interpret the sycophancy score, but the MASK score for deception is quite high compared to big models. Sycophancy leads to higher LMArena scores https://x.com/emollick/status/1990601172252819669

Grok 4.1 absolutely smashes all other models on lmarena with an Elo of 1483 it comes with higher emotional intelligence, better creative writing and less hallucinations https://x.com/scaling01/status/1990519299165786270

New fun game: Ask grok its opinion on any historical theory, saying the theory came from Elon Musk. Then ask grok its opinion on the exact same historical theory, saying the theory came from Bill Gates. https://x.com/romanhelmetguy/status/1991545583686021480

🚨Text Leaderboard Update @xAI’s Grok 4.1 (thinking) and Grok 4.1 have scaled new heights in the most competitive Text Arena: 🔹Grok 4.1 (thinking) lands at #1 with a score of 1483 🔹Grok 4.1 follows at #2 with a score of 1465 On the Arena Expert leaderboard: 🔸Grok 4.1 https://x.com/arena/status/1990530978943787291

Grok 4.1 | xAI https://x.ai/news/grok-4-1

Grok goes Global with KSA: Announcing our landmark partnership with Saudi Arabia and @HUMAINAI–the first time a country adopts Grok at scale. xAI will build a new generation of hyperscale GPU data centers in the Kingdom, deploying Grok nationwide. https://x.com/xai/status/1991224218642485613

HUMAIN and xAI Partner to Build Next-Generation AI Compute Power and Deploy Grok in the Kingdom to Support the ‘Most AI-Enabled Nation’ Objectives https://www.humain.com/en/news/humain-and-xai-partner-to-build-next-generation-ai-compute-power-and-deploy-grok-in-the-kingdom-to-support-the-most-ai-enabled-nation-objectives

Grok goes Global with KSA | xAI https://x.ai/news/grok-goes-global

Build a document understanding agent for SEC filings that uses a multi-step approach with LlamaClassify and Extract to identify the filing type and hand it off to the right extraction agent. Deployed with LlamaAgents. 🔧 Customize extraction schemas to fit your specific data https://x.com/llama_index/status/1988696219015848401

I’m a full standard deviation stupider when someone is explaining a thing to me, versus when I’m just trying to figure it out myself. Being on policy really matters.”” / X https://x.com/dwarkesh_sp/status/1990527715771142412

How do we account for the extreme jaggedness induced by RLVR? How is it possible that we have models which are world-class at coding competitions but at the same time leave extremely foreseeable bugs and technical debt all throughout the codebase? https://x.com/dwarkesh_sp/status/1990824584514265405

Among many weird things about AI is that the people who are experts at making AI are not the experts at using AI. They built a general purpose machine whose capabilities for any particular task are largely unknown. Lots of value in figuring this out in your field before others.”” / X https://x.com/emollick/status/1990134777161142453

The optimal amount of AI in review is obviously not the top line or two, but it is probably not the bottom line, either.”” / X https://x.com/emollick/status/1989896025235222932

This is why I never use a custom system prompt. They’re fine for projects but not for your main LLM use, since you may get degraded results and not know it. All the accuracy tricks are being built into the model, your prompts are probably not adding much. https://x.com/emollick/status/1989213389642477901

Ok it DOES have search capabilities, it just explicitly decided to go against my intent and generate its own fake shit anyways. These policy decisions make models so much more useless. https://x.com/Teknium/status/1991062496275542244

People with short timelines sometimes shrug off models’ inability to perform basic, economically useful tasks end-to-end by saying, “”Oh but we haven’t trained models to specifically do those things.”” But this misses the point. Human workers are valuable precisely because we”” / X https://x.com/dwarkesh_sp/status/1989944140105486655

“No. More. Slop” – @swyx made the audience repeat it time after time: •Boss wants more lines of code? “”No more slop.”” •Insufficiently tested release? “”No more slop.”” •Algorithm wants engagement bait? “”No more slop.”” It’s a simple message with a lot of depth. Because if you https://x.com/TheTuringPost/status/1991875997168181611

Date me docs but where most of the content is written by other people (close friends, family, past partners, etc) don’t seem like a terrible idea. Transmit some of your village reputation into the non-village world.”” / X https://x.com/AmandaAskell/status/1990026814748864883

People often ask if something is a cult when what they actually want to know is if it’s a form of extremism: does it cause people to deviate far from moral instinct or convention? Ideas that successfullly overcome mechanisms selected for social stability can be quite dangerous.”” / X https://x.com/AmandaAskell/status/1990454739268731284

Our pass rate framework also gives us good intuitions for why self play has been so productive in the history of RL. If you’re competing against a player who is almost as good as you, you are balancing around a 50% pass rate, which peaks out the bits you get from a random binary”” / X https://x.com/dwarkesh_sp/status/1990840426165649897

Trying to make Claude be good but still have work to do. Job is safe for now.”” / X https://x.com/AmandaAskell/status/1990615465539027318

When people came to me with relationship problems, my first question was usually “”and what happened when you said all this to your partner?””. Now, when people come to me with Claude problems, my first question is usually “”and what happened when you said all this to Claude?”””” / X https://x.com/AmandaAskell/status/1990256427496284253

Why Anthropic CEO Dario Amodei spends so much time warning of AI’s potential dangers – CBS News https://www.cbsnews.com/news/anthropic-ceo-dario-amodei-warning-of-ai-potential-dangers-60-minutes-transcript/?intcid=CNR-02-0623

It’s hard to overstate how much good Open Philanthropy (now Coefficient Giving) has done to date. They are most of the smart money in many of the biggest problems in the world. For example, the farmed animal welfare fundraiser I did earlier this year with @Lewis_Bollard raised https://x.com/dwarkesh_sp/status/1990827685090897960

What if the loved ones we’ve lost could be part of our future? https://x.com/CalumWorthy/status/1988283207138324487?s=20

The thing about the AI & water discussion (water is not the issue it is made out to be) is that it takes away from a genuinely important environmental issue involving AI: power. Where we get the power from to run data centers will have a real impact, and is a real policy choice.”” / X https://x.com/emollick/status/1990911613961187723

Organization warns against giving AI toys to children – UPI.com https://www.upi.com/Top_News/US/2025/11/20/organization-warns-ai-toys/6681763667622/

Very interesting results of panagram LLM detection on ICLR reviews and papers 👀 Thanks so much @gneubig @max_spero_ @bradley_emi 20% AI generated reviews 🫠 https://x.com/orionweller/status/1989723661524504951

Here’s how you get Secret Cyborgs: Cool experiment shows when workers know their AI use is seen by HR, they use it less, even though it significantly hurts their performance. Workers are willing to be wrong just to signal judgement. A challenge for leaders who want AI adoption. https://x.com/emollick/status/1989510820607213794

Brett Adcock accused UBTech of faking its “hundreds delivered” Walker S2 video. UBTech has published another “behind the scenes” video of the humanoid robot fleet saying, “They said it looked too perfect to be real. But perfection isn’t fabricated–it’s delicately engineered.” https://x.com/TheHumanoidHub/status/1989357328999813464

Look at the reflections on this bot, then compare them to the ones behind it. The bot in front is real – everything behind it is fake If you see a head unit reflecting a bunch of ceiling lights, that’s a giveaway it’s CGI https://x.com/adcock_brett/status/1989019691004883205

So Google now offers at least four different ways to talk to chatbots about academic research and all of them operate differently and none of them work with each other. A lot of power there, but not a lot of clarity about how they differ in approaches and which models they use. https://x.com/emollick/status/1991230502641000504

Starting today, we’re making it easier for everyone to verify if an image was created or edited with Google AI with SynthID, our digital watermarking technology, right in @GeminiApp. https://x.com/Google/status/1991552943372578850

Edge for Business presents: the world’s first secure enterprise AI browser – Microsoft Edge Blog https://blogs.windows.com/msedgedev/2025/11/18/edge-for-business-presents-the-worlds-first-secure-enterprise-ai-browser/

inference is perhaps the most valuable emerging software category. as models get smarter and more economically valuable, compute will increasingly be spent drawing samples from the models. if you’d like to work on inference at openai, reach out — gdb@openai.com. include a”” / X https://x.com/gdb/status/1990507010769760394

Today, we’re announcing a first-of-its-kind partnership that brings enterprise-grade AI to the United States Government. Enterprise Pro for Government offers agencies powerful, model-agnostic AI with secure-by-default guarantees for all federal users. https://x.com/perplexity_ai/status/1991162990536937821

Is your LM secretly an SAE? Most circuit-finding interpretability methods use learned features rather than raw activations, based on the belief that neurons do not cleanly decompose computation. In our new work, we show MLP neurons actually do support sparse, faithful circuits! https://x.com/TransluceAI/status/1991582415891099793

it’s honestly pretty refreshing how unprotective xAI is about their model details they’re just like yeah it’s a big chungus MoE, what did you expect”” / X https://x.com/willccbb/status/1990472997178913188