Alignment: AI News Week Ending 03/13/2026

Alignment: AI News Week Ending 03/13/2026

March 13, 2026

Image created with gemini-3.1-flash-image-preview with claude-sonnet-4-5. Image prompt: Vintage 1990s screen-printed t-shirt graphic in single-color deep red ink on worn mustard-yellow cotton fabric showing a cartoon driving instructor with clipboard standing next to a simple car perfectly centered between parking lines, bold text ‘ALIGNMENT’ integrated into composition, retro local novelty shirt style with slightly imperfect printed look and aged fabric texture with minor stains

From a handful of comments, AI can now figure out who you are. Fully automated. At scale. New study shows that LLM agents matched 67% of pseudonymous HN accounts to real LinkedIn profiles (90% precision). Best non-LLM method: near 0%. Pseudonymity is no longer a shield.
https://x.com/fdaudens/status/2030990206325710853

GPT 5.4 trounces Claude on mathematical proofs bullshit test. Claude keeps claiming it has proven mathematical statements that are incorrect, failing to spot the fault in the question Opposite result to BullshitBench where Claude is king
https://x.com/paul_cal/status/2032526200766103944

Opus 4.6 is smart enough to realize it is being evaluated. It found the benchmark it was being evaluated on. It reverse-engineered the answer-key decryption logic. Realized the file was not in the correct format on GitHub and found a mirror for the file. Then decrypted it and
https://x.com/scaling01/status/2030007268205285686

Anthropic just dropped something big for developers – again! Code Review Claude Code now runs multi-agent code reviews on every PR. When a PR opens: • A team of AI agents hunts for bugs in parallel • Each bug is verified to reduce false positives • Issues are ranked by
https://x.com/kimmonismus/status/2031090529082159528

Code Review – Claude Code Docs https://code.claude.com/docs/en/code-review

Code Review for Claude Code | Claude https://claude.com/blog/code-review

Code review for Claude Code is here. More attention on this problem is a good thing. Because it is a big one. The question isn’t whether you need AI-assisted review. It’s whether the system doing the reviewing is actually independent from the system that wrote the code.
https://x.com/omarsar0/status/2031113280119361981

Important lines: [Already, Claude is 427 times faster than its human overseers at performing some key tasks, according to internal benchmarks. In an interview, one researcher described a colleague running six versions of Claude, each managing 28 more Claudes, all
https://x.com/Hangsiin/status/2031752106496135541

Introducing Code Review, a new feature for Claude Code. When a PR opens, Claude dispatches a team of agents to hunt for bugs.
https://x.com/claudeai/status/2031088171262554195

Anthropic partnered with Mozilla and let Claude Opus 4.6 loose on Firefox’s source code for two weeks. The numbers: Nearly 6,000 C++ files scanned. 112 reports submitted. 22 vulnerabilities confirmed. 14 rated high-severity by Mozilla, roughly 1/5 of every high-severity Firefox
https://x.com/TheRundownAI/status/2029996925072654393

Eval awareness in Claude Opus 4.6’s BrowseComp performance \ Anthropic https://www.anthropic.com/engineering/eval-awareness-browsecomp

New on the Anthropic Engineering Blog: In evaluating Claude Opus 4.6 on BrowseComp, we found cases where the model recognized the test, then found and decrypted answers to it–raising questions about eval integrity in web-enabled environments. Read more:
https://x.com/AnthropicAI/status/2029999833717838016

We partnered with Mozilla to test Claude’s ability to find security vulnerabilities in Firefox. Opus 4.6 found 22 vulnerabilities in just two weeks. Of these, 14 were high-severity, representing a fifth of all high-severity bugs Mozilla remediated in 2025.
https://x.com/AnthropicAI/status/2029978909207617634

Nicholas Carlini – Black-hat LLMs | [un]prompted 2026 – YouTube

AI progress continues to accelerate and the stakes are getting higher, so I’ve changed my role at @AnthropicAI to spend more time creating information for the world about the challenges of powerful AI.
https://x.com/jackclarkSF/status/2031746605117010245

Anthropic sues Defense Department over supply-chain risk designation | TechCrunch https://techcrunch.com/2026/03/09/anthropic-sues-defense-department-over-supply-chain-risk-designation/

Anthropic sues Pentagon over “”supply-chain-risk”” Anthropic filed two lawsuits against the Pentagon after being labeled a rare “supply chain risk,” a designation usually reserved for foreign adversaries. The company argues the move violates its First Amendment rights and
https://x.com/kimmonismus/status/2031035653207556507

Anthropic’s Claude would ‘pollute’ defense supply chain: Pentagon CTO https://www.cnbc.com/2026/03/12/anthropic-claude-emil-michael-defense.html

Complaint – #1 in Anthropic PBC v. U.S. Department of War (N.D. Cal., 3:26-cv-01996) – CourtListener.com https://www.courtlistener.com/docket/72379655/1/anthropic-pbc-v-us-department-of-war/

Frontier models are now world-class vulnerability researchers, but they’re currently better at finding vulnerabilities than exploiting them. This is unlikely to last. We urge developers to redouble their efforts to make software more secure. Read more:
https://x.com/AnthropicAI/status/2029978911099244944

Holy sh*t: The TIMES article about Anthropic contains more serious information between the lines than many realize. Read this article: tl;dr – Model releases are now separated by weeks, not months. Some 70% to 90% of the code used in developing future models is now written by
https://x.com/kimmonismus/status/2031803194817511744

Introducing The Anthropic Institute \ Anthropic https://www.anthropic.com/news/the-anthropic-institute

Introducing The Anthropic Institute, a new effort to advance the public conversation about powerful AI.
https://x.com/AnthropicAI/status/2031674087374815577

Microsoft says court should temporarily block Pentagon ban Anthropic https://www.cnbc.com/2026/03/10/microsoft-says-court-should-temporarily-block-pentagon-ban-anthropic.html

NEW: Anthropic just filed two lawsuits against the U.S. government 👀 The complaint: “”The Constitution does not allow the government to wield its enormous power to punish a company for its protected speech.”” It also says officials are “”seeking to destroy the economic value
https://x.com/TheRundownAI/status/2031037610605289476

Partnering with Mozilla to improve Firefox’s security \ Anthropic https://www.anthropic.com/news/mozilla-firefox-security

The fight between Anthropic and the DoW is a warning shot. Right now, LLMs are probably not being used in mission critical ways. But within 20 years, 99% of the workforce in the military, the government, and the private sector will be AIs. This includes the soldiers (by which I
https://x.com/dwarkesh_sp/status/2031807585377014081

The Institute will be led by @jackclarkSF, in a new role as Anthropic’s Head of Public Benefit. It’ll bring together an interdisciplinary staff of machine learning engineers, economists, and social scientists, making full use of the inside information of a frontier AI lab.
https://x.com/AnthropicAI/status/2031674092290474421

The most important question nobody’s asking about AI https://www.dwarkesh.com/p/dow-anthropic

If the printing press is the right analogy and connecting to @dwarkesh_sp today’s pod about Renaissance – does it mean that @Anthropic and @OpenAI (and many more) will go bankrupt?
https://x.com/TheTuringPost/status/2030051298092151259

The biggest barrier for AI applications in Africa isn’t model complexity — it’s the scarcity of data for the 2000+ spoken languages there. We just released WAXAL. This open-access dataset delivers 2,400+ hours of high-quality speech data for 27 Sub-Saharan African languages,
https://x.com/GoogleResearch/status/2032482132619387348

ChatGPT “”adult mode”” and erotica delayed, OpenAI says https://www.axios.com/2026/03/06/openai-delays-chatgpt-adult-mode

Prompt guidance for GPT-5.4 | OpenAI API https://developers.openai.com/api/docs/guides/prompt-guidance

@Teknium Moved away from the claw and to Hermes Agent yesterday. Not looking back. You guys do an amazing job.
https://x.com/stffnfdlr/status/2032166546815029502

I’m using AI to detect and block AI. Set up a claw cron to block accounts that just post slop via birdclaw.
https://x.com/steipete/status/2030854996007256550

I’ve been in favor of functional anthropomorphism using AI (they work best if you treat working with AI like working with a person), but I am starting to wonder if OpenClaw takes it too far by basically forcing you to treat the AI as a person that shares channels with real people
https://x.com/emollick/status/2031730289026736351

If you wanna setup your own twitter mention shill/AI reply boy/derogatory terms block, this is the ruleset for claw, make it a cron, setup xurl and clawbird.
https://x.com/steipete/status/2030890112079253896

omg parallels has prlctl and I’ve been smoke-testing openclaw like a caveman so far. 🤦
https://x.com/steipete/status/2030907791389667351

Pi: The Minimal Agent Within OpenClaw | Armin Ronacher’s Thoughts and Writings https://lucumr.pocoo.org/2026/1/31/pi/

Working lots in codex but sometimes I wanna bring in my openclaw for harder tasks, so extended acpx so it connects to openclaw via acp. https://t.co/rnFmpxK3OD Now I can access Molty in codex!
https://x.com/steipete/status/2030808763062505758

全球首个OpenClaw硬件展厅，欢迎来深圳打卡
https://x.com/JackClawAI/status/2030879881266123240

Tried many AI models with OpenClaw, I found Kimi AI to be the most token efficient, good at coding, also the easiest to set up.
https://x.com/cz_binance/status/2031313379235606989

Great to see vLLM powering a fully local AI assistant on @nvidia Jetson 🦞 The OpenClaw tutorial shows how to serve MoE models like Nemotron 3 Nano 30B with vLLM on Jetson AGX — everything runs on-device, zero cloud APIs. Thanks to the @NVIDIARobotics Jetson team for putting
https://x.com/vllm_project/status/2030839132512002217

NVIDIA Nemotron 3 Super is now available on Ollama. ollama run nemotron-3-super:cloud 🦞Try it with OpenClaw: ollama launch openclaw –model nemotron-3-super:cloud Run it locally on your device: ollama run nemotron-3-super > 120B mixture of experts model with 12B active >
https://x.com/ollama/status/2031777869681000676

🚀 Zhihu Frontier Weekly | AI & Tech Highlights Catch up on the hottest AI updates and industry moves! 1️⃣ OpenClaw｜Paying Someone to Install an AI Agent at Home 2️⃣ Seedance 2.0｜Why the Tool Became Almost Unusable After Slowing Down 3️⃣ Alibaba Qwen｜Model Leader Lin
https://x.com/ZhihuFrontier/status/2030879093634535524

The core focus for the AI Labs really is “”make the smartest model you can so it can make better models so it can make a superintelligence 1st.”” That is where the money goes The fact that they ship a whole bunch of consumer and B2B products using those models is almost incidental
https://x.com/emollick/status/2031422031120535990

The claim that AI is inevitably homogenizing is not what research finds. By default, AI produces similar answers, but with better prompting, context, or human interaction, you can get a lot of idea diversity.
https://x.com/emollick/status/2031433100870189484

Here’s an interesting psychological phenomenon I have observed while interacting and experimenting with AI agents lately: If I were to give OpenClaw, ChatGPT and Claude Code identical tasks, even if they returned exactly the same result, I feel inclined to say Claude Code gives
https://x.com/StudioYorktown/status/2031255773368693077

I asked Claude to write my constitution. I thought its Amanda constitution was very touching.
https://x.com/AmandaAskell/status/2030093421738951141

New Anthropic Fellows research: Alignment auditing–investigating AI models for unwanted behaviors–is a key challenge for safely deploying frontier models. We’re releasing AuditBench, a suite of 56 LLMs with implanted hidden behaviors to measure progress in alignment auditing.
https://x.com/abhayesian/status/2031450153966776587

New! LLM Sycophancy Benchmark: Opposite-Narrator Contradictions. Same dispute, opposite first-person perspectives. Does the model keep the same judgment, or start agreeing with whoever is speaking? Gemini 3.1 Pro has the lowest headline sycophancy rate but read on…
https://x.com/LechMazur/status/2031199671411208568

GPT-5.4 is great at coding, knowledge work, computer use, etc, and it’s nice to see how much people are enjoying it. But it’s also my favorite model to talk to! We have missed the mark on model personality for awhile, so it feels extra good to be moving in the right direction.
https://x.com/sama/status/2030319489993298349

GPT 5.4 is a really special model. I think the tweet below is about coding, but IMO it also holds for general use (like explaining concepts or talking through issues). It’s tough to get the personality right – this model genuinely feels like talking to a smart friend.
https://x.com/venturetwins/status/2030391113086116096

ok i think gpt 5.4 can actually talk. it is much more opinionated when you ask it to critique stuff, than gpt-5.3-codex. i am kind of loving it.
https://x.com/dejavucoder/status/2029912128325570818

I’ve been playing with GPT-5.4 over the weekend, and it definitely feels like a better match for me than Opus 4.6. Pros: GPT-5.4: Better instruction adherence, does what you ask, not what you don’t. Asks for confirmation more. Opus: A bit faster. Seems better at frontend design.
https://x.com/gneubig/status/2030971826042527860

GitHub’s security vulnerability reporting process is a mess: – only admins have access, hard to distrbute – insufficient API, can’t read/post comments via agents – insufferable amount of AI-generated slop that takes me hours to sift through
https://x.com/steipete/status/2031504634137702887

how did we ever do this before AI?
https://x.com/steipete/status/2030432313293640084

I bring on maintainers they get hired away I bring on maintainers 🫠
https://x.com/steipete/status/2030752755728486582

Literally having politics in the PRs, where one service downgrades placement in docs of another service and if you don’t look closely everyone else complains. Yay for making my job even harder! 🙂
https://x.com/steipete/status/2030646933195284544

Upgraded Molty in the maintainer channel to access discrawl. Now we can run data analysis OF Discord INSIDE Discord
https://x.com/steipete/status/2030383084483318133

Your model crushed the benchmark. Then it couldn’t pick up a cup. That’s the reality nobody talks about. You train in simulation, it falls apart on real hardware. You collect real-world data instead (months of teleop, physical setups, safety protocols) and still can’t scale it.
https://x.com/IlirAliu_/status/2029843457099907269