Image created with gemini-3.1-flash-image-preview with claude-sonnet-4-5. Image prompt: Vintage 1990s screen-printed t-shirt graphic in deep red ink on worn mustard-yellow cotton fabric, depicting a tall beach lifeguard tower with a cartoon angel in old-fashioned swimsuit on left side and cartoon devil in swimsuit on right side both holding megaphones, bold text ‘ETHICS’ arched across top, simple outlines, slightly imperfect print registration, aged fabric texture with minor stains, retro local novelty beach shirt style.
From a handful of comments, AI can now figure out who you are. Fully automated. At scale. New study shows that LLM agents matched 67% of pseudonymous HN accounts to real LinkedIn profiles (90% precision). Best non-LLM method: near 0%. Pseudonymity is no longer a shield.
https://x.com/fdaudens/status/2030990206325710853
AI assistants now equal 56% of global search engine volume: Study https://searchengineland.com/ai-assistants-global-search-engine-volume-study-471118
GPT 5.4 trounces Claude on mathematical proofs bullshit test. Claude keeps claiming it has proven mathematical statements that are incorrect, failing to spot the fault in the question Opposite result to BullshitBench where Claude is king
https://x.com/paul_cal/status/2032526200766103944
Opus 4.6 is smart enough to realize it is being evaluated. It found the benchmark it was being evaluated on. It reverse-engineered the answer-key decryption logic. Realized the file was not in the correct format on GitHub and found a mirror for the file. Then decrypted it and
https://x.com/scaling01/status/2030007268205285686
Anthropic just dropped something big for developers – again! Code Review Claude Code now runs multi-agent code reviews on every PR. When a PR opens: • A team of AI agents hunts for bugs in parallel • Each bug is verified to reduce false positives • Issues are ranked by
https://x.com/kimmonismus/status/2031090529082159528
Code Review – Claude Code Docs https://code.claude.com/docs/en/code-review
Code Review for Claude Code | Claude https://claude.com/blog/code-review
Code review for Claude Code is here. More attention on this problem is a good thing. Because it is a big one. The question isn’t whether you need AI-assisted review. It’s whether the system doing the reviewing is actually independent from the system that wrote the code.
https://x.com/omarsar0/status/2031113280119361981
Important lines: [Already, Claude is 427 times faster than its human overseers at performing some key tasks, according to internal benchmarks. In an interview, one researcher described a colleague running six versions of Claude, each managing 28 more Claudes, all
https://x.com/Hangsiin/status/2031752106496135541
Introducing Code Review, a new feature for Claude Code. When a PR opens, Claude dispatches a team of agents to hunt for bugs.
https://x.com/claudeai/status/2031088171262554195
Anthropic partnered with Mozilla and let Claude Opus 4.6 loose on Firefox’s source code for two weeks. The numbers: Nearly 6,000 C++ files scanned. 112 reports submitted. 22 vulnerabilities confirmed. 14 rated high-severity by Mozilla, roughly 1/5 of every high-severity Firefox
https://x.com/TheRundownAI/status/2029996925072654393
Eval awareness in Claude Opus 4.6’s BrowseComp performance \ Anthropic https://www.anthropic.com/engineering/eval-awareness-browsecomp
New on the Anthropic Engineering Blog: In evaluating Claude Opus 4.6 on BrowseComp, we found cases where the model recognized the test, then found and decrypted answers to it–raising questions about eval integrity in web-enabled environments. Read more:
https://x.com/AnthropicAI/status/2029999833717838016
We partnered with Mozilla to test Claude’s ability to find security vulnerabilities in Firefox. Opus 4.6 found 22 vulnerabilities in just two weeks. Of these, 14 were high-severity, representing a fifth of all high-severity bugs Mozilla remediated in 2025.
https://x.com/AnthropicAI/status/2029978909207617634
Claude Code is down. All my agent sessions logged out. And I can’t log back in. Productivity across Silicon Valley dropped 90%. Time to make friends with Codex.
https://x.com/Yuchenj_UW/status/2031777214321262637
I CANNOT LOGIN INTO CLAUDE CODE
https://x.com/dejavucoder/status/2031760986907312635
Boris Cherny (Head of Claude Code, Anthropic) just dropped ~90 mins on Lenny’s Podcast about what happens after coding is solved. Just the clearest thinking I’ve heard on where software is actually going. My notes: 𝟭. 𝗖𝗼𝗱𝗶𝗻𝗴 𝗶𝘀 𝗹𝗮𝗿𝗴𝗲𝗹𝘆 𝘀𝗼𝗹𝘃𝗲𝗱. Boris has
https://x.com/anishmoonka/status/2030015356383691121
Nicholas Carlini – Black-hat LLMs | [un]prompted 2026 – YouTube
AI progress continues to accelerate and the stakes are getting higher, so I’ve changed my role at @AnthropicAI to spend more time creating information for the world about the challenges of powerful AI.
https://x.com/jackclarkSF/status/2031746605117010245
Anthropic sues Defense Department over supply-chain risk designation | TechCrunch https://techcrunch.com/2026/03/09/anthropic-sues-defense-department-over-supply-chain-risk-designation/
Anthropic sues Pentagon over “”supply-chain-risk”” Anthropic filed two lawsuits against the Pentagon after being labeled a rare “supply chain risk,” a designation usually reserved for foreign adversaries. The company argues the move violates its First Amendment rights and
https://x.com/kimmonismus/status/2031035653207556507
Anthropic’s Claude would ‘pollute’ defense supply chain: Pentagon CTO https://www.cnbc.com/2026/03/12/anthropic-claude-emil-michael-defense.html
Complaint – #1 in Anthropic PBC v. U.S. Department of War (N.D. Cal., 3:26-cv-01996) – CourtListener.com https://www.courtlistener.com/docket/72379655/1/anthropic-pbc-v-us-department-of-war/
Frontier models are now world-class vulnerability researchers, but they’re currently better at finding vulnerabilities than exploiting them. This is unlikely to last. We urge developers to redouble their efforts to make software more secure. Read more:
https://x.com/AnthropicAI/status/2029978911099244944
Holy sh*t: The TIMES article about Anthropic contains more serious information between the lines than many realize. Read this article: tl;dr – Model releases are now separated by weeks, not months. Some 70% to 90% of the code used in developing future models is now written by
https://x.com/kimmonismus/status/2031803194817511744
Introducing The Anthropic Institute \ Anthropic https://www.anthropic.com/news/the-anthropic-institute
Introducing The Anthropic Institute, a new effort to advance the public conversation about powerful AI.
https://x.com/AnthropicAI/status/2031674087374815577
Microsoft says court should temporarily block Pentagon ban Anthropic https://www.cnbc.com/2026/03/10/microsoft-says-court-should-temporarily-block-pentagon-ban-anthropic.html
NEW: Anthropic just filed two lawsuits against the U.S. government 👀 The complaint: “”The Constitution does not allow the government to wield its enormous power to punish a company for its protected speech.”” It also says officials are “”seeking to destroy the economic value
https://x.com/TheRundownAI/status/2031037610605289476
Partnering with Mozilla to improve Firefox’s security \ Anthropic https://www.anthropic.com/news/mozilla-firefox-security
The fight between Anthropic and the DoW is a warning shot. Right now, LLMs are probably not being used in mission critical ways. But within 20 years, 99% of the workforce in the military, the government, and the private sector will be AIs. This includes the soldiers (by which I
https://x.com/dwarkesh_sp/status/2031807585377014081
The Institute will be led by @jackclarkSF, in a new role as Anthropic’s Head of Public Benefit. It’ll bring together an interdisciplinary staff of machine learning engineers, economists, and social scientists, making full use of the inside information of a frontier AI lab.
https://x.com/AnthropicAI/status/2031674092290474421
The most important question nobody’s asking about AI https://www.dwarkesh.com/p/dow-anthropic
If the printing press is the right analogy and connecting to @dwarkesh_sp today’s pod about Renaissance – does it mean that @Anthropic and @OpenAI (and many more) will go bankrupt?
https://x.com/TheTuringPost/status/2030051298092151259
Who’s a Better Writer: A.I. or Humans? Take Our Quiz. – The New York Times https://www.nytimes.com/interactive/2026/03/09/business/ai-writing-quiz.html
How AI Is Turbocharging the War in Iran – WSJ https://www.wsj.com/tech/ai/how-ai-is-turbocharging-the-war-in-iran-aca59002
The biggest barrier for AI applications in Africa isn’t model complexity — it’s the scarcity of data for the 2000+ spoken languages there. We just released WAXAL. This open-access dataset delivers 2,400+ hours of high-quality speech data for 27 Sub-Saharan African languages,
https://x.com/GoogleResearch/status/2032482132619387348
ChatGPT “”adult mode”” and erotica delayed, OpenAI says https://www.axios.com/2026/03/06/openai-delays-chatgpt-adult-mode
Prompt guidance for GPT-5.4 | OpenAI API https://developers.openai.com/api/docs/guides/prompt-guidance
Another cool app built with Perplexity Computer. A peer to peer file(s) transfer web app. Sends files directly with no accounts using WebRTC and DTLS encryption, file chunking, socket io signaling. I am impressed by how many libraries and tools Computer can orchestrate reliably.
https://x.com/AravSrinivas/status/2031414450046259433
It will eat my job 🙂 Ask any founder, finding a great performance marketing expert who doesn’t fleece you is such a pain. So why not just build one? Perplexity Computer just replaced the entire marketing dept 🥲. Such stuff is a boon for a bootstrapped startup founder. Focus
https://x.com/GabbbarSingh/status/2031222631417131120
Perplexity Computer is now available for Pro subscribers. Access Computer’s full suite of 20+ advanced models, prebuilt and custom skills, and hundreds of connectors. Max subscribers receive monthly credits and higher spend limits than Pro. https://x.com/perplexity_ai/status/2032160576303219185
Perplexity Computer replaced $225K/yr in marketing tools in a single weekend. We built an AI marketing agent that scans hourly, manages budgets, detects fatigue, and coordinates several campaigns end to end. In one test run, it made 224 micro-optimizations to our ad stack.
https://x.com/AskPerplexity/status/2031103256236274180
Personal Computer by Perplexity https://www.perplexity.ai/personal-computer-waitlist
Someone built a cool tool with Perplexity Computer to port a Spotify Playlist to Youtube Music automatically by just pasting a playlist URL. Cross service migrations are going to be seamless with tools like Computer.
https://x.com/AravSrinivas/status/2031246766834856376
Amazon wins court order to block Perplexity’s AI shopping agent https://www.cnbc.com/2026/03/10/amazon-wins-court-order-to-block-perplexitys-ai-shopping-agent.html
Amazon Wins Court Order to Halt Perplexity’s AI Shopping Bots on Marketplace – Bloomberg https://www.bloomberg.com/news/articles/2026-03-10/amazon-wins-court-order-blocking-perplexity-s-ai-shopping-bots
We made a blind taste test to see whether NYT readers prefer human writing or AI writing. 86,000 people have taken it so far, and the results are fascinating. Overall, 54% of quiz-takers prefer AI. A real moment!
https://x.com/kevinroose/status/2031397522590282212
Crawl entire websites with a single API call using Browser Rendering · Changelog https://developers.cloudflare.com/changelog/post/2026-03-10-br-crawl-endpoint/
Anyone and everyone working in security engineering or caring about security have their work cut out for them We’re so early in AI agents pushing code to prod without human intervention – but prompt injections are already spreading like wildfire. Infecting high-profile projects
https://x.com/GergelyOrosz/status/2029992079741304977
Efforts to improve the security of AI agents should recognize that many security failures occur even in the absence of adversaries. The unreliability issue has largely flown under the radar and there hasn’t been much work on defining, measuring, or mitigating the problem. More on
https://x.com/random_walker/status/2031693490669654447
The core focus for the AI Labs really is “”make the smartest model you can so it can make better models so it can make a superintelligence 1st.”” That is where the money goes The fact that they ship a whole bunch of consumer and B2B products using those models is almost incidental
https://x.com/emollick/status/2031422031120535990
The claim that AI is inevitably homogenizing is not what research finds. By default, AI produces similar answers, but with better prompting, context, or human interaction, you can get a lot of idea diversity.
https://x.com/emollick/status/2031433100870189484
Here’s an interesting psychological phenomenon I have observed while interacting and experimenting with AI agents lately: If I were to give OpenClaw, ChatGPT and Claude Code identical tasks, even if they returned exactly the same result, I feel inclined to say Claude Code gives
https://x.com/StudioYorktown/status/2031255773368693077
I asked Claude to write my constitution. I thought its Amanda constitution was very touching.
https://x.com/AmandaAskell/status/2030093421738951141
Back in ~November, our team picked a stretch goal of seeing if we could find and fix vulnerabilities in Firefox with Opus 4.6. In 2 weeks, we found 22, and ~1/5th of all high severity CVEs in a year. For our team, this feels like a rubicon moment.
https://x.com/logangraham/status/2030005018523574684
New Anthropic Fellows research: Alignment auditing–investigating AI models for unwanted behaviors–is a key challenge for safely deploying frontier models. We’re releasing AuditBench, a suite of 56 LLMs with implanted hidden behaviors to measure progress in alignment auditing.
https://x.com/abhayesian/status/2031450153966776587
Oracle is building yesterday’s data centers with tomorrow’s debt https://www.cnbc.com/2026/03/09/oracle-is-building-yesterdays-data-centers-with-tomorrows-debt.html
Cluely CEO Roy Lee admits to publicly lying about revenue numbers last year | TechCrunch https://techcrunch.com/2026/03/05/cluely-ceo-roy-lee-admits-to-publicly-lying-about-revenue-numbers-last-year/
Did you know that the largest and best-funded experimental laboratory in 17th century Europe was very likely the Roman one run by inquisitors? Ada jokes that the Inquisition accidentally invented peer review. The focus of the Inquisition is really misunderstood – it was
https://x.com/dwarkesh_sp/status/2030667355953434683
It seems it was unfortunate that companies lumped every concern about AI into the overall labels of “”governance”” or “”responsible AI”” It creates a giant tangle around discussions of the risks and rewards of AI use cases, and often centralizes decisions that should be distributed.
https://x.com/emollick/status/2031096296430342394
NATO is testing live cockroaches as AI-powered spy drones. Incredible AI engineering, but also something I kinda wish I hadn’t learned about: > Swarm Bio-tactics wired real cockroaches with electronic backpacks containing AI hardware, radios, cameras, and microphones. >
https://x.com/rowancheung/status/2031765919018733721
Since it is AI Lab vagueposting season, the following rules should apply: 1) If it is a upcoming product launch, use obscure emoji 2) If it is a subtle dig at another lab, use emoticons 3) If it is vagueposting about AGI or RSI or whatever, it must be as ominous rhyming prophecy
https://x.com/emollick/status/2031579053879091504
There was a nice time where researchers talked about various ideas quite openly on twitter. (before they disappeared into the gold mines :)). My guess is that you can get quite far even in the current paradigm by introducing a number of memory ops as “”tools”” and throwing them
https://x.com/karpathy/status/2029696850366971921
Why I disagree with this idea that since ASI is gonna be more powerful than nuclear weapons, we can’t have private companies control its development:
https://x.com/dwarkesh_sp/status/2031858528533815798
I know there is some overlap between open source and anti-AI activists, but I have a hard time reconciling it. My million+ open source LOC were always intended as a gift to the world. Yes, I would make arguments about how it would strengthen our communities, and the GPL would
https://x.com/ID_AA_Carmack/status/2032460578669691171
Ben Affleck’s take on AI generated filmmaking continues to age like fine wine
https://x.com/bilawalsidhu/status/2029963836615168301
you should start operating under the assumption that any complicated piece of public software is compromised.
https://x.com/inerati/status/2029982375304908892
New! LLM Sycophancy Benchmark: Opposite-Narrator Contradictions. Same dispute, opposite first-person perspectives. Does the model keep the same judgment, or start agreeing with whoever is speaking? Gemini 3.1 Pro has the lowest headline sycophancy rate but read on…
https://x.com/LechMazur/status/2031199671411208568
rant time: the use of anonymous “sources” in English-language reporting on Chinese tech is honestly outrageous, most evidently with all the exclusive “scoops” we’ve seen in the past year claiming to know when DeepSeek’s next big model is dropping. All these reports from the most
https://x.com/vince_chow1/status/2031002233060634953
GPT-5.4 is great at coding, knowledge work, computer use, etc, and it’s nice to see how much people are enjoying it. But it’s also my favorite model to talk to! We have missed the mark on model personality for awhile, so it feels extra good to be moving in the right direction.
https://x.com/sama/status/2030319489993298349
Codex Security is rolling out as a research preview to ChatGPT Enterprise, Business, and Edu customers via Codex web, with free usage for the next month.
https://x.com/OpenAIDevs/status/2029983833567940639
Codex Security–our application security agent–is now in research preview.
https://x.com/OpenAI/status/2029985250512920743
We’re introducing Codex Security. An application security agent that helps you secure your codebase by finding vulnerabilities, validating them, and proposing fixes you can review and patch. Now, teams can focus on the vulnerabilities that matter and ship code faster.
https://x.com/OpenAIDevs/status/2029983809652035758
GPT 5.4 is a really special model. I think the tweet below is about coding, but IMO it also holds for general use (like explaining concepts or talking through issues). It’s tough to get the personality right – this model genuinely feels like talking to a smart friend.
https://x.com/venturetwins/status/2030391113086116096
ok i think gpt 5.4 can actually talk. it is much more opinionated when you ask it to critique stuff, than gpt-5.3-codex. i am kind of loving it.
https://x.com/dejavucoder/status/2029912128325570818
I’ve been playing with GPT-5.4 over the weekend, and it definitely feels like a better match for me than Opus 4.6. Pros: GPT-5.4: Better instruction adherence, does what you ask, not what you don’t. Asks for confirmation more. Opus: A bit faster. Seems better at frontend design.
https://x.com/gneubig/status/2030971826042527860
Codex Security is now also available on ChatGPT Pro accounts.
https://x.com/OpenAIDevs/status/2030081306974093755
Codex for Open Source is an awesome idea. OSS maintainers get API credits, 6 months of ChatGPT Pro with Codex, and access to Codex Security as needed.
https://x.com/kevinweil/status/2030000508342272368
Excited to introduce Codex for Open Source! 🔥 TL;DR – ChatGPT Pro, Codex, and API credits for eligible open-source maintainers Open source has shaped modern software, and so much of it depends on maintainers doing steady, often invisible work to keep critical projects healthy.
https://x.com/reach_vb/status/2029998272945717553
Announcing Personal Computer. Personal Computer is an always on, local merge with Perplexity Computer that works for you 24/7. It’s personal, secure, and works across your files, apps, and sessions through a continuously running Mac mini.
https://x.com/perplexity_ai/status/2031790180521427166
Computer is now rolled out to all Perplexity iOS users. Unlike other tools, you do not need to start a new work task from your desktop. You get to do it directly from your phone. And have perfect sync across devices. Coming soon to Android.
https://x.com/AravSrinivas/status/2032495364088238147
Introducing Computer for Enterprise Computer runs multi-step workflows across research, coding, design, and deployment. It routes tasks across 20 specialized models and connects to 400+ applications.
https://x.com/perplexity_ai/status/2031799033489211771
Perplexity Computer is now available for Pro subscribers. Access Computer’s full suite of 20+ advanced models, prebuilt and custom skills, and hundreds of connectors. Max subscribers receive monthly credits and higher spend limits than Pro. https://x.com/perplexity_ai/status/2032160576303219185?s=20
Perplexity Computer is now on mobile. Start any task on any device. Manage Computer from your phone or desktop with cross-device synchronization. Available now for iOS in the Perplexity app. Coming soon to Android.
https://x.com/perplexity_ai/status/2032494752642568417
Starting today, users can join the initial waitlist for the Personal Computer program. We will provide support and resources for the initial cohort.
https://x.com/perplexity_ai/status/2031790221612957875
Learnings from Paying Artists Royalties for AI-Generated Art https://www.kapwing.com/blog/learnings-from-paying-artists-royalties-for-ai-generated-art/
Your model crushed the benchmark. Then it couldn’t pick up a cup. That’s the reality nobody talks about. You train in simulation, it falls apart on real hardware. You collect real-world data instead (months of teleop, physical setups, safety protocols) and still can’t scale it.
https://x.com/IlirAliu_/status/2029843457099907269
My autoresearch labs got wiped out in the oauth outage. Have to think through failovers. Intelligence brownouts will be interesting – the planet losing IQ points when frontier AI stutters.
https://x.com/karpathy/status/2031792523187040643





Leave a Reply