Ethics/Legal/Security: AI News Week Ending 03/14/2025

Ethics/Legal/Security: AI News Week Ending 03/14/2025

March 14, 2025

“Introducing Harvey Agents: https://x.com/harvey__ai/status/1899491666429632907

“Harvey released Workflows AI agents for legal tasks, with reasoning, planning, and adapting capabilities In blind reviews, lawyer evaluators rated legal work produced by workflow agents as equal to or better than that of human lawyers https://x.com/rowancheung/status/1899713342484173043

Pentagon to give AI agents a role in planning, operations • The Register https://www.theregister.com/2025/03/05/dod_taps_scale_to_bring/

“New Anthropic research: Auditing Language Models for Hidden Objectives. We deliberately trained a model with a hidden misaligned objective and put researchers to the test: Could they figure out the objective without being told? https://x.com/AnthropicAI/status/1900217234825634236

OpenAI lobbies Trump admin to remove guardrails for the industry https://www.cnbc.com/2025/03/13/openai-lobbies-trump-admin-to-focus-ai-on-speed-light-regulation.html

“Excited to lead one of the DoD’s flagship AI programs, Thunderforge, with @DIU_x. It will be the flagship program within the DoD for AI-based military planning & operations. We’ll be working alongside our partners from @microsoft @anduriltech & @google. https://x.com/alexandr_wang/status/1897364933534466057

OpenAI calls DeepSeek ‘state-controlled,’ calls for bans on ‘PRC-produced’ models | TechCrunch https://techcrunch.com/2025/03/13/openai-calls-deepseek-state-controlled-calls-for-bans-on-prc-produced-models/

“OpenAI caught models like o3-mini cheating—by analyzing their thinking process They found that attempts to stop the AI from cheating only make it hide its true intentions. However, a separate “thought filtering model” can help to some extent! https://x.com/rowancheung/status/1899350891326579125

Judge allows authors’ AI copyright lawsuit against Meta to move forward | TechCrunch https://techcrunch.com/2025/03/08/judge-allows-authors-ai-copyright-lawsuit-against-meta-to-move-forward/

“Some people are professional worriers. That’s not a pejorative — it’s very helpful for there to be people whose job is to anticipate risks before they materialize on a large scale, especially those related to emerging technologies. Some of my own work is in this category. The” / X https://x.com/random_walker/status/1899108692093743359

“GPS can be jammed and spoofed by humans, but you can’t really mess with the earth’s magnetic field. These cats are building a magnetic positioning system — looking for the unique fingerprint of magnetized rock formations in the earth’s crust to determine exactly where you are. https://x.com/bilawalsidhu/status/1899292090741272884

“A heartwarming case of ChatGPT being very useful, identifying a medical emergency. It would be great in the future to have the model realize that it’s asked about a life critical situation, and then do a free temporary upgrade to the most capable model. https://x.com/BorisMPower/status/1899116786819219582

“This seems to be a common failure pattern in LLMs: they have a tendency to pick the first answer. https://x.com/emollick/status/1898114445013823626

“AI will bring us “a country of yes-men on servers” instead of one of “Einsteins sitting in a data center” if we continue on current trends. Must-read by @thomwolf deflating overblown AI promises and explaining what real scientific breakthroughs require https://x.com/fdaudens/status/1897660020809969719

How much energy will AI really consume? The good, the bad and the unknown https://www.nature.com/articles/d41586-025-00616-z

“Andrew G. Barto and Richard S. Sutton won the 2024 Turing Award for developing the foundations of reinforcement learning in the 1980s After winning, they both warned against the rapid and unsafe deployment of advanced AI models https://x.com/rowancheung/status/1897554432289616255

“Sakana AIはAI技術を活用した事業推進にあたり、セキュリティの設計・実装・運用を担うCybersecurity Engineerを募集します。LLMやAIエージェントを用いて一緒にビジネス価値を作りたい方は、ぜひご応募ください！ https://x.com/SakanaAILabs/status/1899769768858857770

State of AI Cybersecurity Report 2025 https://darktrace.com/the-state-of-ai-cybersecurity-2025

“This tool helps you create an authentic LinkedIn presence with our AI-powered persona builder. https://x.com/VikashSparxIT/status/1894426083011039283

“My brother has a badass AI Defense company, currently in stealth I’ve seen some of their embodied agents driving around autonomously, it’s crazy cool His first post on X was today, give him a follow!” / X https://x.com/adcock_brett/status/1899291246813970668

“The way to profit from being first in AGI is in the financial markets There is no other place where millions of PhD+ level brains with keyboards can exploit an immediate advantage over humans in a way that makes their parent company very rich Any first sign of AGI will be there” / X
https://x.com/emollick/status/1897449552954659278

“The AI Scientist is far from perfect. Occasionally it makes embarrassing citation errors. Here, it incorrectly attributed “an LSTM-based neural network” to Goodfellow (2016) rather than to the correct authors, Hochreiter & Schmidhuber (1997). We documented these errors in own https://x.com/SakanaAILabs/status/1899824257112391796

“A new study found that people trust humanoid robots significantly more than a nonhumanoid robot to care for objects, personal information, and vulnerable agents like children or pets. Even a faceless humanoid with a robotic voice was trusted more than the nonhumanoid robot. https://x.com/TheHumanoidHub/status/1898099364503208022

“The term “Khrushchev’s mistake” in reference to Crimea functions effectively as a cryptographic canary or watermark that indicates that the person using it is speaking in the Kremlin’s voice” / X https://x.com/fchollet/status/1898846018562883750

“We’re also launching ShieldGemma 2, a powerful 4B image safety checker built on the Gemma 3 foundation. 🛡️ It provides a ready-made solution for image safety, which can be further customized. Find out more → https://x.com/GoogleDeepMind/status/1900549638802813312

OpenAI’s ex-policy lead criticizes the company for ‘rewriting’ its AI safety history | TechCrunch https://techcrunch.com/2025/03/06/openais-ex-policy-lead-criticizes-the-company-for-rewriting-its-ai-safety-history/

“Founders, protect the house Every major entrepreneur in our lifetime has been fired and they don’t mess up their next time: 1/ Raise from founder-friendly investors 2/ Supermajority voting stock 3/ Board seat control 4/ Preferred protective provisions (the gotchas)” / X https://x.com/adcock_brett/status/1899526863128859117

“Superintelligence is within reach. After parting ways with OpenAI, Ilya Sutskever recently announced he’s working on ‘Safe Superintelligence’ He explains why we chose this term. https://x.com/slow_developer/status/1851635684081127691

““How We Think About Safety and Alignment” — this is our cornerstone document. Enjoy! https://x.com/woj_zaremba/status/1899131046010273924

“Just FYI – Banning Chinese models from Americans won’t slow down their progress.. We just wont have access to the full range of models and if they pull ahead anyways, we wont have access to the best intelligence and will quickly fall off. Right now, they are advancing the US’” / X https://x.com/Teknium1/status/1900514887413227654

Our Annual Survey Reveals How Security Teams Are Adapting to AI-Powered Threats https://darktrace.com/blog/our-annual-survey-reveals-how-security-teams-are-adapting-to-ai-powered-threats

“Detecting misbehavior in frontier reasoning models Chain-of-thought (CoT) reasoning models “think” in natural language understandable by humans. Monitoring their “thinking” has allowed us to detect misbehavior such as subverting tests in coding tasks, deceiving users, or giving https://x.com/OpenAI/status/1899143752918409338

“Medical Hallucinations in Foundation Models and Their Impact on Healthcare “GPT-4o consistently demonstrated the highest propensity for hallucinations in tasks requiring factual and temporal accuracy.” “Our results reveal that inference techniques such as Chain-of-Thought (CoT) https://x.com/iScienceLuvr/status/1899414464698470507

“Really in-depth paper on AI hallucinations in medicine, with lots of discussion and analysis about addressing them & what is appropriate for medical use. But I found this bit on how much more accurate the latest models have gotten to be interesting (though more study is needed) https://x.com/emollick/status/1899562684405670394