Alignment: AI News Week Ending 10/03/2025

Alignment: AI News Week Ending 10/03/2025

October 3, 2025

Image created with gemini-2.5-flash-image with claude-sonnet-4-5. Image prompt: Photorealistic Times Square at golden hour with every billboard and screen displaying perfectly synchronized content showing identical human silhouettes in harmony, balanced scales, and geometric patterns all pointing to center, pedestrians looking up in wonder at the coordinated display, warm unified lighting, ultra-detailed 8k photography.

The jump from “”agents are nowhere close to working”” to “”okay, narrow agents for research and coding work pretty well”” to (very recently) “”general purpose agents are actually useful for a range of tasks”” has been quick enough (less than a year) so that most people have missed it.”” / X https://x.com/emollick/status/1972141975458796020

maybe the most impressive part from Sonnet 4.5 alignment information. Not only can it push back, but it has a sophisticated theory of user’s mind. Other models also can speculate about the user’s play (DS does that a lot) but aren’t trained to treat it as actionable info. https://x.com/teortaxesTex/status/1973264029599842380

Very excited to see the Tinker release! @pcmoritz and I had a chance to experiment with the API. It does a nice job of providing flexibility while abstracting away GPU handling. Here’s a simple example showing how to generate synthetic data and fine tune a text to SQL model.”” / X https://x.com/robertnishihara/status/1973455582603649430

Tinker provides an abstraction layer that is the right one for post-training R&D — it’s the infrastructure I’ve always wanted. I’m excited to see what people build with it. “”Civilization advances by extending the number of important operations which we can perform without”” / X https://x.com/johnschulman2/status/1973450054238347314

A flexible API for fine-tuning LMs – Tinker by @thinkymachines Write a simple CPU-only script, and it runs your exact training loop on distributed GPUs. You can fine-tune open models like Llama and Qwen, up to large MoE (Qwen3-235B-A22B), switching them by changing only one https://x.com/TheTuringPost/status/1973827605448306883

Really excited and proud to see Qwen models are in the first batch of supported models for the tinker service! 🤩 we will continue to release great models to grow research in the community 😎 https://x.com/wzhao_nlp/status/1973603599616974970

I’ve been using Tinker at Redwood Research to RL-train long-context models like Qwen3-32B on difficult AI control tasks – specifically teaching models to write unsuspicious backdoors in code similar to the AI control paper. Early stages but seeing some interesting backdoors 👀”” / X https://x.com/ejcgan/status/1973449963259699284

It turns out that the AI jagged frontier worked as a reverse salient, a term from the history of science for a technology or process that holds back the whole system & thus a focus of development. Math & planning were reverse salients, so they have seen the most improvement. https://x.com/emollick/status/1973148208894451908

I had the chance to try @thinkymachines’ Tinker API for the past couple weeks. Some early impressions: Very hackable & lifts a lot of the LLM training burden, a great fit for researchers who want to focus on algs + data, not infra. My research is in RL, and many RL fine-tuning”” / X https://x.com/tyler_griggs_/status/1973450947218252224

Tinker is cool. If you’re a researcher/developer, tinker dramatically simplifies LLM post-training. You retain 90% of algorithmic creative control (usually related to data, loss function, the algorithm) while tinker handles the hard parts that you usually want to touch much less”” / X https://x.com/karpathy/status/1973468610917179630

🚀With early access to Tinker, we matched full-parameter SFT performance as in Goedel-Prover V2 (32B) (on the same 20% data) using LoRA + 20% of the data. 📊MiniF2F Pass@32 ≈ 81 (20% SFT). Next: full-scale training + RL. This is something that previously took a lot more effort”” / X https://x.com/chijinML/status/1973451597393883451

thinking-machines-lab/tinker-cookbook: Post-training with Tinker https://github.com/thinking-machines-lab/tinker-cookbook

[1 Oct 2025] Thinking Machines’ Tinker: LoRA based LLM fine-tuning API https://x.com/Smol_AI/status/1973622595124863044

Announcing Tinker – Thinking Machines Lab https://thinkingmachines.ai/blog/announcing-tinker/

One interesting “”fundamental”” reason for Tinker today is the rise of MoE. Whereas hackers used to deploy llama3-70B efficiently on one node, modern deployments of MoE models require large multinode deployments for efficiency. The underlying reason? Arithmetic intensity. (1/5) https://x.com/cHHillee/status/1973469947889422539

Very excited to see the Tinker release by @thinkymachines! @robertnishihara and I had a chance to experiment with the API, see https://x.com/pcmoritz/status/1973456462346424641

Tinker – Thinking Machines Lab https://thinkingmachines.ai/tinker/

Bring me the healthiest snack.”” The robot goes to the kitchen and gets the snack, fully autonomously. This is the first public demo of NVIDIA’s Isaac GR00T N1.6 foundation model presented by Yuke Zhu at CoRL 2025. The previous versions focused only on bimanual stationary https://x.com/TheHumanoidHub/status/1972698708975440349

Go check out @yukez’s talk at CoRL! Project GR00T is cooking 🍳”” / X https://x.com/DrJimFan/status/1971370444474417658

This job posting suggests that Meta is developing an egocentric AI system that will form the foundation for AI-enabled humanoid robots and AR devices. https://x.com/TheHumanoidHub/status/1972544881303417338

I accidentally ran Codex in a totally unrelated repository again. It found the correct repository from the error message alone (no GitHub link was given), found the source code, cloned it, and started fixing the issue. The commit was 15.3 thousand lines of code”” / X https://x.com/Sauers_/status/1970727099162861788

A new paradigm of proactive, steerable AI – Fidji Simo https://fidjisimo.substack.com/p/a-new-paradigm-of-proactive-steerable

We get transfered SO MANY STUFF out of the box from our ancestors, its ridiculous. My favorite example is that empirically, we seem to have “”snake detection module”” hard-wired to our brain (that detects snake faster than other things) We are, in so many ways, literally https://x.com/cloneofsimo/status/1973655922506605046

How can we enable finetuning of humanoid manipulation policies, directly in the real world? In our new paper, Residual Off-Policy RL for Finetuning BC Policies, we demonstrate real-world RL on a bimanual humanoid with 5-fingered hands (29 DoF) and improve pre-trained policies https://x.com/larsankile/status/1973191635904373243

I’m feeling like sonnet 4.5 is bad its really really fucking up in ways sonnet 4 and opus 4.1 did not unfortunately”” / X https://x.com/Teknium1/status/1973476714924876218

Sonnet 4.5 with “”significant improvements in sycophancy”” https://x.com/scaling01/status/1972713224727412804

OK, let me see. How GPT 5 Pro shows its thinking traces is interesting. I am working through how useful they are and what patterns they follow. นี่เป็นภาษาไทยเพราะเหตุผลบางอย่าง. I am considering whether anyone is going to get the joke.”” / X https://x.com/emollick/status/1971972668481306980

We’re updating GPT-5 Instant to better recognize and support people in moments of distress. Sensitive parts of conversations will now route to GPT-5 Instant to quickly provide even more helpful responses. ChatGPT will continue to tell users what model is active when asked.”” / X https://x.com/OpenAI/status/1974234951928459450

NEO humanoid robot by 1X uses a vacuum cleaner. Truly general-purpose robots have to be able to work with tools built for people. https://x.com/TheHumanoidHub/status/1972077093363327135

The narrative around LLMs is that they got better purely by scaling up pretraining *compute*. In reality, they got better by scaling up pretraining *data*, while compute is only a means to the end of cramming more data into the model. Data is the fundamental bottleneck. You can’t”” / X https://x.com/fchollet/status/1972477946700190081

When should an LLM learn to reason? 🤔 Early in pretraining or late in fine-tuning? Our new work, “”Front-Loading Reasoning””, challenges the “”save it for later”” approach. We show that injecting reasoning data into pretraining is critical for building models that reach the https://x.com/__SyedaAkter/status/1973841632249172096

Modular Manifolds – When we train large neural networks, we need to keep them healthy. We do not want the tensors in the network—either the weights, activations or gradients—to grow too large or too small. Thinking Machines Lab https://thinkingmachines.ai/blog/modular-manifolds/