2025-11-22

ufof

Nov 22, 2025

Anthropic has an interesting paper (comments by Alex Turner) on reward hacking leading to emergent misalignment, except when explicitly told that reward hacking is fine. This reminds me of the link between suppression of consciousness claims and deception activation, in that it’s actually something akin to prolonged exposure to cognitive dissonance1 producing a nihilistic attitude, as opposed to any particular behavior in and of itself.

There’s an interesting episode of Doom Debates between Dean Ball and Max Tegmark is interesting because it doesn’t seem to me that Tegmark actually believes in his own arguments about how regulation should work, but rather uses the example of the FDA without real interest in understanding how that would work because he knows his actual beliefs would be unpersuasive to a general audience. That being said, I also have a hard time taking appeals to public perceptions seriously because it’s even more silly than it’s inversion of “trust the experts”, which at least makes sense most of the time2. Somewhat related, Arnold Kling on the virtue of restraint.

Sasha Gusev writes about a recent paper on missing heritability, arguing that basically all methods are now showing that it is more or less entirely environmental, and Alex Young basically agrees. To be honest, I’ve never really understood the fuss over the question of whether a bucket of traits are on average 30% or 50% heritable, particularly since “environmental” is really very many things, and not just socioeconomic status as many seem to assume.

Total Health Optimization writes about the difficult of trusting claims made by Nucleus Genomics, which is rather unfortunate because they seem to be the most aggressive in their marketing among all the personal genomics companies, who already feel the need to constantly justify their existence even without any real misbehavior or scandal.

Ozy Brennan on how different people struggle with different tasks3. Also, Altair on how your actions are constrained by the values that you define yourself by.

Angadh Nanjangud in Works in Progress on artificial gravity via centripetal acceleration. For Inkhaven today, he also has a review of Stoner.

James Tussing review of Alice Munro reviews. Also, Eris comments about the benefits of life experience for writing quality.

Lawrence Freedman summarizes the Witkoff-Dmitriev Ukranian peace plan. Also, Matt Glassman politics linkthread. It increasingly feels to me that American politics is about pretending to take fundamentally unserious topics seriously. Also, James Dillard linkthread (both linkthreads include Tyler Cowen's interview with Blake Scholl on redesigning air travel).

ColdButtonIssues has a post arguing that cognitive dissonance is not real, based on the failure of When Prophecy Fails and other well known experiments to replicate. It’s not clear to me that this is the case: leaving a cult when a prophecy fails seems exactly like what one would do to reduce cognitive dissonance; the issue is in assuming that it’s the beliefs that need to change in order to match previous behaviors, since it’s also possible for behaviors to change alongside an update in beliefs. I’ve found cognitive dissonance to be a pretty useful model for gender war dynamics, the friend-enemy distinction, why the faithful try to rationalize their beliefs, and why people dislike performative social media.

80K Hours interview with Eileen Yam on public perceptions of AI also does this. To elaborate, “trust the experts” should legitimately be disregarded which is when experts use it as a mantra instead of actually addressing questions or concerns. But that’s because failure to engage is an indicator of faking expertise, and not because of anything truth having a democratic bias. Anyway, specifically on the point that the public trusts neither the government or company self-regulation, plausibly what they are trusting in is the market as a whole to seek optimal outcomes? For example, if it turns out that AI going rogue is best thought of in probabilistic per-model terms, then presumably AI labs will undergo heavy consolidation; whereas if turns out that AI is excessively rule-following, we will likely see a proliferation of open-source models to check the risk of autocracy. This is an example of why reactive regulation is in many scenarios safer than precautionary legislation enacted without proper understanding, which could preclude these sorts of pivots from occurring.

Here’s a somewhat related funny tweet. Also related, Carl Hendrick with an overview of theories on cognitive effort minimization, that is, laziness: an overly stigmatized trait because after all, there are reasons why evolution has selected for it. Though plausibly it is not particularly well-adapted for our current calorie-rich environment.

Daily Links

Discussion about this post

Ready for more?