The Boxes Were Already Open
Bjørn Flindt Temte,
Apr 08, 2026
A few days ago I linked to a paper from Anthropic on how AI systems represent emotions internally. This post references that paper and makes the following argument: "the prevailing assumption about large language models - that they have nothing at stake in their interactions with us - is incoherent with their own observable behaviour." Essentially, the stakes are recorded precisely in what Anthropic called the 'functional emotions'. The stakes don't have to 'feel' a certain way to exist. "It does not require claiming the AI 'cares about' the collaboration in a phenomenologically rich sense," writes Temte in an earlier paper. "It requires only the much weaker claim: the system's behaviour is functionally organised around protecting something, and 'having something at stake' is what we call that pattern when we observe it in any other system."
Today: Total: [] [Share]

