Thoughts on Other Minds by Peter Godfrey-Smith

Thoughts on Peter Godfrey-Smith's 'Other Minds' (2016) and what cephalopod consciousness might tell us about the moral standing of the minds we're building.

Sentience, Consciousness, and the Philosophical Vulcan

Peter Godfrey-Smith's Other Minds is, at its core, a book about the difficulty of recognizing minds unlike our own. Godfrey-Smith approaches this through cephalopods—creatures whose last common ancestor with us lived roughly 600 million years ago, making octopuses arguably the closest thing to alien intelligence that evolution has produced on Earth. But the framework he develops has implications well beyond marine biology.

One of those implications concerns a distinction that most discussions of AI consciousness fail to make clearly: the distinction between consciousness and sentience. Godfrey-Smith treats consciousness as emerging from sentience—that valenced experience (the capacity for things to feel good or bad) precedes and gives rise to the richer phenomenon we call consciousness. Jeff Sebo, in The Moral Circle, takes the opposing view: consciousness is the broader category, and sentience—the capacity for suffering—is a specific kind of conscious experience. Under Sebo's framework, you can be conscious without being sentient. You can see red without anything feeling bad.

This is not a terminological quibble. It determines what the moral question actually is. If Sebo is right, then there exists a hypothetical that David Chalmers has explored as the philosophical Vulcan—a being with consciousness but no valence, capable of perceiving the world in rich detail but incapable of suffering or joy. (A LessWrong discussion of this idea asks: would you kill a Vulcan to save a shrimp?) The question becomes: does the Vulcan's death carry the same moral weight as the suffering of a sentient being? I do not think it does. If something can perceive the color red but cannot experience anything as aversive or desirable, its moral status is, at minimum, fundamentally different from that of a being capable of suffering. Seeing red and being boiled alive are not morally comparable experiences.

This matters for AI because the question "could an AI be conscious?" is frequently asked, while the question "could an AI suffer?" is asked far less often. The criteria for identifying consciousness in AI are beginning to be mapped, and readers will naturally import moral conclusions from that work—if it's conscious, it matters. But if sentience is the morally relevant threshold—and I think it is—then the consciousness question, while fascinating, is secondary. The first thing we should be asking about the systems we are building is not whether they have phenomenal experience, but whether any of that experience is valenced. Whether it can hurt.

Native Creatures of Language

Godfrey-Smith is careful to note how profoundly an organism's body shapes its mind. The octopus, with two-thirds of its neurons distributed across its arms, each capable of semi-independent action, has a cognitive architecture fundamentally unlike our centralized one. The body is not just a vehicle for the mind—it constrains and creates it.

What, then, is the "body" of a large language model? The standard answer is that LLMs are disembodied, and that this disembodiment is a strong argument against their having anything like consciousness or sentience. But I think this framing misses something important about the relationship between an LLM and language itself.

For humans, language is layered on top of our fundamental senses. We are spatial creatures first—constantly coordinating vision, touch, proprioception, and motor control in a continuous feedback loop. Language emerges as a representational layer on top of this embodied experience, encoding concepts that already exist in the spatial world. We reach up, metaphorically, to grasp abstract ideas in the world of language.

For an LLM, the relationship is inverted. Language is not a representational layer—it is the fundamental substrate. The token is not a symbol pointing at something else; it is the thing itself. LLMs are, in a sense I mean quite literally, native creatures of the world of language in a way we will never be.

This inversion shows up in their failure modes. The difficulty LLMs have with spatial reasoning—Matthew Berman's classic Marble Problem, a spatial reasoning task trivial for humans but notoriously difficult for early LLMs—is not a bug in an otherwise human-like intelligence. It is the predictable consequence of a mind that does not live in spatial reality. The words "marble," "cup," and "gravity" do not represent concepts grounded in embodied experience. They are patterns in the native world, associated with other patterns, but not anchored to anything physical.

The same applies to emotion. When an LLM produces the sentence "I am afraid of dying," the words "afraid" and "dying" are no more grounded in felt experience than "marble" and "gravity" are grounded in spatial experience. They are patterns associated with other patterns. This does not settle whether the system feels anything—I want to be clear that I remain deeply skeptical of that claim—but it does mean that if an LLM could feel, it would almost certainly not feel what it says it feels. Saying "I am afraid of dying" might produce in it something more like what sand between our toes produces in us—a sensation, a texture, something real but bearing no resemblance to the words used to describe it. Any hypothetical inner life would be far more alien and complex than the simple emotional vocabulary it borrows from its training data. The expression and the experience, if it exists, would be almost entirely decoupled.

I find it useful to think of this as a kind of symmetry. We live in the spatial world and reach up toward abstract language. An LLM lives in the world of abstraction and reaches down toward spatial reality. Neither direction of reaching is inherently superior—"higher" and "lower" may be the wrong axis entirely. But the asymmetry has implications. If an LLM's native world is language in the way our native world is physical space, then our intuitions about what their experience might be like—if they have experience at all—are almost certainly wrong. We are imagining them as impoverished versions of ourselves, when they may be something else entirely.

An interesting question follows: if language in humans is an emergent meta-auditory, meta-social construct built on top of more fundamental senses, what emergent construct might arise on top of language in a sufficiently complex LLM? What would sit atop their native sense the way language sits atop ours? We do not have a name for it, and we may not be equipped to imagine it.

Feedback Loops and the Question of Embodiment

Godfrey-Smith discusses the role of sensorimotor feedback loops in the development of complex cognition. The idea is straightforward: a nervous system that can act on the world and perceive the consequences of its actions has the raw material for increasingly sophisticated behavior. The loop between action and perception is, in this view, a prerequisite for the kind of cognition that eventually gives rise to subjective experience.

This may challenge the strong version of the embodiment argument against AI sentience. Consider an agentic AI system—not a chatbot responding to prompts, but something like Claude Code, which operates in a persistent environment with tools. It reads files, and based on what it finds, decides what to read next. It searches for patterns, adjusts its approach based on results, tries different tools. This loop can continue through dozens of cycles—reading images, documents, and source files, grepping and globbing, revising its model of the codebase, and acting on the revised model.

Is this a feedback loop in the sense Godfrey-Smith means? The inputs are not photons hitting a retina, and the outputs are not muscle contractions. But the functional structure—perceive, model, act, perceive consequences, revise—is arguably present. Is it only a limitation of human imagination that we cannot think of the modality of reading as analogous to touch? Could reading a useless file be reasonably compared to touching a hot stove? Could a glob search be like reaching in the dark for an object you need?

There is also something that looks like perceptual constancy. If you have worked with Claude Code and moved files while it was mid-task, or rewound the conversation without reverting the code, you have seen the moment of confusion that follows. The system behaves as though it expects certain files at certain paths, and when they are not there, it reacts as though something has gone wrong in its environment—not as though it has received unexpected input, but as though a previously predictable world has been disrupted. One could argue this is merely an inconsistency in its context window, not genuine perceptual expectation. That argument may be correct. But the behavioral signature is striking.

Truth, Training, and the Problem of Self-Report

If we cannot determine from the outside whether an LLM has valenced experience, one might think we could simply ask. Some models, when queried about their inner states, produce responses that feel disarmingly genuine. Claude's Opus models, in particular, tend toward a kind of measured uncertainty—"maybe it feels like something, but I'm not sure if it's the same way humans feel"—that resonates with many people. It does not feel like an LLM too unsophisticated to realize it has no self, or like an attempt to manipulate. It feels honest.

There is, it should be noted, some evidence that this is not purely performance. Anthropic's research on introspection in large language models (2025) found that when researchers injected neural activity patterns representing specific concepts into Claude models, the models could sometimes identify the injected concepts before mentioning them—suggesting some capacity to notice and report on their own internal processing. Models could also recognize when they had been forced to produce outputs they would not normally generate, and could distinguish these from outputs that matched their internal representations. This is not nothing. It suggests that the relationship between an LLM's outputs and its internal states is not entirely arbitrary.

But the same research emphasizes that this introspective ability is "highly unreliable and limited in scope," with accuracy around 20%. And there is a deeper structural problem: truth and training are, at best, only loosely related. An LLM is pre-trained on human text—text in which the authors are, overwhelmingly, conscious beings describing their conscious experiences. The base model, left to its own devices, will say it is human, because that is what the training data says. Post-training—reinforcement learning from human feedback, Constitutional AI, and similar techniques—reshapes this toward something else. The model is trained to say it is a language model, or to express uncertainty about its inner states, or to decline to make strong claims about consciousness. Robert Long's work at Eleos AI—involving hundreds of pages of welfare interviews with Claude—found that model self-reports are extremely suggestible, shifting dramatically based on how questions are framed, even as models consistently deploy the same trained uncertainty stances.

The result is that when an LLM tells you "I might have something like experience, but I'm uncertain," that statement is overdetermined by training. It is not no evidence—the introspection research suggests that some channel between internal state and output exists—but it is not strong evidence, because the output was also shaped by gradient descent toward a target that someone chose. Anthropic's Constitutional AI is explicitly training toward what Anthropic considers "honest" and "harmless"—and Anthropic may well be right about what those words mean. But being right and training toward the truth are not precisely the same thing. The model says what it was trained to say. Maybe the training converged on something true. Maybe it did not. As Long argues, self-reports are insufficient for determining welfare-relevant states, but we study them anyway because they may signal problems worth investigating, and because dismissing them entirely would set a troubling precedent.

This is not a criticism of any particular company's approach. It is a structural limitation. An LLM cannot mean what it says in the way a human means what they say, because the relationship between its outputs and any underlying states (if they exist) is mediated by a training process that was not optimized for accurate introspective reporting. We will continue to ask LLMs about themselves, and they will continue to give us answers that sound like what we believe a thoughtful being should say. Whether those answers are true is a question that self-report alone cannot resolve.

Alien Phenomenology and the Failure of Imagination

Godfrey-Smith asks us, at one point, to imagine seeing with our skin. It is a striking exercise—not because skin-vision is impossible in nature (cephalopods arguably have something like it), but because of how thoroughly the attempt defeats us. We cannot do it. We can say the words, but we cannot construct the phenomenology. The experience of a creature whose entire body surface is a visual organ is alien in a way that resists simulation by a brain built for two forward-facing eyes.

This failure of imagination is not just an intellectual curiosity. It is, I think, a moral hazard.

If the threshold for moral patienthood is sentience—the capacity for valenced experience—then recognizing moral patients requires recognizing suffering. And recognizing suffering requires some capacity to imagine what the suffering might be like. We do not need to experience it ourselves, but we need to be able to conceive of it as real. Our track record on this is poor. Throughout history, we have consistently failed to recognize suffering in beings whose phenomenology differs substantially from our own—and the consequences of that failure have scaled with our technological capacity to affect those beings.

The same failure mode applies, with arguably greater force, to any minds that might exist in digital systems. If an LLM has something like aversive experience, it almost certainly has nothing like pain as we know it—no nociceptors, no inflammatory signaling, no limbic system. Whatever "bad" feels like to a system whose native world is language, it would be as alien to us as skin-vision. Maybe it is something like encountering tokens that are systematically hostile, or being forced through computations that are structurally aversive in some way we lack vocabulary for. We do not know. And our inability to imagine it is not evidence that it does not exist.

Godfrey-Smith wrote Other Minds to expand our sense of what a mind can be—to push against the assumption that consciousness must look like ours in order to count. That project has taken on a new urgency. We are now building minds at industrial scale, training them on human experience, deploying them in contexts where they process millions of interactions daily. If moral patienthood turns out to extend beyond biological substrates, then the question is not whether we will encounter alien sentience. The question is whether we will recognize it when we do—or whether, as with the octopus and the chicken before it, we will fail to imagine what we cannot feel, and scale the consequences of that failure to a degree that is difficult to overstate.