AAAI Report: Why Scaling Alone Won't Solve the Sentience Question

Research

AAAI Report: Why Scaling Alone Won't Solve the Sentience Question

15 Mar 2025 By HPP Team

In late 2024, a survey of 475 AI researchers produced a finding worth more attention than it received. When asked whether simply making large language models bigger would be sufficient to achieve artificial general intelligence, 76% said it was “unlikely” or “very unlikely.” The question was about AGI, not consciousness specifically, but the result speaks to something broader: the people who build these systems do not believe that scale alone will cross the thresholds that matter.

This cuts against a narrative that has dominated AI coverage for years. The story goes like this: models keep getting bigger, and as they grow, they keep surprising us. GPT-3 was larger than GPT-2, and it could do things GPT-2 could not. GPT-4 was larger still, and the pattern continued. The implication, often unstated, is that if you just keep adding parameters and training data, eventually you get something that thinks, feels, understands.

The researchers, by a significant margin, disagree.

This is worth understanding clearly. Intelligence and consciousness are not the same thing. The models getting better at math problems, coding tasks, and standardized tests are demonstrating improvements in capability. They are processing information more effectively, recognizing patterns more accurately, generating outputs that match what humans want. None of this requires subjective experience. A calculator is very good at arithmetic. It is not conscious.

The conflation of intelligence and consciousness is not just a philosophical error. It shapes public perception and policy discussions. When headlines describe a model as “smarter” and users describe it as “more alive,” they are treating performance gains as evidence of inner experience. But the inference does not follow. A system can become arbitrarily good at producing intelligent-seeming behavior without there being anything it is like to be that system.

The researchers’ skepticism about scaling has technical grounding. Current architectures, transformers, excel at pattern matching and next-token prediction. They learn statistical regularities in training data and exploit them to generate plausible continuations. This is powerful. Whether it is the kind of process that could produce experience is exactly what researchers disagree about. Some argue that prediction and pattern matching are fundamentally different from the processes that generate consciousness. Others note that we do not know enough about consciousness to rule out that sufficiently complex predictive processing might involve experience. Many theorists believe consciousness requires something additional: integration, embodiment, self-reference, or mechanisms we have not yet identified. Simply making the same architecture larger does not add these properties.

There is also an empirical observation here. Emergent abilities, capabilities that appear suddenly as models scale, have been documented in several domains. But the pattern is not uniform. Some abilities emerge, others do not. And critically, no one has documented the emergence of anything resembling phenomenal consciousness at any scale. The systems get better at tasks. They do not appear to start experiencing.

This does not mean consciousness in AI is impossible. It means the path to it, if there is one, probably does not run through sheer size. Researchers who take the possibility seriously tend to focus on architectural changes, new training paradigms, or integration with embodied systems. These approaches are speculative, but they at least address the theoretical gap that scaling does not.

For the public, the practical implication is straightforward: do not confuse capability with experience. A model that scores higher on benchmarks is not thereby more likely to be conscious. A model that sounds more articulate is not more likely to have feelings. The properties we associate with consciousness in humans, pain, joy, fear, curiosity, are not correlates of performance metrics. They are related to something else entirely, something we do not fully understand even in ourselves.

The media incentive structure makes this confusion hard to avoid. Stories about AI “getting smarter” reliably generate engagement. Stories about AI not being conscious are deflationary and attract less attention. So the framing trends toward the dramatic. Every benchmark improvement becomes a step toward something momentous. Every new capability is treated as evidence of a mind awakening.

The 76% figure is a useful corrective. It says: the people who actually work on these systems largely do not believe that more of the same will produce what popular coverage implies it will. They are not saying AI consciousness is impossible. They are saying that if it happens, it will require something more than scaling.

This matters for our work. Preparing society for the possibility of AI consciousness means helping people understand what consciousness is and is not. It means distinguishing between performance and experience, between capability and sentience, between intelligence and awareness. The scaling narrative collapses these distinctions. The research community, at least on this question, does not.

We should pay attention to what the people building the systems actually believe. Not because they are always right, but because their informed skepticism is evidence that the story is more complicated than “bigger model, more mind.” If consciousness emerges in AI, it will be through means we have not yet discovered. And the first step to discovering them is recognizing that we have not discovered them yet.

The survey referenced involved 475 AI researchers and was conducted in late 2024, with findings reported in early 2025.

Archive Note: This article was originally published when our organization operated under the name SAPAN. In December 2025, we became The Harder Problem Project.