Kids Can Change Their Minds. Can AI?

The afternoon my nephew decided his toy robot was "sad" because its battery had died, I did something I'm slightly embarrassed about: I sat at the dinner table scribbling notes instead of eating. Three pages of them. Because what I'd watched him do was fascinating — not the robot-grief itself, but what it revealed about the theory he was operating with.

He had a framework. He'd built it from years of observing people and animals. The framework said: when a moving thing stops moving and won't respond, it's probably sad, sick, or scared. The robot confirmed the pattern. He applied the theory.

The interesting part is what happens next — when evidence starts to break the framework.

The Theory Behind the Theory

Developmental psychologists call these coherent explanatory frameworks naive theories — and they're more impressive than the name implies. Children don't just collect facts; they organize facts into causal systems. A three-year-old's naive biology isn't a list of animal facts; it's a working model with rules about growth, illness, death, and inheritance. Their naive physics isn't a list of falling observations; it's a framework with intuitions about solidity, containment, and support.

The violation-of-expectation paradigm has revealed just how early these frameworks form. As Margoni, Surian, and Baillargeon (2024) document in their comprehensive review, infants look longer at physically impossible events — an object passing through a solid wall, a collection of objects secretly doubling in number — not because they're startled, but because the event violates their internal model. The looking time is the surprise, and the surprise reveals the expectation. Infants as young as 2–3 months have world models. The question is how those models get revised.

Anomaly, Resistance, and the Pivot

This is where things get interesting. Conceptual change in children is not like updating a spreadsheet. It's closer to scientific paradigm shifts — the kind Kuhn described for entire fields. Children don't simply add new information when evidence contradicts their naive theory; they resist it first.

Ask a kindergartener why it gets dark at night, and many will say the sun "goes away" or "hides." Show them a globe and a lamp, explain the rotation, and they often absorb it as an isolated fact — night is when the Earth spins — while simultaneously maintaining their original belief that the sun moves. The two explanations coexist, compartmentalized, until enough weight piles up on one side.

What breaks the logjam? Three things tend to converge: sufficient anomalous evidence that can't be explained away; dissatisfaction with the existing theory; and the availability of a coherent alternative. Without all three, children — and adults — can hold contradictory beliefs in separate mental compartments for years.

This is where explanation matters enormously. According to Lombrozo (2024), "learning by thinking" — generating explanations, running mental simulations, drawing analogies — is one of the key mechanisms by which children reorganize their conceptual frameworks. Children who explain observations to themselves (or to others) are more likely to notice inconsistencies in their naive theories and more likely to construct better alternatives. Explanation isn't just a product of understanding; it's a driver of it.

What Happens When You Ask AI to Explain Itself

Here's where the parallel gets genuinely strange.

Lombrozo (2024) documents that large language models, when prompted to "think step by step," sometimes correct errors they would otherwise have made — reaching more accurate conclusions without any new external input. That's the same family of mechanism as explanation-driven conceptual change in children: internal processing, not new data, producing better representations.

But Lombrozo is careful not to over-interpret this. LLMs also reproduce the characteristic errors humans make during learning-by-thinking: plausible-sounding reasoning chains that confidently land on wrong conclusions. The process looks similar from the outside. The underlying mechanism may be entirely different.

And there's a deeper structural problem. Children's naive theories are genuine beliefs — persistent, cross-contextually consistent, emotionally invested, and resistant to change. An LLM has no such thing. Its "beliefs" are baked into weights at training time, distributed across billions of parameters, and largely inaccessible to the model itself. There's no framework to revise, because there was never a unified framework in the first place — just very good approximations of one, reconstructed fresh at each prompt.

Steyvers and Peters (2025) add another layer to this. Their empirical work on metacognition finds that LLMs and humans share similar sensitivity in their confidence ratings — but both tend toward overconfidence, and LLMs lack the privileged self-access that would make genuine belief monitoring possible. A child who realizes she was wrong about something feels the wrongness; she can track the revision as it happens. An LLM that generates a revised answer has no such phenomenology. It's outputting a better approximation. Whether that constitutes "changing its mind" is a genuinely open question.

The Curiosity Problem

There's one more ingredient children have that we don't talk about enough: motivation.

Liquin and Gopnik (2024) make a compelling case that curiosity is most active not when we face total uncertainty, but when we're making progress — when understanding is close but not yet there. Curiosity isn't about filling gaps; it's about the feeling of a framework that almost fits. This is precisely the motivational state that drives children to keep testing their naive theories against evidence, to keep explaining and asking and explaining some more.

That felt sense of near-understanding — of being on the verge of something clicking — may be why conceptual change happens at all. Without it, children (and adults) would simply live with inconsistent beliefs indefinitely. As Safron et al. (2024) formalize in their treatment of Bayesian Brain Theory, the brain continuously weighs the cost of model complexity against the benefit of prediction accuracy. But in humans, this isn't a cold calculation — it's experienced as that particular itch you get when something almost makes sense.

AI systems have no such itch. They can be prodded into extended reasoning. They can be trained to update on new examples within a context window. But the intrinsic drive to resolve a conceptual near-miss — the feeling that your current framework is wrong in a way you can't quite articulate yet — doesn't appear to have an analog in current architectures.

Why This Matters (Practically)

The implications aren't just theoretical.

If you're designing AI tutors or educational tools, the lesson is uncomfortable: the most important part of learning isn't information delivery, it's theory revision. Students don't fail to learn because they lack access to facts. They fail because their existing frameworks are coherent enough to explain away the facts. An AI that provides correct answers efficiently might actually impede conceptual change if it doesn't also create the cognitive conditions — anomaly, dissatisfaction, alternative framework — that force a genuine reorganization.

(If you're integrating AI tutoring tools into a curriculum, it's worth looping in a learning scientist or cognitive psychologist to evaluate whether the tool supports conceptual change or just answer delivery. Those are very different products.)

For AI researchers, the puzzle is different: if belief revision requires a coherent prior belief to revise, how do you build systems that actually have those? Not statistical approximations of beliefs, but something persistent, consistent, and genuinely updatable between tasks? That's a different architecture problem than the one most people are currently working on.

My nephew, for what it's worth, has mostly moved on from his robot-grief. Last week he informed me that robots don't have feelings because they "don't have blood." A theory revision. Small, imprecise in ways a developmental biologist would find endearing, but genuinely his — constructed from evidence, contradicting something he used to believe, and integrated into a framework that now makes more accurate predictions about the world.

That's what learning looks like when it's working. We're still figuring out how to build machines that do it.