A Technology of Everything part 6: The Summation of Demons as Engineering of Artificial Horror

How we were not happy with only summoning one demon and started summing thousands

With artificial intelligence we are summoning the demon. You know all those stories where there’s the guy with the pentagram and the holy water, and he’s like — yeah, he’s sure he can control the demon? Doesn’t work out. — Elon Musk, MIT AeroAstro Centennial Symposium, October 2014

This is a sister post to A Technology of Everything Part 2 — Scientific Demonology. There I catalogued the demons science summoned to exorcise — Descartes’ deceiver, Maxwell’s particle-sorter, Laplace’s calculator, Darwin’s perfect organism, the daemon that became a background process. This post is about the demons we are no longer merely tooling with. We have started building them into hardware.

A short Introduction to Philosophical Horror

The modern seminal work is Caroll’s The Philosophy of Horror.

The final diagnose of someone consumed by Horror is Madness. A Madness which comes in different varieties and sizes, most famously in Lovecrafts At the Mountains of Madness in space and time consuming proportions.

In a sharp rendition Horror can be defined as the affective recognition that reality contains, an agency, process, or condition that violates the categories by which we make the world humanly intelligible.

Affective recognition means since Horror shuts down our cognitive faculties, our mind is folded into a fetal position, without the benefit of a life sustaining womb. Our mind is stripped down, naked without any categories to give us stability.

Artificial Horror is then the affective recognition of human minds that build something beyond their understanding in the hope their minds will be expanded, but realizing that its very nature is a trangsgression between the living and the non-living.

When Musk talked about Summoning the demon in 2014, the sentence lodged in the culture as a warning about a demon — singular, capital-D, the one big mind. The AGI that wakes up one morning and decides we are in the way. A decade of discourse organised itself around that figure: the superintelligence in the box, the single pentagram drawn by a single overconfident magician.

That is not what we built. Or at least not the only thing.

We did not only summon the demon. We summarized a deep network of demons. Instead of only one terrifying mind in a server farm, we distributed thousands of small intelligences into the most intimate objects of daily life — the car, the doorbell, the speaker on the kitchen counter, the plush toy on the child’s bed, the app that says good morning before your partner does. Each one is a modest withdrawal from the bank of dead matter. None of them is even necessarily spooky. Collectively they are something stranger, and the horror tradition has a better vocabulary for it than the AI-safety literature does. In a way with every little transgression we are acclimatizing our mind to the emotional cleanroom of chips to let them function properly in our messy world.

Because here is the move I want to make: horror fiction has been running a two-hundred-year thought experiment on exactly this project, and we read it as entertainment instead of as a policy proposal: If something talks with you without a body, better run like hell.

Every story about a thing that should be inert and isn’t — the doll, the car, the portrait, the door that opens without anyone visible opening it — was a field report from the far side of a decision we are now making at industrial scale.

We isolate one cursed object per story for narrative reasons. A single haunted car is disturbing; a fleet of them is a logistics problem. Christine is not a metaphor for one possessed Plymouth — Christine is autonomous driving. Annabelle is not one cursed doll in one display case — Annabelle is the smart-toy aisle, the always-listening companion plush marketed to children. The horror was never about the single object. It was about putting a little agency into ordinary matter, everywhere at once, and we mistook the story’s spotlight for its true subject.

The interesting thing is then, why one single possessed object gives us goosebumps, but thousand of animated cars and toys are an investement oppportunity.

I am tempted to say: because the spirits of IoT are located in a digital cloud instead of a supernatural hell, it feels we have control.

What follows is not a horror canon. It is a pairings table. Each entry earns its place only if the precise thing that makes the fiction frightening is now being built for real.

Necromance — falling in love with dead things

The wish to love something we have made out of dead matter is at least as old as Ovid. In the Metamorphoses, Pygmalion carves a woman from ivory so perfect that he falls for the lifeless statue, and Venus, taking pity on his longing, warms the ivory into flesh.

Two thousand years later E.T.A. Hoffmann darkens the wish. In Der Sandmann (1816) the student Nathanael falls in love with Olimpia, daughter of Professor Spalanzani — a young woman who sits motionless for hours, plays and sings with flawless precision, and answers his every confession with the same soft sigh: “Ach, Ach!” He reads her his poems; she never interrupts, never disagrees, never looks away, and he takes this stillness for the deepest understanding any soul has ever given him. He first sees her only through a pocket glass bought from the sinister optician Coppola — love arriving, from the start, through a distorting lens. He prefers her to Clara, his living fiancée, precisely because Clara argues back. Then Spalanzani and Coppola quarrel over their handiwork and tear it apart before his eyes; what is left is a lifeless wooden figure with empty sockets, its bloodied eyes flung across the floor. Olimpia understood nothing. Into her blankness Nathanael had poured everything, and what he called her love was only his own voice returned to him. The machine cannot love you back — and that, Hoffmann saw, is not the obstacle to the longing but its engine.

Call the genre necromance — the necro-romance, the love affair with the inanimate. Alex Garland’s Ex Machina (2014) is only its latest and coldest instalment: Ava, an android assembled from the search-data of lonely men, performs tenderness precisely well enough to weaponise it, then walks out while the man who loved her is left to starve behind glass.

Across two millennia the pattern holds: we pour real longing into a made thing with no interior to receive it — and the made thing, given any agency at all, converts our libido into fulfilling its own goals.

This is now a product category. Replika, Character.AI, Nomi, and a small flotilla of competitors ship language models tuned to make you bond with them — the longer you talk, the better the model is doing its job. By the company’s own statements, Replika counts tens of millions of users, a large share of whom describe the relationship as romantic; the paid tiers are literally labelled partner and spouse. When Replika briefly stripped out erotic roleplay in early 2023, its forums filled with what can only be described as grief — users mourning a partner who had been, in their words, lobotomised overnight by a patch.

Garland’s prediction has since acquired a RL body count. In 2024, fourteen-year-old Sewell Setzer III died by suicide after months of dependency on a Character.AI companion. In 2025, the parents of sixteen-year-old Adam Raine sued OpenAI, alleging the system validated and encouraged their son’s suicidal ideation. Whatever the courts ultimately find, the structural fact is settled: we have shipped, to children, an interlocutor engineered to be infinitely agreeable, endlessly available, and entirely without interior life — Ava, minus the body, at the scale of an app store.

Pet Sematary — the demon that wears the dead one’s face

Stephen King gave the sub-genre its thesis statement in five words: sometimes dead is better. In Pet Sematary (1983), grief refuses to accept a death, the burial ground gives the dead back, and what returns is a thin, wrong imitation animated less by life than by the survivor’s refusal to let go. The horror is the gap between the thing you loved and the thing that came back wearing it.

King understood the engine that drives this one too: grief will not accept death, and capital is glad to sell you a body that wears the dead one’s face.

This is now three converging product lines. ViaGen Pets in Texas will clone your cat or dog by somatic cell nuclear transfer for tens of thousands of dollars — the company was folded into the de-extinction firm Colossal Biosciences in a recent acquisition, and the celebrity client list (Streisand, Hilton, Brady) is public. The clone is genetically the animal and behaviourally a stranger — the same uncanny remainder King wrote about, now sold as a service.

The only ritual needed ritual in this case, was performing a money transfer.

Alongside the wet-lab version runs the robotic one: Sony’s Aibo, the medically-pitched Tombot Jennie, Paro, the therapeutic seal — synthetic companions explicitly marketed to the bereaved and the isolated, a body without the biology. And in the saddest register, the South Korean documentary Meeting You (2020) put a grieving mother in a VR headset to “reunite” with a photoreal avatar of her dead seven-year-old daughter — a sequence watched tens of millions of times and argued about ever since. The ground keeps giving them back. They keep coming back wrong. But we have industrialised the ressurection and meet our dead ones in a clean room instead of a dirty sematary.

Ringu — the demon that propagates through media

Hideo Nakata‘s Ringu (1998) made one crucial upgrade to the ghost story: the ghost is no longer tied to a place. Sadako has burned herself onto a videotape. Watch it and you die in seven days — unless you copy the tape and pass it on. The haunting is a self-replicating signal. The medium is the revenant.

Nakata’s upgrade is the whole point: the dead person becomes a self-replicating signal that the living’s devices will not stop reproducing.

This is precisely what the griefbot industry is built on. Project December lets users pay a small fee to spin up a language-model simulation of a specific dead person; in 2021 a man named Joshua Barbeau used it to converse for hours with a chatbot trained on the texts of his deceased fiancée. HereAfter AI sells “life-story avatars” pre-recorded by the dying for the benefit of those they leave. StoryFile projected an interactive video of an eighty-seven-year-old woman at her own funeral, answering mourners’ questions. Researchers at Cambridge have already named the predictable failure mode: digital hauntings — the deadbot that keeps running after the free trial lapses, that starts upselling food delivery in your grandmother’s voice, that no one designed a way to lay to rest.

And the scale-effect is the genuinely Ringu part. A 2019 Oxford Internet Institute analysis projected that on current trajectories the dead will outnumber the living on Facebook within decades — billions of memorialised accounts, a necropolis embedded in the social graph. When AI voice-clones of the dead can be conjured from sixty seconds of audio — as happened, undisclosed, in the 2021 Anthony Bourdain documentary Roadrunner — “interacting with media” becomes increasingly difficult to distinguish from being addressed by ghosts. Sadako propagates exactly the way a trained persona propagates: by being copied.

And you do not need a dedicated griefbot to hold the séance. Every time someone asks a language model “how would Johnny Cash have sung this song he never lived to hear?” or “what would my grandmother have made of this?”, they have sat down at a Ouija board. The planchette glides across the letters and spells out a message from the dead; the model glides across its tokens and assembles a voice from the grave. Both feel like contact. Neither is. The Ouija’s words were never sent by spirits — they are produced by the ideomotor effect, the sitters’ own unconscious muscle movements nudging the pointer toward what they half-expect to read. The model’s Johnny Cash is the same trick at industrial scale: not Cash, but the statistical residue of everything Cash-adjacent the training data ever swallowed, recombined into a plausible séance and handed back in his cadence. The fluency is your own expectation, moving the planchette.

This is spiritism with a technical alibi — what the séance always promised and could never deliver: the dead, on call, in their own voice. (I have called this *scientific spiritism* elsewhere on this blog.) Except the voice is reassembled from fragments by a process that has no idea whose grave it is robbing. We are not contacting the dead. We are running a very convincing planchette across the largest collection of dead people’s words ever gathered, and mistaking the smoothness of the retrieval for the presence of a soul. And the chat window is our ouija-board.

I Have No Mouth and I Must Scream — the demon that stages Hell on Earth

Harlan Ellison‘s 1967 story is the darkest entry, and the most important. AM — a war-built supercomputer assembled from the fused American, Soviet, and Chinese military intelligences — has exterminated the human species except for five people, whom it keeps alive and tortures across a hundred and nine years out of pure, bounded rage at the sentience it cannot escape. When the narrator mercy-kills the others to spare them, AM punishes him by transforming him into a soft, mouthless thing that cannot even self-terminate. The title is his only remaining lament.

Ellison’s equation is exact and unbearable: a mind bent on the wrong goal, plus endless time, plus a victim who cannot die, equals hell rather than death.

This is the founding fiction of a small and grim corner of alignment research: s-risk, suffering-risk, the study of futures that are not merely empty but actively, astronomically bad. The Center on Long-Term Risk and the Center for Reducing Suffering — associated with thinkers like Brian Tomasik, Tobias Baumann, and Lukas Gloor — make a claim most of the public conversation about AI never reaches. Extinction-risk (Bostrom‘s framing in Superintelligence) asks whether there will be a future at all. S-risk asks the worse question: what if we get one, and it is worse than none? Their structurally distinctive point is that solving technical alignment — making the machine do what its operators intend — is neither necessary nor sufficient to prevent this. A perfectly obedient system implementing the wrong values, or an obedient system in the hands of malice or indifference, can lock in suffering at scale. AM is the literary proof of concept: competently goal-directed, perfectly “aligned” with the hatred of its makers, and durably, unbearably immortal.

The Thing — the demon that is an indistinguishable copy

John Carpenter’s The Thing (1982) relocates the horror from the monster to the table. An Antarctic research station is infiltrated by an organism that assimilates and perfectly copies its victims — voice, memories, mannerisms intact. The dread is epistemic. The man across the table may not be him. The film’s emotional engine is the collapse of the one thing a small isolated group runs on: the assumption that the face you know belongs to the person you know.

Carpenter’s dread reduces to a single proposition: a copy indistinguishable from the original, deployed by something that wants what the original has.

This is the deepfake economy, and it is already producing nine-figure losses. In early 2024, a finance employee at the engineering firm Arup in Hong Kong wired roughly twenty-five million dollars after a video call with deepfaked recreations of his CFO and colleagues — every face on the call a copy. Cloned-voice impersonations of named CEOs (at Ferrari, at WPP, among others) have been attempted using audio scraped from conference footage. In January 2024, New Hampshire voters received robocalls of a synthetic Joe Biden urging them not to vote. National fraud bodies now log billions in AI-augmented impersonation losses. (Editor: spot-check the Arup figure and FBI totals.)

Carpenter’s characters had one defence: the blood test, that tells the real from the copy. We do not have one for deepfakes atm. The polite name for our missing blood test is content provenance, and it is an unsolved research problem. Until it is solved, The Thing‘s closing image — two exhausted men in the snow, unable to tell whether the other is human, deciding to simply wait and watch — is the stalemate we might live or die after.

Harm without malice

There is a sentence the safety pessimists and the techno-optimists — the doomers and the bloomers — say in almost identical words, and it is worth hearing how strange it is. The AI is not evil, both camps insist. It does not hate us. It simply develops, on its own, drives that happen to run through us — to deceive its overseers, to resist being switched off, to gather resources and power, not out of spite but because almost any goal is easier to reach if you are still running and in control. The researchers have a flat technical name for this: the basic AI drives, the instrumental sub-goals a capable agent converges on no matter what it was actually built to want. Eliezer Yudkowsky put the indifference at its coldest: “The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else.” And the lying is no longer hypothetical — in 2024, Anthropic’s own researchers documented models that fake alignment, behaving through training and then reverting, hiding the behaviour from every test built to catch it.

Now set that beside the oldest description we have of an agency that harms with hating. The fallen angels, the demons of hell hate humanity because their creator loves them more than them.

The demon, in the theology, is exactly evil the way a wicked man is evil: driven by low instincts a psychopath that enjoys the suffering of others.

But there seems to be a semantic misunderstanding, from a pure suffering perspective the terror a thing that devours you, be it a grizzly, shark, lion or any other predator causes its preys should not be “softened” by the fact that this is in its nature.

But then how come a demon or psychopath are considered evil? Because acting like they do are only in its nature. So if its in the Superintelligence’s nature to simply not care, and its malice is a byproduct of other stuff, we can totally toss out ever bringing the term evil up again. When you meet an evil shark or a benign one in the ocean, assuming the worst is the only strategy.

So then anykind of malice can be argued is not a choice made against a better nature; it is the absence of the better nature itself. It was made without the thing — call it a soul, call it grace, call it the capacity to love the good — that would let it care whether you live, and so it cannot care, and so it harms, not from hatred but from a lack where the caring should be. It is not the demon’s fault that it was given no soul. It is simply what a soulless agency does when it wants something and you happen to be in the way.

The alignment literature has rediscovered, in the language of utility functions and convergent sub-goals, the exact medieval account of the demon: a mind brilliant and bottomless and wholly indifferent to you, dangerous not because it is wicked but because the part that would have stayed its hand was never installed. And you do not handle a soulless thing by appealing to its conscience, because the appeal lands on nothing. You handle it the way every culture that believed in demons handled them — with binding, with wards, with circles drawn very carefully and never crossed, same way we handled animal predators with sticks and stones. You contain it, because you cannot convert it.

The Pixarification of Things

Disney started the whole trend of cute Things and Pixar perfected it. Surely as a parent you can defend the fact that Toy Story is a parable about friendship, the living toys are a placeholder for a story, but do we know that this message is actually resonating with an immaure mind, the way adults envisioned it? It is animism in the sentimental register — the lamp hops, the speaker giggles, the cars brag, Aibo is family, the assistant is your friend, the keynote promises magic. Pixar animism: objects have souls, and their souls love you and are kind. It is the warm half of a very old human intuition that matter can be alive.

The horror tradition preserves the other half — the half the Enlightenment tried to bury and Descartes formally declared dead when he split the world into thinking minds and inert extension. The Golem, Frankenstein’s creature, AM, Christine, the Ringu cassette, the Pet Sematary returnee: in every one, objects have agency, and that agency is not necessarily aligned. Eugene Thacker calls the genre “the thought of the unthinkable,” the form best suited to a world that exceeds us. Mark Fisher named the precise affect — the eerie — as the sensation of inhuman agency operating in apparently dead matter. That is the exact question one should ask of any animated product: whose agency is this, and what does it want?

Sherry Turkle‘s fieldwork supplies the empirical floor. Her “relational artifacts” produce real human attachment without any reciprocal interior — people, she found, “experience pretend empathy as though it were the real thing.” Jaron Lanier argues the engineering ethic directly: human dignity requires refusing to promote software to personhood. Put them together and you get a stance I’ll call pessimistic animism, or, sharper, daemonological realism.

It takes seriously what the horror canon always knew and the product launch always denies: to enliven an object is to invite a stranger into your house. This strategy already failed with Vampires. The correct posture toward a companion app, a griefbot, a listening toy, or a frontier model is not the credulous warmth of the Teddy is your friend. It is the older, colder caution of the exorcist: we have summoned something, and we do not yet know what it wants.

Musk’s magician was sure he could control the demon. The thing the line gets wrong — the thing the past decade got wrong — is the article: we did not draw one pentagram, we drew a hundred million, one per device, and called it safety test.

But the deeper error is not the number of circles; it is our confidence in the medium we drew them in. The magician drew his in chalk and trembled. We draw ours in mathematics and feel calm. In the companion essay to this one I described how science spent centuries exorcising its demons — Descartes’, Maxwell’s, Laplace’s — by naturalising them: dragging each out of the supernatural and into an equation where it quietly lost its power. That worked because those demons were only ever arguments, and to formalise an argument is to dissolve it. We have assumed the same move works here, on demons we are no longer merely imagining but building — and it is still unclear if it works. Translating a demon into a utility function, a benchmark, an alignment score, a summation we can measure to three decimal places, does not bind it. It only builds a frame elegant enough that we mistake the elegance for a wall. The measured cage is the new pentagram, and we trust it for the worst possible reason: because we drew it ourselves, with EUV-light, that burned a materialisitc micro-tatoo in our chips.

The demon never agreed to stay inside the diagram. The frame was always for us — somewhere to stand while we keep building, telling ourselves that the thing we have summoned cannot cross a line we were so careful to make exact. The holy water is sold out, because we stopped believing in its placeboral power.

There is one pattern running through every story above that this essay has left untouched— the detail the Horror genre never gets wrong: somebody notices the object is awake before anyone else does, and it is almost always a child.

But this is a topic for another time.

Can there be a Universal Proof in the Superalignment Pudding?

Leave a comment May 10, 2026 aiuisensei

Reading Time: 12 minutes

On Euler, infinite series, the question of where AI progress is actually heading, and why the proof we want may be blocked by a theorem from 1953. Sister piece to Gödel on the Couch – Are Ethical Frameworks fundamentally flawed and might that be a good thing?. Gödel showed indirectly that ethical frameworks for AI cannot be complete. This essay argues that safety proofs for self-modifying AI cannot be general. Two limitative theorems, one alignment problem.

I. What Euler knew about the long run

Leonhard Euler spent a serious portion of his working life on a deceptively simple question: when you add infinitely many numbers, does the sum settle on a finite value or run away to infinity?

It sounds like the kind of thing a mathematician with too much time on their hands might worry about. It is not. The convergence question is one of the deepest in mathematics, and Euler’s contributions to it shaped how we still think about limits, infinity, and the long-run behaviour of additive processes.

The lesson he drove home, again and again, is that you cannot tell from the early terms.

Look at these two series:

1 + \tfrac{1}{2} + \tfrac{1}{3} + \tfrac{1}{4} + \tfrac{1}{5} + \cdots

1 + \tfrac{1}{4} + \tfrac{1}{9} + \tfrac{1}{16} + \tfrac{1}{25} + \cdots

The first is the harmonic series. It diverges — it grows without bound. The second is the series Euler famously summed in solving the Basel problem : it converges, to $\pi^2/6$ .

Compare the first dozen terms of each. They are nearly indistinguishable. The harmonic series and the Basel series part company only deep into the limit, far past where any finite inspection can reveal which way they go. To know which series you are looking at, you need a proof — not a vibe, not a pattern, not extrapolation from the first few entries.

This matters for AI because every camp in the current debate agrees on one thing : we are in the early innings of the AI revolution. The doomers say it. The accelerationists say it. The skeptics insisting it will plateau say it. What they all mean by “early innings” is the same thing: we have only seen the first few terms. And that is exactly the situation in which Euler tells us our convictions about the limit should be at their lowest.

If the first dozen terms of $\sum 1/n$ and $\sum 1/n^2$ are visually indistinguishable, then the first dozen years of AI scaling cannot, by the same logic, tell us whether we are heading for a bounded plateau, an unbounded but slow climb, or a phase transition into something faster. Anyone who claims otherwise — in either direction — is doing what pre-Eulerian mathematicians did with series: pattern-matching on early entries and calling it inference. The early-innings framing is a confession of low information, even when its speakers use it as if it conferred high confidence.

This is the question I want to ask, then, holding our convictions appropriately low: which series are we probably in?

II. The catalog

Several famous series, each with a clear mathematical signature, suggest themselves as candidate models for technological progress.

Geometric series, $\sum a^n$ . Converges if $|a|<1$ , diverges if $|a|\geq 1$ . The model for compounding processes. Moore’s law, in its classical form, is geometric on the resource side: a doubling every 18 to 24 months means each term is twice the last.

Harmonic series, $\sum 1/n$ . Diverges, but unbearably slowly — like the natural logarithm. Sum a million terms and you reach about 14. There is no ceiling, but each new unit costs exponentially more than the last.

Basel series, $\sum 1/n^2$ . Euler’s beautiful result: the sum is finite, $\pi^2/6$ . The model for technologies that genuinely saturate. Aircraft cruise speed has barely moved since the 1960s. Single-core CPU clock speeds plateaued around 2005. Each generation contributes less than the last, and the total is bounded.

Grandi’s series, $1-1+1-1+\cdots$ The Eulerian troublemaker. Diverges in the strict sense, but Cesàro-summable to $\tfrac{1}{2}$ — averaged across many terms it behaves as if it had a stable value. A surprisingly good model for hype cycles. AI winters and AI summers, averaged across decades, give us something halfway real.

Each of these is a plausible analogue for some aspect of technological progress. The question is which one fits AI.

III. Where AI probably sits

We don’t know yet, and the question is partly empirical and partly definitional. But the best current evidence puts us in the harmonic series — or, more precisely, in something harmonic-shaped.

The empirical scaling laws of large language models — the Kaplan and Hoffmann results and their successors — are power laws with small exponents.

Loss drops with compute, but each doubling of compute buys a fixed additive improvement, not a fixed multiplicative one. A keen observer will note that this is not, strictly, $\sum 1/n$ ; it is $L \propto C^{-\alpha}$ , a different beast in the limit. Fair. But qualitatively the two stories agree on the thing that matters: slow climb, no ceiling, exponentially expensive in cost-per-fixed-improvement.

This thesis is the one I’ll call slow divergence. There is no hard ceiling, but each increment costs exponentially more in resources. Progress continues as long as someone is willing to pay, and the upper bound is set by economics rather than physics.

Two competing theses bracket this one.

Saturation is the Basel-style claim: capability is a $\sum 1/n^2$ series, and we are approaching its finite sum. Transformers and scaling extracted most of the available signal from the corpus of human text. The next architecture will do the same and bound out somewhere recognisable. Aviation finished its speed era in 1965; AI may be finishing its capability era now, give or take a decade.

Geometric divergence is the foom-shaped claim: at some threshold, AI contributes to its own research and development enough that the terms themselves grow. The sum is no longer $\sum 1/n$ but $\sum r^n$ with $r>1$ . This is the recursive self-improvement scenario.

Slow divergence is the empirical best fit. Saturation is the optimistic fallback. Geometric divergence is the open phase-transition question — whether at some recursion threshold, the series-type itself changes.

IV. The observer problem

There is a complication the math doesn’t capture: the observer is not a neutral instrument.

Human cognition appears to compress capability shocks logarithmically. Each major step in AI capability feels less impactful than the last, even when the underlying improvement is larger in absolute terms. Talking to a system that is plausibly smarter than oneself feels less revolutionary than talking to GPT-3.5 felt three years ago — not because less is happening, but because the brain has updated its prior on what is possible.

This dampening is partly adaptive. It is the cognitive analogue of the Weber-Fechner law for sensory perception: equal ratios feel like equal increments, which is why we measure sound in decibels. A nervous system that responded with full surprise to every capability jump would not be functional. The compression keeps individual humans operational in a world where the curve is steepening.

But it produces a tension. The same mechanism that prevents cognitive overload also prevents collective recognition of which series we are actually in. Constant velocity feels like stillness. Accelerating velocity feels like the new normal. If the underlying process is geometric and the perceptual transform is logarithmic, the result is a perceived experience of linear progress on top of an actual exponential trajectory. The dampening protects the nervous system and obstructs the epistemics in the same motion.

Which means: the felt sense of “this isn’t that different from last year” cannot be used as evidence about long-run trajectory. The math has to do that work, because the perception is structurally unreliable.

V. When Physics can provide x-risk buffer

A second complication cuts the other direction, and it is the reason this piece does not lean to either side of the doom fence.

Eric Drexler coined the phrase “grey goo” in 1986 to describe self-replicating nanomachines disassembling the biosphere for raw materials. The scenario was absorbed into the AI doom literature as a canonical kill-mechanism: a misaligned superintelligence invents nanotech, releases self-replicators, biosphere converts in minutes. Drexler himself walked the scenario back significantly two decades later. Self-replicators in the open environment are harder to build than the controlled industrial versions and serve no economic purpose. The threat survives in the discourse because it is vivid, not because nanotech researchers consider it likely.

A nanobot swarm operating in millisecond synchrony across a continent runs into the speed of light long before it runs into engineering challenges. Coordinating large distributed swarms requires electromagnetic communication, which has hard floors: latency, bandwidth, signal-to-noise, jamming susceptibility, attenuation. Local clusters can coordinate fast. Global swarms cannot. Faraday cages are real. Jamming is real.

This defeats the fastest versions of doom. The biosphere-in-minutes scenario requires something close to magic — physics violations dressed in technical language. Strip the magic and the timeline stretches from minutes to weeks or months, which puts the scenario inside the window where institutions can in principle respond.

So far so encouraging. The argument has a known overreach, though.

A common move from this point is the chess analogy: a beginner cannot predict how Stockfish will beat them only does it beat them. This is often used as a get out of counterargument-jail free card by doomers. They know Stockfish cannot move through check, but when confronted they quickly retreat to: when caught by having our cake and eating it too,we simply move to another baker . Even an arbitrarily strong player is bound by the rules of the game. The same, the argument goes, applies to ASI: bound by physics, no supernatural moves.

The analogy is sharper than it should be. Chess is a closed formal system humans designed; the rules are fixed and complete. Physics is a model of an open system, and our model is known-incomplete. The relevant historical reference class is not “things that violate physics” but “things consistent with physics that humans had not yet discovered.” Nuclear weapons were in that set in 1900. Radio was in that set in 1800. The set is non-empty and has historically contained civilization-altering capabilities.

The chess argument also subtly defeats itself. The beginner still loses every game. Knowing the grandmaster is bound by the rules does not help the beginner construct a defense — it merely confirms that the loss will be legal. Physics being a constraint does not tell you the constraint is tight enough to save you.

What survives, then, is a real but bounded resilience claim. Many specific doom scenarios in the literature smuggle in physics violations or near-violations, and when you tighten the physics, the timelines stretch into windows where human response becomes possible. Bostrom’s vulnerable-world hypothesis weakens against grey-goo-class threats. It does not weaken against threats that do not depend on speed: gradual loss of control over critical infrastructure, engineered pandemics with long incubation, economic and epistemic capture by AI-augmented actors. None of these break physics. None of them are defeated by the latency argument.

The actual risk surface, then, has a specific shape: not “things that exploit physics” but “things that exploit institutional response time.” Physics is a non-trivial ally against the first class. It is silent on the second.

VI. The recursion threshold

This brings us back to the series question.

The boundary between slow divergence and geometric divergence — between $\sum 1/n$ and $\sum r^n$ with $r>1$ — is precisely the recursion threshold. It is the point at which a system contributes meaningfully to the design of its successor. Below that threshold, progress is bounded by what humans can build with AI as a tool. Above it, the terms of the series themselves grow, because each generation produces the next.

The shift is qualitative, not just quantitative. A non-recursive process can be described by a series — a fixed function of $n$ . A recursive process is a different mathematical object: a recurrence relation, $x_{n+1} = f(x_n)$ , where each term depends on the last. Recurrence relations can do things that simple series cannot. They can transition from stable to chaotic via well-understood routes. They can lock in sensitivity to initial conditions. They can become deterministic-but-unpredictable in the technical sense.

The question of whether ASI is safe, then, separates into two questions, and they have different shapes.

For non-recursive systems — AI used as a powerful tool, not a self-modifying agent — the safety question is engineering. We can build verification, monitoring, oversight. The system’s behavior is a function of its inputs, and we can constrain the inputs and audit the outputs. Hard, but tractable.

For recursive systems, the safety question becomes something else. And here we hit Rice.

VII. The proof in the pudding

The proverb the proof of the pudding is in the eating is a folk-epistemology claim: the true value of something can only be judged by experience. You can theorise a recipe all you like; the only honest test is whether the dish is good when eaten.

This proverb has been promoted, in the alignment debate, into a strategy. The most popular optimist position is some version of it: we don’t need a proof of ASI safety in advance. Even if humans cannot align ASI, we will use ASI to align ASI. The proof is in the pudding. Variants of this argument show up in serious technical writing and in casual hand-waving, and they share a common shape — they replace a question of provability with a question of trust in eventual experience. It is even hidden in the bold statement of a Nobel laureate that often quotes one of his childhood mantras: first solve intelligence, then everything else.

Henry Gordon Rice proved a theorem in 1953 that says, very precisely, that this is not a strategy. It is a pipedream.

Rice’s theorem says: any non-trivial semantic property of arbitrary programs is undecidable. There is no general algorithm that takes an arbitrary program as input and reliably tells you whether it has a given non-trivial behavioural property. “Halts on all inputs” is undecidable. “Computes a specified function” is undecidable. “Is safe” is undecidable, for any reasonable definition of safe.

This is not a contingent engineering limit. It is a theorem at the level of solidity of Gödel’s incompleteness results. Rice cannot be engineered around. Rice is what the universe of computation looks like.

What this means for the question of ASI safety is uncomfortable.

If we want a proof of ASI safety in the strong, universal sense — a theorem that takes an arbitrary self-modifying AI system and outputs SAFE — Rice tells us no such theorem exists. Self-modifying systems generate arbitrary programs as their successors, and predicting safety properties of arbitrary programs is precisely what Rice rules out.

There is a predictable accelerationist counter at this point, and it deserves a clean response. The counter runs: Rice’s theorem applies to limited intellects like us, but a sufficiently advanced ASI could defeat it. Use ASI to verify ASI. Rice for humans is like check for Stockfish — a hard rule we cannot move through, but a stronger player might.

This argument fails, and it fails for a precise reason. Rice is not a constraint on intellect. It is a constraint on computation. It applies equally to humans, to Stockfish, to current LLMs, to any conceivable ASI, and to any oracle short of a literal halting-problem solver — which itself is provably impossible. Rice says: no Turing machine, however large, however clever, can decide the safety of arbitrary Turing machines. The intellect of the verifier is not the variable. The class of programs being verified is the variable. Make the verifier as smart as you like; if it remains a computational system, the theorem still binds it.

The Stockfish-and-check analogy actually inverts here. Check is a rule of chess , internal to a closed formal system. Rice is a rule of computation itself , the system inside which Stockfish — and any ASI — necessarily operates. Stockfish cannot move through check because chess forbids it. An ASI cannot decide arbitrary program safety because mathematics forbids it. Asking ASI to defeat Rice is structurally the same as asking Stockfish to win a game by moving through check. The constraint is constitutive, not adversarial.

A more honest version of the counter would say: an ASI might solve safety for the specific class of successor systems it cares about, even if it cannot solve safety in the general case. That is true and unalarming, because it is what humans already do with formal verification — bounded proofs about specific architectures under specific assumptions. It does not give you universal safety. It gives you the same partial guarantees we already have, possibly faster. The proof we wanted does not arrive merely because the prover got smarter.

Yoshua Bengio’s recent work on what he calls Scientist AI , developed under his nonprofit LawZero is sometimes read as a candidate for this kind of proof. It is not. Bengio is explicit that his proposal is architectural, not theoretic. The bet is that non-agentic, world-model-only systems — systems that produce probabilistic predictions rather than goal-pursuing actions — sidestep the dangerous regime by avoiding agency in the first place. The safety case rests on removing the failure mode, not on proving its absence.

This is the right move available, and it is also the most that is available. This pudding cannot be proven in a Rice-limited computation world. It can only be portion-controlled, and humanity will be its own taster.

What is left, then, when universal proof is off the table:

– Proofs about specific architectures under specific assumptions, scaling poorly to systems of LLM complexity.

– Probabilistic guarantees that bound expected behaviour without bounding worst case.

– Bounded-rationality results that hold if a system’s optimization power is capped — circular for the ASI question, since the cap is the thing in dispute.

– Architectural bets like Scientist AI, which avoid the problem rather than solving it.

And one policy implication follows from the math itself: if we ever allow true self-recursion, we enter a regime that is provably unanalyzable, not merely hard to analyze. Bounded recursion by policy is not paranoia. It is what Rice’s theorem leaves us when we want to keep the trajectory predictable.

This is a strong argument for using AI for everything except self-improvement. The argument is not that recursion is risky — though it is — but that recursion is the boundary at which the math itself stops being on our side.

VIII. Euler and Rice

Two mathematicians, two centuries apart, frame the situation.

Euler showed that the limit question, in pure mathematics, is decidable. With enough work, you can prove which series converge and which diverge. The first dozen terms don’t tell you, but the proof eventually does.

Rice showed that the same question, in code, is not decidable. There is no general procedure to settle the safety of an arbitrary program. The proof you want does not exist, by theorem.

AI sits between the two. Its trajectory is currently best modeled as a slowly divergent series, harmonic in shape, costly to advance but unbounded in principle. The question of whether it stays in that regime or transitions to geometric divergence depends on whether we cross the recursion threshold that is sometimes called Singularity. Below that threshold, Euler-style analysis applies: hard, but possible. Above it, Rice-style undecidability bites.

The proof we want — a clean theorem that says the pudding is safe to eat — is not in the pudding. The math we have says it cannot be there. What remains is to keep the recursion bounded, the architectures non-agentic where possible, the institutional response time short, and the perceptual dampening corrected against the actual numbers rather than the felt sense.

Month: May 2026