The Machine That Must Behave · The Jagged Frontier

Consider the difference between a song and a video game. A song is a finished sequence. It is recorded once, and from then on it plays the same way every time; a listener consumes it from beginning to end. A game is not like that. A game is a generative system — a set of rules that defines a vast space of possible states, which a player explores by acting. The designer does not author the experience directly. The designer authors the machine that produces the experience, and the player drives it.

This distinction matters more than it first appears, because it changes what “produce it in a factory” would even mean. The survey’s opening chapter argued that the master variable across every creative medium is the verifier — wherever a machine can cheaply check whether an output is good, automation tends to win. Music and film were treated there as finished artifacts. A game is not an artifact. To manufacture one in an AI factory would mean generating a machine that behaves coherently across a combinatorial explosion of things the player might do: every order of operations, every edge case, every sequence break.

That requirement exposes the fault line cleanly. A song that is five percent incoherent is still listenable; a listener forgives a weak bridge. A game that is five percent broken is unshippable. The quest flag fails to fire, the character clips through the floor, the save file corrupts — and the product is not a flawed version of itself but no product at all. Generative models are superb at producing plausible surfaces and weak at guaranteeing invariants, the properties that must hold true across every state the system can enter. Surface is cheap; invariants are not.

The best verifier and the worst

And yet games also possess something the earlier media lacked: in part, the best verifier of all. Does the code compile? Do the automated tests pass? Can an agent finish the level? Is the difficulty balanced? These are checkable questions with mechanical answers. Where a question is checkable, a machine can be trained to exceed human demonstrations on it — the same regime in which game-playing systems long ago surpassed their teachers. So part of a game sits squarely in the territory where automation is strong.

The complication is that a game is not one thing in one regime. It decomposes into layers, and each layer sits in a different verifier regime. The honest answer to whether games can be produced by machines is therefore not a single answer but a stack of them.

Mechanics and code

The mechanics-and-code layer is verifiable, which is why the simple end of the spectrum is essentially already solved. The rules of Tetris or Angry Birds are tiny, the state space is small, the whole implementation is a few hundred lines, and a model can write it and test it against itself. There is nothing mysterious left to bottleneck.

The evidence is already visible on the storefronts. Steam saw over 230 games released in the first week of January 2026 alone, many of them AI-made, with the year on track to beat the prior record of more than 19,000 titles. Legitimate games now launch, as developers put it, into a black hole of spam. The industry coined a term for the phenomenon during 2025: gameslop, low-quality AI-assembled titles flooding the stores with minimal human curation. The factory at this tier is not a forecast. It is already running.

Assets

The asset layer — art, textures, voice, music — is the music-and-image case in a new costume, and it is automating fast inside real studios. Activision has used AI for everything from concept art to player surveys and has sold AI-generated in-game items for real money. This is where the labor conflict is sharpest. In July 2025 the actors’ union SAG-AFTRA ratified an AI-focused Interactive Media Agreement with roughly ninety-five percent approval, establishing that a performer’s voice cannot be cloned without written consent.

The legal complication runs deeper than labor. The US Copyright Office has confirmed that content generated by AI with no human creative contribution cannot be copyrighted. For a studio contemplating a fully AI-dependent pipeline, this is not a footnote but a structural problem: an unownable game is a weak commercial asset, because anyone may copy it freely.

Content and levels

The content-and-level layer is where it helps to remember that games have had AI factories for forty years. They are called procedural generation — algorithms that assemble levels, items, or terrain rather than placing each one by hand. Rogue did it in 1980; Diablo, Minecraft, and No Man’s Sky scaled it up enormously.

Procedural generation also taught the cautionary lesson the whole question circles. No Man’s Sky generated eighteen quintillion planets, and players found them boring, because infinite variation without authored meaning is just noise. Quantity of generated content has steeply diminishing returns on meaning. The games that use generation well — Spelunky, Dead Cells — do something narrower and more disciplined: they machine-combine hand-authored, carefully tuned chunks inside tight human-designed constraints. Human-designed grammar, machine-assembled instances. That hybrid, not autonomous generation, is the realistic model, and it has the same shape as the bifurcation seen in music.

World models

Then there is the genuinely new thing: world models that hallucinate the game frame by frame as the player acts, with no engine and no assets underneath at all. Google’s Genie 3, shown in August 2025, generates real-time interactive 3D worlds at 720p and 24 frames per second that stay coherent for several minutes — a real leap over Genie 2’s blurry, passive fifteen-second clips. DeepMind opened a Project Genie preview to top-tier subscribers in late January 2026.

Two observations temper the excitement. The first is that the limits are the coherence problem that recurs throughout this survey: the model’s memory runs only about a minute, it cannot be fully steered, and it is costly to run at scale. Maintaining a persistent, rule-governed world over a hundred hours of play is exactly the long-horizon coherence these models are worst at. The second observation is more telling than the first. Genie’s developers are not, by their own account, aiming at making games. They are building it as a way to train AI agents inside simulated environments. The frontier where the AI is the game remains a tech demo of a possible future, not a shippable product.

Is it fun

The top layer is the question of whether the game is fun, and here the difficulty is worse than anything the earlier media posed. If the survey has had a hardest case all along, it has been judging whether a work is good — the Faust problem, where taste resists any objective measure. Fun is that problem made worse. Fun has no objective function, no formula a machine can score against, and it is emergent, arising from the interaction of systems rather than residing in any single component. There is no way to evaluate it without playing, and no way to playtest cheaply at the scale at which a machine generates candidates. AI agents serving as automated playtesters can help find exploits and check balance, but they cannot tell anyone whether the thing is worth making in the first place.

The complexity axis

Stack those layers against the complexity of the game itself and the picture comes into focus. The original question proposed the right cut: simple games like Tetris and Angry Birds at one end, vast ones like Elden Ring and GTA 6 at the other.

Tier	Example	Verifier regime	AI resistance
Simple	Tetris, Angry Birds	Code-checkable; tiny state space	Low — already a factory
Mid-tier	Indie, roguelike, mobile	Mixed; assets and levels automatable	Where the economic disruption is most violent
AAA	Elden Ring, GTA 6	Fun and coherence un-checkable at scale	High — the irreducibility moat

Simple games are gone, already manufactured at volume. But even here, the originals — the exact falling-block tension of Tetris, the feel of Angry Birds’ physics — were tail-event design insights, rare and unpredictable. A machine clones them infinitely without having invented them, and would struggle to invent the next one. The factory copies the discovery; it did not make it.

The mid-tier — indie, roguelike, mobile — is where the economic disruption is most violent, because it is where automation collapses the cost structure most sharply. AI shrinks team size and timeline until a solo developer approaches what a small studio used to be. The market floods, and discovery — the problem of being found at all — dies under the volume. The industry frames this as a contest between a bull case and a bear case. The bull case holds that AI supercharges independent developers to challenge the giants. The bear case holds that a flood of disposable copycats kills the independent scene and hands more power to the few studios holding strong intellectual property. Both appear to be happening at once, against a backdrop of layoffs that make cost-cutting adoption move faster than enthusiasm for it.

The irreducibility moat

The AAA tier — Elden Ring, GTA 6 — is the most resistant of any medium this survey has examined, and the reason closes the loop back to where the survey began. A great open-world game is literally a designed complex system: emergent and non-linear, where the meaning arises from the interaction of rules with player agency and cannot be specified from the parts. It has to be grown and tuned through iteration, not derived.

This is not the provenance moat that protects an original work of art, where value rests on who made it and when. It is an irreducibility moat. The reason an Elden Ring cannot be spat out of a factory is the same reason a complex system cannot be computed faster than simply running it. There is no shortcut around the play.

The distinction maps onto Cynefin, the sense-making framework that sorts problems by how knowable they are. Tetris lives in the Clear and Complicated domains — ordered, knowable, and therefore automatable, where cause and effect can be analyzed in advance. Elden Ring lives in the Complex domain, where one can only probe, sense, and respond, and where the design reveals itself only by being played. No amount of analysis substitutes for the probe.

The cheapest way to learn how a complex system behaves is to run it. For a great game, running it means playing it.

On top of irreducibility, the AAA game is also a scene good. People play GTA in the week it launches because everyone else is playing it, and that shared social event cannot be synthesized by any model. The brand and the unownability of AI-only assets, already discussed, only deepen a moat that complexity builds on its own.

A material, not only a threat

So the realistic near term is not an AI Netflix for games, generating finished worlds on demand. It is replacement at the bottom and augmentation at the top. The large studios will pour AI into their pipelines to survive cost pressure — assets, code, quality-assurance agents, NPC dialogue — which changes who can afford to make big games without letting a factory generate one autonomously.

There is one way games are special. In every other medium AI is purely a threat to the product: a cheaper substitute competing with the human-made version. In games, AI is also a new material inside the product. Generative characters that improvise, dialogue that never repeats, worlds that respond to the player in ways no designer scripted — these turn part of the disruption into a new design space rather than only a replacement. The factory comes for the slop tier. The top tier absorbs AI as a tool and, if anything, grows more valuable because the slop tier exists.