Module 14

AI Futures: AGI, ASI, and the Trajectories

Definitions, timelines, capability trajectories, alignment, hardware, and the constraints that bound the next two decades.

~25 min read Advanced Builds on M10

Three CEOs, Three Timelines, Three Different Things

In the first months of 2026, three of the most-watched people in the AI industry made public statements about when transformative AI would arrive.

Dario Amodei, head of Anthropic, repeated his line that “powerful AI” — systems with the intellectual range of Nobel laureates across many disciplines — would emerge in late 2026 or early 2027.

Demis Hassabis of Google DeepMind, asked the same question at a conference in India, said “maybe within five years,” and added that getting there would require one or two further breakthroughs on the order of the Transformer or AlphaGo.

Yann LeCun, Meta’s chief AI scientist, repeated his standing position: at least a decade, probably much longer, and almost certainly not from large language models alone.

Three credible people, asked the same question, gave timelines that span more than a decade. Amodei is talking about a deployable system that exceeds expert human performance across many cognitive tasks. Hassabis is talking about a system that can also genuinely invent — that could, in principle, derive relativity or design Go from first principles. LeCun is talking about something else again: a system grounded in models of the physical world, with reasoning architectures that the field has not yet built.

Most of the disagreement is definitional. The empirical questions — how fast capabilities are progressing, where the bottlenecks are — have a narrower spread than the timeline numbers suggest.

A Short Tour of the Words

Four terms appear repeatedly in this discussion, and the same word can carry quite different meanings in different rooms.

Artificial General Intelligence (AGI) is the most contested. In its older usage, AGI meant a system that could perform any intellectual task a human can, at roughly human level, across domains. In recent industry usage it has drifted toward “an AI that can perform most economically valuable computer work autonomously” — a narrower and more measurable target. Sam Altman has called it “a very sloppy term.”

Artificial Superintelligence (ASI) denotes a system that substantially exceeds the best human performance across virtually every cognitive domain, including scientific creativity, social judgement, and what people sometimes call wisdom. ASI sits beyond AGI by definition, but the gap between them is unspecified — it could be years, or it could be very short.

The Singularity starts from a mechanism: a system that can improve itself, then use the improved version to improve again — recursive self-improvement. The point at which that loop makes AI-driven change so rapid that human civilization is irreversibly transformed is, in Ray Kurzweil’s formulation, the Singularity.

High-Level Machine Intelligence (HLMI) is the term used in academic surveys, defined as machines accomplishing every task better and more cheaply than human workers. It is roughly equivalent to economic-AGI.

These distinctions matter because the timelines change dramatically depending on which one you mean. A 2023 survey of nearly 1,700 AI researchers — the Expert Survey on Progress in AI, or ESPAI — placed the median estimate for HLMI at 2047. The 2024–2025 forecaster surveys, with smaller and more selected samples, put it at 2030. The Metaculus community, in early 2026, estimated a 50% chance of AGI by November 2033 — and revised that estimate outward during late 2025, against the prevailing direction of the news cycle.

A seventeen-year spread is large, but it is not random. Each of those numbers is produced by a population with predictable biases. The ESPAI sample is wide and includes researchers across academic and industrial settings, where caution about claims that a senior colleague might dispute is a professional norm; survey work in the field has documented that this tends to push median timelines later. The 2024–2025 forecaster surveys are smaller, more self-selected, and weight more heavily toward people who already work near the frontier — a population that has watched recent capability jumps from inside, and would be unusual if it did not extrapolate aggressively. The Metaculus community sits between the two: a large, self-selected group of forecasters with a track record on shorter-horizon questions, whose collective revisions are useful precisely because the direction of the revision is observable. The fact that Metaculus moved its AGI estimate later in late 2025, while the news cycle was loud with capability announcements, is the kind of signal worth weighting more than any single point estimate. One further question belongs in the kit: who benefits from the prediction? A CEO’s timeline is, among other things, a message to investors, recruits, and regulators.

The State in Early 2026

This section is a status report; a reader who already follows the industry closely can skip to the next section and lose nothing.

Frontier model releases have moved from a quarterly to a rolling cadence. February 2026 saw GPT-5.3-Codex (the model OpenAI markets as its agentic-coding flagship), Claude Opus 4.6, and Google’s Gemini 3.1 Pro within a few weeks of one another; April added Claude Opus 4.7, Google’s Deep Research Max, and a gated preview release Anthropic calls Claude Mythos. The “model war” is no longer something that happens twice a year; it happens continuously.

The capability frontier has shifted from chat to long-running computer work. The systems that the labs are now most eager to demonstrate are agents that plan, use tools, run code, browse the web, hand off to subagents, and recover from errors over hours or days.

Reasoning, in the technical sense, has become a commodity feature. Every major lab now ships some form of inference-time deliberation: OpenAI’s “thinking” models, Google’s Deep Think, Anthropic’s extended and adaptive thinking, DeepSeek’s training pipeline (in which the model is rewarded for solving problems where the answer can be programmatically checked, an approach commonly abbreviated RLVR), and reasoning variants from Grok and Qwen. The interesting question has shifted from “can the model reason?” to “can it plan, use tools, verify, recover from errors, and keep useful state over hours?”

Older benchmarks are saturating, and the new ones look very different. Where the industry once measured itself against MMLU and HumanEval, the current frontier is agentic and operational: SWE-Bench Pro, Terminal-Bench 2.0, OSWorld, Humanity’s Last Exam, ARC-AGI-2 and 3, BrowseComp, GDPval-AA, CyberGym. These are harder to compare across labs because each one uses different evaluation harnesses, different sets of tools the model is allowed to call, different effort settings (how much inference-time compute the model is given to think), and different contamination screens (procedures for checking whether benchmark questions leaked into training data).

Exponential realist Inference cost ($/Mtoken) Nov 2022 – Oct 2024

The cost of running a large language model fell from roughly $20 per million input tokens in late 2022 to about $0.07 per million tokens by late 2024 — a drop of about 286× in two years. The mechanism is a mix of hardware specialization, model distillation (training smaller models on the outputs of larger ones), competitive pressure across providers, and serving-side optimization.

This is the kind of curve worth taking seriously. It is anchored in a measurable metric, the time window is short enough to verify, and the underlying drivers are well understood. If even half of this rate continues through 2028, the set of tasks that are economical to automate expands further than most current revenue projections assume — which is one of the reasons revenue and infrastructure spending have diverged so sharply.

The label that fits this picture is uneven progress. Capabilities are arriving faster than institutional readiness to deploy them. Enterprise pilots fail more often than they succeed. The MIT GenAI Divide study in 2025 found that most pilots produced no measurable profit-and-loss impact. A controlled trial by METR, also in 2025, found that experienced developers using early-2025 AI tools took 19% longer on tasks in familiar codebases. The same period also saw companies cite AI in 55,000 announced job cuts, an order of magnitude above two years prior.

Why Forecasts Fail: The Extrapolation Trap

There is a recurring way that confident predictions about the future go wrong, worth naming before reading any of the trajectories below — including the ones in this module. In 1966 the novelist Harry Harrison tried to picture New York thirty-three years out, in Make Room! Make Room!: a city of 35 million, most of them starving, food and water exhausted. He reached it honestly, by taking the population and consumption trends of his day and continuing the lines — his own arithmetic: the United States in 1950, with 9.5 percent of the world’s population, consumed 50 percent of its raw materials, so “within fifteen years” it would consume “over 83 percent.” What the lines could not capture was that the system was already bending away from them as he wrote — the post-war birth boom had peaked years earlier and the contraceptive pill, approved in 1960, was spreading, so US population growth was decelerating rather than exploding; and on the resource side, rising scarcity pushes prices up, which pushes substitution and efficiency, slackening the very shortages a straight line projects. The prediction failed not from bad arithmetic but from straight-line thinking about an adaptive system.

The systematic study of that failure is Philip Tetlock’s. Across decades of scored forecasts he found that credentialed experts — especially the confident, media-facing kind — often predict complex political and economic events no better than chance, and that the people who do reliably better, his “superforecasters,” are set apart less by what they know than by how they work: they think in explicit probabilities, break a big question into checkable pieces, and update in small steps as evidence arrives. The lesson is not that forecasting is hopeless but that its honest form looks nothing like the confident point prediction — and, as Module 2 argued, the reason is structural. The agents in a complex system adapt, so the further out you extrapolate, the more of what you are extrapolating has already moved.

This is the discipline the rest of the module tries to keep. The capability-trajectory table further down is a Harrison-style extrapolation — a set of lines continued — and is labelled as exactly that, to be tracked rather than trusted. The five constraints that follow are there for a Tetlock-style reason: the usual way an AI extrapolation fails is that one of them binds first, the way decelerating fertility and the price mechanism overtook Harrison’s.

Five Constraints Worth Keeping in Mind

Landauer’s principle sets an absolute minimum energy cost for erasing a bit of information: roughly 3 × 10⁻²¹ joules at room temperature. To get a sense of scale: current GPUs spend somewhere between 10⁻¹⁵ and 10⁻¹⁴ joules on a single floating-point multiply-accumulate operation, which is about a million times above the Landauer floor. The floor is genuinely physical — no engineering can remove it — and the gap to it is the headroom that hardware efficiency improvements are eating into. Compute scaling faces a physical ceiling, not just an economic one.

The Kaplan and Chinchilla scaling laws describe a curve that flattens: roughly halving a model’s error takes about ten times the compute — a regularity Module 10 lets you drive for yourself in its scaling-laws explorer. The laws are at once the strongest evidence for continued progress (the curve has not yet broken) and the clearest constraint on its pace, which is why recent gains have shifted to new recipes — mixture-of-experts routing, test-time deliberation, verifiable-reward training — each with its own scaling curve still to discover.

Goodhart’s law — when a measure becomes a target, it ceases to be a good measure — is the reason benchmark saturation does not equal AGI. As specific benchmarks become things that labs optimize against, their signal degrades. A model that scores 95% on a suite of named tests may not be 95% as useful in production, and the gap widens with optimization.

Technology adoption S-curves begin from an observation about reach: some inventions change life very strongly — the internet — and some affect only a small corner of it, like a faster way to produce East Frisian rock sugar. An invention of the first kind is called a general-purpose technology, and it is usually not very useful on its own: it needs complementary inventions, and people must first work out what it is for. Its adoption therefore traces an S-curve — slow, then steep, then slow again:

One curve, four phases: invention, development, takeover, ubiquity. Electricity crossed it in roughly forty years; the internet in about fifteen.

Even the fastest crossings — forty years for electricity, about fifteen for the internet — are slow by the standards of current AI predictions, and the slow part is rarely the technology. It is the development phase: the surrounding work of complementary invention, organizational change, training, regulation, and trust.

Principal-agent problems describe the structural difficulty of one party (the agent) acting on behalf of another (the principal) when their incentives diverge and the principal cannot fully observe the agent’s actions. Deploying autonomous AI agents creates this problem at unprecedented scale and intensity. Crucially, capability and verifiability are in tension: the more capable the agent, the harder it becomes to monitor what it is actually doing. This places a natural speed limit on how much autonomy organizations will delegate, independent of how capable AI systems become.

When you encounter a confident long-range AI prediction, ask which of these constraints it is implicitly betting against — unbounded compute, costless scaling, faithful benchmarks, instant adoption, or unconditional delegation.

Capability Trajectories: Software as the Leading Indicator

If you want to track how fast AI capability is actually moving, software development is the most informative single domain. It is measurable in ways that essay writing and “creativity” are not — code either compiles, passes its tests, and ships, or it does not. It is also the domain that the major labs use internally, which means improvements are visible quickly in their own deployment.

A reasonable trajectory for the rest of the decade, extrapolated from observed progress in 2024 and 2025, runs roughly as follows:

2025 — Agentic coding: systems that autonomously generate, refine, and manage multi-file projects, with sustained multi-hour runs.
2026 — Autonomous refactoring agents capable of full-project changes (framework migrations, dependency upgrades, technical-debt sweeps) with minimal supervision and CI/CD validation.
2027 — Codebase co-ownership: AI agents acting as persistent contributors that open pull requests, schedule refactors, and maintain documentation alongside human teams.
2028 — Specification-to-deployment: a system can take an ambiguous human specification, ask clarifying questions, generate architecture, code, and tests, and deploy to production.
2029 — Heterogeneous multi-agent collaboration, with specialized agents (frontend, backend, infrastructure, security) coordinating across repositories.
2030 — Continuous autonomous optimization of live systems.

This is an extrapolation, not a forecast; track it against reality.

Exponential realist Frontier training compute (FLOPs) 2010 – 2025

The compute used to train frontier AI systems has grown by roughly a factor of 4× per year for the last decade and a half — fast enough that the largest training runs in 2025 use about a billion times more compute than the largest runs in 2010. The drivers are well understood: more capable accelerators, larger and cheaper clusters, and a willingness to pay training bills measured in hundreds of millions of dollars.

This trend cannot continue indefinitely. The energy required is growing roughly in proportion, and energy is now the binding constraint, not silicon. The honest forecast is not “compute will keep growing at 4× per year forever” but “the curve will bend within this decade as it runs into grid capacity, capital cost, and Landauer-floor efficiency limits.” When it bends matters more for AI futures than whether it bends.

There is also a second-order point that recurs in this trajectory. Ray Kurzweil identified computer programming as “the main bottleneck for superintelligent AI”: once an AI is good enough at programming to improve its own programming skill, a positive feedback loop opens. Both Anthropic and OpenAI have publicly stated that they use their own coding models to help debug, deploy, and optimize parts of their training and serving infrastructure. Neither lab has quantified the share of internal infrastructure work that is now AI-assisted, so the strength of the loop cannot be inferred from public information; what can be observed is that the loop now exists in a non-trivial form, where two years ago it did not.

From Alignment Theory to Deployment Controls

The discussion of AI safety has shifted noticeably between 2020 and 2026. In the earlier period it was largely a discussion of alignment in the abstract: how to specify human values to a sufficiently powerful optimizer, how to avoid Goodhart-style reward hacking, how to think about long-run existential risk from systems that did not yet exist. In 2026 the conversation is substantially more concrete.

Constitutional AI, the approach of training models at the level of identity and values rather than specific rules, remains the most visible alignment technique in production. It is incomplete; no current method is a complete control regime for highly agentic systems. Mechanistic interpretability — the project of understanding what neural networks are actually computing — has made real progress, with millions of internal features identified and circuits mapped for specific behaviors. It is now possible, in narrow cases, to detect signs of deception or hidden objectives inside a model.

The most interesting recent shift is the emergence of capability gating as an active safety instrument. Anthropic’s Project Glasswing, announced in April 2026, is the clearest example. The Claude Mythos Preview model is deemed useful enough for critical defensive cybersecurity work — Anthropic reports it has been used to find thousands of high-severity vulnerabilities in critical software — but not appropriate for general release.

The mechanism behind that decision is worth being clear-eyed about. The gating decision is made by Anthropic’s internal safety and policy teams, against criteria the company sets itself. Access is granted to vetted partners under contractual restrictions; there is no public appeal process and no independent external auditor with veto authority over the decision. This is a meaningful change in posture from earlier publish-with-a-system-card releases — but in regulatory terms it is self-governance by a private firm, not an externally enforced safety regime. Whether that distinction matters depends on whether the underlying trust holds. It is the kind of arrangement that works while it works and reveals its limits when it does not.

Misuse risk has also become more concrete than long-run alignment risk on near-term horizons. The same coding capability that helps patch software helps find and exploit vulnerabilities. Bio and cyber safeguards are now central to the release decisions of every frontier lab. Misuse, not classical alignment, is the binding near-term problem.

The governance landscape has, meanwhile, fragmented along familiar lines. The United States revoked President Biden’s AI safety executive order in January 2025 and has pivoted toward competitiveness and deregulation, with a “Winning the Race” action plan oriented around roughly ninety deregulatory actions. The European Union is proceeding alone with the AI Act, whose high-risk system rules come into force in August 2026. State-level transparency legislation in California (SB 53) and New York (the RAISE Act) has emerged as a pragmatic middle ground. Export controls on advanced chips — the lever the U.S. has actually been willing to pull — remain the single most consequential governance instrument in play. What is emerging is a layered patchwork rather than a regime.

Hardware and Energy: The Floor Under Everything

Accelerated AI servers have made data centre electricity demand a first-order constraint: grid connection times, generation buildout, cooling capacity, and long-term power purchase arrangements now determine where AI infrastructure can be built and how fast. The sidebar carries the numbers.

Exponential realist Global data centre electricity (TWh/year) 2024 – 2030 (IEA base case)

The IEA’s base case puts data centre electricity demand at roughly 415 TWh in 2024, rising to about 945 TWh by 2030 — a more-than-doubling in six years, with AI accelerators responsible for most of the increase. This is the first time in modern energy history that a single computational technology is projected to be a meaningful driver of global electricity demand growth.

The mechanism is straightforward — more accelerators, higher utilization, more inference per query as reasoning becomes the default — but the second-order effects are large. New data centre capacity is being sited in regions with available power and cooling, not in regions with the most users; this is reshaping the geography of cloud infrastructure. It is also reviving interest in nuclear power for baseload, and the first-of-their-kind thorium reactor experiments are beginning to attract serious capital.

Hardware itself continues to advance, but at sub-exponential rates. Specialized chips for training and inference, neural processing units, agentic accelerators — the pattern is roughly five-to-tenfold gains every three or four years, not the order-of-magnitude jumps that singularity-style forecasts assume. Compute has also become a domain of geopolitical competition. The U.S. export controls on advanced chips to China are the single largest non-market intervention in the AI industry, and the multi-year lead they create is, on current evidence, the central determinant of where frontier models will be trained over the next few years.

The longer-term hardware story has more speculation in it than observation. Brain-computer interfaces are an active research area but remain experimental, and Kurzweil’s prediction that the neocortex will be connected to the cloud in the 2030s should be read as a long-range hypothesis, not a forecast. Thorium reactors, fusion, and other breakthrough energy paths could, in principle, lift the energy ceiling.

The Economic Question: Productivity, Labor, Concentration

The economic story of AI in the mid-2020s does not yet fit any tidy narrative. Three numbers, in three different units, give a sense of the spending side: global private AI venture funding hit $225.8 billion in 2025 (sixty-one percent of all global venture capital); hyperscaler capital expenditure is projected in the hundreds of billions for 2026; and NVIDIA’s single-quarter revenue reached $57 billion in the third fiscal quarter of 2026. Read together, they describe a single pattern — capital concentrated at the infrastructure layer at a scale that has no real precedent outside of wartime mobilization. And yet enterprise AI revenue remains a small fraction of the infrastructure being built to serve it. The phrase “infrastructure-to-revenue gap” has entered the analyst vocabulary, and it implies, depending on who is using it, either an inevitable correction or a Jevons-paradox demand explosion still to come.

Self-reported AI productivity gains of 30 to 75 percent are common in surveys, and they generally fail to show up in organizational metrics. The METR controlled trial mentioned earlier — in which experienced developers using AI tools were 19% slower in familiar codebases — is a useful counterweight to the survey numbers. A March 2026 NBER survey of executives found positive expected productivity effects concentrated in high-skill services and finance, which is closer to the picture that careful workflow studies have produced. AI gains are real, conditional on workflow fit, and unevenly distributed.

The labor story is more anticipatory than demonstrated. Companies cited AI in 55,000 job cuts in 2025, a roughly twelvefold increase over two years, driven by expectation rather than measured productivity. Dario Amodei has predicted that fifty percent of entry-level white-collar jobs could be displaced within one to five years.

Two mechanisms carry most of the weight of that argument. The first is that AI capabilities are advancing faster than organizational labor markets can adapt. Hiring decisions, training programs, and licensing regimes typically operate on multi-year cycles; capability frontiers in 2024 and 2025 moved on quarterly cycles. If that mismatch persists, expectations alone — independent of measured productivity — will keep producing the kind of AI-cited job cuts now visible in the announcements data. The second is what Amodei calls climbing the cognitive ladder from the bottom: routine, structured, well-supervised tasks fall first, and these are concentrated in entry-level roles. Translation, basic copywriting, tier-one customer support, simple legal drafting, and parts of junior software engineering already show measurable substitution by AI in 2026 enterprise data. If this pattern continues upward, the result is not a redistribution of work across professions but a shrinking floor — fewer entry-level roles, harder ascent for anyone whose career path runs through them.

Comparative advantage — the Ricardian observation that even when one party is absolutely better at everything, both still gain by specializing — is the standard answer to this kind of forecast. The claim is checkable on four numbers. Suppose an AI produces 2 units of task A per hour to a human’s 1, and 10 units of task B to the human’s 1 — better at both. But each unit of A costs the AI 5 units of B not produced (its hour yields 2 A or 10 B); it costs the human only 1. Counted in B forgone, the human is the cheaper producer of A, and total output rises when the human takes A and the AI stays on B. That is the guarantee: humans retain economic roles even when AI is cognitively superior across the board, provided that AI is not also cheaper per unit of output across every task simultaneously. That proviso is doing all the work, and current evidence is mixed. On the side that supports it: many tasks remain bound to physical presence, regulated authorization, or accountability that an automated system cannot legally bear, and human wages for these are, for now, stable or rising. On the other side: in domains where the work is purely symbolic and quality can be checked cheaply — translation between major languages, basic copywriting, tier-one technical support, transcription, simple data entry — inference cost has already crossed the wage line for entry-level workers, in the sense that the dollar cost of producing an acceptable unit of output via API call is below the per-unit labor cost. The frontier of that crossover is moving outward; whether it stops at any natural boundary is the open question, and the one on which the comparative-advantage argument either holds or breaks.

The deeper structural question may be concentration. AI infrastructure spending already represents a substantial share of U.S. economic growth. Amodei has warned of Gilded Age levels of wealth concentration — personal fortunes in the trillions, single firms generating enormous annual revenue, and an unusually tight coupling between economic and political power. Whether this concern proves prescient or alarmist, it is the kind of second-order effect that abundance forecasts often skip over.

Discontinuities and an Adoption-Phase Lens

What about the discontinuities — a post-transformer architecture, an energy breakthrough, the recursive self-improvement loop? Assessing them invites a reasoning error that is easiest to see enacted. Tell people living at the time of Jesus about tanks, and they would have feared their downfall: nothing in their world could stop one. Today’s Italians, however, would not be much impressed by a single tank. The error is evaluating a future capability inside the present world; by the time such a capability exists, the world that must absorb it — defenses, institutions, rival systems — will have changed along the way.

Capability is one axis; adoption is another, and the internet’s history makes a useful lens available for it. The pattern there had three stages: an infrastructure-and-platforms phase (roughly 1993–1997 for the internet), a hype-bubble phase (1998–2000), and a practical-integration phase in which the technology became invisible (post-2003). Three indicators distinguish them: infrastructure spending running ahead of revenue, public-market valuations climbing on adoption metrics rather than profit, and enterprise pilots failing more often than they succeed. On those indicators, which internet year does early 2026 most resemble? Pick one before reading on.

The reading here is roughly 1996 or 1997: the infrastructure phase mature, the hype phase visibly assembling, the integration phase still over the horizon. If that placement is even approximately right, a correction-and-consolidation phase is more likely than not before the technology settles into routine use — even on a moderately accelerating path.

What Would Change Our View

The honest closing position is that a moderate-acceleration baseline — continued improvement, no runaway self-improvement — is a working hypothesis, not a forecast. A short list of things that would warrant updating it — sharply, in either direction — is more useful than a single point estimate; the list is the superforecaster’s move from the extrapolation-trap section, practiced rather than preached.

A genuine post-transformer architectural jump, of the kind LeCun expects and Ilya Sutskever (formerly chief scientist at OpenAI, now leading Safe Superintelligence Inc.) predicts, would tighten the timeline for high-acceleration scenarios and would also weaken the most aggressive industry forecasts that assume continued dividends from current architectures. A breakthrough that decoupled compute from energy — fusion at grid scale, mature thorium fission, or something not yet on the table — would remove what is currently the binding constraint and force a re-evaluation of every scenario that assumes energy-constrained scaling. Strong evidence that Kurzweil’s programming feedback loop is compounding — agentic-coding capability that improves faster than the trajectory in this module’s table — would shorten timelines materially. Credible global coordination on alignment, or its absence after a major incident, would do the same in opposite directions.

In the other direction, a sustained plateau in agentic reliability — the long-horizon error problem turning out to be harder than the labs currently expect — would extend timelines significantly, perhaps by years. So would a deep correction in AI infrastructure financing, of the kind the infrastructure-to-revenue gap could trigger. So would a serious misuse event that produced a regulatory step-change comparable to nuclear governance after Three Mile Island.

The right intellectual posture toward AI futures, on the available evidence, is calibration rather than conviction. The spread of credible forecasts is genuinely wide, and most of the disagreement is about definitions and constraints rather than about the direction of progress.

The question to carry forward into the rest of Part IV is not “when does AGI arrive?” It is “which trends are real, which constraints are binding, and which of the things we currently treat as background — energy, governance, deployment friction, the limits of human oversight — will turn out to be the foreground?”