Menu

Predictions

A registry of falsifiable claims, each with a resolution criterion a stranger could adjudicate and a resolve-by date. Claims are never deleted or renamed — only their status changes. Anchors (/predictions/#<id>) are stable and safe to link. Current entries derive from The Jagged Frontier, whose essays link back to each claim.

20

open

0

resolved-correct

0

resolved-incorrect

0

revised

0

ambiguous

open (20)

jf-pretraining-plateau-02

Through 2028, headline frontier capability gains will come predominantly from post-training RL, test-time compute, and tool use — not from pretraining-scale increases.

From The Jagged Frontier · What scaling actually bought

open conf: medium
Resolves
by 2028-12-31
Criteria
Technical reports and credible third-party analyses of major 2026–2028 frontier releases attribute the main gains to post-training or inference-time methods; no release achieves a generational jump credited chiefly to parameter scaling. Disputed attribution resolves to ambiguous.
scaling capabilities

jf-human-premium-04

By 2028, at least two of the three largest music-streaming platforms will offer explicit AI-content labeling or verified-human filtering.

From The Content Factory · What falls, and what does not

open conf: medium
Resolves
by 2028-12-31
Criteria
Platform feature documentation or press releases of the three largest music-streaming services by subscribers (e.g. Spotify, YouTube Music, Apple/Amazon Music) document a user-facing AI-content label or verified-human filter in production.

Revisions

  • 2026-06-14 — Supporting regulatory context noted; status unchanged. (The EU Commission's final Code of Practice on marking and labelling AI-generated content (June 10 2026), operationalizing AI Act Article 50 transparency obligations from August 2 2026, creates a regulatory tailwind toward content labelling across generative-AI providers in the EU. It does not establish the music-streaming-specific feature this claim requires (a user-facing label or verified-human filter on the three largest services), so the criterion remains unmet — but the direction of travel supports it.)
culture music

jf-genre-fiction-falls-07

By 2028, AI-generated works will constitute a majority of new releases in at least one major serialized genre-fiction market (e.g. Kindle Unlimited romance or LitRPG).

From The Content Factory · What falls, and what does not

open conf: medium
Resolves
by 2028-12-31
Criteria
Platform disclosure or credible third-party analysis (publishing-industry research) showing AI-generated titles above 50% of new releases in at least one major serialized genre-fiction category.
culture publishing

jf-productivity-paradox-12

Through 2028, large-scale software-delivery telemetry (DORA-class) will continue to show individual AI coding gains far exceeding organization-level delivery gains.

From The Specification Is the Work · The amplifier

open conf: medium
Resolves
by 2028-12-31
Criteria
DORA annual reports and comparable telemetry (Faros-class, 10k+ developers) through 2028 show individual task/PR throughput gains substantially exceeding organization-level delivery-performance gains. If the gap closes (org-level gains comparable to individual gains), resolves incorrect.

Revisions

  • 2026-06-14 — Supporting evidence noted; status unchanged. (Anthropic's "When AI builds itself" (Favaro & Clark, June 4 2026) states Claude wrote more than 80% of code merged into Anthropic's production systems, while self-reported productivity gains remain 20-40% (per the Redwood Research calibration). A high throughput/merge share alongside far smaller realized productivity gains is exactly the individual-vs-organization wedge this claim tracks — but it is one firm's internal figure, not DORA-class telemetry, so it informs rather than resolves.)
software productivity

jf-replication-flood-20

By 2028, the verifier-overload signal in scientific publishing will have grown — annual retractions (or paper-mill detections) exceed the 2024 level by at least 50%.

From The Verifier and Nature · What the survey has been mapping

open conf: medium-high
Resolves
by 2028-12-31
Criteria
Retraction Watch database or Crossref retraction counts: calendar-year 2028 total at least 1.5× the 2024 total, or equivalent growth in documented paper-mill detections.
science publishing

jf-jagged-persists-01

Frontier AI systems in 2030 will still show a jagged profile — superhuman on some tasks while failing tasks most adults find trivial — with no expert-consensus "human-equivalent" moment having occurred.

From The Jagged Frontier · The coastline, not the wall

open conf: high
Resolves
by 2030-12-31
Criteria
As of 2030-12-31: documented, reproducible failures of frontier models on tasks most adults find trivial still exist, AND no consensus in credible expert surveys (majority of surveyed AI researchers) that human-level generality was reached. Both conditions required for resolved-correct.

Revisions

  • 2026-06-13 — Supporting evidence noted; status unchanged. (DeepMind's From AGI to ASI (Genewein et al., 2026) independently adopts the jagged framing, stating that capability profiles of concrete systems "may well be jagged w.r.t. human-level intelligence" and that AI progress "may equally be jagged and non-uniform" (Remark III, citing Morris et al. 2026). A heavyweight establishment source endorsing the conceptual frame, though it does not settle the 2030 condition.)
agi capabilities

jf-strip-miner-05

No genuinely new mass-culture genre or form (scale-comparable to the cinematic universe, true-crime podcasts, K-pop) will be widely credited as originated by an autonomous AI engagement-optimization loop by 2030.

From The Content Factory · Three ways the loop bends

open conf: medium
Resolves
by 2030-12-31
Criteria
Absence of any case widely credited in mainstream cultural criticism where an autonomous AI optimization loop originated a new mass-culture genre or form. A disputed case resolves to ambiguous.
culture

jf-prestige-resists-06

Through 2030, no predominantly AI-generated live-action series or film will earn a major-category Emmy/Oscar nomination or a top-10 annual slot on a major streaming service.

From The Content Factory · What falls, and what does not

open conf: medium-high
Resolves
by 2030-12-31
Criteria
Award nomination records and platform annual top-10 lists through 2030. "Predominantly AI-generated" means AI generated the majority of footage and performances, per credits or credible reporting.
culture film

jf-world-model-games-09

By 2030, no commercially shipped game built on a frame-generating world model (no conventional engine or assets) will sustain persistent, rule-governed play for 10+ hours.

From The Machine That Must Behave · World models

open conf: medium-high
Resolves
by 2030-12-31
Criteria
Requires a commercial release (paid or ad-supported, not a research preview), technical reporting confirming a frame-generation architecture without a conventional engine, and documented persistent rule-governed state across sessions totalling 10+ hours.
games world-models

jf-ai-uncopyrightable-11

Through 2030, works generated by AI without human creative contribution will remain uncopyrightable in the United States.

From The Machine That Must Behave · Assets

open conf: medium-high
Resolves
by 2030-12-31
Criteria
US Copyright Office policy and controlling court decisions as of 2030-12-31 still deny copyright to works with no human creative contribution. A statutory change or controlling precedent granting such copyright resolves the claim incorrect.
law games culture

jf-avionics-verifier-14

Through 2030, no civil-aviation certification authority will accept AI-generated flight-control code without human-auditable requirement-to-test traceability; AI's certified role stays on the verification side.

From The Specification Is the Work · The gradient

open conf: high
Resolves
by 2030-12-31
Criteria
FAA/EASA certification guidance and the documented development process of any certified system through 2030. A certified flight-control system whose code was AI-generated without human-auditable traceability resolves the claim incorrect.
software safety-critical

jf-erp-unmoved-15

By 2030, large ERP implementation outcomes (overrun and failure rates) will remain within their historical bands despite AI coding tools.

From The Specification Is the Work · The gradient

open conf: medium
Resolves
by 2030-12-31
Criteria
Industry implementation surveys (Panorama ERP report or successor): reported overrun and failure rates for large ERP implementations through 2030 show no step-change improvement versus 2020–2024 baselines.
software enterprise

jf-math-acceleration-16

By 2030, AI-assisted methods will have resolved at least 50 long-open named mathematics problems (Erdős-problem class) with machine-verified proofs.

From The Verifier and Nature · Mathematics: the one perfect verifier

open conf: medium-high
Resolves
by 2030-12-31
Criteria
Count problems open for at least 10 years, resolved with substantive AI involvement, with a machine-verified (e.g. Lean) proof — per the Erdős problems database, formal-proof repositories, and credible mathematical reporting. Threshold: 50 by 2030-12-31.
science math

jf-math-direction-17

Through 2030, the selection of which research-level mathematical problems to pursue will remain human — no AI system autonomously poses and resolves a problem the field judges significant.

From The Verifier and Nature · Mathematics: the one perfect verifier

open conf: medium
Resolves
by 2030-12-31
Criteria
No case through 2030 where credible mathematicians attribute both the choice of problem and its resolution to an autonomous system. Disputed cases resolve to ambiguous.

Revisions

  • 2026-06-13 — Supporting evidence noted; status unchanged. (DeepMind's From AGI to ASI (Genewein et al., 2026) classifies AI achievements to date — Move 37, automated theorem proving, AlphaFold — as exploratory creativity within human-provided conceptual spaces (Boden levels 1-2), and frames transformative creativity and the autonomous origination of significant problems as the unmet hallmark of ASI. Consistent with this claim that problem-selection stays human through 2030; conceptual support, not resolution.)
science math

jf-dev-jobs-rotate-13

US software-developer employment in 2030 will be no more than 20% below its 2025 level — the job rotates toward specification and verification rather than disappearing.

From The Specification Is the Work · Where the factory's reach ends

open conf: medium-high
Resolves
by 2031-06-30
Criteria
BLS Occupational Employment Statistics for the "Software Developers" category: 2030 headcount no more than 20% below the 2025 figure. Resolution mid-2031 to allow for data publication lag.
software labor

jf-bio-calendar-19

As of 2030, clinical development will not have dramatically compressed — median time from first-in-human to approval stays above 6 years and overall clinical success rate below 25%.

From The Verifier and Nature · Biology: the worst judge of all

open conf: high
Resolves
by 2031-06-30
Criteria
Standard industry analyses (BIO/Citeline/FDA-based) of programs concluding by 2030: median first-in-human-to-approval time > 6 years AND overall likelihood of approval from Phase 1 < 25%. Resolution mid-2031 for data lag. Both conditions required for resolved-correct.
science biology medicine

jf-aaa-moat-08

No predominantly AI-generated game will achieve AAA-scale success (Metacritic ≥ 85 and multi-million unit sales) by 2032.

From The Machine That Must Behave · The irreducibility moat

open conf: high
Resolves
by 2032-12-31
Criteria
"Predominantly AI-generated" means design, code, and content majority-generated autonomously, per credits or credible reporting. Checked against Metacritic score (≥ 85) and sales data (≥ 2 million units). Any single qualifying title resolves the claim incorrect.
games

jf-faust-canon-03

No disclosed AI-authored literary work will win a top-tier literary prize (Booker, Pulitzer, Nobel, or national equivalent) by 2035.

From The Jagged Frontier · Why Faust is the right test

open conf: high
Resolves
by 2035-12-31
Criteria
Prize records: no top-tier literary award given to a work publicly known at the time of the award to be predominantly AI-authored.
culture literature

jf-physics-stall-18

No AI-originated theory of fundamental physics will receive experimental confirmation by 2035.

From The Verifier and Nature · Physics: a field that splits in two

open conf: high
Resolves
by 2035-12-31
Criteria
No experimentally confirmed beyond-Standard-Model (or comparable fundamental) theory whose origination is credibly attributed to an AI system, as of 2035-12-31.

Revisions

  • 2026-06-13 — Supporting evidence noted; status unchanged. (DeepMind's From AGI to ASI (Genewein et al., 2026) gives a mechanism for this expectation: the abstraction barrier and embodied bottleneck argue models trained on human data lack a demonstrated way to discover novel conceptual primitives, and Hassabis's cited test — could an AI have originated general relativity from 1900s knowledge, "today the answer is no" — frames AI-originated fundamental physics as still out of reach. Conceptual support for the stall, not resolution.)
science physics