AI 2027 vs Reality

Author Johannes Haus
Last updated

The Big Picture

The AI 2027 scenario predicted a specific, aggressive path from today’s AI to superintelligence by 2027. Thirteen months after publication, where do things actually stand?

The short answer: Reality is tracking at roughly 70% of the scenario’s predicted pace. The direction is right. The speed is somewhat slower. And the implications depend on whether that gap stays constant or narrows.

The Speed Ratio

The AI Futures Project — the team behind AI 2027 — graded their own 2025 predictions in February 2026. Their assessment:

  • Aggregate pace: 58-66% of predicted speed on quantitative metrics
  • Qualitative predictions: mostly on pace
  • Individual prediction aggregate: mean 75%, median 84%

Our independent tracking of 53 predictions remains broadly consistent with a slower-than-scenario pace, though the picture is uneven across categories.

That is not a simple pass/fail result. Some concrete trends have materialized; others are late, ambiguous, or still not testable. The result is a tracker worth continuing, not a verdict that the scenario has been proven right.

What’s Tracking Ahead

A few areas are moving faster than AI 2027 predicted:

  • Agent capability improvement — METR time horizons are doubling every 3-4 months, compared to the predicted 7-month doubling. This is the single most important metric for the scenario’s core thesis (AI accelerating AI research), and it’s ahead of schedule.
  • Labor market impact — Concern about AI job displacement emerged earlier and more visibly than predicted.
  • Lab competition — The gap between top US labs (0-2 months) is even smaller than the predicted 3-9 months. The race is tighter.

What’s Roughly On Track

The bulk of qualitative predictions are tracking as described:

  • Infrastructure investment scale and pace
  • Agent emergence and the “useful but unreliable” dynamic
  • Coding agent adoption and value creation
  • AI-for-AI research focus at major labs
  • Continuous training paradigm shift
  • Export control impact on Chinese AI
  • DOD engagement with AI labs
  • Public skepticism persisting despite rapid progress

What’s Behind

Several quantitative predictions are lagging:

  • Compute scaling — the largest-training-run prediction remains hard to verify publicly and is currently treated as not yet testable rather than cleanly behind.
  • Benchmark targets — SWE-bench-Verified progress was late relative to the mid-2025 target, and comparability between self-reported and independently standardized scores still matters.
  • Financial milestones — Valuations and market performance behind the aggressive predictions.
  • China gap — Chinese labs appear further behind than predicted, partly due to export controls being more effective than expected.

Category-by-Category Breakdown

Model Capability (7 predictions)

Mixed. Qualitative trends such as continuous training are confirmed, cheaper-model dynamics are on track, and benchmark results vary by evaluation. The scenario may have overweighted compute scaling relative to algorithmic and architectural progress.

Agent Autonomy (6 predictions)

Mostly confirmed. Agent emergence, pricing, long-horizon task struggles, personal assistant marketing — all accurate. METR time horizons actually ahead of prediction. This category is the strongest validation of the scenario.

Coding (6 predictions)

Strong but mixed. Coding agents are providing real value and generating meaningful revenue. SWE-bench progress was late relative to the original target, and harder benchmarks still complicate simple claims about fully automated software engineering.

Governance (5 predictions)

Confirmed. DOD contracting, academic/media skepticism, early capability secrecy trends — all on track. Later predictions about nationalization debates and anti-AI protests are still emerging.

Security (6 predictions)

Mixed. Bioweapon assistance capabilities are on track, cyber capability evidence has moved ahead of schedule, and model-theft/security-infrastructure predictions remain partly emerging or not yet fully testable.

Geopolitics (7 predictions)

Partially confirmed. Export control impact confirmed. China compute constraints confirmed. The scenario’s prediction that China would centralize into a single mega-datacenter is harder to verify but directionally plausible. Some predictions about China’s model gap may be behind.

Economic Impact (11 predictions)

Strong but uneven. Infrastructure investment, capex trajectory, datacenter buildouts, and labor-market concern are prominent. Financial valuations and market-performance claims remain the main areas behind.

Takeoff (5 predictions)

Not yet testable. These predict AI automating AI research, with multipliers reaching 1.5× to 4×. Early signals exist (AI-assisted coding, research acceleration), but the dramatic takeoff dynamics target late 2026 through 2027.

The Adjusted Timeline

If progress continues at 70% of the depicted rate, what does that mean for the scenario’s dramatic predictions?

The AI Futures team estimated:

  • Without additional slowdowns: Takeoff shifts from late 2027 → mid-2029
  • With compute/labor growth constraints: Takeoff shifts to mid-2028 to mid-2030
  • Daniel Kokotajlo’s (lead author) updated median for full coding automation: 2029
  • Eli Lifland’s updated median for full coding automation: early 2030s

In other words: even the authors themselves now expect the critical milestones 1-3 years later than their original scenario depicted. But they still expect them.

What This Means

Three ways to read the evidence:

The “Vindication” Reading

AI 2027 identified several important dynamics early. Its qualitative picture of 2025 captured real developments in agents, coding tools, infrastructure, and institutional response, and some metrics are moving at or ahead of the expected pace. On this reading, the original scenario is directionally useful even if timing shifts.

The “Overconfident” Reading

The quantitative predictions are behind because the scenario overstated how fast compute would scale. The qualitative predictions being right is less impressive because many of those trends were already visible in early 2025. The takeoff thesis remains unproven, and 2029-2030 is far enough away that a lot could change.

The “Both True” Reading

The scenario correctly identified several dynamics and directions of AI progress. It was too aggressive on some timelines and may have been too conservative on some deployment or capability metrics. The honest assessment is: this is one of the most concrete public forecasts available, it remains useful to track, and the remaining uncertainty is genuinely large.

We think the third reading is closest to the truth. The scenario deserves to be taken seriously and tracked rigorously — which is exactly what we’re doing.

Read more: