Superhuman coder emerges

Author Johannes Haus
Last updated
Not Yet Testable · Coding Automation · 60% confidence
Predicted: March 2027 ·Adjusted: Late 2027–Mid 2028 · Updated: 2026-06-15 · Source: ai-2027.com, Appendix G (page 50): SC definition
A superhuman coder (SC): an AI system that can do any coding tasks that the best AGI company engineer does.

At a glance

  • Assessment: Not Yet Testable
  • Confidence: 60%
  • Predicted timing: March 2027
  • Primary source: ai-2027.com, Appendix G (page 50): SC definition

What AI 2027 Predicted

The scenario predicts the emergence of a “superhuman coder” — an AI system that surpasses the best human programmers on essentially any coding task, both in quality and speed. This is a key milestone in the path toward broader superintelligence.

How We Track This

We monitor:

  • SWE-bench Verified and SWE-bench Pro scores
  • Terminal-Bench results
  • Real-world coding competitions (Codeforces, etc.) — AI vs human rankings
  • Enterprise reports on code quality from AI vs human developers
  • Novel system-level projects completed entirely by AI

Current Evidence

Coding AI is advancing rapidly but “superhuman” remains far. Claude Opus 4.6 (Thinking) leads SWE-bench Verified at 79.2%, GPT-5.4 at 77.2% (vals.ai). But on the harder SWE-bench Pro (real-world complexity), best scores are only 23.3% (GPT-5) and 23.1% (Claude Opus 4.1), per Scale Labs. Terminal-Bench 2.0: GPT 5.3 Codex at 65%, Opus 4.6 at 63%. 16 Claude Opus 4.6 agents wrote a C compiler from scratch. Claude Code went from zero to #1 coding tool in 8 months. But the gap between “very useful coding assistant” and “superhuman coder” (any task, faster and cheaper than best human) remains large.

Anthropic’s June 2026 Fable 5 launch is a stronger signal for long-horizon autonomous coding than prior public releases. Anthropic reports that, in early customer testing, Fable 5 completed a 50-million-line Ruby migration in one day that Stripe estimated would have taken a team over two months by hand. Anthropic also says Fable 5 led Cognition’s FrontierCode evaluation. This supports the direction of travel toward more autonomous engineering, but it remains vendor-reported and does not establish the superhuman-coder threshold across all coding tasks.

Sources:

Counterevidence & Limitations

  • SWE-bench Pro results (~23%) show that real-world coding is far harder than benchmarks suggest
  • “Superhuman” is a high bar — surpassing the best humans on any task is qualitatively different from being a useful assistant
  • Current tools require significant human guidance for complex projects
  • The March 2027 predicted date may be too aggressive by 6–18 months

What Would Change Our Assessment

  • Upgrade to “emerging”: SWE-bench Pro scores above 50%; AI consistently winning coding competitions
  • Upgrade to “on-track”: SWE-bench Pro above 70%; credible reports of AI completing complex projects without human guidance
  • Maintain at “not-yet-testable”: Prediction date hasn’t arrived yet

Update History

DateUpdate
2026-06-15Anthropic reported that Claude Fable 5 completed a large Ruby codebase migration in one day in early Stripe testing and led Cognition’s FrontierCode evaluation. This is stronger evidence for long-horizon coding progress, but it remains vendor-reported and does not establish superhuman coding across all tasks. Confidence adjusted 0.55 → 0.60.
2026-04-27Anthropic released Claude Opus 4.7, highlighting better performance on difficult and long-running software engineering tasks, including customer reports of hours-long autonomy, self-verification, and stronger async coding workflows (Anthropic). This modestly strengthens the evidence that coding models are advancing toward more autonomous engineering work, but does not establish the cheap superhuman-coder threshold.
2026-04-02AI Futures Project Q1 update: Kokotajlo shifted Automated Coder median from late 2029 to mid 2028 (1.5 years sooner). Revised AC time horizon requirement from 3 years to 1 year, citing Opus 4.6 impressiveness. Eli Lifland’s AC median also pulled forward 1.5 years. The scenario’s own authors are converging faster toward the original timeline. Source: LessWrong
2026-03Prediction timeframe not yet reached (March 2027). AI coding capabilities advancing rapidly — SWE-bench scores improving, autonomous coding agents shipping — but superhuman performance across all coding tasks remains distant.
2026-01Kokotajlo personal median for full coding automation: December 2030.
2025-12AI Futures Project places median for “Superhuman Coder” at December 2031 — vs AI 2027 scenario’s January 2027.
2025-11Claude Opus 4.5 reportedly outperforms every human candidate on Anthropic internal engineering assessments. Gemini 3 scores 37.4% on Humanity’s Last Exam (world record). Gap narrowing visibly but “superhuman coder” milestone remains contested.
2025-09Gemini 2.5 Deep Think achieves gold-medal performance at 2025 ICPC World Finals (Sep 17), solving 10/12 problems including one no human team solved. Strongest “superhuman” coding signal to date, though competitive algorithmic programming differs from real-world software engineering.