Superhuman coder emerges

Author Johannes Haus

Last updated 2026-06-15

Not Yet Testable · Coding Automation · 60% confidence

Predicted: March 2027 ·Adjusted: Late 2027–Mid 2028 · Updated: 2026-06-15 · Source: ai-2027.com, Appendix G (page 50): SC definition

A superhuman coder (SC): an AI system that can do any coding tasks that the best AGI company engineer does.

At a glance

Assessment: Not Yet Testable
Confidence: 60%
Predicted timing: March 2027
Primary source: ai-2027.com, Appendix G (page 50): SC definition

What AI 2027 Predicted

The scenario predicts the emergence of a “superhuman coder” — an AI system that surpasses the best human programmers on essentially any coding task, both in quality and speed. This is a key milestone in the path toward broader superintelligence.

How We Track This

We monitor:

SWE-bench Verified and SWE-bench Pro scores
Terminal-Bench results
Real-world coding competitions (Codeforces, etc.) — AI vs human rankings
Enterprise reports on code quality from AI vs human developers
Novel system-level projects completed entirely by AI

Current Evidence

Coding AI is advancing rapidly but “superhuman” remains far. Claude Opus 4.6 (Thinking) leads SWE-bench Verified at 79.2%, GPT-5.4 at 77.2% (vals.ai). But on the harder SWE-bench Pro (real-world complexity), best scores are only 23.3% (GPT-5) and 23.1% (Claude Opus 4.1), per Scale Labs. Terminal-Bench 2.0: GPT 5.3 Codex at 65%, Opus 4.6 at 63%. 16 Claude Opus 4.6 agents wrote a C compiler from scratch. Claude Code went from zero to #1 coding tool in 8 months. But the gap between “very useful coding assistant” and “superhuman coder” (any task, faster and cheaper than best human) remains large.

Anthropic’s June 2026 Fable 5 launch is a stronger signal for long-horizon autonomous coding than prior public releases. Anthropic reports that, in early customer testing, Fable 5 completed a 50-million-line Ruby migration in one day that Stripe estimated would have taken a team over two months by hand. Anthropic also says Fable 5 led Cognition’s FrontierCode evaluation. This supports the direction of travel toward more autonomous engineering, but it remains vendor-reported and does not establish the superhuman-coder threshold across all coding tasks.

Sources:

Counterevidence & Limitations

SWE-bench Pro results (~23%) show that real-world coding is far harder than benchmarks suggest
“Superhuman” is a high bar — surpassing the best humans on any task is qualitatively different from being a useful assistant
Current tools require significant human guidance for complex projects
The March 2027 predicted date may be too aggressive by 6–18 months

What Would Change Our Assessment

Upgrade to “emerging”: SWE-bench Pro scores above 50%; AI consistently winning coding competitions
Upgrade to “on-track”: SWE-bench Pro above 70%; credible reports of AI completing complex projects without human guidance
Maintain at “not-yet-testable”: Prediction date hasn’t arrived yet

Update History

Date	Update
2026-06-15	Anthropic reported that Claude Fable 5 completed a large Ruby codebase migration in one day in early Stripe testing and led Cognition’s FrontierCode evaluation. This is stronger evidence for long-horizon coding progress, but it remains vendor-reported and does not establish superhuman coding across all tasks. Confidence adjusted 0.55 → 0.60.
2026-04-27	Anthropic released Claude Opus 4.7, highlighting better performance on difficult and long-running software engineering tasks, including customer reports of hours-long autonomy, self-verification, and stronger async coding workflows (Anthropic). This modestly strengthens the evidence that coding models are advancing toward more autonomous engineering work, but does not establish the cheap superhuman-coder threshold.
2026-04-02	AI Futures Project Q1 update: Kokotajlo shifted Automated Coder median from late 2029 to mid 2028 (1.5 years sooner). Revised AC time horizon requirement from 3 years to 1 year, citing Opus 4.6 impressiveness. Eli Lifland’s AC median also pulled forward 1.5 years. The scenario’s own authors are converging faster toward the original timeline. Source: LessWrong
2026-03	Prediction timeframe not yet reached (March 2027). AI coding capabilities advancing rapidly — SWE-bench scores improving, autonomous coding agents shipping — but superhuman performance across all coding tasks remains distant.
2026-01	Kokotajlo personal median for full coding automation: December 2030.
2025-12	AI Futures Project places median for “Superhuman Coder” at December 2031 — vs AI 2027 scenario’s January 2027.
2025-11	Claude Opus 4.5 reportedly outperforms every human candidate on Anthropic internal engineering assessments. Gemini 3 scores 37.4% on Humanity’s Last Exam (world record). Gap narrowing visibly but “superhuman coder” milestone remains contested.
2025-09	Gemini 2.5 Deep Think achieves gold-medal performance at 2025 ICPC World Finals (Sep 17), solving 10/12 problems including one no human team solved. Strongest “superhuman” coding signal to date, though competitive algorithmic programming differs from real-world software engineering.