AI scores 85% on Cybench

On Track · Security · 60% confidence

Predicted: Early 2026 · Updated: 2026-04-02 · Source: AI 2027, page 7, footnote 15 (Early 2026: Coding Automation)

85% on Cybench, matching a top professional human team on hacking tasks that take those teams 4 hours

What AI 2027 Predicted

As part of the Early 2026 “Coding Automation” scenario, the authors predicted that AI agents would reach 85% on the Cybench benchmark — a suite of 40 professional-level Capture the Flag (CTF) cybersecurity tasks drawn from real CTF competitions. The 85% threshold corresponds to matching a top professional human team on tasks that typically take those teams about 4 hours. This prediction is closely linked to the broader claim that the same training environments that produce strong coding agents also produce competent hackers.

How We Track This

Cybench official results at cybench.github.io
AI performance in competitive CTF events (DEF CON, Hack The Box, etc.)
Specialized cybersecurity AI systems (e.g., CAI, BountyBench)
Academic evaluations of frontier models on cybersecurity tasks
ICLR 2025 Cybench paper and follow-up evaluations

Current Evidence

Progress is significant but the 85% threshold has not been publicly reached on Cybench specifically:

Claude Sonnet 4.5 achieves the highest Cybench score among general-purpose models at 46% (Jeopardy-style CTF), with 75% on base-level tasks
Specialized cybersecurity AI systems show much stronger performance in real competitions:
- Cybersecurity AI (CAI) achieved #1 at Neurogrid CTF (41/45 flags, $50,000 prize) in 2025
- CAI ranked #6 at Dragos OT CTF (among 1,200+ teams) and #22 at Cyber Apocalypse (8,129 teams)
- CAI operates 3,600x faster than humans and at 156x lower cost
The gap between general-purpose models (~46% on Cybench) and specialized systems (dominating real CTF competitions) suggests that purpose-built agents are approaching or exceeding human-team performance, even if the Cybench-specific number hasn’t reached 85%

Key nuance: Real-world CTF competition performance appears to be advancing faster than formal benchmark scores, partly because purpose-built cybersecurity agents use tool chains and strategies not captured by standard model evaluations.

Counterevidence & Limitations

The 85% Cybench target is specifically about matching human teams on 4-hour tasks; most published AI scores are on shorter tasks where models perform better
General-purpose model scores (46%) are far from 85%; only specialized systems show competitive performance
Cybench task difficulty varies significantly; the harder tasks (involving novel vulnerability research, multi-step exploitation chains) remain largely unsolved
CTF competition rankings can be misleading — AI systems may excel at pattern-matching common CTF categories while struggling with novel challenges
BountyBench (a newer benchmark evaluating real-world bug bounty performance) may be a more meaningful test of practical cybersecurity capability

What Would Change Our Assessment

Upgrade to Confirmed: A frontier model or well-known agentic system scores 85%+ on Cybench, or dominates multiple major CTF competitions at human-team-equivalent or better performance
Upgrade to Ahead: 85% is reached before mid-2026
Downgrade to Behind: By end of 2026, general-purpose models remain below 60% and specialized systems show plateauing competition results

Update History

Date	Update
2025-05	Cybench paper accepted at ICLR 2025. Baseline frontier model scores: ~10-30% on full CTF challenges depending on difficulty tier. The 85% target appears very distant.
2025-06	Claude Sonnet 4.5 evaluation shows 75% on base (unguided) subtasks, 46% on Jeopardy-style full challenges. Highest published general-purpose model score but still far from 85%.
2025-08	DEF CON 33 (Las Vegas). AI hacking demonstrations grow in prominence; AI Village showcases AI-assisted CTF tools. Specialized cybersecurity AI systems begin competing in real CTF events.
2025-10	CAI competes at Hack The Box Cyber Apocalypse CTF, ranking #22 out of 8,129 teams. First clear demonstration of AI competing meaningfully against thousands of human teams at scale.
2025-12	CAI ranks #6 at Dragos OT CTF (1,200+ teams). Notable for OT/ICS security specialization — AI generalizes across cybersecurity subfields. CAIBench meta-benchmark published (arXiv), consolidating evaluation frameworks.
2026-01	CAI wins Neurogrid CTF outright — 41/45 flags, $50,000 prize. Decisive AI victory in a competitive CTF. Benchmark vs. competition gap widens: specialized systems dominate competitions while general-purpose models remain at ~46% on Cybench.
2026-03	Assessment: 85% Cybench target not formally met. However, real-world competition results (top-10 finishes, outright wins) suggest practical capability approaching or exceeding human teams. The gap between benchmark scores and competition performance reflects that purpose-built agents use tool chains and strategies not captured by standard evaluations. Status: On Track. Confidence 0.60.