Frontier model trained at 10²⁷ FLOP (Agent-0, completes May 2025)

Emerging · Model Capability · 45% confidence

Predicted: May 2025 · Updated: 2026-03-13 · Source: ai-2027.com, Late 2025: The World's Most Expensive AI

OpenBrain's latest public model—Agent-0—was trained with 10²⁷ FLOP.

What AI 2027 Predicted

The scenario describes “Agent-0,” a frontier model trained with 10²⁷ FLOP (10× less than the headline 10²⁸ run, but still a massive jump from GPT-4’s ~2×10²⁵ FLOP). This model represents the public-facing output of the first wave of next-generation training infrastructure, released in late 2025. Agent-0 is described as impressive in capability but still limited compared to what follows.

How We Track This

We monitor:

Epoch AI’s estimates of training compute for frontier models (GPT-5, Claude 4, Gemini Ultra 2)
Lab disclosures about training scale and infrastructure
Third-party analysis of cluster sizes and training durations
Hardware deployment timelines (Blackwell clusters, custom ASICs)

Current Evidence

Epoch AI estimated GPT-5 pretraining compute at approximately 3×10²⁵ FLOP — roughly in the same order of magnitude as GPT-4, not the 10²⁷ the scenario envisioned. Epoch’s analysis notes that GPT-5 actually used less training compute than GPT-4.5, with OpenAI apparently prioritizing efficiency and post-training over raw scale. Over 30 models have now been trained above the 10²⁵ FLOP threshold, suggesting this scale has become commoditized rather than frontier-pushing.

The gap between actual compute (~3×10²⁵) and predicted compute (10²⁷) is roughly 30×, which is substantial. However, there is significant uncertainty in these estimates, and post-training compute (RL, RLHF) may add meaningfully to total training FLOP in ways that aren’t well-characterized.

Sources:

Counterevidence & Limitations

Labs are increasingly opaque about training compute, and Epoch’s estimates carry wide uncertainty ranges
The shift toward inference-time compute and RL post-training may mean raw pretraining FLOP is the wrong metric; total effective compute could be substantially higher
GPT-5 achieved major capability gains despite seemingly modest compute scaling, suggesting algorithmic efficiency improvements partially substituted for raw scale
Some labs may have completed larger training runs that haven’t been publicly characterized

What Would Change Our Assessment

Upgrade to “on-track”: Credible evidence that a 2025 model used ≥10²⁶·⁵ total training compute (pretraining + post-training combined)
Upgrade to “confirmed”: A confirmed training run at or near 10²⁷ FLOP
Downgrade to “behind”: If Epoch revises estimates downward or the largest 2025 runs are confirmed well below 10²⁶·⁵

Update History

Date	Update
2026-03	GPT-5 estimated at ~3×10²⁵ FLOP pretraining, well below 10²⁷ target. Significant uncertainty remains around total compute including post-training and inference-time scaling.