All Predictions
Tracking 53 predictions from the AI 2027 scenario against reality.
The better agents are also expensive; you get what you pay for, and the best performance costs hundreds of dollars a month.
Frontier AI labs increasingly use AI systems to accelerate their own AI research and development.
AI coding tools transform software development, becoming indispensable for most professional programmers.
By this point 'finishes training' is a bit of a misnomer; models are frequently updated to newer versions trained on additional data or partially re-trained to patch some weaknesses.
Data center construction accelerates dramatically, with power grid constraints becoming a real bottleneck.
Department of Defense quietly but significantly begins scaling up contracting OpenBrain directly for cyber, data analysis, and R&D, but integration is slow due to bureaucracy.
US export controls constrain Chinese access to frontier AI chips, but China adapts through domestic alternatives and workarounds.
Hundreds of billions pour into AI infrastructure, with hyperscalers racing to build compute capacity at unprecedented scale.
Agent-1 is bad at even simple long-horizon tasks (page 7, Early 2026 section). Also: agents in Mid 2025 are 'impressive in theory but in practice unreliable.'
Specifically, we forecast that they score 65% on the OSWorld benchmark of basic computer tasks (compared to 38% for Operator and 70% for a typical skilled non-expert human).
Advertisements for computer-using agents emphasize the term 'personal assistant': you can prompt them with tasks like 'order me a burrito on DoorDash' or 'open my budget spreadsheet and sum this month's expenses.'
Despite rapid capability gains, mainstream skepticism about AI's transformative potential persists among academics, journalists, and policymakers.
AI agents become increasingly useful for real tasks but remain unreliable on complex, multi-step workflows.
We imagine the others to be 3-9 months behind OpenBrain (page 4, Late 2025). By Early 2026, several competing AIs match or exceed Agent-0 (page 7). The 3-9 month gap is the Late 2025 state; near-parity emerges by Early 2026.
AI-driven job displacement becomes visible enough to trigger stock market volatility and significant public backlash.
METR time horizons doubled every 7 months from 2019-2024 and every 4 months from 2024-onward (Appendix G, page 51). The acceleration from 7-month to 4-month doubling is a key claim.
OpenBrain blows the competition out of the water again by releasing Agent-1-mini—a model 10x cheaper than Agent-1 and more easily fine-tuned for different applications.
It could offer substantial help to terrorists designing bioweapons, thanks to its PhD-level knowledge of every field and ability to browse the web.
OpenBrain 'responsibly' elects not to release it publicly yet (page 10); very few have access to the newest capabilities (page 16).
Producing domestic chips about three years behind the U.S.-Taiwanese frontier.
GLOBAL AI CAPEX $1T (KEY METRICS 2026 sidebar). Note: Compute Forecast supplement shows annual spending: $270B (2024), $400B (2025), $600B (2026), $1T (2027) — suggesting $1T may be the 2027 annual figure, not cumulative by 2026. The KEY METRICS placement is ambiguous; tracker operationalizes as cumulative.
85% on Cybench, matching a top professional human team on hacking tasks that take those teams 4 hours
Agent-2 is 'only' a little worse than the best human hackers, but thousands of copies can be run in parallel, searching for and exploiting weaknesses faster than defenders can respond. (page 10)
OPENBRAIN'S COMPUTE COSTS $40B 2026 ANNUAL · OPENBRAIN POWER REQUIREMENT 6GW PEAK POWER · CAPITAL EXPENDITURE $200B COST OF OWNERSHIP OF OPENBRAIN'S ACTIVE COMPUTE
AI systems contribute enough to AI research that the effective R&D multiplier reaches 1.5× — AI makes AI research 50% faster.
1.3 on RE-Bench matching top expert humans given 8 hours at well-defined AI research engineering tasks.
They are about six months behind the best OpenBrain models.
Sidebar graphics show OpenBrain valuation trajectory: $1T (Apr 2026), $2T (Aug 2026), $3T (Dec 2026)
The stock market has gone up 30% in 2026, led by OpenBrain, Nvidia, and whichever companies have most successfully integrated AI assistants.
We forecast that mid-2025 agents will score 85% on SWEBench-Verified. (page 3, footnote 3. Note: the 72% starting point is tracker-added context, not from the source.)
OpenBrain's latest public model—Agent-0—was trained with 10²⁷ FLOP.
GLOBAL AI POWER 38GW PEAK POWER
SHARE OF US POWER ON AI 2.5% — 33 GW OF 1.34TW CAPACITY
The safety team finds that if Agent-2 somehow escaped and wanted to 'survive' and 'replicate' autonomously, it might be able to do so.
There is a 10,000 person anti-AI protest in DC.
He sets in motion the nationalization of Chinese AI research... all the best researchers merge into a DeepCent-led collective... A Centralized Development Zone (CDZ) is created at the Tianwan Power Plant.
China has managed to maintain about 12% of the world's AI-relevant compute.
OPENBRAIN REVENUE $45B 2026 ANNUAL
Now, the models have become sufficiently good at verifying more subjective things, allowing the use of IDA to improve the model at many tasks.
Someone mentions the possibility of nationalizing OpenBrain, but other cabinet officials think that's premature... He elects to hold off on major action.
One such breakthrough is augmenting the AI's text-based scratchpad (chain of thought) with a higher-bandwidth thought process (neuralese recurrence and memory).
While the latest Agent-1 could double the pace of OpenBrain's algorithmic progress, Agent-2 can now triple it, and will improve further with time.
OpenBrain's security level is typical of a fast-growing ~3,000 person tech company (RAND's SL2). They are working hard to protect their weights and secrets from insider threats and top cybercrime syndicates (SL3).
Other Party members discuss extreme measures to neutralize the West's chip advantage. A blockade of Taiwan? A full invasion? Both sides signal seriousness by repositioning military assets around Taiwan.
Agent-2 is effectively 'online learning,' in that it's built to never really finish training. Every day, the weights get updated to the latest version, trained on more data generated by the previous version the previous day.
China has aggressively hardened security by airgapping (closing external connections) and siloing internally.
Now that coding has been fully automated... research taste has proven difficult to train due to longer feedback loops and less data availability.
CCP leadership recognizes the importance of Agent-2 and tells their spies and cyberforce to steal the weights. (page 11; Appendix D provides detailed theft mechanics.)
OpenBrain runs 200,000 Agent-3 copies in parallel. (page 13. Note: Appendix I page 54 states 250,000 copies — an internal source inconsistency.)
While the latest Agent-1 could double the pace of OpenBrain's algorithmic progress, Agent-2 can now triple it.
This massive superhuman labor force speeds up OpenBrain's overall rate of algorithmic progress by 'only' 4x due to bottlenecks and diminishing returns to coding labor. (Note: Footnote 31 clarifies 4x algorithmic progress corresponds to roughly 2x overall progress rate.)
A superhuman coder (SC): an AI system that can do any coding tasks that the best AGI company engineer does.
Agent-2 trained at 2×10²⁸ FLOP (Apr 2026 – Mar 2027), representing a 1000× increase over GPT-4 scale.