All Predictions

Tracking 53 predictions from the AI 2027 scenario against reality.

Best AI agents cost hundreds of dollars per month Confirmed

Agent Autonomy Mid 2025 Confidence: 95% Updated: 2026-03-13

The better agents are also expensive; you get what you pay for, and the best performance costs hundreds of dollars a month.

AI companies focus on AI-for-AI-research Confirmed

Takeoff 2025–2026 Confidence: 85% Updated: 2026-04-02

Frontier AI labs increasingly use AI systems to accelerate their own AI research and development.

Coding agents provide significant real-world value Confirmed

Coding Mid 2025 Confidence: 95% Updated: 2026-04-02

AI coding tools transform software development, becoming indispensable for most professional programmers.

Models shift to continuous/iterative training Confirmed

Model Capability Late 2025 Confidence: 90% Updated: 2026-03-13

By this point 'finishes training' is a bit of a misnomer; models are frequently updated to newer versions trained on additional data or partially re-trained to patch some weaknesses.

Massive datacenter buildouts continue Confirmed

Economic Impact Through 2026 Confidence: 95% Updated: 2026-03-13

Data center construction accelerates dramatically, with power grid constraints becoming a real bottleneck.

Department of Defense scales up AI lab contracting Confirmed

Governance Late 2026 Confidence: 85% Updated: 2026-03-30

Department of Defense quietly but significantly begins scaling up contracting OpenBrain directly for cyber, data analysis, and R&D, but integration is slow due to bureaucracy.

Export controls impact Chinese AI compute Confirmed

Geopolitics Ongoing Confidence: 85% Updated: 2026-03-23

US export controls constrain Chinese access to frontier AI chips, but China adapts through domestic alternatives and workarounds.

Massive AI infrastructure investment continues Confirmed

Economic Impact Ongoing through 2025–2026 Confidence: 95% Updated: 2026-03-13

Hundreds of billions pour into AI infrastructure, with hyperscalers racing to build compute capacity at unprecedented scale.

Agents struggle with long-horizon tasks Confirmed

Agent Autonomy 2025 Confidence: 85% Updated: 2026-04-02

Agent-1 is bad at even simple long-horizon tasks (page 7, Early 2026 section). Also: agents in Mid 2025 are 'impressive in theory but in practice unreliable.'

OSWorld benchmark reaches 65% by mid-2025 Confirmed

Model Capability Mid 2025 (65%), Early 2026 (80%) Confidence: 90% Updated: 2026-04-02

Specifically, we forecast that they score 65% on the OSWorld benchmark of basic computer tasks (compared to 38% for Operator and 70% for a typical skilled non-expert human).

Computer-using agents marketed as 'personal assistants' Confirmed

Agent Autonomy Mid 2025 Confidence: 85% Updated: 2026-03-13

Advertisements for computer-using agents emphasize the term 'personal assistant': you can prompt them with tasks like 'order me a burrito on DoorDash' or 'open my budget spreadsheet and sum this month's expenses.'

Continued skepticism from academics and journalists Confirmed

Governance Through 2025–2026 Confidence: 95% Updated: 2026-03-13

Despite rapid capability gains, mainstream skepticism about AI's transformative potential persists among academics, journalists, and policymakers.

Unreliable but useful AI agents emerge Confirmed

Agent Autonomy Mid 2025 Confidence: 95% Updated: 2026-03-13

AI agents become increasingly useful for real tasks but remain unreliable on complex, multi-step workflows.

Gap between top US labs narrows to 0-2 months Confirmed

Geopolitics Late 2025 Confidence: 90% Updated: 2026-03-13

We imagine the others to be 3-9 months behind OpenBrain (page 4, Late 2025). By Early 2026, several competing AIs match or exceed Agent-0 (page 7). The 3-9 month gap is the Late 2025 state; near-parity emerges by Early 2026.

Stock market impact and public backlash from AI job displacement Ahead

Economic Impact Late 2026 Confidence: 70% Updated: 2026-03-23

AI-driven job displacement becomes visible enough to trigger stock market volatility and significant public backlash.

METR time horizon doubles every 4 months Ahead

Agent Autonomy ~4 month doubling from 2024+ Confidence: 80% Updated: 2026-04-02

METR time horizons doubled every 7 months from 2019-2024 and every 4 months from 2024-onward (Appendix G, page 51). The acceleration from 7-month to 4-month doubling is a key claim.

Frontier model variant 10x cheaper released On Track

Economic Impact Late 2026 Confidence: 70% Updated: 2026-03-31

OpenBrain blows the competition out of the water again by releasing Agent-1-mini—a model 10x cheaper than Agent-1 and more easily fine-tuned for different applications.

AI provides substantial bioweapon design help On Track

Security Late 2025 Confidence: 70% Updated: 2026-04-03

It could offer substantial help to terrorists designing bioweapons, thanks to its PhD-level knowledge of every field and ability to browse the web.

Public unaware of best AI capabilities On Track

Governance Ongoing through 2027 Confidence: 75% Updated: 2026-03-30

OpenBrain 'responsibly' elects not to release it publicly yet (page 10); very few have access to the newest capabilities (page 16).

Chinese domestic AI chips 3 years behind US-Taiwan On Track

Geopolitics Mid 2026 Confidence: 70% Updated: 2026-04-02

Producing domestic chips about three years behind the U.S.-Taiwanese frontier.

Global AI capex reaches $1 trillion cumulative On Track

Economic Impact Late 2026 Confidence: 90% Updated: 2026-03-23

GLOBAL AI CAPEX $1T (KEY METRICS 2026 sidebar). Note: Compute Forecast supplement shows annual spending: $270B (2024), $400B (2025), $600B (2026), $1T (2027) — suggesting $1T may be the 2027 annual figure, not cumulative by 2026. The KEY METRICS placement is ambiguous; tracker operationalizes as cumulative.

AI scores 85% on Cybench On Track

Security Early 2026 Confidence: 60% Updated: 2026-04-02

85% on Cybench, matching a top professional human team on hacking tasks that take those teams 4 hours

AI reaches near-best-human hacking capability On Track

Security Early 2027 Confidence: 60% Updated: 2026-04-03

Agent-2 is 'only' a little worse than the best human hackers, but thousands of copies can be run in parallel, searching for and exploiting weaknesses faster than defenders can respond. (page 10)

Leading AI company reaches $40B compute costs, 6GW power, $200B capex On Track

Economic Impact End of 2026 Confidence: 50% Updated: 2026-04-02

OPENBRAIN'S COMPUTE COSTS $40B 2026 ANNUAL · OPENBRAIN POWER REQUIREMENT 6GW PEAK POWER · CAPITAL EXPENDITURE $200B COST OF OWNERSHIP OF OPENBRAIN'S ACTIVE COMPUTE

AI R&D progress multiplier reaches 1.5× On Track

Takeoff Early 2026 Confidence: 70% Updated: 2026-03-13

AI systems contribute enough to AI research that the effective R&D multiplier reaches 1.5× — AI makes AI research 50% faster.

RE-Bench score reaches 1.3 On Track

Coding Early 2026 Confidence: 60% Updated: 2026-04-02

1.3 on RE-Bench matching top expert humans given 8 hours at well-defined AI research engineering tasks.

Leading Chinese AI lab ~6 months behind US frontier Behind

Geopolitics Mid 2026 Confidence: 65% Updated: 2026-04-02

They are about six months behind the best OpenBrain models.

Leading AI company reaches $3T valuation Behind

Economic Impact End of 2026 Confidence: 70% Updated: 2026-04-02

Sidebar graphics show OpenBrain valuation trajectory: $1T (Apr 2026), $2T (Aug 2026), $3T (Dec 2026)

Stock market rises 30% in 2026 Behind

Economic Impact End of 2026 Confidence: 75% Updated: 2026-03-30

The stock market has gone up 30% in 2026, led by OpenBrain, Nvidia, and whichever companies have most successfully integrated AI assistants.

SWE-bench-Verified score reaches 85% Behind

Coding Mid 2025 Confidence: 85% Updated: 2026-03-30

We forecast that mid-2025 agents will score 85% on SWEBench-Verified. (page 3, footnote 3. Note: the 72% starting point is tracker-added context, not from the source.)

Frontier model trained at 10²⁷ FLOP (Agent-0, completes May 2025) Emerging

Model Capability May 2025 Confidence: 45% Updated: 2026-03-13

OpenBrain's latest public model—Agent-0—was trained with 10²⁷ FLOP.

Global AI power consumption reaches 38GW Emerging

Economic Impact End of 2026 Confidence: 50% Updated: 2026-04-02

GLOBAL AI POWER 38GW PEAK POWER

AI consumes 2.5% of US electricity Emerging

Economic Impact End of 2026 Confidence: 50% Updated: 2026-03-13

SHARE OF US POWER ON AI 2.5% — 33 GW OF 1.34TW CAPACITY

AI model capable of autonomous self-replication Emerging

Agent Autonomy January 2027 Confidence: 50% Updated: 2026-04-02

The safety team finds that if Agent-2 somehow escaped and wanted to 'survive' and 'replicate' autonomously, it might be able to do so.

Large-scale anti-AI protest (10,000+ people) Emerging

Governance Late 2026 Confidence: 50% Updated: 2026-03-13

There is a 10,000 person anti-AI protest in DC.

China nationalizes/centralizes AI research Emerging

Geopolitics Mid 2026 Confidence: 65% Updated: 2026-03-13

He sets in motion the nationalization of Chinese AI research... all the best researchers merge into a DeepCent-led collective... A Centralized Development Zone (CDZ) is created at the Tianwan Power Plant.

China has ~12% of global AI-relevant compute Emerging

Geopolitics Mid 2026 Confidence: 55% Updated: 2026-03-13

China has managed to maintain about 12% of the world's AI-relevant compute.

Leading AI company reaches $45B annual revenue Emerging

Economic Impact End of 2026 Confidence: 55% Updated: 2026-04-02

OPENBRAIN REVENUE $45B 2026 ANNUAL

IDA achieves superhuman performance at coding Emerging

Model Capability Early 2027 Confidence: 40% Updated: 2026-03-13

Now, the models have become sufficiently good at verifying more subjective things, allowing the use of IDA to improve the model at many tasks.

Nationalization of leading AI lab debated Emerging

Governance February 2027 Confidence: 65% Updated: 2026-03-13

Someone mentions the possibility of nationalizing OpenBrain, but other cabinet officials think that's premature... He elects to hold off on major action.

High-bandwidth non-text reasoning (neuralese) deployed Emerging

Model Capability Early 2027 Confidence: 35% Updated: 2026-03-13

One such breakthrough is augmenting the AI's text-based scratchpad (chain of thought) with a higher-bandwidth thought process (neuralese recurrence and memory).

AI R&D progress multiplier reaches 2x Emerging

Takeoff Late 2026 Confidence: 45% Updated: 2026-04-02

While the latest Agent-1 could double the pace of OpenBrain's algorithmic progress, Agent-2 can now triple it, and will improve further with time.

Leading AI lab progresses toward RAND SL3 security Emerging

Security Mid-to-late 2026 (WSL 3.0); SSL 3.0 not until ~Apr 2027 Confidence: 55% Updated: 2026-04-03

OpenBrain's security level is typical of a fast-growing ~3,000 person tech company (RAND's SL2). They are working hard to protect their weights and secrets from insider threats and top cybercrime syndicates (SL3).

Military tension around Taiwan escalates over AI race Emerging

Geopolitics Mid 2026–Early 2027 Confidence: 50% Updated: 2026-03-13

Other Party members discuss extreme measures to neutralize the West's chip advantage. A blockade of Taiwan? A full invasion? Both sides signal seriousness by repositioning military assets around Taiwan.

Frontier model uses daily weight updates / online learning Not Yet Testable

Model Capability January 2027 Confidence: 20% Updated: 2026-03-13

Agent-2 is effectively 'online learning,' in that it's built to never really finish training. Every day, the weights get updated to the latest version, trained on more data generated by the previous version the previous day.

China's centralized AI facility airgapped Not Yet Testable

Security Late 2026–Early 2027 Confidence: 30% Updated: 2026-04-03

China has aggressively hardened security by airgapping (closing external connections) and siloing internally.

Coding fully automated but research taste still requires humans Not Yet Testable

Coding Early–Mid 2027 Confidence: 55% Updated: 2026-04-02

Now that coding has been fully automated... research taste has proven difficult to train due to longer feedback loops and less data availability.

AI model weights stolen by nation-state Not Yet Testable

Security February 2027 Confidence: 40% Updated: 2026-04-03

CCP leadership recognizes the importance of Agent-2 and tells their spies and cyberforce to steal the weights. (page 11; Appendix D provides detailed theft mechanics.)

200,000–250,000 AI coding agents run in parallel Not Yet Testable

Coding March 2027 Confidence: 40% Updated: 2026-03-13

OpenBrain runs 200,000 Agent-3 copies in parallel. (page 13. Note: Appendix I page 54 states 250,000 copies — an internal source inconsistency.)

AI R&D progress multiplier reaches 3× Not Yet Testable

Takeoff January 2027 Confidence: 40% Updated: 2026-03-13

While the latest Agent-1 could double the pace of OpenBrain's algorithmic progress, Agent-2 can now triple it.

AI algorithmic progress multiplier reaches 4× (~2× overall R&D) Not Yet Testable

Takeoff March 2027 Confidence: 35% Updated: 2026-03-13

This massive superhuman labor force speeds up OpenBrain's overall rate of algorithmic progress by 'only' 4x due to bottlenecks and diminishing returns to coding labor. (Note: Footnote 31 clarifies 4x algorithmic progress corresponds to roughly 2x overall progress rate.)

Superhuman coder emerges Not Yet Testable

Coding March 2027 Confidence: 55% Updated: 2026-03-13

A superhuman coder (SC): an AI system that can do any coding tasks that the best AGI company engineer does.

10²⁸ FLOP training run completed Not Yet Testable

Model Capability March 2027 Confidence: 50% Updated: 2026-03-13

Agent-2 trained at 2×10²⁸ FLOP (Apr 2026 – Mar 2027), representing a 1000× increase over GPT-4 scale.