Security
Biosecurity, cyber risk, containment, secrecy, and escalating safety thresholds.
This category contains 6 tracked predictions. Each page includes the original claim, current evidence, counterevidence, and what would change our assessment.
CCP leadership recognizes the importance of Agent-2 and tells their spies and cyberforce to steal the weights. (page 11; Appendix D provides detailed theft mechanics.)
It could offer substantial help to terrorists designing bioweapons, thanks to its PhD-level knowledge of every field and ability to browse the web.
Agent-2 is 'only' a little worse than the best human hackers, but thousands of copies can be run in parallel, searching for and exploiting weaknesses faster than defenders can respond. (page 10)
85% on Cybench, matching a top professional human team on hacking tasks that take those teams 4 hours
China has aggressively hardened security by airgapping (closing external connections) and siloing internally.
OpenBrain's security level is typical of a fast-growing ~3,000 person tech company (RAND's SL2). They are working hard to protect their weights and secrets from insider threats and top cybercrime syndicates (SL3).