MIRI has been awarded its largest grant to date — $7,703,750 split over two years from Open Philanthropy, in partnership with Ben Delo, co-founder of the cryptocurrency trading platform BitMEX!
We have also been awarded generous grants by the Berkeley Existential Risk Initiative ($300,000) and the Long-Term Future Fund ($100,000). Our thanks to everybody involved!
- Buck Shlegeris of MIRI and Rohin Shah of CHAI discuss Rohin's 2018–2019 overview of technical AI alignment research on the AI Alignment Podcast.
- From MIRI's Abram Demski: Thinking About Filtered Evidence Is (Very!) Hard and Bayesian Evolving-to-Extinction. And from Evan Hubinger: Synthesizing Amplification and Debate.
- From OpenAI's Beth Barnes, Paul Christiano, Long Ouyang, and Geoffrey Irving: Progress on AI Safety via Debate.
- Zoom In: An Introduction to Circuits: OpenAI's Olah, Cammarata, Schubert, Goh, Petrov, and Carter argue, “Features are the fundamental unit of neural networks. They correspond to directions [in the space of neuron activations]. […] Features are connected by weights, forming circuits. […] Analogous features and circuits form across models and tasks.”
- DeepMind's Agent57 appears to meet one of the AI benchmarks in AI Impacts' 2016 survey, “outperform professional game testers on all Atari games using no game specific knowledge”, earlier than NeurIPS/ICML authors predicted.
- From DeepMind Safety Research: Specification gaming: the flip side of AI ingenuity.