Backup Utility Functions: A Fail-Safe AI Technique

Setting up the goal systems of advanced AIs in a way that results in benevolent behavior is expected to be difficult. We should account for the possibility that the goal systems of AIs fail to implement our values as originally intended. In this paper, we propose the idea of backup utility functions: Secondary utility functions that are used in case the primary ones “fail”.

Read more

Identifying Plausible Paths to Impact and their Strategic Implications

FRI’s research seeks to identify the best intervention(s) for suffering reducers to work on. Rather than continuing our research indefinitely, we will eventually have to focus our efforts on an intervention directly targeted at improving the world. This report outlines plausible candidates for FRI’s “path to impact” and distills some advice on how current movement building efforts can best prepare for them.

Read more