Center on Long-Term Risk

Our goal is to address worst-case risks from the development and deployment of advanced AI systems. We are currently focused on conflict scenarios as well as technical and philosophical aspects of cooperation.

We do interdisciplinary research, make and recommend grants, and build a community of professionals and other researchers around our priorities.

More about us

Measurement Research Agenda

1 Motivation The Center on Long-Term Risk aims to reduce risks of astronomical suffering (s-risk) from advanced AI systems. We’re primarily concerned with threat models involving the deliberate creation of suffering. We’ve identified two major risk factors for s-risk: Conflict between advanced agentic AI systems. Malevolent systems that terminally value suffering. These may be created deliberately by malevolent developers or users, or they may be created accidentally.  To mitigate these risks, we are interested in tracking properties of AI systems that make them more likely to be involved in catastrophic conflict or to want to create suffering. Thus, we propose the following research priorities: Identify and describe properties of AI systems that would robustly make them more likely to contribute […]

Read online

Cooperation, Conflict, and Transformative Artificial Intelligence: A Research Agenda

Note: This research agenda was published in January 2020. For an update on our work in multi-agent systems as of March 2021, see this post. Author: Jesse Clifton The Center on Long-Term Risk's research agenda on Cooperation, Conflict, and Transformative Artificial Intelligence outlines what we think are the most promising avenues for developing technical and governance interventions aimed at avoiding conflict between transformative AI systems. We draw on international relations, game theory, behavioral economics, machine learning, decision theory, and formal epistemology. While our research agenda captures many topics we are interested in, the focus of CLR's research is broader. We appreciate all comments and questions. We're also looking for people to work on the questions we outline. So if you're interested […]

Download Read online

Reducing long-term risks from malevolent actors

Summary Dictators who exhibited highly narcissistic, psychopathic, or sadistic traits were involved in some of the greatest catastrophes in human history.  Malevolent individuals in positions of power could negatively affect humanity’s long-term trajectory by, for example, exacerbating international conflict or other broad risk factors. Malevolent humans with access to advanced technology—such as whole brain emulation or other forms of transformative AI—could cause serious existential risks and suffering risks. We therefore consider interventions to reduce the expected influence of malevolent humans on the long-term future. The development of manipulation-proof measures of malevolence seems valuable, since they could be used to screen for malevolent humans in high-impact settings, such as heads of government or CEOs. We also explore possible future technologies that […]

Read online
Browse all CLR research

From our blog

The optimal timing of spending on AGI safety work; why we should probably be spending more now

Tristan Cook & Guillaume Corlouer October 24th 2022 Summary When should funders wanting to increase the probability of AGI going well spend their money? We have created a tool to calculate the optimum spending schedule and tentatively conclude funders collectively should be spending at least 5% of their capital each year on AI risk interventions and in some cases up to 35%. This is likely higher than the current AI risk community spending rate which is at most 3%. In most cases, we find that the optimal spending schedule is between 5% and 15% better than the ‘default’ strategy of just spending the interest one accrues and from 15% to 50% better than a naive projection of the community’s spending […]

Read more

When is intent alignment sufficient or necessary to reduce AGI conflict?

In this post, we look at conditions under which Intent Alignment isn't Sufficient or Intent Alignment isn't Necessary for interventions on AGI systems to reduce the risks of (unendorsed) conflict to be effective. We then conclude this sequence by listing what we currently think are relatively promising directions for technical research and intervention to reduce AGI conflict. Intent alignment is not sufficient to prevent unendorsed conflict In the previous post, we outlined possible causes of conflict and directions for intervening on those causes. Many of the causes of conflict seem like they would be addressed by successful AI alignment. For example: if AIs acquire conflict-prone preferences from their training data when we didn’t want them to, that is a clear case of misalignment. […]

Read more

When would AGIs engage in conflict?

Here we will look at two of the claims introduced in the previous post: AGIs might not avoid conflict that is costly by their lights (Capabilities aren’t Sufficient) and conflict that is costly by our lights might not be costly by the AGIs’ (Conflict isn’t Costly).  Explaining costly conflict First we’ll focus on conflict that is costly by the AGIs’ lights. We’ll define “costly conflict” as (ex post) inefficiency: There is an outcome that all of the agents involved in the interaction prefer to the one that obtains. This raises the inefficiency puzzle of war: Why would intelligent, rational actors behave in a way that leaves them all worse off than they could be?  We’ll operationalize “rational and intelligent” actors […]

Read more
Browse CLR blog