Center on Long-Term Risk

Our goal is to address worst-case risks from the development and deployment of advanced AI systems. We are currently focused on conflict scenarios as well as technical and philosophical aspects of cooperation.

We do interdisciplinary research, make and recommend grants, and build a community of professionals and other researchers around our priorities.

Cooperation, Conflict, and Transformative Artificial Intelligence: A Research Agenda

Note: This research agenda was published in January 2020. For an update on our work in multi-agent systems as of March 2021, see this post. Author: Jesse Clifton The Center on Long-Term Risk's research agenda on Cooperation, Conflict, and Transformative Artificial Intelligence outlines what we think are the most promising avenues for developing technical and governance interventions aimed at avoiding conflict between transformative AI systems. We draw on international relations, game theory, behavioral economics, machine learning, decision theory, and formal epistemology. While our research agenda captures many topics we are interested in, the focus of CLR's research is broader. We appreciate all comments and questions. We're also looking for people to work on the questions we outline. So if you're interested […]

Reducing long-term risks from malevolent actors

Summary Dictators who exhibited highly narcissistic, psychopathic, or sadistic traits were involved in some of the greatest catastrophes in human history.  Malevolent individuals in positions of power could negatively affect humanity’s long-term trajectory by, for example, exacerbating international conflict or other broad risk factors. Malevolent humans with access to advanced technology—such as whole brain emulation or other forms of transformative AI—could cause serious existential risks and suffering risks. We therefore consider interventions to reduce the expected influence of malevolent humans on the long-term future. The development of manipulation-proof measures of malevolence seems valuable, since they could be used to screen for malevolent humans in high-impact settings, such as heads of government or CEOs. We also explore possible future technologies that […]

Taboo "Outside View"

No one has ever seen an AGI takeoff, so any attempt to understand it must use these outside view considerations —[Redacted for privacy] What? That’s exactly backwards. If we had lots of experience with past AGI takeoffs, using the outside view to predict the next one would be a lot more effective. —My reaction Two years ago I wrote a deep-dive summary of Superforecasting and the associated scientific literature. I learned about the “Outside view” / “Inside view” distinction, and the evidence supporting it. At the time I was excited about the concept and wrote: “...I think we should do our best to imitate these best-practices, and that means using the outside view far more than we would naturally be inclined.” Now that I […]

Case studies of self-governance to reduce technology risk

Summary Self-governance occurs when private actors coordinate to address issues that are not obviously related to profit, with minimal involvement from governments and standards bodies. Historical cases of self-governance to reduce technology risk are rare. I find 6 cases that seem somewhat similar to AI development, including the actions of Leo Szilard and other physicists in 1939 and the 1975 Asilomar conference. The following factors seem to make self-governance efforts more likely to occur: Risks are salient The government looks like it might step in if private actors do nothing The field or industry is small Support from gatekeepers (like journals and large consumer-facing firms) Support from credentialed scientists. After the initial self-governance effort, governments usually step in to develop […]

Coordination challenges for preventing AI conflict

Summary In this article, I will sketch arguments for the following claims: Transformative AI scenarios involving multiple systems pose a unique existential risk: catastrophic bargaining failure between multiple AI systems (or joint AI-human systems). This risk is not sufficiently addressed by successfully aligning those systems, and we cannot safely delegate its solution to the AI systems themselves. Developers are better positioned than more far-sighted successor agents to coordinate in a way that solves this problem, but a solution also does not seem guaranteed. Developers intent on solving this problem can choose between developing separate but compatible systems that do not engage in costly conflict or building a single joint system. While the second option seems preferable from an altruistic perspective, […]

