Coordination challenges for preventing AI conflict

Summary In this article, I will sketch arguments for the following claims: Transformative AI scenarios involving multiple systems pose a unique existential risk: catastrophic bargaining failure between multiple AI systems (or joint AI-human systems). This risk is not sufficiently addressed by successfully aligning those systems, and we cannot safely delegate its solution to the AI systems themselves. Developers are better positioned than more far-sighted successor agents to coordinate in a way that solves this problem, but a solution also does not seem guaranteed. Developers intent on solving this problem can choose between developing separate but compatible systems that do not engage in costly conflict or building a single joint system. While the second option seems preferable from an altruistic perspective, […]

Read more