S-risks: Why they are the worst existential risks, and how to prevent them (EAG Boston 2017)
S-risks are risks of events that bring about suffering in cosmically significant amounts. By “significant”, we mean significant relative to expected future suffering. The talk was aimed at an audience that is new to this concept. For a more in-depth discussion, see our article Reducing Risks of Astronomical Suffering: A Neglected Priority.
I’ll talk about risks of severe suffering in the far future, or s-risks. Reducing these risks is the main focus of the Foundational Research Institute, the EA research group that I represent.
To illustrate what s-risks are about, I’ll use a fictional story.
Imagine that some day it will be possible to upload human minds into virtual environments. This way, sentient beings can be stored and run on very small computing devices, such as the white egg-shaped gadget depicted here.
Behind the computing device you can see Matt. Matt’s job is to convince human uploads to serve as virtual butlers, controlling the smart homes of their owners. In this instance, human upload Greta is unwilling to comply.
To break her will, Matt increases the rate at which time passes for Greta. While Matt waits for just a few seconds, Greta effectively endures many months of solitary confinement.
Fortunately, this did not really happen. In fact, I took this story and screenshots from an episode of the British TV series Black Mirror.
Not only did it not happen, it’s also virtually certain it won’t happen. No future scenario we’re imagining now is likely to happen in precisely this form.
But, I will argue, there are many plausible scenarios which are in relevant ways like, or even worse than, that one. I’ll call these s-risks, where “s” stands for “suffering”.
I’ll explain presently what s-risks are, and how s-risks may be realized. Next, I’ll talk about why effective altruists may want to focus on preventing s-risk, and through what kinds of work this can be achieved.
The way I’d like to introduce them, s-risks are a subclass of existential risk, often called x-risk. It’ll therefore be useful to recall the concept of x-risk. Nick Bostrom has defined x-risk as follows.
“Existential risk – One where an adverse outcome would either annihilate Earth-originating intelligent life or permanently and drastically curtail its potential.”
Bostrom also suggested that one way to understand how x-risk differs from other risk is to look at two dimensions of risk. These two dimensions are their scope and their severity.
We can use these to map different types of risks in a two-dimensional figure. Along the vertical axis, risks are ordered according to their scope. That is, we ask how many individuals would be affected? Would it only be one person, everyone in some region, everyone alive on Earth at one point, or even everyone alive plus future generations? Along the horizontal axis, we map risks according to their severity. That is, we ask how bad an adverse outcome would be for one affected individual.
For example, a single fatal car crash would have terminal severity. In that sense, it’s pretty bad. However, in another sense, it could be worse because it affects only a small number of people – it has personal rather than global or even regional scope. But, there also are risks with a greater severity; for example, being tortured for the rest of your life, with no chance of escape, arguably is worse than a fatal car crash. Or, to give a real life example, consider factory farming. We commonly think that, say, the life of chickens in battery cages is so bad that it’s better not to bring them into existence in the first place. That’s the reason why we think it’s good that the food at this conference is largely vegan.
To come back to the title of my talk, I can now state why s-risks are the worst existential risks. S-risks are the worst existential risks because I’ll define them to have the largest possible scope and the largest possible severity. (I will qualify the claim that s-risks are the worst x-risks later.) That is, I’d like to suggest the following definition.
“S-risk – One where an adverse outcome would bring about severe suffering on a cosmic scale, vastly exceeding all suffering that has existed on Earth so far.”
So, s-risks are roughly as severe as factory farming, but with an even larger scope.
To better understand this definition, let’s zoom in on the part of the map that shows existential risk.
One subclass of risks are those that, with respect to their scope, would affect all future human generations, and, with respect to their severity, would remove everything valuable. One central example of such pan-generational, crushing risks are risks of human extinction.
Risks of extinction have received the most attention so far. But, conceptually, x-risks contain another class of risks. These are risks of outcomes even worse than extinction in two respects. First, with respect to their scope, they not only threaten the future generations of humans or our successors, but all sentient life in the whole universe. Second, with respect to their severity, they not only remove everything that would be valuable but also come with a lot of disvalue – that is, features we’d like to avoid no matter what. Recall the story I told in the beginning, but think of Greta’s solitary confinement being multiplied by many orders of magnitude – for instance, because it affects a very large population of sentient uploads.
Let’s pause for a moment. So far, I’ve introduced the concept of s-risk. To recap, they are risks of severe suffering on a cosmic scale, which makes them a subclass of existential risk.
(Depending on how you understand the “curtail its potential” case in the definition of x-risks, there actually may be s-risks which aren’t x-risks. This would be true if you think that reaching the full potential of Earth-originating intelligent life could involve suffering on an astronomical scale, i.e., the realisation of an s-risk. Think of a quarter of the universe filled with suffering, and three quarters filled with happiness. Considering such an outcome to be the full potential of humanity seems to require the view that the suffering involved would be outweighed by other, desirable features of reaching this full potential, such as vast amounts of happiness. While all plausible moral views seem to agree that preventing the suffering in this scenario would be valuable, they might disagree on how important it is to do so. While many people find it plausible that ensuring a flourishing future is more important, FRI is committed to a family of different views, which we call suffering-focused ethics. (Note: We’ve updated this section in June 2019.))
Next, I’d like to talk about why and how to prevent s-risks.
All plausible value systems agree that suffering, all else being equal, is undesirable. That is, everyone agrees that we have reasons to avoid suffering. S-risks are risks of massive suffering, so I hope you agree that it’s good to prevent s-risks.
However, you’re probably here because you’re interested in effective altruism. You don’t want to know whether preventing s-risks is a good thing, because there are a lot of good things you could do. You acknowledge that doing good has opportunity cost, so you’re after the most good you can do. Can preventing s-risks plausibly meet this higher bar?
This is a very complex question. To understand just how complex it is, I first want to introduce a flawed argument for focusing on reducing s-risk. (I’m not claiming that anyone has advanced such an argument about either s-risks or x-risks.) This flawed argument goes as follows.
Premise 1: The best thing to do is to prevent the worst risks
Premise 2: S-risks are the worst risks
Conclusion: The best thing to do is to prevent s-risk
I said that this argument isn’t sound. Why is that?
Before delving into this, let’s get one potential source of ambiguity out of the way. On one reading, premise 1 could be a value judgment. In this sense, it could mean that, whatever you expect to happen in the future, you think there is a specific reason to prioritize averting the worst possible outcomes. There is a lot one could say about the pros and cons as well as about the implications of such views, but this is not the sense of premise 1 I’m going to talk about. In any case, I don’t think any purely value-based reading of premise 1 suffices to get this argument off the ground. More generally, I believe that your values can give you substantial or even decisive reasons to focus on s-risk, but I’ll leave it at that.
What I want to focus on instead is that, (nearly) no matter your values, premise 1 is false. Or at least it’s false if, by “the worst risks”, we understand what we’ve talked about so far, that is, badness along the dimensions of scope and severity.
When trying to find the action with the highest ethical impact there are, of course, more relevant criteria than scope and severity of a risk. What’s missing are a risk’s probability; the tractability of preventing it; and its neglectedness. S-risks are by definition the worst risks in terms of scope and severity, but not necessarily in terms of probability, tractability, and neglectedness.
These additional criteria are clearly relevant. For example, if s-risks turned out to have probability zero, or if reducing them was completely intractable, it wouldn’t make any sense to try to reduce them.
We must therefore discard the flawed argument. I won’t be able to definitively answer the question under what circumstances we should focus on s-risk, but I’ll offer some initial thoughts on the probability, tractability, and neglectedness of s-risks.
I’ll argue that s-risks are not much more unlikely than AI-related extinction risk. I’ll explain why I think this is true and will address two objections along the way.
You may think “this is absurd”, we can’t even send humans to Mars, why worry about suffering on cosmic scales? This was certainly my immediately intuitive reaction when I first encountered related concepts. But as EAs, we should be cautious to take such intuitive, ‘system 1’ reactions, at face value. For we are aware that a large body of psychological research in the “heuristics and biases” approach suggests that our intuitive probability estimates are often driven by how easily we can recall a prototypical example of the event we’re considering. For types of events that have no precedent in history, we can’t recall any prototypical example, and so we’re systematically underestimating the probability of such events if we aren’t careful.
So we should critically examine this intuitive reaction of s-risks being unlikely. If we do this, we should pay attention to two technological developments, which are at least plausible and which we have reason to expect for unrelated reasons. These are artificial sentience and superintelligent AI, the latter unlocking many more technological capabilities such as space colonization.
Artificial sentience refers to the idea that the capacity to have subjective experience – and in particular, the capacity to suffer – is not limited to biological animals. While there is no universal agreement on this, in fact most contemporary views in the philosophy of mind imply that artificial sentience is possible in principle. And for the particular case of brain emulations, researchers have outlined a concrete roadmap, identifying concrete milestones and remaining uncertainties.
As for superintelligent AI, I won’t say more about this because this is a technology that has received a lot of attention from the EA community. I’ll just refer you to Nick Bostrom’s excellent book on the topic, called Superintelligence, and add that s-risks involving artificial sentience and “AI gone wrong” have been discussed by Bostrom under the term mindcrime.
But if you only remember one thing about the probability of s-risk, let it be this: This is not Pascal’s wager! In brief, as you may recall, Pascal lived in the 17th century and asked whether we should observe religious commands. One of the arguments he considered was that, no matter how unlikely we think it is that God exists, it’s not worth risking ending up in hell. In other words, hell is so bad that you should prioritize avoiding it, even if you thought hell was very unlikely.
But that’s not the argument we’re making with respect to s-risk. Pascal’s wager invokes a speculation based on one arbitrarily selected ancient collection of books. Based on this, one cannot defensibly claim that the probability of one type of hell is greater than the probability of competing hypotheses.
By contrast, worries about s-risk are based on our best scientific theories and a lot of implicit empirical knowledge about the world. We consider all the evidence we have, and then articulate a probability distribution over how the future may unfold. Since predicting the future is so hard, the remaining uncertainty will be quite high. But this kind of reasoning could in principle justify concluding that s-risk is not negligibly small.
OK, but maybe an additional argument comes to your mind: since a universe filled with a lot of suffering is a relatively specific outcome, you may think that it’s extremely unlikely that something like this will happen unless someone or something intentionally optimised for such an outcome. In other words, you may think that s-risks require evil intent, and that such evil intent is very unlikely.
I think one part of this argument is correct: I agree that it’s very unlikely that we’ll create an AI with the terminal goal of creating suffering, or that humans will intentionally create large numbers of suffering AIs. But I think that evil intent accounts only for a tiny part of what we should be worried about, because there are these two other, more plausible routes.
For example, consider the possibility that the first artificially sentient beings we create, potentially in very large numbers, may be “voiceless” — unable to communicate in written language. If we aren’t very careful, we might cause them to suffer without even noticing.
Next, consider the archetypical AI risk story: a superintelligent paperclip maximizer. Again, the point is not that anyone thinks that this particular scenario is very likely. It’s just one example of a broad class of scenario where we inadvertently create a powerful agent-like system that pursues some goal that’s neither aligned with our values nor actively evil. The point is that this paperclip maximiser may still cause suffering for instrumental reasons. For instance, it may run sentient simulations to find out more about the science of paperclip production, or to assess how likely it is that it will encounter aliens (who may disrupt paperclip production); alternatively, it may spawn sentient “worker” subprograms for which suffering plays a role in guiding action similar to the way humans learn not to touch a hot plate.
A “voiceless first generation” of sentient AIs and a paperclip maximizer that creates suffering for instrumental reasons are two examples for how an s-risk may be realised, not by evil intent, but by accident.
Third, s-risks may arise as part of a conflict.
To understand the significance of the third point, remember the story from the beginning of this talk. The human operator Matt wasn’t evil in the sense that he intrinsically valued Greta’s suffering. He just wanted to make sure that Greta complies with the commands of her owner, whatever these are. More generally, if agents compete for a shared pie of resources, there is a risk that they’ll engage in negative-sum strategic behavior that causes suffering even if everyone disvalues suffering.
The upshot is that risks of severe suffering don’t require rare motivations such as sadism or hatred. There is also plenty of actual evidence for this worrying principle in history; for instance, look at most wars or factory farming. Both aren’t caused by evil intent.
By the way, in case you’ve wondered, the Black Mirror story wasn’t an s-risk, but we can now see that it illustrates two major points: first, the importance of artificial sentience and, second, severe suffering caused by an agent without evil intent.
To conclude: to be worried about s-risk, we don’t need to posit any new technology or any qualitatively new feature above what is already being considered by the AI risk community. So I’d argue that s-risks are not much more unlikely than AI-related x-risks. Or at the very least, if someone is worried about AI-related x-risk but not s-risk, the burden of proof is on them.
I acknowledge that this is a challenging question. However, I do think there are some things we can do today to reduce s-risk.
First, there is some overlap with familiar work in the x-risk area. More specifically, some work in technical AI safety and AI policy is effectively addressing both risks of extinction and s-risks. That being said, any specific piece of work in AI safety may be much more relevant for one type of risk than the other. To give you a toy example, if we could make sure that a superintelligent AI by default shuts down in 1000 years, this wouldn’t help much to reduce extinction risk but would prevent long-lasting s-risk. For some more serious thoughts on differential progress within AI safety, I refer you to the Foundational Research Institute’s technical report on Suffering-focused AI Safety.
So the good news is, we’re already doing work that reduces s-risk. But note that this isn’t true for all work on existential risk; for example, building disaster shelters, or making it less likely that we’re all wiped out by a deadly pandemic, may reduce the probability of extinction — but, to a first approximation, it doesn’t change the trajectory of the future conditional on humans surviving.
Next to more targeted work, there are also broad interventions which are plausibly preventing s-risks more indirectly. For instance, strengthening international cooperation could decrease the likelihood of conflicts, and we’ve seen that negative-sum behaviour in conflicts is one potential source of s-risks. Or, going meta, we can do research aimed at identifying just which types of broad intervention are effective at reducing s-risks. The latter is part of what we’re doing at the Foundational Research Institute.
I’ve talked about whether there are interventions that reduce s-risks. There is another aspect of tractability: will there be sufficient support to carry out such interventions? For instance, will there be enough funding? We may worry that talk of cosmic suffering and artificial sentience is well beyond the window of acceptable discourse – or, in other words, that s-risks are just too weird.
I think this is a legitimate worry, but I don’t think we should conclude that preventing s-risk is futile. For remember that 10 years ago, worries about risks from superintelligent AI were nearly universally ridiculed, dismissed or misrepresented as being about The Terminator.
By contrast, today we have Bill Gates blurbing a book that talks about whole brain emulations, paperclip maximizers, and mindcrime. That is, the history of the AI safety field demonstrates that we can raise significant support for seemingly weird cause areas if our efforts are backed up by solid arguments.
Last but not least, how neglected is work on s-risk?
It’s clearly not totally neglected. I said before that AI safety and AI policy can reduce s-risk, so arguably some of the work of, say, the Machine Intelligence Research Institute or the Future of Humanity Institute is effectively addressing s-risk.
However, it seems to me that s-risk gets much less attention than extinction risk. In fact, I’ve seen existential risk being equated with extinction risk.
In any case, I think that s-risk gets less attention than is warranted. This is especially true for interventions specifically targeted at reducing s-risk, that is, interventions that wouldn’t also reduce other classes of existential risk. There may well be low-hanging fruits here, since existing x-risk work is not optimized for reducing s-risk.
As far as I’m aware of, the Foundational Research Institute is the only EA organization that explicitly focuses on reducing s-risk.
To summarize: both empirical and value judgments are relevant to answer the question whether to focus on reducing s-risk. Empirically, the most important questions are: how likely are s-risks? How easy is it to prevent them? Who else is working on this?
Regarding their probability, s-risks may be unlikely, but they are far more than a mere conceptual possibility. We can see just which technologies could plausibly cause severe suffering on a cosmic scale, and overall s-risks don’t seem much more unlikely than AI-related extinction risks.
The most plausible of these s-risks enabling technologies are artificial sentience and superintelligent AI. Thus, to a first approximation, these cause areas are much more relevant to reducing s-risk than other x-risk cause areas such as biosecurity or asteroid deflection.
Second, reducing s-risk is at least minimally tractable. We probably haven’t yet found the most effective interventions in this space. But we can point to some interventions which reduce s-risk and which people work on today — namely, some current work in AI safety and AI policy.
There are also broad interventions which may indirectly reduce s-risk, but we don’t understand the macrostrategic picture very well yet.
Lastly, s-risk seems to be more neglected than extinction risk. At the same time, reducing s-risk is hard and requires pioneering work. I’d therefore argue that FRI occupies an important niche with a lot of room for others to join.
That all being said, I don’t expect to have convinced every single one of you to focus on reducing s-risk. I think that, realistically, we’ll have a plurality of priorities in the community, with some of us focusing on reducing extinction risk, some of us focusing on reducing s-risk, and so on.
Therefore, I’d like to end this talk with a vision for the far-future shaping community.
Those of us who care about the far future face a long journey. But it’s a misrepresentation to frame this as a binary choice between extinction and utopia.
But in another sense, the metaphor was apt. We do face a long journey, but it’s a journey through hard-to-traverse territory, and on the horizon there is a continuum ranging from a hellish thunderstorm to the most beautiful summer day. Interest in shaping the far future determines who’s (locked) in the vehicle, but not what more precisely to do with the steering wheel. Some of us are most worried about avoiding the thunderstorm, while others are more motivated by the existential hope of reaching the summer day. We can’t keep track of the complicated network of roads far ahead, but we have an easy time seeing who else is in the car, and we can talk to them – so maybe the most effective thing we can do now is to compare our maps of the territory, and to agree on how to handle remaining disagreements without derailing the vehicle.