Hedonistic vs. Preference Utilitarianism

Based on a piece from 2006; major additions: Oct. 2013; last update: 16 Apr. 2017


It's a classic debate among utilitarians: should we care about an organism's happiness and suffering (hedonic wellbeing), or should we ultimately value fulfilling what it wants, whatever that may be (preferences)? In this piece, I discuss intuitions on both sides and explore a hybrid view that gives greater weight to the hedonic subsystems of brains than to other overriding subsystems. I also discuss how seeming infinite preferences against suffering could lead to a negative-leaning utilitarian perspective. While I have strong intuitions on both sides of the dispute, in the end I may side more with idealized-preference utilitarianism. But even if so, there remain many questions, such as Which entities count as agents? How should we weigh them? And how do we assess the relative strengths of their preferences? In using preference utilitarianism to resolve moral disagreements, there's a tension between weighting various sides by power vs. numerosity, paralleling the efficiency vs. equity debate in economics.


Jeremy Bentham's original formulation of utilitarianism was based around happiness and suffering. Later formulations generally moved toward focus on preference fulfilment instead. Kahneman and Sugden (2005) discuss hedonism vs. preferences from the standpoint of psychology.

Economists tend to use preferences because revealed preferences can be measured, and in general, a preference ordering seems more "rigorous" than an arbitrary cardinal numerical assignment for intensities of happiness and suffering. The von Neumann-Morgenstern utility theorem demonstrated that any preference ordering over lotteries satisfying four properties could be represented by maximizing the expected value of a utility function, unique up to a positive affine transformation. This utility function needn't represent the same thing as Bentham's original conception of happiness, although Yew-Kwang Ng argues that it does when finite sensibility is taken into account.

Of course, having these numerical utility functions still doesn't necessarily allow for interpersonal comparisons of utilities. The best economists can do in their concepts of efficiency is talk about Pareto or potentially Pareto improvements, but these don't capture all changes that we may wish to say would improve utility. For example, suppose an unempathic billionaire walks past a homeless little girl on the streets. It would not be a Pareto or even potentially Pareto improvement for the billionaire to buy the girl a winter coat against the cold, yet most of us would like for the billionaire to do so (unless he has vastly more cost-effective projects to fund instead).

So whether we are hedonistic or preference utilitarians, we may want to make value judgements for interpersonal comparisons that go beyond the "rigorous" preference-oriented framework of economists. What we thereby lose in objectivity, we gain in moral soundness. (Note: Probably there are many attempts to formally ground interpersonal comparisons of von-Neumann-Morgenstern utility, though I'm not aware of the details.)

Cases where preferences diverge from hedonic wellbeing

Most of the time, what an organism prefers for himself is what (he thinks) will make him most happy and least in pain. For me personally, my selfish preferences align with my selfish hedonic desires maybe ~90% of the time. When this is true, the distinction between preferences and hedonic satisfaction may not be crucial, although it could affect some of our other intuitions as discussed below.

There are some cases where the two diverge, such as

  • People preferring not to enter a blissful experience machine
  • A person preferring to maintain a cantankerous attitude rather than adopting a more cheerful one
  • A monk voluntarily fasting for two weeks in pursuit of a higher calling.

Preference utilitarianism as a universal morality

In this section I suggest one intuition in favor of preference satisfaction.

Example 1. First consider a universe in which no life exists. There are no feelings, sentiments, or experiences. Only stars and desolate planets fill the void of space. It seems intuitive that nothing matters in this universe. As there are no organisms around to care about anything, ethics does not apply.

Example 2. Consider a second universe that contains exactly one organism, named Chris. In Chris's mind, the only thing that matters is carrying out his ethical obligation to build domino towers. Since this is the only ethical principle that exists, there's some quasi-universal sense in which it's ethical for Chris to stack dominoes.

Example 3. What if we now complicate the situation and consider a universe with two organisms: Chris (from before) and Dorothy? Suppose that Dorothy's only goal is to prevent the construction of domino towers. Thus, Chris can only act in a way that he considers ethical if he abridges Dorothy's ethical belief. The same is true for Dorothy with respect to Chris's ethical belief. How do we resolve the dispute?

Recall that ethics only began to apply in the universe once Chris and Dorothy existed. Suppose Dorothy holds her belief twice as strongly as Chris does. Then, in some sense, Dorothy's belief "exists" twice as much, so the quasi-universal ethical stance is to give Dorothy's belief twice as much consideration. In this particular example, it's best to prevent construction of domino towers.

If we apply the intuition from these examples to any finite number of organisms, all with finitely strong ethical beliefs, the result is preference utilitarianism.

What shall we do with organisms that don't explicitly recognize what they care about? For instance, what if the universe consisted entirely of a single mouse that was in pain? We can suppose for the sake of argument that the mouse doesn't conceive of itself as an abstract organism enduring negative sensations. Presumably the mouse doesn't think, "I wish this pain would stop." But the intuition that motivates our concern for the interests of other beings rests not upon the ability of those beings to explicitly state their wishes -- rather, it comes from an empathetic recognition that those wishes exist and matter. Clearly the mouse's pain is a real event that matters to the mouse, even if the mouse can't articulate that fact. So preference utilitarianism does give consideration to implicit preferences -- whether held by human or non-human animals.

Libertarian intuitions for preference utilitarianism

Preference utilitarianism is not the same as libertarianism, because there may be cases in which a person is morally obligated to act against her wishes to better satisfy the wishes of others or potentially even her future self. That said, the preference view does a better job of capturing the sense of individual autonomy than does the happiness view. On the happiness view, one can imagine "dissident emotional primitives being dragged kicking and screaming into the pleasure chambers," but this seems less likely on the preference view.

A main reason I find the preference view plausible is that ultimately what I would want for myself is for my preferences to be satisfied, not always for me to be made happier, so extending the same to others is the nicest way to treat them. In other words, preference utilitarianism is basically the Golden Rule, which is "found in some form in almost every ethical tradition," according to Simon Blackburn's Ethics: A Very Short Introduction (p. 101).

Criticisms of the preference view

Consider some objections:

  1. Irrational decisions: What are we to make of someone who voluntarily has unprotected sex with a stranger and incurs high risk of contracting HIV? Or of a prisoner who shares needles with fellow inmates?
  2. Liking vs. wanting: It may be that wireheads want brain stimulation without enjoying it. More generally, liking and wanting appear to be different brain systems. Isn't it absurd to give people something they want if they don't like it?
  3. Misinformed preferences: What about someone who believes he wants lower taxes but actually would be better off with a society that has higher taxes?
  4. Perverse preferences: Imagine a pig with a single overriding desire -- to be brutally tortured. This preference is not mistaken or transitory; while being tortured, the pig's desire only grows stronger. But neither does the preference stem from great hedonic satisfaction during the process. The pig's desire is a nonaffective one, which perseveres in spite of intense agony.
  5. Infinite preferences: What do we make of a person who would not accept any amount of pleasure in return for someone insulting her reputation? While humans cannot experience infinite suffering or happiness at a given instant, preferences seem to be more open to non-Archimedianism.

The response to "irrational decisions" is simple: Utilitarianism counts the preferences of all organisms, not just those existing right now, so we need to weigh your current self's preference for quick relief against your future selves' preference to not live with AIDS.

Liking vs. wanting is an important consideration. I agree with the intuition that liking should trump wanting, but my guess is that people who want something without liking it would prefer (meta-want) to not want the thing. For instance, drug addicts who crave an additional hit wish they didn't have those cravings. If meta-preferences can override or at least compete strongly with base-level preferences, the problem should usually go away. In cases where it doesn't go away, the situation reduces to one of "perverse preferences."

The remaining three objections I'll discuss in subsequent sections.

What is a preference, anyway?

Intuitively, when an organism has a preference, it wants the world to be in one state rather than another. For example, an animal in the rain may prefer to be warm and dry rather than wet and cold. Inside its brain, there's a system telling the animal that things would be better if it were inside.

Preferences can also extend beyond the hedonic wellbeing of a person. For example, deep ecologists may prefer that nature is kept untouched, even if no human is around to observe this fact (and, I would add, even if multitude animals suffer as a result).

Trivial systems

Consider the following:

  • Does a thermostat set to 22.5 degrees Celsius "prefer" for the room to be at that temperature? Do we violate the thermostat's preference if we open windows to the cold air and keep the temperature below that level?
  • Does a computer "prefer" to run its computations, and is that preference violated when you unplug its power source?
  • Does gravity "prefer" to pull balls toward the Earth, and do you temporarily violate that preference when you throw them in the air?
  • If you tap your knee with a hammer, does it "prefer" to jerk in response?

It seems that when we talk about preferences, we really mean the desires of some sort of agent, especially a conscious agent, rather than an arbitrary system or force of nature. If so, this already suggests some connection between hedonistic and preference utilitarianism: The agents that we count as having preferences tend, especially in our current biological world, also to be agents that have emotional experiences.

If preferences should be imputed mainly to minds that are conscious agents, different people may have different ideas about where to draw the boundaries of what a preference is -- since, indeed, even the boundaries of "conscious" and "agent" are up for dispute. The preference view makes it slightly more plausible that a broader class of agents has preferences than agents to whom we would have attributed pleasure and pain on the hedonistic view, just because it seems like preferences are conceptually simpler sorts of attributes that aren't so narrowly confined to hedonic systems of the type found in animal brains. But exactly how broadly we extend the notion of preference satisfaction is up to our hearts to decide.

As with consciousness in general, these questions are not binary. I might give extremely tiny weight to satisfying a thermostat's preference to have the temperature at 22.5 degrees, but this is so small that it can generally be ignored. Probably better to have ten million thwarted thermostats than one mouse shivering in the cold for 30 seconds.

It may be that micro-scale physical processes exhibit behavior analogous to thermostats, and we might wonder if these would dominate calculations due to their prevalence. This is worth considering, but keep in mind that a digital thermostat is a much more complex system than, say, a covalent bond between atoms. A digital thermostat is not only bigger but includes a small computer, a display, buttons for various settings, and so on. It's plausible that these things add moral weight, just as the extra complexity of animals adds moral weight above a thermostat.

If it seems absurd to give any consideration to thermostats, keep in mind that animals and people can be seen as very complicated thermostats -- using sensors and taking actions to keep themselves in homeostasis. This complexity includes higher-level thoughts, feelings, memories, and so on, and if we choose, we could require some minimum threshold of these characteristics before an agent's preferences counted at all. But if we don't set such a threshold, it's natural to see even the thermostat in your home as having a preference that matters to an extremely tiny degree.

Brain elections

Brains are ensembles of many submodules, which are themselves ensembles of many neurons. Some neurons and submodules push for one action (e.g., go to sleep due to tiredness), while others push for a different action (e.g., stay awake to reply to a comment). The coalition with more supporters wins the election in deciding your action choice. If the election is close, presumably the preference is not very strong compared with a landslide election (e.g., take my hand out of this pot of scalding water).

An interesting question is whether we should count the individual votes in the election separately or just the final outcome. In general this shouldn't much matter, because for example, if the election gave 45% of votes for sleeping and 55% for replying to the comment, the preference for staying up to reply to the comment would be relatively small (55% - 45% = 10%), and satisfying it would matter less than satisfying a landslide election, like removing your hand from hot water. So whether we say the preference is just 10%, or whether we say it's 55% for, 45% against, summing to 10% net votes for, it wouldn't affect our judgement. It's just a matter of whether the aggregation is done in the person's head or by our ethical evaluation.

(Note that the weight of a preference is determined both by the degree of mandate of the winner and the overall size of the populace. The hand-in-hot-water election occurs in a very big country where lots of neurons are sending strong votes, so that election matters a lot even beyond the fact that it had a landslide outcome.)

Perverse preferences

Where this discussion becomes relevant is in the case of "perverse preferences" raised as one objection to preference utilitarianism. Few biological agents exhibit strongly perverse preferences, but they certainly seem possible in principle. For example, imagine an artificial mind where the emotional center produces one output message, and then the sign gets flipped on its way to the motivational center. In this case, if we only look at the final output behavior, we conclude that the preference is to suffer, but if we extend most of our ethical concern to what the electorate actually felt in this "rigged election," we would conclude that the suffering should stop.

Verbal vs. implicit preferences

This subsystem-level view can also help us see why an organism's verbalized preference output is not necessarily the only measure of its underlying preference. The person may be confused, or trying to conform to social convention, or misinformed, or otherwise introspectively inaccurate. While we have very effective brain systems whose goal is to predict how much we'll like or dislike various experiences, these predictions can be off target, and ultimately, the proof is in the pudding. The brain's response to actually experiencing the event should arguably play the strongest role in our assessment of what a person's preference is about that event, not so much his prediction beforehand or even recollection afterward.

In some sense, the neural-level viewpoint is the hedonist's reply to the preference utilitarian's Golden Rule intuition. The preference utilitarian says, "Treat others how you'd want to be treated, which means respecting their preferences." The hedonistic utilitarian replies: "People are not unified entities. There are multiple 'selves' within an organism with different responses at different times. It's true that some win control to decide behavior, but we should still care about the losers' preferences somewhat as well."

Idealized preferences

More generally, what preference utilitarianism actually cares about, in most formulations, are idealized preferences -- what the agent would want if it knew more, was wiser, had improved reflective capacity, had more experiences, and so on. Probably most preferences that appear perverse are actually just not idealized. Of course, idealization introduces a host of new issues, because the idealization procedure is not unique and may lead to significantly different outputs depending on how it's done. This is troubling, but if we believe idealization makes sense, it's best if we pick some plausible idealization procedure rather than avoid idealization altogether.

Neural authoritarianism

This view of considering brain subsystems and neurons rather than just explicit preferences and actual decisions is a sort of blend between hedonistic and preference utilitarianism: It feels a lot like hedonistic utilitarianism, because the agents whose preferences we're counting are the (mainly but not exclusively) hedonic subcomponents of the decision. Of course, if non-hedonic subsystems did override the hedonic ones, as is sometimes the case even in biological organisms, we might choose to favor the non-hedonic subsystems, depending on how much they seem to be genuine members of the neural electorate vs. how much they appear to be just voter fraud.

But what we gain in concordance between the hedonistic and preference views, we lose in autonomy by individual actors. For instance, suppose we could see that your neurons would, on the whole, accept a trade of being kicked in the knee in return for a trip to the amusement park. However, you feel you don't want to be kicked in the knee, and this would violate your right to refuse harm. Should we force you to be kicked against your will? This is a tricky question. My intuition says "No" because of the "violation of liberty" that's involved, but the flip side is to feel sorry for all those powerless neurons that are losing out on the amazing rides they could be enjoying. I might feel the opposite way if the scenario were inverted: If a person wanted to be kicked in order to get a day at the amusement park, even though the neurons would dislike the kicking more than they would like the roller coasters and Ferris wheels, then I'd be more inclined to say the person should not be allowed to get kicked.

In any event, in many cases outside of the toy example of a torture-wanting pig whose output behavior was distorted from the underlying hedonic sensations, neural votes probably don't diverge that much from people's autonomous choices, and even if they did:

  1. How would we know our second-guessing of people's introspection is more accurate than their introspection itself? It could easily be less accurate.
  2. Even if it were more accurate, exerting this kind of control over other people could make them feel resentful and opens the doors to corruption by authoritarian leaders.

On the Felicifia forum, Hedonic Treader rightly observed that we should err on the side of personal freedom in most cases. Of course, as Michael Bitton pointed out to me, we can also nudge people in better directions by using cognitive psychology to influence their choices without eliminating options.

Infinite preferences and negative-leaning utilitarianism

Consider someone who claims he would not accept even one second of torture in return for eternal bliss. If we take this at face value, it would imply that torture is infinitely worse than happiness for this person. Then if we try to combine this person's utility with that of other people, would his negative infinity on torture swamp everyone else?

One approach is to deny that this person actually has an infinite preference. Perhaps the person is misinformed about how bad the torture would be, and probably he's unfairly discounting the future pleasure moments. Taking a neural-level view, we might say that the aversive reactions to torture are not infinitely more powerful than the positive ones to happiness. Yet the person may still maintain his stance against these allegations.

While I think it's not right to let this single person's preference dominate the non-infinite preferences of others, I do think we should take it somewhat seriously and not simply override it on a neural view. We have to strike a balance between overcoming the irrationalities of explicit preferences versus avoiding neural authoritarianism. In this case, I would probably not treat the preference against one second of torture as infinite but as extremely strong and finite, requiring vast amounts of pleasure to be outweighed. Because few people express the reverse sentiment (that "I would accept infinite durations of headaches and nausea in return for one second of this blissful experience"), the existence of people with this anti-torture sentiment pushes somewhat toward a negative-leaning utilitarian view.

Of course, most people would accept a second of torture in return for eternal bliss (or even just very long bliss), but perhaps if the torture was bad enough, they also would change their minds in that moment. This should be taken seriously. It's also a reason why I think small amounts of very bad suffering are far more serious than lots of mild suffering: We're willing to trade mild suffering for mild pleasure even when enduring the mild suffering, but if the suffering becomes intense enough, we might not accept it in return for any amount of pleasure, at least not in the heat of the moment.

Preference utilitarianism is more constrained

Hedonistic utilitarianism allows for a large degree of flexibility in deciding exactly how much happiness and suffering a given experience entails. For example, negative-leaning utilitarians can set the suffering value of a very painful experience as much more negative than a more positive-leaning utilitarian would.

With preference utilitarianism, the utility assignments are more constrained because they should generally respect the observed preferences of the actor, although there are exceptions discussed above in cases like irrationality, time discounting, epistemic error, or major conflict between the brain's high-level output and low-level hedonic reactions. So, for example, when most people say they're glad to be alive rather than temporarily unconscious, we should generally take this at face value and assume their lives are above zero, at least at that moment.

Of course, there remains plenty of wiggle room for preference utilitarians to make judgment calls in deciding when the exceptions apply, as well as through interpersonal-comparison tradeoffs.

Preferences about the external world

In this piece I've mainly discussed selfish preferences: How an actor feels about her own emotions or other affairs regarding herself, such as whether her honor has been tarnished, whether she has been used as a means to an end, or various other concerns that may be more than immediately hedonic but still self-directed.

What about preferences regarding the wider world? One I mentioned already was deep ecologists' preference (which I do not share) for untouched nature, even if no one is around to see it. Various other moral preferences are of a similar type: Wanting to reduce poverty, increase social tolerance, limit human-rights abuses, reduce wild-animal suffering, and so on. In these cases, the actor does not just care about his own experience but actually cares about something "out there" in the world and would continue to care whether or not he was around to see it and whether or not he could fool himself into thinking his goal had been accomplished. I don't want to just feel like I've reduced expected suffering but rather want to actually reduce expected suffering.

A deathbed wish

Suppose a grandfather's dying wish is to leave his fortune to his favorite grandson. You're the only person to hear the grandfather make this request, and the default legal outcome is for some of the money to go to you, allowing you to donate it to important charities. Is it wrong to not report the grandfather's wish? After all, if you let the default legal outcome happen, you'll be able to donate your share of the inheritance to important charities.

Well, there are a few instrumental reasons why it would be wrong: Lying is almost always a bad idea, and a society in which people lie, even when they feel doing so is right, would likely be worse than our present society. It's generally good to create a culture in which dying wishes are respected, and doing so here contributes to that goal.

But is there any further sense in which not honoring the grandfather's wishes is wrong? After all, he's already dead and can't feel bad about his wishes not being respected. He also had no idea his wishes wouldn't be carried out, because he assumed you were a trustworthy person.

This question is tough. It feels very counterintuitive to suggest that a person's preferences can be violated after he's dead by something that he'll never know about, and yet, if his preference actually referred to a thing in the world happening, and not just to his subjective perceptions, then in this case his preference would be violated.

I do know that for myself, I actually want my preferences about the world to be carried out, and I would regard it as wrong if they weren't. But is this special to my preferences because they're mine, or do my preferences say it's wrong when others' non-self-directed preferences are violated? I incline toward the latter view, because ethics is fundamentally about others, not about myself. However, I'm not completely sure, and people disagree on this point.

If we do take the view that it matters if preferences are actually fulfilled rather than just whether an organism thinks they are, then this makes sense of Peter Singer's stance that involuntarily killing persons is wrong, even if the persons would never realize they had been killed, because doing so violates their actual preferences to keep living. We might also ask whether animals that don't have a sense of themselves existing over time still have implicit preferences against dying; Singer doesn't think so, but if we count implicit preferences in other domains, why not this one? (Note that historically, I have not found painlessly killing animals to be wrong, so with this last remark I was challenging my own assumption rather than advancing a view I hold confidently.)

Double counting?

Another concern with respecting altruistic and not just selfish preferences is double counting: If someone cares about everyone, then helping others is good both for those others and for the person who cares about those others. If everyone cared about everyone else, then helping everyone would be good for an individual mainly through the effects on those other than herself. This seems weird, but maybe that's just because it doesn't describe the situation of our actual world. In practice, especially when we talk about actual rather than stated preferences, most of us devote a large fraction of our caring budget to ourselves.

Reading a private diary

Suppose you're at a friend's house, and your friend goes away for 15 minutes to take a shower. You're left in the living room, and you see the friend's diary on a shelf. The diary says "Private - Do Not Read," but you're curious, and you think, "It wouldn't hurt anyone to take a peek, right?" Is it wrong to read the diary if you could be sure no one would find out?

A hedonistic act-utilitarian might say reading the diary was okay, if it was really certain no one would find out and if doing so wouldn't have hurt your relationships or future behavior. A hedonistic rule-utilitarian or other meta-utilitarian might object to the harm that such activities tend to cause in general, or even the harm that such a principle would cause to utilitarianism itself. A preference utilitarian who accepts the importance of desires about the external world can object on an even simpler basis: Reading the diary violates your friend's preference even if she never finds out. I visualize these external preferences as being like invisible strings that we step on and break when we violate the wishes of someone not there to witness us doing so.

Most of our preferences involve wanting the configuration of our brain (in particular, our hedonic systems) to be one way rather than another. However, sometimes preferences involve wanting the external world to be one way rather than another (e.g., wanting the diary to be in the state of "not being read by other people"). Is there really a fundamental difference between these two kinds of preferences?  The difference seems mainly to be about the degree of abstraction that we interpret on the actions and tendencies of your neural system.

Active vs. passive preferences

Hedonistic utilitarianism has the virtue that it's (relatively) clear when a given hedonic state is engaged, making counting of happiness and suffering (relatively) straightforward. In contrast, a person may have many preferences all at once, most of which aren't being thought about. We might call a preference that's merely latent in someone's connectome a "passive preference", while a desire that's currently being felt can be called an "active preference". Active preferences appear similar to hedonic experiences (and may induce or be induced by hedonic experiences), making them easier to count.

Do active preferences matter more than passive ones? If you have a preference that you never think about (e.g., the preference not to be held upside down by an elephant's trunk), does its satisfaction still count positively to moral value? Given that a person appears to have infinitely many such passive preferences at any given time, if fulfillment of these preferences does matter, how do we count them? Or do most such preferences boil down to a few basic preferences, like not being injured, frightened, etc.?

The spirit and the flesh

Many of us feel a qualitative difference between different types of motivational states. The more basic ones seem to be pleasures/pains "of the flesh", often corresponding to clearly physical beneficial/harmful stimuli. We also have feelings "of the spirit" that don't inform us of specific somatic events but rather represent more abstract longings for the world to be different and joy upon seeing positive changes. "Soul hurt" may refer to events beyond our immediate lives -- such as, in my case, the existence of huge amounts of suffering in the universe.

In general, fleshly experiences feel more hedonically toned, while spiritual desires feel more oriented around preferences, but there's certainly overlap on each side. For instance, soul hurt does hurt a little bit hedonically, but not as much as the soul thinks it should compared with fleshly experiences. (Adam Smith: "If he was to lose his little finger to-morrow, he would not sleep to-night; but, provided he never saw them, he will snore with the most profound security over the ruin of a hundred millions of his brethren").

How should a moral valuation trade off people's fleshly experiences versus their spiritual desires?

Consider a mind like that of Mr. Spock: It lacks many ordinary accoutrements of emotion (physiological arousal, quick behavioral changes, etc.), but Spock's goal-directed calculations still embody preferences about how the world should be different. Does Spock then mainly experience soul hurt rather than fleshly hurt? Or does broadcasting of bad, negatively reinforcing news throughout Spock's brain still count as aversive hedonic experience to some degree even if it lacks the other processes that accompany emotions in humans?

Hedonic experience as the id's preferences?

Most people have an intuitive sense of what's meant by "raw pleasure" and "raw pain", but actually defining these hedonic experiences is slippery. One plausible definition could be "awareness of reward or punishment signals that trigger reinforcement learning, changes in motivation, evaluative judgments, and so on". This definition is complicated and fuzzy.

It's plausible that motivation should be an important part of hedonic experience. Pain asymbolia "is a condition in which pain is experienced without unpleasantness." This type of pain doesn't seem very morally bad to me, which suggests that the (at least implicit) desire for pain to stop is an important part of what makes pain bad. In other words, a (perhaps merely implicit and low-level) preference for pain to stop seems crucial to the hedonic experience.

In light of this, we might ask whether hedonic experience can be interpreted as a type of preference, perhaps with some extra features as well. In cartoon form, we might picture pleasure as "fulfillment of the preferences of the Freudian id", and suffering as frustration of those preferences. Other parts of us, such as our Freudian superegos, have other preferences, including moral desires. Perhaps when we contrast hedonistic vs. preference utilitarianism, we're mainly contrasting the preferences of the id versus the preferences of the superego?

While it's fashionable to belittle Freud, I think Freud's id/ego/superego distinction remains powerful, and the idea that our minds contain several competing subsystems seems correct. As one example of a more modern theory with similarities to Freud's, Christiano (2017)'s distinction between "cesires" and "desires" is reminiscent of the distinction between the id and the ego+superego.

My personal take

This section was written in 2013 and is somewhat out of date.

For a while I had a strong instinct towards hedonistic utilitarianism, and when I saw people violating it (e.g., in the case of Robert Nozick's experience machine), my immediate reaction was to think, "No, you're wrong! You're misassessing the tradeoff." I feel the same way when Toby Ord says he would accept a day of torture for ten years of happy life. I would vehemently refuse this trade myself, but if someone else chose it after careful deliberation, perhaps including trying some torture to see how bad it was, I might let that person do what he wanted.

In some of these cases where I can't believe that people hold the preferences they claim to, there may be biases going on: Discounting the future, factual mistakes, overriding major hedonic subsystems, deferring to social expectations, etc. However, in other cases it may genuinely be the case that other people are wired differently enough from myself that they do rationally prefer something different from what I would choose.

Taking the preference stance requires a higher level of cognitive control over my emotions than the hedonistic stance, because I have to abstract away my empathy from the situation and picture it more as a black-box decision-making process coming to some conclusion. If I look too closely at what that conclusion is, I'm tempted to override it with my own feelings.

Over time I've grown more inclined toward the preference view. It seems more elegant and less arbitrary, it appears to be a universal morality in the sense of resolving competing values and goals, and it encapsulates the Golden Rule, which is probably the most widespread moral principle known to humankind (and maybe even more broadly in the universe, since altruism seems to be an evolutionarily convergent outcome for agents that confront iterated prisoner's dilemmas). At the end of the day, morality is not about what I want; it's about what other people (and organisms in general) want.

Interestingly, Peter Singer -- once a prominent preference utilitarian -- has shifted in the opposite direction. In an episode of the podcast Rationally Speaking, Singer explains that he now aligns closer to the sophisticated hedonistic view of Henry Sidgwick. Singer believes that only consciously experienced events matter, although we should construe hedonic experience more broadly than just raw pleasure and pain.

Non-conscious agents?

The preference-utilitarian view can lead to perspectives that most people would find strange. For example, suppose we encountered aliens whose overriding goal was to compute as many digits of pi as possible. If this was a clear preference by these conscious agents, a preference utilitarian would care about them achieving their goal, at least to some degree depending on their moral weight. This may seem preposterous to us, but remember that we're fundamentally no different, and if we were those aliens, we would really care about the digits of pi too. (Some values that other humans care about seem no less absurd to me than wanting to compute digits of pi.)

There's a perhaps even more strange implication of this view: Preference utilitarianism might lead us to care about entities like companies, organizations, and nation states. These, too, strategically act like agents optimizing a set of goals. Even though they're composed of conscious elements (the people who run them), their own utility functions may not correspond to conscious desires of anyone in particular.

We can see an interesting parallel between the consumer's utility-maximization problem and the firm's profit-maximization problem. For instance, both consumer utility and firm output are often modeled in economics as Cobb-Douglas functions. This makes sense because, biologically, humans are factories for producing evolutionary fitness, with diminishing marginal product for any given factor holding the others constant.

But are we really going to care about corporations apart from the people that comprise them? There's already a lot of backlash against legal corporate personhood, much less ethical personhood. It also seems slightly odd that a corporation could be "double counted" as mattering itself and also by the welfare of the people who constitute it. Of course, if there were a real, conscious China brain, even a hedonistic utilitarian would face double counting in some cases.

Right now I feel like I don't care about corporations or governments separately from the people who comprise them, but I'm not sure if this view would hold up on further reflection. I'm more inclined to care about non-conscious alien agents. Maybe this distinction comes down to being able to imagine myself as the alien better than as the corporation.

After writing this section, I discovered an important essay by Eric Schwitzgebel: "If Materialism Is True, the United States Is Probably Conscious." It provides some nice thought experiments for extending our intuitions about which computations we care about. My response is that asking "Is the United States really conscious?" is a confused question, analogous to "If a tree falls in the forest with no one there to hear it, does it really make a sound?" People have an easier time dissolving the latter question than the former, but they're structurally identical confusions.

Is any sufficiently advanced intelligent agent conscious?

Above I've been assuming it's possible for there to exist a sophisticated agent whom we don't consider to have phenomenal experience, according to our understanding of what phenomenal experience is like. Is this even possible? Obviously it depends on how we define the boundaries of phenomenal experience. Here's one argument why our ordinary conceptions of phenomenal experience may encompass any sufficiently intelligent agent.

What is conscious emotion for us? Many people have many theories, and I don't claim to have the answer. But it seems that emotion consists in systems that represent changes in expected reward/punishment, expression of drives and motivations, and then thinking, self-reflecting, planning, imagining, and idea-synthesizing about how to accomplish our goals. There are many other bells and whistles that come along as well. It seems a sufficiently intelligent agent would have all of these properties. Maybe the bells and whistles would look different, and the thinking might happen in different ways, but the fundamental process of having drives and exploring how to execute them seems common. If there is more to conscious emotion in animals than what I described, my guess is that what I've left out is like missing pieces in a jigsaw puzzle, rather than a completely fundamental feature without which all the other characteristics are completely valueless.

With the example of the United States as an agent, it's not clear if it meets the criteria for being "sufficiently intelligent" or "agent-like," especially since its boundaries as a unified agent are themselves unclear. For instance, suppose you ask a question to the United States. The answer would be written by some person within the country, using that person's brain. Why should we say the United States as a whole wrote the reply rather than that particular person did? I guess the United States would be a more compelling agent if the country collectively wrote the reply; if so, we might have more intuition of the United States being more like an agent.

Alternatively, we might prefer to define phenomenal experience more narrowly, to only encompass agent intelligence constructed in ways more similar to those found in animal brains. For instance, while the United States may have forms of self-reflection (e.g., public-opinion polling) that extend beyond what a single individual does, it's not clear that it has the same kind of emotional self-reflection as I do. The choice of how parochially we want to define consciousness is up to us.

Murray Shanahan presents an interesting argument in Chapters 4-5 of Embodiment and the Inner Life: Cognition and consciousness in the space of possible minds, summarized in his talk at the AGI 2011 conference. He suggests that general intelligence actually requires consciousness. Shanahan's main idea is that responding to completely new situations entails recruiting coalitions of many brain regions, working together in new ways rather than repeating old stereotypes. The most prominent coalitions are broadcast widely, which is what gives rise to consciousness according to global workspace theory, one of the leading theories of consciousness with good empirical support. This jibes with our experience, in which routine, automatic tasks (including even driving a car, walking, or brushing our teeth) can be done without being very conscious of them. Consciousness is required when the context is new, and you need to marshal help from many brain regions at once in an "all hands on deck"1 sort of way in order to respond to a novel challenge as best as possible. This kind of global recruitment of brain components may be common to many cognitive architectures to various degrees.

Ward (2011), p. 465:

the evidence is overwhelming that consciousness is functionally integrative and that this is the dominant fitness advantage it provides for conscious organisms. In other words, the role of this integrative processing is to provide internal representations (‘‘models’’) of the niche-relevant causal structure of the environment, including objects and their surroundings and the events that take place there. Baars (2002) reviews extensive evidence that integrative conscious processing involves more widespread cortical activity than does unconscious processing.

Marvin Minsky views consciousness as coming along with intelligent computation:

I don't [see] consciousness as holding one great, big, wonderful mystery. Instead it's a large collection of useful schemes that enable our resourcefulness. Any machine that can think effectively will need access to descriptions of what it's done recently, and how these relate to its various goals. For example, you'd need these to keep from getting stuck in a loop whenever you fail to solve a problem. You have to remember what you did-first so you won't just repeat it again, and then so that you can figure out just what went wrong-and accordingly alter your next attempt.

One review of Minsky's The Society of Mind says: "The question is posed as to whether machines can be intelligent without any emotions. The author seems to be arguing, and plausibly I think, that emotions serve as a defense against competing interests when a goal is set. Emotional responses occur when the most important goal(s) are disrupted by other influences. Intelligent machines then will need to have the many complex checks and balances."

In "Philosophers & Futurists, Catch Up!" Jürgen Schmidhuber offers another account of why consciousness may be convergent within certain human-like learning architectures (pp. 179-180):

we have pretty good ideas where the symbols and self-symbols underlying consciousness and sentience come from (Schmidhuber, 2009a; 2010). They may be viewed as simple by-products of data compression and problem solving. As we interact with the world to achieve goals, we are constructing internal models of the world, predicting and thus partially compressing the data histories we are observing. If the predictor/compressor is an artificial recurrent neural network (RNN), it will create feature hierarchies, lower level neurons corresponding to simple feature detectors similar to those found in human brains, higher layer neurons typically corresponding to more abstract features, but fine-grained where necessary. Like any good compressor the RNN will learn to identify shared regularities among different already existing internal data structures, and generate prototype encodings (across neuron populations) or symbols for frequently occurring observation sub-sequences, to shrink the storage space needed for the whole. Self-symbols may be viewed as a by-product of this, since there is one thing that is involved in all actions and sensory inputs of the agent, namely, the agent itself. To efficiently encode the entire data history, it will profit from creating some sort of internal prototype symbol or code (e. g. a neural activity pattern) representing itself (Schmidhuber, 2009a; 2010). Whenever this representation becomes activated above a certain threshold, say, by activating the corresponding neurons through new incoming sensory inputs or an internal 'search light' or otherwise, the agent could be called self-aware. No need to see this as a mysterious process -- it is just a natural by-product of partially compressing the observation history by efficiently encoding frequent observations.

However, Schmidhuber goes on to suggest that there exist theoretical intelligent agents that are not conscious in any familiar sense of that term:

Note that the mathematically optimal general problem solvers and universal AIs discussed above do not at all require something like an explicit concept of consciousness. This is one more reason to consider consciousness a possible but non-essential by-product of general intelligence, as opposed to a pre-condition.

So maybe we would regard theoretical optimal problem solvers as unconscious. Or maybe we would consider our primate-based notions of conscious agency too narrow and expand our sphere of concern to encompass any sort of powerful intelligence for ethical calculations.

Hedonic happiness, preference-based suffering

If one holds the hedonistic-utilitarian view and believes that not all preferences matter, then one might encourage creating organisms that are motivated positively by pleasure but negatively by non-hedonic preferences against harm. They would then enjoy the good moments but robotically act to avoid bad moments without "feeling" them. I think this perspective is based on an overly parochial view of what we care about morally, and I think any violated preference is bad to some degree. But maybe I would care slightly more about familiar hedonic suffering, in which case agents of this type might be at least better than the default.

David Pearce proposes the idea of equipping ourselves with robotic prostheses (incorporating a manual override to ensure individual autonomy) that would catch people before they made decisions that would cause harm, such as touching a hot stove or stepping off a cliff. Insofar as such devices would implement prevention against harm (similar to neuronal reflex responses that don't reach the brain) rather than negative reinforcement in response to harm, it's not clear how much they would pose an ethical issue even to a preference utilitarian. That said, I have doubts about the ability of such systems to generalize to many preventative contexts without a higher-level intelligence of their own. Perhaps they could be useful in limited circumstances, in which case they would be a rather natural extension of existing protective devices like seat belts and safety goggles.

Idealized preferences and agent-moments

Every young boy wants to become a paleontologist when he grows up. Then, as he matures, he realizes that other careers might in fact be more rewarding. When he takes a job other than digging for dinosaur bones, is he violating the preferences of his younger self?

I think the answer is plausibly "no," and the reason is that, as mentioned previously with reference to misinformed preferences, the boy's preference to be a paleontologist lacks insight into what his life will actually be like at a later stage. His stance is more of a prediction: I expect that when I'm older, I'll then want to be a paleontologist. His idealized preference -- to do what makes him happy -- was really the same the whole time, and it's just his assessment of what he would enjoy that changed.2

So we have just one idealized preference -- do what makes me happy -- for both Young Self and Old Self. Does this mean that when we count preferences, this counts just once? In contrast, if Young Self had genuinely different values on an issue that Old Self did not share even upon idealized reflection, each of these would count separately, making two total preferences?

Or take a more extreme example: If a being has a consistent idealized preference over its billion years of life, is satisfying this preference no more intrinsically important than satisfying a similar preference by a being that pops into existence for a millisecond and then disappears immediately after the preference is fulfilled? Shouldn't the extra duration count for something?

Yes, I think extra duration should count. In particular, rather than asking whether a given idealized preference was held, I would count the number of agent-moments for which a given idealized preference was held. After all, in an infinite multiverse, all physically possible idealized preferences will be held with some measure by some random configurations of matter. What we really care about is how many agent-moments hold a given idealized preference. The billion-year-old agent had astronomically more agent-moments holding its preference, so fulfillment of its preference counts astronomically more.

This proposal may actually be counterintuitive. It suggests, for instance, that helping elderly people might matter more than helping young people if the elderly people's preferences (for instance, that they be happy in their old age) were held longer than those of the young people. Or maybe we would anticipate the future preferences of the young people to have lived happily and try to satisfy those in advance.

A further question is whether preferences held over time matter for all agent-moments that would idealize to that preference or only when the agent is thinking about that particular preference. At the very least, it seems like preference strength may vary in time? I always implicitly prefer for all my past, present, and future selves not to be beaten up, but if I were being beaten up, I would really prefer not to be beaten up then.

In general, the project of intertemporal preference satisfaction is tricky! Ordinarily preference utilitarians sweep it under the rug, because the current self has all the power, and past and future selves are at its mercy. Someone who firmly committed last month to exercise 3 times a week might grow lazy and stop caring about what seemed to his past self a crucial New Year's resolution.

Hedonistic utilitarianism avoids the mess that intertemporal preference utilitarianism seems to generate because

  1. whereas past, present, and future agents can all simultaneously have preferences about their past, present, and future selves, hedonic emotions are only counted when they're experienced
  2. whereas organisms may have implicit idealized preferences about everything at every moment, they only experience certain specific feelings at a given moment.

Could we make preference utilitarianism more like hedonistic in these regards? We could require that only preferences about one's current state count, and then only when one is actively thinking about that preference. This seems to me to go too far towards a hedonistic view, because it doesn't allow for thwarting of preferences like "I want to actually reduce suffering rather than be tricked into thinking I'm reducing suffering" to be intrinsically bad, apart from the fact that from the perspective of the preference-utilitarian framework, not actually reducing suffering is bad.

Other remaining questions for preference utilitarianism

Some of these questions were touched on in this piece, but my own opinions on them are not fully resolved.

How do we compare utility across organisms?

  • Economists would translate to "willingness to pay" and then make these comparisons in dollars. However, this gives the most weight to the most wealthy people.
  • Political realists would think in terms of bargaining leverage and say that an organism's weight is just the share that it's able to negotiate during the deal-making process. Like the economist approach, this gives most weight to those with most power and gives no intrinsic weight to animals, babies, unborn people, etc. except via the caring of those who have power now.
  • Common-sense egalitarian intuitions suggest that every individual should count the same, but this leaves the question underspecified of what "the same" looks like, since there is no objectively comparable measure of utility. Of course, we can make approximations. It would be unreasonable to say that a billionaire derives more utility from an extra $5 spent on his yacht than would the person out in the cold for whom he could buy a coat.
  • Neuroscience intuitions suggest that we may need to apply some differential weighting based on brain capabilities, although how strong these should be I'm doubtful. If we apply none at all, then plants and bacteria might dominate expected-value calculations.

How do we compare utility within an organism over time?

This is a special case of comparing utility across organisms if you think of an organism as a collection of organism-moments. The question is important because people may make foolish decisions that they later regret, or they may heartlessly morally ignore the horrible suffering that their past selves endured because it's a sunk cost to them now. In particular, how do we deal with torture victims who temporarily wished they were dead but may not feel the same in retrospect?

Which organisms have preferences?

Two plausible ways of approaching this question are (1) degree of consciousness (phenomenal stance) and (2) degree of agency (intentional stance). It seems
plausible to me that either of these may qualify an organism for moral consideration. Obviously agency is relevant for strategic compromise; I don't know if I'm inadmissibly giving it intrinsic value as well.

Suppose there were an intelligent but non-conscious agent. Would it be weird to care about it? One of the main reasons humans feel empathy is reciprocal altruism, so it's almost more natural to extend concern to other high-level agents than it is to extend it to low-level hedonic creatures like small animals. However, the small animals benefit from our being better able to put ourselves in their place and feel what they feel.

In The Age of Spiritual Machines, Ray Kurzweil deflects issues about whether robots will be conscious by saying that we'll interact with them, develop personal relationships with them, etc., and this will eventually "convince people that they are conscious." When I first read the book in 2005, I thought this response was inadequate; I thought we needed to know whether these robots would be actually conscious. Now I see a kind of sophisticated wisdom to Kurzweil's point.

In A Cow at My Table, Tom Regan says of veganism:

I think everybody has that capacity to stop and think and say, "If I knew you, I wouldn't eat you."

And in some ways, it really is that simple.

By analogy, once we started getting to know an advanced artificial agent, even a "non-conscious" one, we would begin to sympathize with its dreams and fears.

What if neural subsystems strongly disagree with the final stated preference?

See the discussion of "perverse preferences" above.

Population ethics

Is creating a new unfulfilled preference bad? Clearly yes. Is creating a new fulfilled preference good? This is less clear to many people, including myself. How do we trade off preference fulfillment vs. frustration? This is not an obvious question because different person-moments may disagree on the tradeoff. As mentioned earlier, someone being tortured sufficiently terribly would not accept any compensation in return for it continuing.

Power weighting vs. equality

In economics, there's a classic distinction between Pareto efficiency and distributive justice: Pareto transactions can get you to an efficient outcome, but this is sensitive to initial endowments, i.e., how much bargaining leverage various parties have. A person or group with basically no power will get very little in the final efficient allocation. As an Athenian says in the Melian dialogue by Thucydides: "the strong do what they can and the weak suffer what they must".

Our utilitarian intuitions push us toward feeling that everyone's interests should count equally, that we shouldn't give favoritism to some just because they're more mighty or wealthy. On the other hand, these equality intuitions face problems of their own: How do we weigh different brain sizes and types? How do we even count brains? Presumably Chinese citizens and a whole China brain both count? How much each? Where do we draw the boundaries of these minds? How do we compare the strengths of conflicting preferences for two different minds? This becomes very complicated, and one might be tempted to throw up one's hands and say that this equality business is just too hard and arbitrary. The only thing that's not arbitrary is the outcome of bargaining given the universe's chosen initial endowments. Should we just accept that? My intuition says no, because, for instance, it leaves vast numbers of powerless animals and future generations in the dust except insofar as certain powerful humans happen to care about them.

When we encounter moral disagreement, one intuitive response is to say, "Ok, you care about X, I care about Y, so let's adopt a compromise morality of (X+Y)/2." We can then keep doing this across enough people and get basically a preference utilitarianism that resolves moral disagreement. However, the problem is, Over what set of individuals do we aggregate these preferences? Do we include animals? China brains? And with what weights? Including these other agents seems more fair in an equality sense, but if we do so, other powerful people tend to object and say that we're inflating the weight of our utilitarian morality vis-a-vis their non-utilitarian morality by including these extra preferences in the calculations. They might push for something closer to efficiency based on status-quo endowments of power.

One example where this comes up is with debates over nature preservation. I claim that reducing primary productivity is probably net good for wild animals in the long run. (This ignores potentially detrimental effects on global stability. Therefore, I'd recommend curbing ecosystems in ways that have low negative or even positive impacts on international cooperation and peace in the long run.) Others claim that the intrinsic value of nature outweighs the pain of individual animals. Just considering these two viewpoints, one might try to adopt a stance somewhere in the middle. However, when we also consider the preferences of all the animals who actually have to be eaten alive in the wild on equal footing with the (weak) personal preferences of the conservationists, the balance would shift to a stance of opposing wilderness so strongly that the pro-wilderness views would be negligible in the calculation. But the pro-wilderness people don't like this; they may object that this framework of dispute resolution is somehow unfairly biased in my favor. Or maybe they would assert that nature itself has some sort of preference not to be perturbed, and this preference is strong enough to outweigh quintillions of suffering animals.

Veil of ignorance

One argument for equality weighting rather than power weighting is a "veil of ignorance" approach: Imagine that you are to become a random agent. Then you'd aim to maximize expected preference fulfillment, with equal weighting across all agents. Of course, this thought experiment ultimately begs the question: Why is the random distribution uniform? It could just as well have been a brain-size-weighted random distribution or a power-weighted random distribution. And it still leaves the messy problem of carving out what physical processes count as agents, what their preferences are, how strong those preferences are relative to one another, etc. The veil of ignorance thus doesn't provide any answers; it just furnishes some boost to our egalitarian intuitions.

There's an extensive literature on fair division, with various procedures for achieving splits of goods among agents with different preferences. An efficient division is not necessarily fair, because as the Wikipedia article notes, giving everything to one agent would be Pareto optimal but not equitable.

Intensity weighting?

If we do equality weighting, do we normalize the utility of each organism to be on the same scale (e.g., between 0 and 100 for the worst and best possible outcomes, respectively)? Or do we weight by some sense of "intensity" of the feelings? A normalization approach is methodologically cleaner and is also more appropriate for game-theoretic calculations, but it intuitively feels like we should weigh by intensity. For instance, suppose one agent's life consists in only two possible outcomes: Either it gets to eat 62 cookies, or it gets to eat 63 cookies. Assuming it prefers the latter, eating 63 cookies would have scaled utility of 100, compared with scaled utility of 0 for eating 62 cookies. Yet compare this with someone who might either enjoy a fulfilling life (100) or be tortured for many days on end and then killed (0). Are we really going to count these on comparable footing in the equality-weighted calculus?

My personal answer here is that in terms of intrinsic value, I would apply intensity weighting. For strategic calculations, game theory dictates using an organism's own utility function, whatever that may be. Actually, with game theory, only relative comparisons matter; normalization just makes calculations easier.

That game theory doesn't care about intensity of emotions was encapsulated nicely by Ken Binmore in Natural Justice (p. 27):

Players can't alter their bargaining power by changing the scale they choose to measure their utility, any more than a physicist can change how warm a room is by switching from degrees Celsius to degrees Fahrenheit.

Hat tip to John Danaher's excellent blog post, "Egalitarian and Utilitarian Social Contracts" for the quote.

Is creating new satisfied preferences good?

Hedonistic utilitarianism values good experiences, i.e., when a brain realizes that it's receiving rewards. The value of pleasure is not intrinsically dependent on external outcomes. Since pleasure itself is valued, creating more organisms to feel happy is good according to regular (non-negative) utilitarianism. (Note: I'm a negative utilitarian.)

What about when it comes to preferences? Does the preference utilitarian value

  1. the state of an organism having a preference satisfied? or
  2. the actual content of the preferences that current organisms have?

The first of these cases is similar to hedonistic utilitarianism, in that what's valued is either an experience of preference satisfaction by the organism or at least the satisfaction of the organism's preference as an event that happens, even if the organism isn't aware of the preference being satisfied. This is the lens through which I've been interpreting preference utilitarianism in most of this piece. According to this valuation system, it's good to create new preferences that get satisfied. For instance, if you could give everyone the additional preference that 2+2=4, then since this preference is satisfied, doing so increases moral value. (Thanks to Lukas Gloor for inspiring this example.)

The second case -- valuing the actual content of preferences -- is very different from hedonistic utilitarianism. It amounts to treating as morally right the weighted average of the wishes of all existing organisms. The goal is not to create a state in which organisms have satisfied preferences; it's, rather, to achieve whatever goals the preferring organisms had. Hence, giving everyone additional preferences would be irrelevant or even harmful, because those new preferences wouldn't conduce to satisfying the content of the existing preferences. This implies a view closer to negative utilitarianism -- one that Christoph Fehige calls "anti-frustrationism": "we have obligations to make preferrers satisfied, but no obligations to make satisfied preferrers" (see p. 16).

Questions remain about how to treat future organisms. Certain kinds of future organisms "exist" according to eternalism, so we should presumably count their preferences even now. But whether and which kinds of future organisms exist is affected by our actions, making calculations trickier. The expected values of our choices depend on which beings get created, but which beings get created depends on the expected values of our choices. I guess the solution is just to consider each choice in turn and evaluate it against the aggregated morality of whatever organisms would exist if that choice were taken. For instance, if you take option A, it creates a being who wishes it hadn't been created, while if you take option B, it creates a being indifferent to being created. If all else is equal, the world is better according to the aggregate morality of all past, present, and future organisms if you choose option B.


The view about neural-subsystem voting was inspired by conversations with Anna Salamon and Carl Shulman. Some other parts of this piece originated from a discussion with Ben West. Sasha Cooper inspired the section on idealized preferences and agent-moments.

Postscript: Dialogue on non-hedonic preferences

On 15 Dec. 2013, I talked with a friend about whether non-hedonic preferences deserve moral weight. Below is an edited and reworded version of that conversation.

Friend: Why would something non-sentient be of any value?

Brian: Golden Rule intuition: I would ask another agent to do what I care about most, not what makes me personally happiest. If it did what would make me personally happiest, it would wipe my brain of concern for suffering and fill me with drugs. Likewise, if another organism really cares about something unrelated to hedonics, it seems like the ethical thing to do what it wants, not what I want. Do you not like the Golden Rule? Even if you want to follow the Golden Rule, it's not totally clear how to do it. But it would be a preference view of some sort.

Friend: I find the arguments against preference utilitarianism very compelling: (1) The non-experiential nature of preferences. (2) The thought experiment involving an organism who has a preference for suffering.

Brian: (1) Aren't preferences more fundamental, though? Your experiences matter because of how you care about them. (2) Yes, even preference utilitarianism gets into tricky issues about which parts of the brain have which preferences, how we weigh them, etc. I argued in the above piece that preference utilitarianism probably would not endorse torturing a pig that wants it, depending on what subcomponents of the pig's brain are counted as having preferences and how robust was the torture-desiring system.

Friend: (1) This is interesting, but I can't see how something that produces no feeling is any more or less valuable than a rock. (2) Yeah.

Brian: As far as (1), it's not valuable to you, but it's valuable to the other agent. That's what makes the Golden Rule more difficult than it seems. I wonder if some people find regular human altruism similarly challenging to motivate, if they feel less intrinsic empathy.

Friend: Hmm, to some extent my ethics are also an extension of my own hedonism. I just recognize there are cheaper places to buy hedons than inside my body, so maybe that's another reason I like the hedonistic view.

Brian: Right, that's the form of ethics I used to endorse. One friend called it something like "selfish altruism."

Friend: What's your view on population ethics regarding preference utilitarianism?

Brian: Instinctively I would incline toward negative, but that's again "doing what I want." A meta-level preference-utilitarian assessment of population ethics based on the desires of existing agents would look different. I don't necessarily endorse such a meta-level view, but I'm toying with it. In practice, my view on this might be like normal people's views about regular altruism: I might give 25% of my time/effort/tradeoffs to the thoroughgoing preference-utilitarian view, and then the rest can be what I want (i.e., negative-leaning population ethics).

Friend: When I was first learning about ethics I thought something like you're currently describing might be the closest you could get to a kind of objective "should": "We should do what it is collectively believed we should do." Like, people thinking it should be actually means that it should be. Luckily I decided against that, since I don't think moral realism makes sense.

Brian: In my case, it's not moral realism. It's more like "being nice." Or, as Eliezer Yudkowsky says regarding his proposal of coherent extrapolated volition (CEV): "I'm an individual, and I have my own moral philosophy, which may or may not pay any attention to what our extrapolated volition thinks of the subject. Implementing CEV is just my attempt not to be a jerk" (p. 30).

Since having this conversation, I've realized that the Golden Rule argument for preference utilitarianism is slightly circular, because the Golden Rule itself presupposes that preferences are the currency of value: "do unto others as you would have them [i.e., as you prefer them to] do unto you". We could instead formulate a hedonic golden rule: "Do unto others so as to change their hedonic well-being in analogy with what would increase your hedonic well-being." This sounds somewhat weird but could be shortened to "Improve others' hedonic well-being." That said, I think the preference-oriented golden rule is more general and simple: It says to respect others' preferences regardless of the content of those preferences, whereas the hedonistic golden rule presupposes particular content.


  1. Daniel Dennett also mentions this simile in Consciousness Explained, attributing it to Odmar Neumann.  (back)
  2. I also have sympathy with the opposite view: Yes, the older self does violate the preference of the younger self. A new "ruling coalition" in the person's brain has overridden the old one, just as strong groups of humans sometimes mercilessly destroy weak ones.  (back)