Expected Value and Normative Gettier Problems
I want to make sense of the following case:
You want to believe something because of reason A, BUT
Reason A is not a convincing principle, BUT
Reason B, which is a convincing principle in general, also supports this belief, BUT
In the case of this particular belief, Reason B is not the reason why you believe the thing, and if it weren’t for reason A, reason B would not convince you of this at all
This sounds strange and abstract, but actually I have run into a number of cases which seem to look something like this, and I am not totally sure what to do with them. On its face, it seems to make the supported belief overdetermined (generally I will be discussing this in the context of “normative” beliefs specifically, beliefs about what one ought to do). It is something that you want to believe, and it is something that you have a good principled reason to believe, however, for all this, it strikes me more like a sort of “Gettier Problem”. For the unfamiliar, a “Gettier Problem” is a class of situations where someone has a justified true belief, but because of how independent the truth and justification of this belief are from each other, it nonetheless seems like that person doesn’t actually know the thing they believe. A classic example is depicted here.
I have many problems with this recent article on longtermism which I won’t get into in detail, but among other things, it seems to depict longtermist thinkers as extremist for defending activism that can only shift the probability of an outcome by a very very very small amount, for sufficiently high stakes possibilities like AGI extinction scenarios. I have not read some of the cited sources (and based on Bostrom’s past work on reductios against extreme expected value reasoning, I suspect it is interacting with many of these sources misleadingly), but it is true that extreme expected value arguments of this sort are sometimes brought up in defense of work on existential risks – indeed, this is not the first time I’ve run into a critique of AGI research that says something like this. I have occasionally seen longtermist philosophers refer to this type of expected value edge case as the problem of “fanaticism”, roughly meaning that some possibilities are so high stakes that they dominate expected value calculations even when their probabilities are absurdly tiny. This type of case will show up a great deal in this post, and I will occasionally refer to it by this name.
The trouble is, I have not run into any thinkers who both think we should pour a good deal of energy and resources into a longtermist intervention, and also believe that this will only move risks by one chance in a trillion or something. The people who point out the expected value case for caring about these risks even if they are extremely unlikely, in my experience, tend to estimate pretty scary odds that these risks will be realized. Toby Ord for instance estimates a one in ten chance that AGI will cause an existential disaster in the next century. Bringing up the extreme expected value case works as a sort of hedging – the odds of AGI extinction risks are very uncertain, and it is hard to bring people onboard to the plausibility of odds like one in ten, so you can also point out that there’s a principled case even if the odds are extremely off from this estimate. This is my interpretation of most arguments I’ve seen of this sort anyway 1.
In a way, I find these sorts of arguments disingenuous for the simple reason that regardless of principled strength, they clearly would not convince anyone. Indeed it seems much easier to convince someone that there is a one in ten, or for that matter a nine in ten chance, that AGI will drive us extinct in the next century, than it is to convince someone that they should care even if the odds are only one in a trillion. On the other hand, setting that floor maybe allows people to draw their own line in the sand for the point when the expected value becomes unconvincing – maybe the expectation is that someone will not buy either that there is a one in ten chance of AGI catastrophe, or that they should care even if there is a one in a trillion chance, but rather that if they can be convinced that there is at least a one in a hundred chance, this is still the type of risk they would normally take seriously enough to invest much more into research and mitigation. This argument just shows that the expected value cost/benefit has already been demonstrated to work out for pretty much any such plausible probability threshold.
This is a case of mixing distinct empirical and normative arguments because they serve different purposes. Another example like this is one I’ve recently discussed, Pascal’s Wager. Let us say that you are a Catholic priest who thinks that there is a nine in ten chance that Catholicism is correct, but because you know this is a hard sell, you make the case that being Catholic is more prudent than being an atheist, in principle at least, down to an infinitesimal probability. (This is complicated by the fact that, as I mentioned, it is not clear what specific beliefs would be most favored by Pascal’s Wager, and it is unlikely to be pure, straightforward Catholicism, so the priest is still in a bit of a pickle if they want to convince the atheist that they should convert to Catholicism rather than something else, but to simplify the example, just assume the argument is whether it is more prudent to be an atheist or a Catholic).
Now, in a sense, the Pascal’s Wager part of this argument is disingenuous, because if someone really did think converting to Catholicism only improved their odds of avoiding hell by Graham’s Number to the negative one or something, they would not be convinced, the priest probably wouldn’t in their shoes either. Instead the priest can only hope that the atheist winds up drawing some intermediate probability that they can be convinced is compelling, and then be aware in the background that the expected value cost/benefit goes through at this credence 2.
These mixed cases are not quite the ones that interest me yet. They are cases where someone makes an argument that didn’t convince them of their position, but they actually do have a belief that is enough to justify their position on its own (unusually high credence in Catholicism/AGI disaster), it is just hard to bring someone all the way to that belief, so it is paired with an instrumental argument that casts a much much broader net. The sort of case that most interests me is one where it is questionable whether someone’s belief can be persuasively justified based on their personal reason for believing it, and as a result they don’t appeal to their real personal reason at all.
A different example from the priest might be of the lapsed Catholic. Imagine someone who used to be Catholic, but eventually became convinced of atheism. In fact, looking back at all the specifics of Catholicism, they come to believe that there is only a one in a billion chance that Catholicism is correct. Nonetheless, they are terrified of Catholic hell. The idea that, if Catholicism is correct as they once believed, they will now spend eternity in this horrible place, keeps them up at night. When they reflect, they still only think that this is a one in a billion chance, but their anxiety about this remote possibility causes them to wish that they were Catholic once again, and to wonder if they should try to reconvert themself. Now this hypothetical lapsed Catholic does not take any other one in a billion risks this seriously, even infinitely bad ones like hell. Although they in fact believe that there is an expected value argument that being Catholic is safer than being an atheist, that is not the reason they want to be Catholic again. That has to do with unrelated specifics of their personal history. Could this person, nonetheless, argue that their reconversion is reasonable because of the expected value arguments?
This is closer to the sort of normative Gettier Problem that concerns me most. This person thinks that there is a good principled argument for the thing they, unrelatedly, want anyway. Are they fully rationally justified in going forward with this? In reconverting to Catholicism? I think there is an impersonal, trivial sense of the argument in which this is rational, in particular if your definition of rationality is following a certain rule. Just as it would be rational for the lapsed Catholic to reconvert (or convert to some other, maximally hell-avoiding religion) even if they found this entirely unintuitive, the fact that they do find it intuitive doesn’t change that reconverting is still the rational choice.
I have mentioned in the past that it doesn’t seem as though it is bad in some special, non-trivial way, to break with one’s principles when they are not sufficiently convincing to dissuade one of a strong intuition. Or at the very least it makes more sense to accept a genuinely compelling principle you will not always follow than to support an unconvincing principle you will always be able to follow. In my way, I find the argument that the lapsed Catholic is in an odd case of this compelling – a case in which the person will act in some way regardless of principle, but as it happens they have their choice complemented by principle as well. Maybe there are three problems I have with closing the books on these cases with this account.
The first problem is that I think bringing up both intuition and abstract principle can be used as a way for an argument to patch itself, and look better than it is. If you have two different arguments that prove your point, it can look superficially like the argument is stronger, even if a different version of both would be necessary for either to be particularly persuasive. In this case for instance, a single argument that properly connected the intuitive and principled factors should be more convincing than two arguments that separately demonstrate each.
The second problem is a related, more conscious version of this. Someone might be unwilling to do something for just one of these reasons, but might be willing to for both. As an example, maybe a normal atheist who thinks Pascal’s Wager is dumb will just muscle through their fear of hell, find it too silly to act on, even if they still want to believe in hell. The hypothetical lapsed Catholic on the other hand might feel relief at seeing an excuse to believe in hell, if they decide that Pascal’s Wager does (sort of) work. Is it appropriate to feel relief when we run into an unconvincing argument for the thing we already wanted to believe? I think even if the argument “goes through” in some important, principled sense, it is doubtful that this relief is appropriate. If it was, it would give us more discretion to selectively believe whatever we want, arguments be damned. Once again, it seems as though there are some things we are so resistant to believing that we won’t believe them no matter the arguments, but now we could add onto the pile things we want to believe, and can come up with a good excuse for believing.
This feels strangely adversarial to me, like principles are some independent agent one can negotiate with. For instance maybe pure expected value favors something other than Catholicism, but it still says that Catholicism is more prudent than atheism, so the lapsed Catholic, who wants to believe in Catholicism but not the alternate, maximally hell-avoiding religion, will meet principle halfway by becoming Catholic. But this is a bullet someone could bite, that it is not odd to treat conflicts between principle and intuition like two agents negotiating, it just strikes me as questionable. At the very least one couldn’t say that they listened to any specific reasoned argument in this case, they would have to present the existence of this weird internal negotiation as their justification. 3
The third problem is that, unlike the cases where one chooses to act despite a principle, it seems like it may often be difficult to tell when one is in a situation like this one. That is, if you think there is a compelling principled argument for something, and you believe in that thing, then without more careful investigation, it may be easy to dismiss as something you believe in for the stated reason. I think utilitarians can be especially susceptible to this when responding to critics.
A classic criticism of utilitarianism 4 is that a utilitarian surgeon will dissect one healthy patient to get lifesaving transplants for five others. An equally classic utilitarian rejoinder, when confronted with this case, is to say that the negative effects of this are obviously huge, and in particular it would collapse trust in the healthcare system if this sort of thing happened. An even more sophisticated rejoinder I am quite sympathetic to is that, although the utilitarian is horrified by this case impulsively, beyond the level of social impacts, a good utilitarian has reason to nurture and feed this impulsive reaction, because it is the type of intuition that will generally be valuable.
And yet… all of this can be controlled for. We can imagine a surgeon with any psychology we want, in any position arbitrarily favorable to covering up the transplants' circumstances. The defense that appeals to the usefulness of the intuition against this is probably the hardest to control for, because it tangles up being a good utilitarian so much with the reaction. Nonetheless, this is a genuinely counterintuitive implication for many people, including, I would expect, some utilitarians, even after they control for all of this. I have actually become fairly comfortable with this implication myself, it is something I can accept when all of the controls are in place if I look at it right 5. However there are plenty of other utilitarian implications I am deeply uncomfortable with, like torturing one person to entertain millions (not an outcome I can impartially wish would come about if I removed all agency). This intuition remains even when I control for the obvious utilitarian retort that no good society could work like that.
Here’s another case involving expected value which seems to have similar properties to the lapsed Catholic case, but which I am much more invested in rescuing: elections. A popular point among cynics is that voting doesn’t make a difference. The odds of impacting some outcome with your vote is negligible, so why consider it worthwhile? I have seen critics of democracy, such as the anarcho-capitalists Bryan Caplan, Jason Brennan, and Michael Huemer in this interview, cite it as a devastating issue with voting – people vote carelessly, because they know that individually their vote won’t matter.
This argument is curiously in tension with common sense. In order for this low probability of impact to be a key issue with democracy, it seems as though people must be expected to vote for the candidate they do not actually want to win much of the time. Perhaps this happens sometimes, but strain as I might, I can’t imagine this being a particularly large-scale issue. I celebrate when the person I voted for wins, and I’m upset when they lose. If I knew I was the deciding vote in the last two general elections I voted in (I have only voted in two so far), I would have voted the same way, and so, I think, would pretty much everyone else I know. A more significant possible criticism is that people often wish for certain election outcomes for bad reasons, like spite for the other side, rather than because of careful moral reasoning or even self-interest. These sorts of problems seem to relate, in my opinion, to the preference and information synthesis issues existing democracies have, problems I have written about in the past and which make me excited about projects like RadicalXChange even though I criticize them.
The issue does not seem to be accounted for by the low probability of influence, because I believe we almost all do behave as though our vote will decide the election (alright, we need to control for strategic voting here to make that statement, but I think it at least applies to two-candidate elections). And yet… the cynics and anarcho-capitalists are right, the odds of influencing an election are unbelievably small. How do we account for the intuitive value of voting? When first reading “Reasons and Persons”, I ran into an explanation that has since interested and bothered me. One of Parfit’s “Five Mistakes of Moral Mathematics” was the view that we can ignore very very low probabilities. To counter the “mistake”, he gave examples of cases where expected value made actions with very very low probabilities of leading to the desired outcomes look reasonable. One example was elections, which he estimated, in the case of close elections, had an expected value of the impact of the election’s outcome on two average voters (minus the cost of voting itself of course).
I shared this page on the RIT Effective Altruism Discord, supplemented with the point that, since lots of affected parties can’t vote at all in the election, such as non-humans, future generations, and people from other countries, the overall expected value of voting is actually a great deal higher than this. One member actually commented on the server that this argument had convinced them that voting was in fact rational. More recently, 80,000 Hours expanded this argument into a post that estimated the expected impact of voting in different situations. And yet… isn’t this type of reasoning the sort of thing that many Effective Altruists will reflexively call a “Pascal’s Mugging”? Textbook fanaticism? Is similar expected value reasoning taken seriously in other contexts? Wouldn’t we be doing really weird stuff if we were normally convinced by this reasoning? I think this may, in a way, be like the lapsed Catholic example, in that we already believe that voting is worthwhile, but this explanation doesn’t convince us of it.
I don’t think we decided that voting was important because we multiplied a big number by a small number and got a respectably medium number. This leaves me with two possibilities. Either there is something about this case that makes something like expected value reasoning convincing, in a way it isn’t in comparable risk/reward situations, or else this is a weird edge case of a normally convincing principle that happens to flatter our intuition.
There are cases where even extremes of expected value seem convincing because of some other feature of the situation. An example is that if we are engaging in risk/reward situations a number of times greater than or equal to the reciprocal of the lowest probability event 6, it seems intuitively reasonable to make choices based on pure expected value. That is, expected value is the well-known solution to long-term success in gambling, and if you gamble enough, good expected values are going to be compelling even for very very low odds per bet 7. This is, in fact, one of the foundational arguments in defense of expected value, and the reason expected value is not always compelling is that we are often in situations that feel importantly different. If there is a one in a billion chance that a choice will give everyone a perfect Heaven forever, but a 999,999,999 in a billion chance it will drive us extinct, it doesn’t seem like we are almost guaranteed to win in the long run. Whatever the expected value is, we are almost guaranteed to lose in the long run. If you only have one choice ever, you will make it much differently than if you have a million more similar choices. Can we find a differentiating feature, like the long-term gambling case, in the voting case, which makes the expected value argument at least part of the reason voting seems worthwhile, even if it isn’t enough alone?
I really hope so, because things get even worse than just voting. After all, what are the odds that your activism or money or research will be the crucial factor in mitigating an extinction risk? In making politics more effective and just? In accomplishing anything important on the large scale? The expected value might work out, but is that really what motivates you? Maybe we could rescue the rationality of donating to causes that fairly directly help people with their donors' money, but, at least for those that require large staffs, we still have the problem that none of the people working for the charity have a good rational reason to do so considering that they are probably, individually, replaceable.
I mentioned in an earlier footnote that although I don’t think there are any longtermists who literally believe we should devote tons of resources to something like AGI if it’s a one in a billion risk, that would not stop some from saying that it is worth taking a one in a billion chance on working to help with something like AGI. There is a sort of partial recognition of the generality of this problem here, activists in all areas devote a good deal of energy to activism that has extremely low chances of changing things on the margin, in all sorts of cases. Taking extreme risk/reward reasoning seriously is totally normal, haven’t you heard of the five mistakes of moral mathematics? This is ultimately my key, or one of my key problems with the fanaticism objections to longtermist reasoning. It isn’t that longtermists don’t endorse fanaticism in justifying cause priorities, it is that I think, when they do, they are making a Gettierish mistake about why they actually take an intervention seriously. Fanaticism is, again, a known problem among longtermist philosophers, it is just that entirely giving up on fanatical reasoning also seems unacceptably limiting, and not just when it comes to justifying longtermist cause areas.
The crucial problem set aside is that fanaticism isn’t the only way to distinguish different types of expected value cases. Shouldn’t devoting your career to a one in a billion risk you can certainly mitigate count about as much as taking a position that will give you a one in a billion marginal chance of personally mitigating an extremely probable risk? Not only does it count as much in expected value terms, but even if you decompose the expected value back into the risk and its reward, it will look about the same. Again, the crucial difference isn’t the small number multiplied by a big number.
A critic who is close to the AGI folks and so I think reflexively charitable to them despite reservations is Peter Singer. Like Torres in his piece, Singer seems to think AGI should be alot lower, and climate change a lot higher on longtermist’s priorities. The issue, which Torres seems to ignore, and which Singer seems more sympathetic to when it is pointed out, is marginal value. It is not that AGI is more important than climate change (though many longtermists think so), the most common consideration cited by Effective Altruists in my experience is that AGI safety is extremely neglected, and climate change is much much less neglected. Effective Altruism could pretty much single handedly turn AGI safety research into a real field, but can have relatively little impact on climate change efforts on its own. Singer thinks that AGI risk, or more precisely the likelihood of current research mitigating AGI risk, is quite small compared to climate change 8.
Although I am not as pessimistic about AGI safety research as I think Singer is, this is certainly true to a degree. We have lots of promising leads for directions to help mitigate climate change, and as a risk, climate change is an ongoing crisis, with almost certain much much worse effects on the horizon. Let’s assume for a minute that it turned out that the marginal, expected impact of AGI research and climate change research was the same. This could turn out to be the case because the risk of one is higher, but the reward of work on the other is higher, but it could also be the case that, on a per individual basis, the risk is similar, and the reward is similar. I think Singer’s reaction could still be relatable.
Looking at these examples, an alternative to the expected value/fanaticism accounts suggests itself. What is intuitive about participating in activism isn’t that your contribution has a high expected value, it’s that the total impact of all people engaging in similar activism has good chances of paying off, not just a large impact if it does. I am not sure why this feature is intuitive, but it predicts the difference in intuitions about extreme expected value cases very well. It is why AGI research looks crazy to some people even when they account for neglectedness, it’s why voting seems to make sense.
I think there are a few possible reasons for this, if you want to justify this intuition in a deeper way. The most principled one is a sort of categorical imperative (which for semi-unrelated reasons I think is a poor guide for action, and bad at leading to unique conclusions on its own). You might think that we can safely view our individual decision as like a collective decision if we expect a large collective of people to follow similar reasoning and be motivated to perform similar actions in service of this goal. You may not make much difference on the margin, but you could think that it would suck if everyone in a similar position to you thought that way, and if you think that way and they don’t, you would wind up being a sort of moral free-rider.
Another possibility is that there is a natural way to look at these cases of collective action as a scaled version of intuitively reasonable smaller-scale activities. Take the voting example, if you vote in a nation of five citizens, you have a large amount of marginal voice, but there are a lot fewer people you could have an impact on. The expected value works out in favor of voting, and on this scale, it is non-fanatical. Each step you take towards the “fanatical” version, voting in a large national election, is a step that scales both impact variables up and changes little else. There are more people diluting your voice, at the same time that there are more people being impacted by it. We could probably tell some version of this story to convert any intuitive choice into a fanatical one by taking minor steps, but there is something hard to explain that is more reasonable sounding in the type of story you tell where things like voting and activism are concerned. I think this story is somewhat less principled than the first possible explanation, but the least principled one is just that we don’t have much of an alternative if we are trying to have an impact on large events.
If there is some large, morally significant event that may happen, we want to influence it. When we are alone in our attempts to act, it is impossible to deceive ourselves about the type of impact we expect our actions to have. When we are part of a group of likeminded people working towards the goal, we have the psychological option to identify our actions with the collective, rather than with our personal contribution to the cause. This seems more like explanation than justification, so I tend to think that this is one of the more pessimistic possible explanations. If we accept it as the key motivator, then we really are in the normative Gettier Problem situation I’ve been discussing when we engage in nearly any action meant to help with important causes.
I want to add, however, that although expected value doesn’t seem to be enough on its own to explain how people are convinced to act collectively for causes, that doesn’t mean it is sufficiently irrelevant that we have to assume it isn’t the “real” motive in any sense. The most obvious way is that expected value is a convincing tie breaker or impact translator.
What I mean by tie breaker is that it seems as though if you control for collective impact, the odds of personal influence still make a difference in large-scale collective action. For instance, if you are voting on a bill as part of congress, versus if you are voting on a bill on a ballot initiative as a civilian, it seems as though the collective you are participating in the voice of will have the same impact, but your marginal influence is much greater in congress than on the ballot. The intuitive part of this can’t be accounted for by the impact of the collective, but can be accounted for by the difference in expected values.
What I mean by impact translator is that you can compare cases where you will have some personal, almost certain impact, with ones where you can engage in activism as part of a large collective. As an example, let’s say you have a choice between voting on a foreign aid bill, or personally going overseas to help those in extreme need (and let’s say your absentee ballot will not be able to arrive on time for you to do both). Expected value could help you make a choice between the two. If you solely compared impact of the collective with impact of individual action, you would always choose to vote, because the collective action will pretty much always have much more impact. You will choose to take the collective action even when this seems absurd based on your marginal influence on it. When the choice is to vote for a bill to invest more in the fire department, you will not step out of the voting line to save the lives of a family trapped in a burning building yourself. Here is where expected value seems valuable once again, to answer whether you should go help the poor overseas yourself, or vote on a bill to provide foreign aid, you should figure out your marginal expected impact in both cases, and compare them.
The trickier question is the degree to which you can trade off between collective expected value situations, and “fanatical” ones. Basically, if you have two scenarios with low low odds of personal impact but high impact if successful, and in one scenario the impact is part of a more impactful overall collective than the other, but the individual expected value of your act is higher in the other. We can think of the climate change/AGI question again. If we suppose that the collective impact of activism on climate change today will be greater than that of activism on AGI safety, but your marginal impact will be higher in AGI safety, because you will have more personal influence on the effects of the collective, then we are making a tradeoff like this to some degree. Raising the expected value of the action we might take, in exchange for making it more and more similar to the classical fanaticism case, where we take an individual action with very very low chances of very very large impact.
Some degree of this tradeoff seems reasonable to me, for instance I am more optimistic about the possible impact of AGI research than people like Singer, and think the tradeoff involved in AGI research is at least comparable in intuitive value to getting involved with climate change research, when all facets of both cause areas are properly estimated. However, I am not sure where the crucial point is, at what point the low expected impact of the collective outweighs the overall higher expected value. At what point do you stop voting in a very large election to give your wallet to the Pascalian mugger? Given my concern that people may not have a principled way to differentiate their intuitions in these cases, the scary thing is there may literally be nothing you can really appeal to in deciding this threshold. Overall, problems in expected value in general and these types of cases of expected value in particular occupy my mind a great deal. I want to say that when I say I vote because of the high expected value, I am telling the truth, but on the other hand I am pretty sure that I am not telling the whole truth. I think that more thought on the rationality of cases like this needs to be done, but I am not ready to give any satisfying accounts of how to deal with the existence of cases like the ones I have discussed in this post. It just seems that they give a reason that we cannot assume that our preferred behavior is rational just because it is supported by a generally compelling principle.
To be clear, I am not arguing that longtermist thinkers don’t believe we should take individual actions that change the chances of these risks by a very very small amount. Most do, but this is actually a complicated issue that will figure as one of the main interests of this piece later on. ↩︎
As it happens, I think any religious belief changes one’s odds of not winding up in hell by so little, that it has not reached my threshold for an argument I would normally take seriously, but the priest might hope someone’s threshold and credences will be more favorable than mine. ↩︎
If I imagine a version of this case that removes the agent and only asks for a comparison of outcomes, I find the answer intuitive. Say in one possible outcome the five patients die while the one healthy patient lives, and in another the one healthy person trips, and their organs go flying all over, landing coincidentally into the patients who need them (at what point do thought experiments like these become entirely too silly?). I would consider the second outcome better, and when I consider all of the factors that seem to me to differ between this case and the surgeon personally facilitating this outcome, my problems come from the social and psychological elements we are trying to control for. ↩︎
Another possible objection I leave aside is the possibility that AGI research actually makes the relevant risks worse. This for instance seems to be one of Glen Weyl’s criticisms, and Alex Berger has highlighted this concern as well. I think this is unlikely, and it would make me extremely pessimistic if it was true (it would suggest that in the apparently fairly likely event that AGI is eventually developed, our best case scenario is the one in which no one has worked to make it safer), but it is a possibility worth taking very seriously. A comparable point could be made in some of these other cases, for instance a common criticism of my Pascal’s Wager post was raising the possibility that worship of any kind will actually increase the overall odds of damnation, which I again do not think is either more likely or very much less strange and disturbing an implication. ↩︎
Ed. Note: I can’t find where I read this (a Gwern article?), but there was a paragraph somewhere about how Google’s infrastructure gets “one-in-a-million” and “one-in-a-billion” glitches and erros all the time, due to their sheer scale. ↩︎