This Thing of Darkness I Acknowledge Mind: Chapter on Responsibility and Implicit Bias

Here’s a draft of the chapter of my moral psychology textbook. It’s on implicit bias and responsibility.  This one was much more depressing to write than the one on preferences.  As always, questions, comments, suggestions, and criticisms are most welcome.


“This thing of darkness I acknowledge mine.”

~ William Shakespeare, The Tempest, 5.1.289-290

1 Some incidents

At 12:40 AM, February 4th, 1999, Amadou Diallo, a student, entrepreneur, and African immigrant, was standing outside his apartment building in the southeast Bronx. In the gloom, four passing police officers in street clothes mistook him for Isaac Jones, a serial rapist who had been terrorizing the neighborhood. Shouting commands, they approached Diallo. He headed towards the front door of his building. Diallo stopped on the dimly lit stoop and took his wallet out of his jacket. Perhaps he thought they were cops and was trying to show them his ID; maybe he thought they were violent thieves and was trying to hand over his cash and credit cards. We will never know. One of them, Sean Carroll, mistook the wallet for a gun. Alerting his fellow officers, Richard Murphy, Edward McMellon, and Kenneth Boss, to the perceived threat, he triggered a firestorm: together, they fired 41 shots at Diallo, 19 of which found their mark. He died on the spot. He was unarmed. All four officers were ruled by the New York Police Department to have acted as a “reasonable” police officer would have acted in the circumstances. Subsequently indicted for second-degree murder and reckless endangerment, they were acquitted on all charges.

Like so many others, Sean Bell, a black resident of Queens, had some drinks with his friends at a club the night before his wedding, which was scheduled for November 25th, 2006. As they were leaving the club, though, something less typical happened: five members of the New York City Police Department shot about fifty bullets at them, killing Bell and permanently wounding his friends, Trent Benefield and Joseph Guzman. The first officer to shoot, Gescard Isnora, claimed afterward that he’d seen Guzman reach for a gun. Detective Paul Headley fired one shot; officer Michael Carey fired three bullets; officer Marc Cooper shot four times; officer Isnora fired eleven shots. Officer Michael Oliver emptied an entire magazine of his 9 mm handgun into Bell’s car, paused to reload, then emptied another magazine. Bell, Benefield, and Guzman were unarmed. In part because Benefield’s and Guzman’s testimony was confused (understandably, given that they’d had a few drinks and then been shot), all of the police officers were acquitted. New York City agreed to pay Benefield, Guzman, and Bell’s fiancée just over seven million dollars (roughly £4,000,000)in damages, which prompted Michael Paladino, the head of the New York City Detectives Endowment Association, to complain, “I think the settlement is a joke. The detectives were exonerated… and now the taxpayer is on the hook for $7 million and the attorneys are in line to get $2 million without suffering a scratch.”

In 1979, Lilly Ledbetter was hired as a supervisor by Goodyear Tire & Rubber Company. Initially, her salary roughly matched those of her peers, the vast majority of whom were men. Over the next two decades, her and her peers’ raises, which when awarded were a percentage of current salary, were contingent on periodic performance evaluations. In some cases, Ledbetter received raises. In many, she was denied. By the time she retired in 1997, her monthly salary was $3727. The other supervisors – all men – were then being paid between $4286 and $5236. Over the years, her compensation had lagged further and further behind those of men performing substantially similar work; by the time she retired, she was making between 71% and 87% what her male counterparts earned. Just after retiring, Ledbetter launched charges of discrimination, alleging that Goodyear had violated Title VII of the Civil Rights Act, which prohibits, among other things, discrimination with respect to compensation because of the target’s sex. Although a jury of her peers found in her favor, Ledbetter’s case was appealed all the way to the American Supreme Court, which ruled 5-4 against her. Writing for the majority, Justice Samuel Alito argued that Ledbetter’s case was unsound because the alleged acts of discrimination occurred more than 180 days before she filed suit, putting them beyond the pale of the statute of limitations and effectively immunizing Goodyear. In 2009, Congress passed the Lilly Ledbetter Fair Pay Act, loosening such temporal restrictions to make suits like hers easier to prosecute.

Though appalling, Ledbetter’s example is actually unremarkable. On average in the United States, women earn 77% of what their male counterparts earn for comparable work. A longitudinal study of the careers of men and women in business indicates that Ledbetter’s case fits a general pattern. Although no gender differences were found early-career, by mid-career, women reported lower salaries, less career satisfaction, and less feelings of being appreciated by their bosses (Schneer & Reitman 1994). Over the long term, many small, subtle, but systematic biases often snowball into an unfair and dissatisfying career experience.

Why consider these cases together? What – other than their repugnance – unites them? The exact motives of the people involved are opaque to us, but we can speculate and consider what we should think about the responsibility of those involved, given plausible interpretations of their behavior and motives. This lets us evaluate related cases and think systematically about responsibility, regardless of how we judge the historical examples used as models. In particular, in this chapter I’ll consider the question whether and to what extent someone who acts out of bias is responsible for their behavior. The police seem to have been in some way biased against Diallo and Bell; Ledbetter’s supervisors seem to have been in some way biased against her. To explore the extent to which they were morally responsible for acting from these biases, I’ll first discuss philosophical approaches to the question of responsibility. Next, I’ll explain some of the relevant psychological research on bias. I’ll then consider how this research should inform our understanding of the moral psychology of responsibility. Finally, I’ll point to opportunities for further philosophical and psychological research.

Continue reading


Here’s a short conceptual analysis of bragging….


The fact that Wikipedia lists me as a notable alumnus of my college speaks ill of the reliability of crowd sourced information.

~ Tweet by @johnmoe

1. Aim to impress


The speech act of bragging has never been subjected to conceptual analysis.  This paper fills that lacuna.[1]  The most-studied speech act is assertion.  Less attention has been paid to other speech acts, such as requests, promises, declarations, and apologies.  We argue that bragging is a special form of asserting.[2]  Specifically, a speaker brags just in case she aims to impress her addressee with something about herself by asserting something about herself.

Many speech acts characteristically aim at generating a particular type of mental state in the addressee.  Assertion aims to generate belief.  Promising aims to generate trust or reliance.  Commands aim to generate intentions.  We contend that bragging aims to generate the state of being impressed.  It suffices for present purposes to characterize being impressed as a distinctive mental state, which we think is best construed as an emotion akin to awe, wonder, and admiration. Our first claim, then, is that someone doesn’t count as bragging if she isn’t trying to impress her addressee.

Consider a case: your interlocutor tells you, “I used to play fly-half for the Oxford rugby team.”  Let’s contextualize this conversational gambit.  If you, like the speaker, are a rugby aficionado and realize that the fly-half position is arguably the most important on the team, then you are likely to be impressed.  Intuitively, if the speaker makes this assertion to another sports fan, he is bragging.  However, if you’ve just told him that you feel nothing but contempt for sports and sportsmen, then unless he’s simply clueless it would hardly seem that he’s bragging.  After all, he can’t intend to do what he takes to be impossible, and it’s likely that he thinks it’s not possible to impress you with his sporting prowess.  Perhaps he’s telling you something about himself to test whether you can be friends.  Perhaps he’s purposefully outing himself to end the conversation.  Perhaps he’s engaged in special pleading on the part of his favorite sport.  But one thing he’s clearly not doing is bragging.  In each case, he’s asserting that he’s accomplished something.  In the original case but not the variants, he’s also bragging.  We think the best explanation of this difference is that bragging aims to impress.

Does he need to be impressed with himself?  We think not.  Suppose, for instance, that he thinks the competitive level of university rugby is embarrassingly poor, such that nothing one does in that context could be impressive.  Still, if he thinks you don’t know that, he would be bragging.

Does he need to think that a fully-informed, disinterested observer would be impressed?  Once again, no.  A fully-informed, disinterested observer would also realize that the competitive level of university rugby is embarrassingly poor.  Nevertheless, if he thinks that you have some investment in rugby or sports more generally, he could boast by asserting that he used to play fly-half.

Does he need to think that the thing that will impress his address is or will be seen as good (morally, prudentially, or in some other way)?  A third time, no. Consider Cool Hand Luke’s claim that he can eat fifty eggs.  Is it morally, prudentially, epistemically, or aesthetically good to have this capacity?  Nevertheless, it is a feat.  His claim to be able to eat fifty eggs is a boast.  One can even brag about something that is or is likely to be perceived as negative (morally, prudentially, or in some other way).  Imagine a university professor who preens about the fact that she’s never, in her career, given an undergraduate paper a grade of A, let alone A+, because she is only willing to award such grades to papers that are publishable without revisions.  She knows that her colleagues find this standard appalling but impressive.  She is boasting.  This provides an opportunity to distinguish between bragging and self-praise.  They overlap extensively, but they doubly dissociate.  You can engage in self-praise that isn’t bragging if you don’t intend your audience to be impressed with you.  You can brag without engaging in self-praise if you don’t intend your addressee to attribute responsibility to you.  As Aristotle points out in the Nicomachean Ethics (3.5), praiseworthiness presupposes responsibility.

These considerations suggest that in bragging a speaker aims to produce in the addressee (and not necessarily in anyone else) the state of being impressed.


2. Impress by asserting


An obvious objection is that if bragging is aimed at producing the emotion of being impressed, then we are wrong to classify it as a kind of assertion.  This objection fails because, on our account, bragging aims at producing both a belief and the state of being impressed.  Specifically, we think that a speaker brags iff she intends by making an utterance:

(1)  to produce in the addressee the belief that p,

(2)  that the addressee should recognize the speaker’s intention (1),

(3)  that the addressee should base her belief that p on her recognition of (1), and

(4)  that the addressee’s belief that p lead her to be impressed with the speaker.

The first three conditions will be familiar from Grice (1957).  The fourth distinguishes bragging as a special kind of assertion.  One might wonder why we don’t include a fifth condition to the effect that A recognizes (4) and a sixth condition to the effect that A should base her being impressed with S on her recognition of (4).  We take up this issue below.  In this section, we defend the assertion conditions (1-3).

Does boasting really have to piggy-back on assertion?  Can one boast by asking a question, by issuing a directive, by apologizing, and so on?  Consider this case: an audience-member at an academic talk asks a devastating question then smiles smugly to herself.  Let’s stipulate that she aimed to impress the rest of the audience.  Does her question count as a boast?  We think the answer depends on how exactly she aims to impress the rest of the audience.  Presumably, she intends to get them to think that she’s very clever.  On our account, if she also intends them to recognize this intention and to base their belief on it, then she is indirectly bragging because she’s indirectly asserting that she is clever (in much the same way that someone can indirectly command you to get off his foot by asserting that you’re standing on it).  If she doesn’t have these further intentions, then our account says she isn’t bragging.  This seems right, or at least not clearly wrong.

One might think, though, that only condition (4) is truly necessary: as long as the addressee ends up being impressed with the speaker, the precise pathway is irrelevant.  We think that cases one might be inclined to describe as non-assertive brags fall into just two categories: indirect assertions (and hence indirect brags susceptible to the same analysis as the question case above) and non-brags.

For example: “I want to compete for another Iron Chef trophy, but my chances this time are terrible.”  Instead of asserting that she’s already won one Iron Chef trophy, the speaker presupposes it.  Is she bragging?  If by presupposing she indirectly asserts that she’s won and intends to impress, our account says that she indirectly brags.  If she doesn’t indirectly assert (perhaps she thinks her addressee already knows that she’s won once), she isn’t. If expressing the desire to compete, regardless of whether she’s won already, seems like bragging (who would want to compete if they didn’t think they were very good indeed?), we give the same analysis.  Either there’s an indirect assertion involved, or it isn’t a brag at all.

One might demur, claiming that in some cases the speaker intends to impress her addressee directly, without any mediating belief or other mental state.  How, we ask, is it possible to end up in a state of being impressed with X without taking some predicate to be true of X?  You might not be able to articulate what you’re impressed by.  You might get it wrong.  But it seems to us preposterous that you can be in such an emotional state without some belief-like attitude implicitly grounding it.  “I don’t know what it is about X, but I find X impressive.”  That sounds fine.  “Nothing about X is impressive, but X is impressive.”  This strikes us as absurd.


3. If you’ve got it, flaunt it


We have now argued for two necessary conditions on bragging.  First, the bragger must aim to produce in her addressee the emotional state of being impressed.  Second, she must aim to produce this emotional state via the belief produced by asserting.  We now argue that both the belief and the emotion must involve being impressed with something about the speaker.  This is a natural extension of our previous argument that one is never simply impressed with X; one is always impressed with something about X.

Consider two cases of bragging and non-bragging that both aim to produce the emotion of being impressed by way of belief.  In the first, an Oxbridge philosopher by the name of Petro Ungero claims to be smarter than almost all of his own colleagues, as well as the Nobel Laureate psychologist Daniel Kahneman.  In the second, Pyotr Ungerovich claims that David Lewis was the smartest philosopher of the twentieth century.  What distinguishes Ungero from Ungerovich?  It seems clear that the former is bragging while the latter is not.  Both are trying to impress their addressees by getting them to believe something.  The crucial difference is that Ungero is trying to get his addressee to believe something about Ungero which will in turn lead the addressee to be impressed with Ungero.  By contrast, Ungerovich is trying to get his addressee to believe something about Ungerovich, which will in turn lead the addressee to be impressed with Lewis.  More precisely, the structure of bragging is to make an assertion aimed at getting the addressee to believe that the speaker has property P, and thereby to be impressed by the speaker’s having P.

So far, we have rested content with an intuitive notion of what counts as being about the speaker.  We are not in a position to give a full account of this concept, but we can say that we understand it capaciously.  You can clearly brag about your traits and skills.  “I’m courageous,” would traditionally count as a boast, as would, “I’m a chess grandmaster.”  You can also brag about your achievements.  “I’ve summated Annapurna,” is a boast.  It’s also clear that people can and do brag about their group identities.  “I’m a Rothschild,” can be a boast, as can “Canada is the world’s greatest hockey power,” when spoken by a Canadian.  This might seem odd, since it’s no achievement to be born into a particular family or nation, but people clearly do brag about these things.  An analysis of bragging fails if it doesn’t recognize this fact.

You can brag about your traits, skills, and group identities; it’s clear that you can also brag about your possessions.  “I own a Bugatti,” is a boast, as is, “I’m all about conspicuous consumption.”  Again, it might be distasteful, bourgeois, philistine, or immoral to boast in this way, but the question whether it’s permissible to boast is distinct from the question whether it’s possible.

It might seem at this point that, on our account, there’s nothing you can’t in principle brag about.  In fact, we are sympathetic to this idea.  We want to suggest that it doesn’t matter whether the thing bragged about is in any fundamental way associated with the speaker.  Instead, what matters is that the speaker takes the addressee to associate the bragged-about thing with the speaker (and potentially be impressed by it).  If I think that you think that the identity of my great-great-grandfather is sufficiently associated with me, I can brag about my ancestry.  If I think that you think my astrological sign is sufficiently associated with me, I can brag about my zodiac.  If I think that you think the accomplishments of my acquaintances are sufficiently associated with me, I can brag by name-dropping about whom I’ve met.  What matters is the speaker’s construal of what the addressee associates with the speaker.  Given sociological facts about what people tend to associate with each other, traits, skills, achievements, group identities, and possessions can all conventionally be bragged about.  Were these sociological facts to change, the opportunities to brag would also change.


4. I don’t mean to brag, but…


Thus far, we’ve argued that a speaker brags when and only when she makes an assertion about herself in order to produce in her addressee a belief that will in turn lead the addressee to be impressed with something about the speaker.  Something needn’t be in any way good to be impressive to the addressee, nor need it be impressive to anyone else.  Its connection with the speaker can be tenuous, provided that the speaker takes the addressee to associate it with her.  In the remainder of this paper, we discuss the conditions under which it’s possible to cancel a brag while still making the related assertion, which leads us to conclude with a few remarks on the recent neologism ‘humblebrag’.

It’s of course possible to make a non-bragging assertion that would, in some contexts, constitute a brag. “I used to play fly-half for Oxford,” is an example we’ve already seen.  What makes the difference, on our account, is condition (4): whether the speaker also intends her addressee to be impressed with something about her because they come to believe something about her.  The speaker’s communicative intentions are determinative.  If this is right, it’s not possible to brag by accident, since – even if you end up impressing your addressee unintentionally – you wouldn’t meet the necessary conditions for bragging.  Nevertheless, simply denying that you meant to brag after engaging in egregious self-aggrandizement seems suspect – the braggart’s version of Moore’s paradox.  Compare the more familiar example of an indirect speech act (Searle 1975) in which the speaker performs one speech act by performing another: I can request a beer by asking whether you have any beer.  But I can cancel the implied request by prefacing my question with, “I don’t want a beer, but….”  Canceling the brag while making the assertion doesn’t seem to work so well.  “I’m not trying to impress you by saying this, but I am a genius.”  Yeah right.

Why is it especially hard to cancel a boast?  This question can be answered by distinguishing between two distinct but interlocking aspects of communication: meaning, which is determined by the speaker who must nevertheless take into account how the addressee is likely to interpret her utterance, and interpretation, which is determined by the addressee who must nevertheless take into account what the speaker is likely to have meant by her utterance (Neale 2004).  An utterance succeeds to the extent that what the speaker means is identical to what the addressee interprets.  What’s odoriferous about at least some attempts to assert-without-bragging is that, even if the speaker really doesn’t aim to impress, she makes bizarre if not quite inconsistent demands on the addressee’s interpretation of her utterance.  On the one hand, the addressee is meant to believe something impressive about the speaker.  On the other hand, the addressee is not meant to be impressed – indeed, is meant not to be impressed.  On top of that, the speaker draws attention to the fact that the content of her assertion could be considered impressive.

Why is it especially difficult to cancel brags?  To answer this question, we revert to the familiar point that you can’t intend what you take to be impossible.  The question, then, is whether it’s possible to intend your audience to believe that you’re a genius because you say so, to pay attention to the fact that this would ordinarily be impressive, and yet not to be impressed.  There are bizarre cases in which this is possible, but the vast majority of the time it’s not.  With something less conventionally impressive than genius, the cancellation is more likely to work.  What the speaker needs is an “out.”  She needs to be able to point to some aim other than impressing her addressee that she thinks the addressee will consider plausible.  For instance, the speaker is on an airliner with the addressee, and the pilots have been incapacitated.  She say, “Trust me.  I’m a retired fighter pilot.”  She’s trying to get her addressee to believe that she’s competent to fly the airliner, but she doesn’t care whether the addressee is impressed with her credentials and experience.  She cares whether he trusts her.

Thus, one way to cancel the brag that would otherwise piggy-back on an assertion is to cancel the attempt to impress the addressee by providing an alternative purpose to the utterance (“Trust me; don’t be impressed by me.”)  Another way to cancel the brag is to sever the connection between the impressive thing and the speaker.  For instance, “I’m a multi-millionaire, but all of my wealth is inherited.”  Or, “I’m a descendant of Charlemagne, not that that means anything about me.”  In many cases, canceling the emotional component and canceling the connection to self are patently impossible, so any attempt to do either is doomed.

If the speaker knows that the addressee won’t accept the disclaimer, then she can’t cancel the brag.  Consider the tweet we used as an epigraph, “The fact that Wikipedia lists me as a notable alumnus of my college speaks ill of the reliability of crowd sourced information.”  This is a paradigmatic humblebrag.  What distinguishes it from straightforward bragging?  The humblebragger, in addition to saying something about themselves with the aim of getting their addressee to be impressed with them, tries to do so in such a way that the addressee doesn’t realize that the speaker is trying to impress.  This is usually done by saying something self-deprecating while bragging.  For instance, “I’m not notable,” is paired with, “I’m described as notable on Wikipedia.”

Here’s another example, this one a tweet by American stage actor Steve Kazee responding to the Daily News comparing his appearance to that of Ricky Martin: “Who wore it better?  I mean it’s @ricky_martin for gods sake.  Of course he wears it better!  I can’t compete with that.”  Kazee is bragging: he’s drawing attention to the facts that he is starring in a Broadway show, that his appearance was remarked on positively in a major newspaper, and that he was compared to the heartthrob Ricky Martin.  But he’s trying to brag in such a way that his addressees don’t realize that he aims to impress.

Humblebrags always do this.  They’re especially annoying because they implicitly challenge the addressee’s competence.  For a humblebrag to succeed, the addressee can’t recognize that the speaker aims to impress.  Thus, humblebragging always suggests or presupposes that the addressee isn’t intelligent, sensitive, or savvy enough to see through the self-deprecation to the intention to impress.

We’re finally in a position to return to our decision not to include in our analysis of bragging conditions requiring (5) the speaker to intend that the addressee recognize (4) and (6) be impressed with the speaker based on her recognition of (4).  Condition (6) is a non-starter.  Unless the speaker is embroiled in a boasting contest, she presumably wants her addressees to be impressed not because she means to impress them but because the content of her boast is impressive.  “Don’t be impressed with me because I say so,” she’d say, “Be impressed because I’m impressive!”[3]

What about condition (5)?  If this reflexive intention were necessary for bragging, then humblebragging as we’ve analyzed would be impossible, since the humblebragger would intend both that her addressee recognize that she intends to impress and that her addressee fail to recognize that she intends to impress.  But maybe our account is wrong.  Perhaps instead humblebragging isn’t really bragging.  Alternatively, perhaps humblebragging doesn’t involve hiding one’s intent to impress; perhaps the humblebragger intends to impress but also intends the addressee to make a character-level judgment that she isn’t a bragger.

Neither of these suggestions strikes us as more plausible than our original theory.  We suggest instead a three-way taxonomy of brags: (a) brazen brags, where the speaker intends the addressee to recognize that she’s trying to impress, (b) humblebrags, where the speaker intends the addressee to fail to recognize that she’s trying to impress, and (c) indifferent brags, where the speaker doesn’t intend one way or the other.

We leave for future research the paradox apparently generated by saying, “I’m so humble.”[4]



Grice, H. P. (1957). Meaning. Philosophical Review, 66:3, 377-88.

Neale, S. (2004). This, that, and the other.  In M. Reimer & A. Bezuidenhout (eds.), Descriptions and Beyond, pp. 68-182. Oxford University Press.

Searle, J. (1975). Indirect speech acts. In Cole & Morgan (eds.), Syntax and Semantics, 3: Speech Acts, pp. 59-82. New York: Academic Press.

Searle, J. (1976). A classification of illocutionary acts. Language in Society, 5:1, 1-23.




[1] We will use ‘brag’ and ‘boast’ synonymously.

[2] This claim is consistent with Searle’s (1976) taxonomy, which counts boasting as a kind of representative speech act.

[3] This argument is connected to our earlier point that one is never simply impressed with X; one is always impressed with the fact that X has some property or other.

[4] We are indebted to the following people for helpful discussion of this paper: Carl Sachs, Daniel Harris, David Pereplyotchik, J. Adam Carter, Adam Morton, Julia Staffel, Luke Maring, and John Greco.

Ramsifying virtue theory

Draft of a paper to be published in Current Controversies in Virtue Theory.  My controversy is over the question “Can people be virtuous?”  My respondent is James Montmarquet.  Other contributors to the volume include Heather Battaly, Liezl van Zyl, Jason Baehr, Ernie Sosa, Dan Russell, Christian Miller, Bob Roberts, and Nancy Snow.

Ramsifying virtue theory 

Can people be virtuous? This is a hard question, both because of its form and because of its content.

In terms of content, the proposition in question is at once normative and descriptive. Virtue-terms have empirical content. Attributions of virtues figure in the description, prediction, explanation, and control of behavior. If you know that someone is temperate, you can predict with some confidence that he won’t go on a bender this weekend. Someone’s investigating a mysterious phenomenon can be partly explained by (correctly) attributing curiosity to her. Character witnesses are called in trials to help determine how severely a convicted defendant will be punished. Virtue-terms also have normative content. Attributions of virtues are a manifestation of high regard and admiration; they are intrinsically rewarding to their targets; they’re a form of praise. The semantics of purely normative terms is hard enough on its own; the semantics of “thick” terms that have both normative and descriptive content is especially difficult.

Formally, the proposition in question (“people are virtuous”) is a generic, which adds a further wrinkle to its evaluation. It is notoriously difficult to give truth conditions for generics (Leslie 2008). A generic entails its existentially quantified counterpart, but is not entailed by it. For instance, tigers are four-legged, so some tigers are four-legged; but even though some deformed tigers are three-legged, it doesn’t follow that tigers are three-legged. A generic typically is entailed by its universally quantified counterpart, but does not entail it. Furthermore, a generic neither entails nor is entailed by its counterpart “most” statement. Tigers give live birth, but most tigers do not give live birth; after all, only about half of all tigers are female, and not all of them give birth. Most mosquitoes do not carry West Nile virus, but mosquitoes carry West Nile virus. Given the trickiness of generics, it’s helpful to clarify them to the extent possible with more precise non-generic statements.

Moreover, the proposition in question is modally qualified, which redoubles the difficulty of confirming or disconfirming it. What’s being asked is not simply whether people are virtuous, but whether they can be virtuous. It could turn out that even though no one is virtuous, it’s possible for people to become virtuous. This would, however, be extremely surprising. Unlike other unrealized possibilities, virtue is almost universally sought after, so if it isn’t widely actualized despite all that seeking, we have fairly strong evidence that it’s not there to be had.

In this paper, I propose a method for adjudicating the question whether people can be virtuous. This method, if sound, would help to resolve what’s come to be known as the situationist challenge to virtue theory, which over the last few decades has threatened both virtue ethics (Alfano 2013a, Doris 2002, Harman 1999) and virtue epistemology (Alfano 2011, 2013a, Olin & Doris 2014). The method is an application of David Lewis’s (1966, 1970, 1972) development of Frank Ramsey’s (1931) approach to the implicit definition of theoretical terms. The method needs to be tweaked in various ways to handle the difficulties canvassed above, but, when it is, an interesting answer to our question emerges: we face a theoretical tradeoff between, on the one hand, insisting that virtue is a robust property of an individual agent that’s rarely attained and perhaps even unattainable and, on the other hand, allowing that one person’s virtue might inhere partly in other people, making virtue at once more easily attained and more fragile.

The basic principle underlying the Ramsey-Lewis approach to implicit definition (often referred to as ‘Ramsification’) can be illustrated with a well-known story:

And the Lord sent Nathan unto David. And he came unto him, and said unto him, “There were two men in one city; the one rich, and the other poor. The rich man had exceeding many flocks and herds: But the poor man had nothing, save one little ewe lamb, which he had bought and nourished up: and it grew up together with him, and with his children; it did eat of his own meat, and drank of his own cup, and lay in his bosom, and was unto him as a daughter. And there came a traveler unto the rich man, and he spared to take of his own flock and of his own herd, to dress for the wayfaring man that was come unto him; but took the poor man’s lamb, and dressed it for the man that was come to him.” And David’s anger was greatly kindled against the man; and he said to Nathan, “As the Lord liveth, the man that hath done this thing shall surely die: And he shall restore the lamb fourfold, because he did this thing, and because he had no pity.” And Nathan said to David, “Thou art the man.”

Nathan uses Ramsification to drive home a point. He tells a story about an ordered triple of objects (two people and an animal) that are interrelated in various ways. Some of the first object’s properties (e.g., wealth) are monadic; some of the second object’s properties (e.g., poverty) are monadic; some of the first object’s properties are relational (e.g., he steals the third object from the second object); some of the second object’s properties are relational (e.g., the third object is stolen from him by the first object); and so on. Even though the first object is not explicitly defined as the X such that …, it is nevertheless implicitly defined as the first element of the ordered triple such that …. The big reveal happens when Nathan announces that the first element of the ordered triple, about whom his interlocutor has already made some pretty serious pronouncements, is the very person he’s addressing (the other two, for those unfamiliar with the 2nd Samuel 12, are Uriah and Bathsheba[1]).

The story is Biblical, but the method is modern. To implicitly define a set of theoretical terms (henceforth ‘T-terms’), one formulates a theory T in those terms and any other terms (henceforth ‘O-terms’) one already understands or has an independent theory of. Next, one writes T as a single sentence, such as a long conjunction, in which the T-terms t1…, tn occur (henceforth ‘T[t1…, tn]’ or ‘the postulate of T’). The T-terms are replaced by unbound variables x1…, xn, and then existentially quantified over to generate the Ramsey sentence of T, which states that T is realized, i.e., that there are objects x1…, xn that satisfy the Ramsey sentence. An ordered n-tuple that satisfies the Ramsey sentence is then said to be a realizer of the theory.

Lewis (1966) famously applied this method to folk psychology to argue for the mind-brain identity theory. Somewhat roughly, he argued that folk psychology can be treated as a theory in which mental-state terms are the T-terms. The postulate of folk psychology is identified as the conjunction of all folk-psychological platitudes (commonsense psychological truths that everyone knows, and everyone knows that everyone knows, and everyone knows that everyone knows that everyone knows, and so on). The Ramsey sentence of folk psychology is formed in the usual way, by replacing all mental-state terms (e.g., ‘belief’, ‘desire’, ‘pain’, etc.) with variables and existentially quantifying over those variables. Finally, one goes on to determine what, in the actual world, satisfies the Ramsey sentence; that is, one investigates what, if anything, is a realizer of the Ramsey sentence. If there is a realizer, then that’s what the T-terms refer to; if there is no realizer, then the T-terms do not refer. Lewis claims that brain states are such realizers, and hence that mental states are identical with brain states.

Lewis’s Ramsification method is attractive for a number of reasons.[2] First, it ensures that we don’t simply change the topic when we try to give a philosophical account of some phenomenon. If your account of the mind is wildly inconsistent with the postulate of folk psychology, then – though you may be giving an account of something interesting – you’re not doing what you think you’re doing. Second, enables us to distinguish between the meaning of the T-terms and whether they refer. The T-terms mean what they would refer to, if there were such a thing. Whether they in fact refer is a distinct question. Third, and perhaps most importantly, Ramsification is holistic. The first half of the twentieth century bore witness to the fact that it’s impossible to give an independent account of almost any psychological phenomenon (belief, desire, emotion, perception) because what it means to have one belief is essentially bound up with what it means to have a whole host of other beliefs, as well as (at least potentially) a whole host of desires, emotions, and perceptions. Ramsification gets around this problem by giving an account of all of the relevant phenomena at once, rather than trying to chip away at them piecemeal.

Virtue theory stands to benefit from the application of Ramsification for all of these reasons. We want an account of virtue, not an account of some other interesting phenomenon (though we might want that too). We want an account that recognizes that talk of virtue is meaningful, even if there aren’t virtues. Most importantly, we want an account of virtue that recognizes the complexity of virtue and character – the fact that virtues are interrelated in a whole host of ways with occurrent and dispositional mental states, with other virtues, with character more broadly, and so on.

Whether Lewis is right about brains is irrelevant to our question, but his methodology is crucial. What I want to do now is to show how the same method, suitably modified, can be used to implicitly define virtue-terms, which in turn will help us to answer the question whether people can be virtuous. For reasons that will become clear as we proceed, the T-terms of virtue theory as I construe it here are ‘person’, ‘virtue’, ‘vice’, the names of the various virtues (e.g., ‘courage’, ‘generosity’, ‘curiosity’), the names of their congruent affects (e.g., ‘feeling courageous’, ‘feeling generous’, ‘feeling curious’), the names of the various vices (e.g., ‘cowardice’, ‘greed, ‘intellectual laziness’), and the names of their congruent affects, (e.g., ‘feeling cowardly’, ‘feeling greedy’, ‘feeling intellectually lazy’). The O-terms are all other terms, importantly including terms that refer to attitudes (e.g., ‘belief’, ‘desire’, ‘anger’, ‘resentment’, ‘disgust’, ‘contempt’, ‘respect’), mental processes (e.g., ‘deliberation’), perceptions and perceptual sensitivities, behaviors, reasons, situational features (e.g., ‘being alone’, ‘being in a crowd’, ‘being monitored’), and evaluations (e.g., ‘praise’ and ‘blame’).

Elsewhere (Alfano 2013), I have argued for an intuitive distinction between high-fidelity and low-fidelity virtues. High-fidelity virtues, such as honesty, chastity, and loyalty, require near-perfect manifestation in undisrupted conditions. Someone only counts as chaste if he never cheats on his partner when cheating is a temptation. Low-fidelity virtues, such as generosity, tact, and tenacity, are not so demanding. Someone might count as generous if she were more disposed to give than not to give when there was sufficient reason to do so; someone might count as tenacious if she were more disposed to persist than not to persist in the face of adversity. If this is on the right track, the postulate of virtue theory will recognize the distinction. For instance, it seems to me at least that almost everyone would say that helpfulness is a low-fidelity virtue whereas loyalty is a high-fidelity virtue. Here, then, are some families of platitudes about character that are candidates for the postulate of virtue theory:


(A) The Virtue / Affect Family

(a1) If a person has courage, then she will typically feel courageous when there is sufficient reason to do so.

(a2) If a person has generosity, then she will typically feel generous when there is sufficient reason to do so.

(a3) If a person has curiosity, then she will typically feel curious when there is sufficient reason to do so.




(an) ….


(C) The Virtue / Cognition Family

(c1) If a person has courage, then she will typically want to overcome threats.

(c2) If a person has courage, then she will typically deliberate well about how to overcome threats and reliably form beliefs about how to do so.




(cn) ….


(S) The Virtue / Situation Family

(s1) If a person has courage, then she will typically be unaffected by situational factors that are neither reasons for nor reasons against overcoming a threat.

(s2) If a person has generosity, then she will typically be unaffected by situational factors that are neither reasons for nor reasons against giving resources to someone.

(s3) If a person has curiosity, then she will typically be unaffected by situational factors that are neither reasons for nor reasons against investigating a problem.






(E) The Virtue / Evaluation Family

(e1) If a person has courage, then she will typically react to threats in ways that merit praise.

(e2) If a person has generosity, then she will typically react to others’ needs and wants in ways that merit praise.

(e3) If a person has curiosity, then she will typically react to intellectual problems in ways that merit praise.






(B) The Virtue / Behavior Family

(b1) If a person has courage, then she will typically act so as to overcome threats when there is sufficient reason to do so.

(b2) If a person has generosity, then she will typically act so as to benefit another person when there is sufficient reason to do so.

(b3) If a person has curiosity, then she will typically act so as to solve intellectual problems when there is sufficient reason to do so.






(P) The Virtue Prevalence Family

(p1) Many people commit acts of courage.

(p2) Many people commit acts of generosity.

(p3) Many people commit acts of curiosity.

(p4) Many people are courageous.

(p5) Many people are generous.

(p6) Many people are curious.






(I) The Cardinality / Integration Family

(i1) Typically, a person who has modesty also has humility.

(i2) Typically, a person who has magnanimity also has generosity.

(i3) Typically, a person who has curiosity also has open-mindedness.






(D) The Desire / Virtue Family

(d1) Typically, a person desires to have courage.

(d2) Typically, a person desires to have generosity.

(d3) Typically, a person desires to have curiosity.






(F) The Fidelity Family

(f1) Chastity is high-fidelity.

(f2) Honesty is high-fidelity.

(f3) Creativity is low-fidelity.






Each platitude in each family is meant to be merely illustrative. Presumably they could all be improved somewhat, and there are many more such platitudes. Moreover, each family is itself just an example. There are many further families describing the relations among vice, affect, cognition, situation, evaluation, and behavior, as well as families that make three-way rather than two-way connections (e.g., “If a person is courageous, then she will typically act so as to overcome threats when there is sufficient reason to do so and because she feels courageous.”). For the sake of simplicity, though, let’s assume that the families identified above contain all and only the platitudes relevant to the implicit definition of virtues. Ramsification can now be performed in the usual way. First, create a big conjunction (henceforth, simply the ‘postulate of virtue theory’). Next, replace each of the T-terms in the postulate of virtue theory with an unbound variable, then existentially quantifies over those variables to generate the Ramsey sentence of virtue theory. Finally, check whether the Ramsey sentence of virtue theory is true and – if it is – what its realizers are.

After this preliminary work has been done, we’re in a position to see more clearly the problem raised by the situationist challenge to virtue theory. Situationists argue that there is no realizer of the Ramsey sentence of virtue theory. Moreover, this is not for lack of effort. Indeed, one family of platitudes in the Ramsey sentence specifically states that, typically, people desire to be virtuous; it’s not as if no one has yet tried to be or become courageous, generous, or curious.[3] In this paper, I don’t have space to canvass the relevant empirical evidence; interested readers should see my (2013a and 2013b). Nevertheless, the crucial claim – that the Ramsey sentence of virtue theory is not realized – is not an object of serious dispute in the philosophical literature.

One very common response to the situationist challenge from defenders of virtue theory (and virtue ethics in particular) is to claim that virtues are actually quite rare, directly contradicting the statements in the virtue prevalence family. I do not think this is the best response to the problem, as I explain below, but the point remains that all serious disputants agree that the Ramsey sentence is not realized.

As described above, Ramsification looks like a simple, formal exercise. Collect the platitudes, put them into a big conjunction, perform the appropriate substitutions, existentially quantify, and check the truth-value of the resulting Ramsey sentence (and the referents of its bound variables, if any). But there are several opportunities for a critic to object as the exercise unfolds.

One difficulty that arises for some families, such as the desire / virtue family, is that they involve T-terms within the scope of intentional attitude verbs.[4] Since existential quantification into such contexts is blocked by opacity, such families cannot be relied on to define the T-terms, though they can be used to double-check the validity of the implicit definition once the T-terms are defined.[5]

Another difficulty is that this methodology presupposes that we have an adequate understanding of the O-terms, which in this case include terms that refer to attitudes, mental processes, perceptions and perceptual sensitivities, behaviors, reasons, situational features, and evaluations. One might be dubious about this presupposition. I certainly am. However, the fact that philosophy of mind and metaethics are works-in-progress should not be interpreted as a problem specifically for my approach to virtue theory. Any normative theory that relies on other branches of philosophy to figure out what mental states and processes are, and what reasons are, can be criticized in the same way.

A third worry is that the list of platitudes contains gaps (e.g., a virtue acquisition family about how various traits are acquired). Conversely, one might think that it has gluts (e.g., unmotivated commitment to virtue prevalence). To overcome this pair of worries, we need a way of determining what the platitudes are. Perhaps surprisingly, there is no precedent for this in the philosophy of mind, despite the fact that Ramsification is often invoked as a framework there.[6] This may be because it’s supposed to be obvious what the platitudes are. Here’s Frank Jackson’s flippant response to the worry: “I am sometimes asked—in a tone that suggests that the question is a major objection—why, if conceptual analysis is concerned to elucidate what governs our classificatory practice, don’t I advocate doing serious opinion polls on people’s responses to various cases? My answer is that I do—when it is necessary. Everyone who presents the Gettier cases to a class of students is doing their own bit of fieldwork, and we all know the answer they get in the vast majority of cases” (1998, 36–37). After all, according to Lewis, everyone knows the platitudes, and everyone knows that everyone knows them, and everyone knows that everyone knows that everyone knows them, and so on. Sometimes, however, the most obvious things are the hardest to spot. It thus behooves us to at least sketch a method for carrying out the first step of Ramsification: identifying the platitudes. Call this pre-Ramsification.

Here’s an attempt at spelling out how pre-Ramsification should work: start by listing off a large number of candidate platitudes. These can be all of the statements one would, in a less-responsible, Jacksonian mood, have merely asserted were platitudes. It can also include statements that seem highly likely but perhaps not quite platitudes. Add to the pool of statements some that seem, intuitively, to be controversial, as well as some that seem obviously false; these serve as anchors in the ensuing investigation. Next, collect people’s responses to these statements. Several sorts of responses would be useful, including subjective agreement, social agreement, and reaction time. For instance, prompt people with the statement, “Many people are honest,” and ask to what extent they agree and to what extent they think others would agree. Measure their reaction times as they answer both questions. High subjective and social agreement, paired with fast reaction times, is strong but defeasible evidence that a statement is a platitude. This is a bit vague, since I haven’t specified what counts as “high” agreement or “fast” reaction times, but there are precedents in psychology for setting these thresholds. Moreover, this kind of pre-Ramsification wouldn’t establish dispositively what the platitudes are, but then, dispositive proof only happens in mathematics.

It’s far beyond the scope of this short paper to show that pre-Ramsification works in the way I suggest, or that it verifies all and only the families identified above. For now, let’s suppose that it does, i.e., that all of the families proposed above were validated by pre-Ramsification. Let’s also suppose that we have strong evidence that the Ramsey sentence of virtue theory is not realized (a point that, as I mentioned above, is not seriously contested). How should we then proceed?

Lewis foresaw that, in some cases, the Ramsey sentence for a given field would be unrealized, so he built in a way of fudging things: instead of generating the postulate by taking the conjunction of all of the platitudes, one can generate a weaker postulate by taking the disjunction of each of the conjunctions of most of the platitudes. For example, if there were only five platitudes, p, q, r, s, and t, then instead of the postulate’s being , it would be (p&q&r&s)v(p&q&r&t)&…&(q&r&s&t). In the case of virtue theory, we could take the disjunction of each of the conjunctions of all but one of the families of platitudes. Alternatively, we could exclude a few of the platitudes from within each family.

Fudging in this way makes it easier for the Ramsey sentence to be realized, since the disjunction of conjunctions of most of the platitudes is logically weaker than the straightforward conjunction of all of them. Fudging may end up making it too easy, though, such that there are multiple realizers of the Ramsey sentence. When this happens, it’s up to the theorist to figure out how to strengthen things back up in such a way that there is a unique realizer.

The various responses to the situationist challenge can be seen as different ways of doing this. Everyone recognizes that the un-fudged Ramsey sentence of virtue theory is unrealized. But a sufficiently fudged Ramsey sentence is bound to be multiply realized. It’s a theoretical choice exactly how to play things at this point. More traditional virtue theorists such as Joel Kupperman (2009) favor a fudged version of the Ramsey sentence wherein the virtue prevalence family has been dropped. John Doris (2002) favors a fudged version wherein the virtue/situation and virtue/integration families have been dropped. I (2013) favor a fudged version wherein the virtue / situation family has been dropped and a virtue /social construction family has been added in its place. The statements in the latter family have to do with the ways in which (signals of) social expectations implicitly and explicitly influence behavior. The main idea is that having a virtue is more like having a title or social role (e.g., you’re curious because people signal to you their expectations of curiosity) than like having a basic physical or biological property (e.g., being over six feet tall). Christian Miller (2013, 2014) drops the virtue prevalence family and adds a mixed-trait prevalence family in its place, which states that many people possess traits that are neither virtues nor vices, such as the disposition to help others in order to improve one’s mood or avoid sliding into a bad mood.

In this short paper, I don’t have the space to argue against all alternatives to my own proposal. Instead, I want to make two main claims. First, the “virtue is rare” dodge advocated by Kupperman and others who drop the virtue prevalence family has costs associated with it. Second, those costs may be steeper than the costs associated with my own way of responding to the situationist challenge.

Researchers in personality and social psychology have documented for decades the tendency of just about everybody to make spontaneous trait inferences, attributing robust character traits on the basis of scant evidence (Ross 1977; Uleman et al. 1996). This indicates that people think that character traits (virtues, vices, and neutral traits, such as extroversion) are prevalent. Furthermore, in a forthcoming paper (Alfano, Higgins, & Levernier forthcoming), I show that the vast majority of obituaries attribute multiple virtues to the deceased. Not everyone is eulogized in an obituary, of course, but most are (about 55% of Americans, by my calculations). Not all obituaries are sincere, but presumably many are. Absent reason to think that people about whom obituaries differ greatly from people about whom they are not written, we can treat this as evidence that most people think that the people they know have multiple virtues. But of course, if most relations of most people are virtuous, it follows that most people are virtuous. In other words, the virtue-prevalence family is deeply ingrained in folk psychology and folk morality.

Social psychologists think that people are quick to attribute virtues. My own work on obituaries suggests the same. What do philosophers say? Though there are some (Russell 2009) who claim that virtue is rare or even non-existent with a shrug, this is not the predominant opinion. Alasdair MacIntyre (1984, p. 199) claims that “without allusion to the place that justice and injustice, courage and cowardice play in human life very little will be genuinely explicable.” Philippa Foot (2001), following Peter Geach (1977), argues that certain generic statements characterize the human form of life, and that from these generic statements we can infer what humans need and hence will typically have. For the sake of comparison, consider what she says about a different life form, the deer. Foot first points out that the deer’s form of defense is flight. Next, she claims that a certain normative statement follows, namely, that deer are naturally or by nature swift. This is not to say that every deer is swift; some are slow. Instead, it’s a generic statement that characterizes the nature of the deer. Finally, she says that any deer that fails to be swift – that fails to live up to its nature – is “so far forth defective” (p. 34). The same line of reasoning that she here applies to non-human animals is meant to apply to human animals as well. As she puts it, “Men and women need to be industrious and tenacious of purpose not only so as to be able to house, clothe, and feed themselves, but also to pursue human ends having to do with love and friendship. They need the ability to form family ties, friendships, and special relations with neighbors. They also need codes of conduct. And how could they have all these things without virtues such as loyalty, fairness, kindness, and in certain circumstances obedience?” (pp. 44-5, emphasis mine).

In light of these sorts of claims, let’s consider again the defense offered by some virtue ethicists that virtue is rare, or even impossible to achieve. If virtues are what humans need, but the vast majority of people don’t have them, one would have thought that our species would have died out long ago. Consider the analogous claim for deer: although deer need to be swift, the vast majority of deer are galumphers. Were that the case, presumably they’d be hunted down and devoured like a bunch of tasty venison treats. Or consider another example of Foot’s: she agrees with Geach (1977) that people need virtues like honeybees need stingers. Does it make sense for someone with this attitude to say that most people lack virtues? That would be like saying that, even though bees need stingers, most lack stingers. It’s certainly odd to claim that the majority – even the vast majority of a species fails to fulfill its own nature. That’s not a contradiction, but it is a cost to be borne by anyone who responds to the situationist challenge by dropping the virtue prevalence family.

One might respond on Foot’s behalf that human animals are special: unlike the other species, we have natures that are typically unfulfilled. That would be an interesting claim to make, but I am not aware of anyone who has defended it in print.[7] I conclude, then, that dropping the virtue prevalence family is a significant cost to revising the postulate.

But is it a more significant cost than the one imposed on me by replacing the virtue / situation family with a virtue / social construction family? I think it is. This comparative claim is of course hard to adjudicate, so I will rest content merely to emphasize the strength of the virtue / prevalence family.

What would it look like to fudge things in the way I recommend? Essentially, one would end up committed to a version of the hypothesis of extended cognition, a variety of active externalism in the family of the extended mind hypothesis. Clark & Chalmers (1998) argued that the vehicles (not just the contents) of some mental states and processes extend beyond the nervous system and even the skin of the agent whose states they are.[8] If my arguments are on the right track, virtues and vices sometimes extend in the same way: the bearers of someone’s moral and intellectual virtues sometimes include asocial aspects of the environment and (more frequently) other people’s normative and descriptive expectations. What it takes (among other things) for you to be, for instance, open-minded, on this view is that others think of you as open-minded and signal those thoughts to you. When they do, they prompt you to revise your self-concept, to want to live up to their expectations, to expect them to reward open-mindedness and punish closed-mindedness, to reciprocate displays of open-mindedness, and so on. These are all inducements to conduct yourself in an open-minded way, which they will typically notice. When they do, their initial attribution will be corroborated, leading them to strengthen their commitment to it and perhaps to signal that strengthening to you, which in turn is likely to further induce you to conduct yourself in open-minded ways, which will again corroborate their judgment of you, and so on. Such feedback loops are, on my view, partly constitutive of what it means to have a virtue.[9] The realizer of the fudged Ramsey sentence isn’t just what’s inside the person who has the virtue but also further things outside that person.

So, can people be virtuous? I hope it isn’t too disappointing to answer with, “It depends on what you mean by ‘can’, ‘people’, and ‘virtuous’.” If we’re concerned only with abstract possibility, perhaps the answer is affirmative. If we are concerned more with the proximal possibility that figures in people’s current deliberations, plans, and hopes, we have reason to worry. If we only care whether more than zero people can be virtuous, the existing, statistical, empirical evidence is pretty much useless.   If we instead treat ‘people’ as a generic referring to human animals (perhaps a majority of them, but at least a substantial plurality), such evidence becomes both important and (again) worrisome. If we insist that being virtuous is something that must inhere entirely within the agent who has the virtue, then evidence from social psychology is damning. If instead we allow for the possibility of external character, there is room for hope.[10]


[1] Nathan is also using an extended metaphor. My point is clear nevertheless.

[2] An alternative is the “psycho-functionalist” method, which disregards common sense in favor of (solely) highly corroborated scientific claims. See Kim (2011) for an overview. For my purposes, psycho-functionalism is less appropriate, since (among other things) it is more in danger of changing the topic.

[3] I seem to be in disagreement on this point with Christian Miller (this volume), who worries that people may not be motivated to be or become virtuous. In general, I’m even more skeptical than Miller about the prospects of virtue theory, but in this case I find myself playing the part of the optimist.

[4] I am here indebted to Gideon Rosen.

[5] It might also be possible to circumvent this difficulty, which anyway troubles Lewis’s application of Ramsification to the mind-brain identity theory, by using only de re formulations of the relevant statements. See Fitting & Mendelsohn (1999) for a discussion of how to do so.

[6] Experimental philosophers have started to fill this gap, but not in any systematic or consensus-based way.

[7] Micah Lott (personal communication) has told me that he endorses this claim, though he has a related worry. In short, his concern is to explain how, given the alleged rarity of virtue, most people manage to live decent enough lives.

[8] For an overview of the varieties of externalism, see Carter et al. (forthcoming).

[9] I spell out this view in more detail in Alfano & Skorburg (forthcoming). For a treatment of the feedback-loops model in the context of the extended mind rather than the character debate, see Palermos (forthcoming).

[10] I am grateful to J. Adam Carter, Orestis Palermos, and Micah Lott for comments on a draft of this paper.

The semantic neighborhood of intellectual humility

Here’s a draft of a paper (co-authored with Markus Christen and Brian Robinson) on the semantic neighborhood of intellectual humility.  We are replicating in German and Mandarin, so those who are familiar with Wilfrid Sellars should think of this as the first step in a seriously scientific dot-quotation research programme.

1. Introduction

The study of personality and conceptions of personality has been pursued by psychologists and other researchers in various ways, including among others observations in laboratory settings and field experiments, correlational studies of survey responses, and psycholexical analyses. The present research embodies the latter methodology, and is informed by both philosophical theory and mathematical modeling tools developed in physical science.

Psycholexical analysis dates back to Francis Galton’s Measurement of Character (1884). The basic idea is that, all else being equal, a natural language is more likely to include a predicate for a property to the extent that the property is important to those who speak the language. This is not to say that every phrase or term refers. There are no unicorns despite the existence of the term ‘unicorn’. Nor is it to say that everything worth talking about is already represented by a phrase or singular term. Words are sometimes coined because new phenomena come into existence or become important; words are also sometimes coined because extant phenomena could not otherwise be parsimoniously described and explained. Sometimes a speaker coins words to describe or explain phenomena for which a word already exists, but of which the coiner is ignorant. So words that are synonyms (or nearly so) emerge, further emphasizing the importance of the phenomena referred to. Regardless, the rough generalization that there is a strong positive correlation between the importance of phenomena in the lives of the speakers of a language and the probability of the existence of a term in the language that refers to those phenomena is hard to deny. If this is on the right track, studying psychological language is an indirect way of studying the psychological properties people care about.

Psychologists in the psycholexical tradition don’t stop there, though. They also typically argue that the semantic structure of a language reflects to some extent the perceived structure of the phenomena described by the language. In personality psychology, this insight was famously used by Allport & Odbert (1936) to create a semantic taxonomy of thousands of personality-relevant terms, which they argued represents how people conceive of personality. Of course, the step from language to people’s conception of personality is not identical to the step from their conception of personality to actual personality, but it’s natural to think that there will be at least a positive correlation – if only a weak one – between how we think about personality and how personality actually is. This two-step connection (from language about personality to conceptions of personality, from conceptions of personality to actual personality) has been empirically validated by personality models such as the Big Five (Peabody & Goldberg 1989) and Big Six (Ashton et al. 2004; Saucier 1997).

The Big Six includes an H factor that represents facets of personality related to honesty and humility. Intellectual humility seems to involve a consciousness of the limits of one’s knowledge, including sensitivity to circumstances in which one’s native egocentrism is likely to function self-deceptively (Roberts & Woods 2007), though others regard it as more of a “second-order” open-mindedness (Spiegel 2012). In our age of information, intellectualhumility has grown all the more relevant. However, little conceptual or empirical work has explored this trait. We think that the psycholexical approach is especially promising in the investigation of intellectual humility because questionnaires are likely to be especially unreliable as measures of this construct. Someone who is genuinely humble is unlikely to report being humble, and someone who reports being humble is unlikely to be humble. Humility – whether intellectual, moral, or otherwise – seems to involve a paradox of self-reference.

Additionally, our investigation is motivated by Aristotle’s insight, reiterated in contemporary philosophy by Roberts & Wood (2007), that a virtue (i.e., a positive value-laden personality disposition or dimension of individual difference) is often best understood in the context of related virtues and the vices they oppose. Put a different way, by contextualizing a term for a virtue in the constellation of its near-synonyms and its near-antonyms, we can create a perspicuous representation of the meaning of the term.

For these reasons, we propose to investigate the trait of intellectual humility psycholexically by comparing ‘intellectual humility’ with both its antonyms and synonyms.

2. Method

Our analysis is based on the assumption that the practice of language is precipitated in dictionaries, lexicons, and other wordbooks. Of particular interest is the thesaurus – a language reference book or database organized to help its users find words related to a concept but having slightly different shades of meaning or connotation. Thesauruses reflect what people in their daily use of language – in particular when writing text – consider semantically similar to a given term. In other words, a thesaurus lists synonyms in a broad sense. Modern thesauruses also list antonyms, which are then again related to a set of their own synonyms.

The present research explores the semantic space of intellectual humility by first identifying the most common synonyms and antonyms of ‘intellectual humility’. Next, by referring to the database (the largest online thesaurus for American English), we associate each identified term with a word-bag, which is the set of synonyms listed for that term. The semantic constellation of a term t is thus an ordered pair (t, {tsyn1, tsyn2, tsyn3, …, tsynn}), whose first element is t itself and whose second element is t’s word-bag, i.e., the set of synonyms of t (including t itself). By comparing semantic constellations, we then create a similarity metric by calculating the relative overlap of each pair of word-bags. The similarities calculated in this way are then used in a novel clustering and visualization tool that generates a semantic map of the terms involved.

More specifically:

1)    We identified potential synonyms and antonyms for ‘intellectual humility’ in three ways:

  1. We searched philosophy and psychology journals for articles that discuss intellectual humility; we found 24 papers or related texts (such as calls for proposals, abstracts, and papers).
  2. We performed an Internet search for entries on ‘intellectual humility’ and found 20 entries that dealt in a significant way with the concept.
  3. We identified scales that are used in psychology for constructs that have some similarity to intellectual humility (e.g., the H factor of the Big Six personality inventory).

In all these texts, we identified terms that are used to represent the meaning of ‘intellectual humility’ or its relevant vices.

2)    Four raters that have experience with the philosophical topic of intellectual humility assessed all terms collected in step 1 to determine whether they could be used to express the concept of intellectual humility or a related vice. A term was kept on the list if three out of four raters agreed to do so. In this way, we identified 52 synonyms and 69 antonyms for ‘intellectual humility’. Each term was represented at least in noun form and usually in adjective form also: for example, {tolerance, tolerant}.

3)    We identified all entries for each term generated in step 2 in the database to generate word-bags for each synonym and antonym. For example, the word-bag for ‘tolerance’ included all entries on for the term set {tolerance, tolerant}.

4)    Next, we calculated the similarity in overlap between every pairwise combination of word-bags. For example, the word-bag of ’tolerance’ contains 55 terms and the word-bag of ’broadmindedness’ contains 40 terms. 12 terms are contained in both word-bags. Hence, the similarity between ‘tolerance’ and ‘broadmindedness’ is 12/40 = 0.3. In this way, the similarity measures are always between 0 (no similarity) and 1 (one word-bag is completely contained in the other word-bag).

5)    We checked for highly similar terms (overlaps > 0.5).[1] We collapsed the word-bags of these terms into a single word-bag to reduce the number of synonyms/antonyms. Conceptually, it’s unclear whether terms that share more than half of their semantic constellations represent genuinely distinct constructs. In this way, we reduced the number of synonyms from 52 to 39 and the number of antonyms from 69 to 46. When two terms were collapsed, our raters kept the term that in their estimation was better known. A new word-bag was created combining those of the two collapsed terms. In cases where the word-bag of term X overlapped with two or more terms by > 0.5 whose mutual overlap was, however, below the cutoff-value, the raters determined collapsing based on the highest mutual overlaps. This occurred 2 times for the synonyms and 8 times for of the antonyms. For all condensed word-bags, the similarities were re-calculated. Step 5 was not iterated.

6)    The similarity measures obtained in this way were then used as inputs in a visualization algorithm called superparamagnetic agent mapping, which employs self-organizing agents governed by the dynamics of a clustering algorithm inspired by spin physics to generate denoised low-dimensional representations. To conceptualize this mapping, imagine each term as a particle that naturally repels all other particles. However, as overlap between two terms increases, they become more attracted to each other. Thus, superparamagnetic agent mapping typically produces clumping, where several particles clump together (connoting similarity) while collectively repelling a different cluster (connoting collective difference between the two clusters). It has been shown (Ott et al. 2014) that this method is superior to standard methods such as factor analysis, principal components analysis, and multidimensional scaling in preserving the topology of the data space with clustered data. Since such a map will never precisely display the real topology of the original, high-dimensional space, we calculated for each point on the map the sum of the differences between the point and all its neighbors both in the map and in the original space (normalized to the longest distance in either case). The lower this sum, the better the map displays the real distance distribution of a point from its neighbors in the original space, so this number is a proxy for the quality of the map. To increase the heuristic value of the maps, we rescaled the sizes of the points themselves so that larger points indicate greater topological certainty.

7)    Finally, using the same clustering paradigm in an adapted version from (Ott et al. 2005), we identified clusters on the map generated in step 6.

Step 7 generates the maps below that are then used to inform our reasoning about intellectual humility.

3. Results

We produced three maps to convey our results. Figure 1 is the synonym map, showing the degree of overlap among intellectual humility’s 39 synonyms. The terms predominantly cluster into three groups. The first group (displayed in green) we have labeled the Sensible Self and is exemplified by terms such as ‘comprehension,’ ‘responsiveness’, and ‘mindfulness’. We take this cluster to be representative of the notion that an intellectually humble person will be open and responsive to new ideas and information. The second (pink) cluster we call the Inquisitive Self; it is illustrated by terms such as ‘curiosity’, ‘exploration’, and ‘learning’. The difference between the Sensible Self and the Inquisitive Self indicates that there is some difference between seeking new information or ideas and being open to them when they are presented. Third, we have named the blue cluster the Discreet Self, which is typified by ‘humility’, ‘decency’, and ‘unpretentiousness’. Finally, some terms (shown in black) have intermediate positions among these groups (e.g., ‘flexibility’ and ‘tolerance’) and do not fit neatly within any cluster.


Figure 1: IH Synonym map.

Figure 2 shows the results of the antonym map, displaying the degree of overlap between intellectual humility’s 46 antonyms. The first result to notice is that almost all the terms are aligned along one dimension and cluster at each endpoint. We take this to represent the distinction between underrating and overrating. The larger, red cluster can be thought of as the Overrated Self, and includes terms such as ‘vanity, ‘pride’, and ‘arrogance’. This cluster suggests that one way not to be intellectually humble is to be overly focused on one’s own high status. Overrating oneself is not, however, the only way to fail to be intellectually humble. The opposite endpoint has two closely related clusters that indicate two other ways. There is the Underrated Other in purple (typified by terms such as ‘bias’, ‘prejudice’, and ‘unfairness’) and the Underrated Self cluster in orange, which is similar in that it involves underrating, but the object of underrating is oneself. This cluster is characterized by terms such as ‘diffidence’, ‘timidity’, and ‘acquiescence’. This cluster suggests that there is such a thing as being too humble, such that one’s lack of pride ceases to have any positive value. It is worth noting how close the two (orange and purple) underrated clusters are relative to the (red) overrated cluster. This indicates that there is a higher degree of similarity based on the nature of the rating (over or under) than on who is being evaluated (self or other). Finally, we again see several terms (such as ‘hubris’, ‘chutzpah’, and ‘aloofness’) in white circles in the middle of the line, indicating that these terms do not fit within any cluster. This result should not be surprising since one can be aloof by either overrating oneself or underrating others (or both).


Figure 2: IH Antonym map.

Finally, we mapped all synonyms and antonyms together. We have preserved the colors from the two previous maps. The resulting map preserves many of the structural features of the previous maps, but with a few significant changes. First, it reveals that for the antonyms the linear structure along the poles of the Overrated Self and the Underrated Other is mainly preserved, whereas the terms on the Underrated Self (orange) are in the same region as the terms for the Discreet Self (blue) from the synonym set. Additionally, the distinction between the terms for the Sensible Self (green) and Inquisitive Self (pink) is no longer discernible. This second merger merely indicates that the difference between the Inquisitive Self and the Sensible Self is large enough to be significant when compared to the Discreet Self, but small enough not to be significant when compared to intellectual humility’s antonyms.


Figure 3: Unified synonyms and antonyms map.

4. Discussion & Conclusion

From these results, there are three points we wish to draw out for discussion. First, there is the matter of what the clusters represent. In the antonyms map, we take each cluster to represent a distinct vice, i.e., a different way one can fail to be intellectually humble. For the synonyms, however, two possibilities exist. It might be that each cluster represents a distinct trait, all three of which go by the same name of ‘intellectual humility’. Opposing this semantic diversity thesis is the alternate interpretation that sees each cluster representing a different facet of the single trait of intellectual humility.

Second, consider the merging of the synonym-based Discreet Self and antonym-based Underrated Self in the combined map. We see two possible interpretations. It might be that the discreet aspect of intellectual humility is essentially akin to underrating oneself. Snow (1995) and Taylor (1985) both argue that humility essentially involves recognizing one’s low status or personal faults. If this is right, then either the discreet aspect of humility is more of a vice than a virtue, or the underrated aspect of humility’s antonyms is more of a virtue than a vice. Either way, the valence of one or both of these semantic clusters may need to change. Alternatively, there might be two different traits picked out by these clusters – one a virtue and the other a vice – that are behaviorally similar enough that they are easily conflated. Someone who underrates herself will behave very similarly to a discreet person. They will both not regularly speak up about controversial topics, in praise of themselves, or for their own rights and entitlements, making it difficult to differentiate them behaviorally. There could, however, be an underlying psychological difference that typically goes unobserved. The discreet person may not often attend to evaluating herself, but when she does so, she does it accurately. One who underrates herself, however, may pay significant attention to her own merits, but regularly devalue them. Further research on the behavioral and psychological aspects of intellectual humility and its contraries may help to answer this question.

The final point relates back to the Big Six personality inventory (Ashton et al. 2004; Saucier 1997). As mentioned earlier, the H factor is meant to represent facets of personality related to honesty and humility. The 100-item revised version measures the participant’s humility (specifically her modesty) by having her indicate (dis)agreement with statements such as “I am an ordinary person who is no better than others.” We worry that the Big Six therefore includes in its H dimension items that are better understood as contrary to humility, not allied with or constitutive of it.



Allport, G. & Odbert, H. (1936). Trait-names: A Psycho-lexical Study.

Aristotle. Nicomachean Ethics.

Ashton, M., Lee, K., Perugini, M., Szarota, P., de Vries, R., Di Blas, L., Boies, K., De Raad, B. (2004). A six-factor structure of personality-descriptive adjectives: Solutions from psycholexical studies in seven languages. Journal of Personality and Social Psychology, 86:2, 356-366.

Galton, F. (1884). Measurement of Character.

Ott, T., Eggel, T., Christen, M. (2014). Generating Low-Dimensional Denoised Embeddings of Nonlinear Data with Superparamagentic Agents. Proceedings of the 2014 International Symposium on Nonlinear Theory and its Applications (NOLTA), Lucerne, Switzerland, September 14-18.

Ott, T., Kern, A., Steeb, W.-H., Stoop, R. (2005). Sequential Clustering: Tracking Down the Most Natural Clusters. Journal of Statistical Mechanics: theory and experiment: P11014.

Peabody, D, & Goldberg, L. (1989). Some determinants of factor structures from personality-trait descriptors. Journal of Personality and Social Psychology, 57:3, 552-567.

Roberts, R. & Wood, J. (2007). Intellectual Virtues: An Essay in Regulative Epistemology. Oxford University Press.

Saucier, G. (1997). Effects of variable selection on the factor structure of person descriptors. Journal of Personality and Social Psychology, 73:6, 1298-1312.

Spiegel, J.G. (2012). Open-mindedness and intellectual humility. Theory and Research in Education. 10:27-38


[1]This cut-off value was chosen based on a logarithmic count of the long-tailed distance distribution such that the tail was cut off before the beginning of the main mode of the distribution (i.e., the largest mode in a multi-modal distribution).

the recognition heuristic and epistemic injustice

Now for the poet, he nothing affirmeth, and therefore never lieth.

The Defense of Poesy, by Sir Philip Sidney

It’s easy, especially for a white man like me, to take for granted my capability to assert.  If I want to say something — in person, on a blog, to a reporter, to an administrator at my university — all I have to do is open my mouth or start typing.  What could be simpler?

But any particular act of asserting, like any speech act at all, is possible only because it originates in a complex linguistic, social, and cultural matrix.  Some elements of this matrix are obvious and uncontroversial when pointed out.  I can’t say something to you if we don’t speak the same language and have no a way of translating from my language to yours.  Likewise, I can’t make an assertion if I’ve established a reputation, like the boy who cried ‘wolf!’, as unreliable: in that case, any intelligent interlocutor would treat the probability of p given that I said ‘p’ as equivalent to the prior probability of p:

P(p | Mark says ‘p) = P(p)

P(wolf | boy cries ‘wolf!’) = P(wolf)

My word would carry no weight one way or the other.  It’s unclear whether I’ve even made an assertion when my word has no weight — especially if I know in advance that I’m so distrusted.

What if I’ve established no reputation one way or another?  You might think that, in such a scenario, the default should be to trust me, to give my word some, though of course not dispositive, weight.  Call this default assertoric empowerment: an epistemic agent S is default-empowered to assert that p for a range R of propositions just in case S’s saying that p (when p is in R) typically carries some evidentiary weight even with strangers. (I’m drawing here on Searle’s idea of empowerment in The Construction of Social Reality.)

For other kinds of speech acts, it’s obvious that constraints are placed on empowerment.  Not just anyone can issue me directives.  “Eat your vegetables” carries some force when my wife says it to me, but not when the bus boy at a restaurant says it to me.  “Class dismissed” will end my class when I say it, but it won’t end my class when you say it or your class when I say it.  I can’t promise to give you the Grand Canyon for your birthday because I don’t own, and have no way of acquiring, the Grand Canyon.  One needs to be suitably empowered to give people orders, to declare X to be Y, or to promise to Z.

For “pushy” speech acts such as directives and declaratives, default empowerment is highly circumscribed.  There are very few things that any given person is assumed by default to be able to command others to do.  “Stop harming me” is probably one, though that presupposes that the speaker is in fact being harmed.  “Don’t harm me” might work a little better.  Likewise, there are very few things that any given person is assumed to be able to declare.  I can’t declare myself President, declare myself tenured, or name your baby.  Most default declarative empowerments seem to have to do with voluntary affiliations.  I can declare myself a Christian, or an atheist, or a socialist, or gay.  Historically, though, even these kinds of affiliations couldn’t be declared by default.  After the Peace of Westphalia, a German peasant couldn’t declare his own religious affiliation: it was declared for him by his prince.  Until very recently, it was impossible to self-identify as homosexual because there was no concept or word for the category.  Even after the words and concepts were forged, self-declaring as gay was not default-empowered: someone who tried might, instead of being acknowledged, face electroshock therapy.  In 2013, Bangladesh recognized a third gender category of hijras, who are neither men nor women.

Not so, one might think, with assertions.  Unless one is explicitly disempowered because one is severely mentally ill, a very young child, or a notorious liar, one is default-empowered to assert that p for a very wide range R.  I want to challenge this assumption.  Just for starters, consider the fact that in ancient Greece the testimony of a slave was admissible as evidence in a trial only if it was acquired under torture.  This shows that belonging to a certain social category has been enough, historically, to disempower someone from making an assertion unless very special steps were taken.  Surely, though, things have improved in the ensuing centuries.  But how much?  Even in progressive Sweden, a woman’s “no” still means “yes.”  In the USA, a black man’s saying “I’m not resisting arrest” can still lead to charges of… resisting arrest.  Sad to say, default assertoric empowerment does not characterize the epistemic lives of many, many people: whether you’re empowered to say that p depends on which social category you belong to.  In this post, I’ll just assume that it’s clear that the examples of assertoric disempowerment I’ve mentioned are repugnant.  Those who share my sensibilities will agree that women should be default-empowered to say (and mean) no, that black people should default-empowered to say (and mean) that they’re not resisting arrest, and that it should never be a condition on someone’s assertoric empowerment that s/he first be tortured.

It’s useful, then, to distinguish normative assertoric empowerment from descriptive assertoric empowerment.  On the one hand, default assertoric empowerment shouldn’t depend on the social category the speaker belongs to.  On the other hand, it often does.  What seems to happen all too often can be captured by a relativized version of the empowerment schema:

An epistemic agent S of socio-cultural category C is default-empowered to assert that p for a range R of propositions just in case S/C’s saying that p (when p is in R) typically carries some evidentiary weight.

When descriptive default assertoric empowerment diverges from normative default assertoric empowerment because of the role of the C-variable, we have an instance of social-categoriy-based-epistemic injustice.  In other words, if your belonging to a social category that should be irrelevant to whether you are empowered to say that p disempowers you from saying that p, you have been wronged.  (On the other side of the coin, if you are unfairly privileged to say that p only because you belong to a particular social category, a different sort of epistemic injustice has been committed.)  I won’t even attempt to lay out a general account of when people of a given category should or should not be default-empowered to assert that p.  For one thing, I don’t have the space here.  For another, I have no idea how to do so.  What I do want to try in the balance of this post is to convince you that a particularly pernicious form social-category-based epistemic injustice, in which people’s capacity to make assertions is undermined, is rife in the news — in particular, in the coverage of violent ongoing conflicts.

People don’t have time to travel the world in search of everything worth knowing.  We rely on reporters and newspapers to tell us what’s worth knowing.  We expect that, if we’ve chosen an epistemically responsible paper to read, then if it systematically ignores something, that thing isn’t worth knowing about.  One way in which epistemic injustice can crop up, then, is that people who have important assertions to make are systematically ignored because of where they’re from.  If you won’t be heard — and you know that you won’t be heard — then you cannot speak.  If you cannot speak even though you have something important to say, and your silence is determined by the social group you belong to, then epistemic injustice has occurred.

In decades of research, Gerd Gigerenzer and his collaborators have shown that the degree to which something is covered in the news is highly predictive of whether people in other countries recognize that thing.  Moreover, people seem to use the fact that they recognize something to decide whether it is large on some important dimension.  This “recognition heuristic” can be a powerful epistemic tool when the importance of something correlates with how much it gets covered in the news, and hence how many people recognize it and think it’s important.  For instance, Americans are surprisingly good (and better than Germans) at saying which of two German cities is bigger because they tend to recognize only some of them, and almost always say that the one they recognize is bigger.  Likewise, Germans are surprisingly good (and better than Americans) at saying which of two American cities is bigger because they tend to recognize only some of them, and almost always say that the one they recognize is bigger.

Population is an important dimension of a city, so it reflects well on major newspapers that their coverage (and hence our recognition and decision-making) tracks city population pretty well.  Indeed, correlations between population, news coverage, and proportion of people recognizing a city tend to be at least .60 and as high as .86.  On the plausible assumption that people from different cities have roughly as much of note to say as one another, high correlations like this indicate that epistemic justice is being served.  In other research, however, I’ve started to document problems with this model when cities outside of the US and Europe are thrown into the mix (see this post and follow-ups on my blog).  Although the correlation between population and coverage is .83 for the New York Times‘s coverage of German cities and .77 for Argentine cities, it’s a measly .41 for Turkish cities and drops to .19 when cities from Germany, Argentina, Turkey, Thailand, and Nigeria are considered together.  Ignoring for the sake of brevity a lot of important caveats, the reason for the international discrepancy is that cities outside of Europe are covered much, much less than those in Europe.  Here’s a graph that represents the correlations between ordinal population ranking and ordinal NYT coverage ranking for Germany and the rest of the world:

Screen Shot 2014-02-15 at 3.49.56 PM


Note the many cities, some of which are quite large, tied for last place with 0 mentions in the NYT.  If you lived in one of those cities between 2000 and 2010 (the dates covered by my analysis), you could not speak to the world — at least, not through the NYT.  Geography determines communicative destiny.

One might think that I’m overstating the case.  After all, maybe nothing important is going on in cities outside of Europe.  Maybe entire cities have lost their default assertoric empowerment because they have nothing worth saying.  Surely, though, you’d admit that whether people are meeting violent deaths in a given area would make that area remarkable.  If a newspaper fails to cover large-scale violence, then it is committing epistemic injustice against the survivors and victims, who presumably want to say something worth hearing about their plight.  The number of people killed in armed conflict is an important dimension of the such a horrific event.  One would hope, then, that the amount of news coverage would correlate well with the severity of the horror.  Sadly, this is not so.  To show this, I correlated the number of violent deaths in 2013 in a given area with the number of articles in the NYT that mentioned killing in the area in question.  There were 17 conflicts in which at least 100 people were killed (an arbitrary cutoff I imposed before looking at any correlations).  The correlation between the number of deaths in 2013 and the number of articles mentioning those deaths in 2013 was a paltry .28.  Here’s a scatterplot:

Screen Shot 2014-02-15 at 4.23.57 PM


The blue line is a regression line for the data.  It’s got a shadow around it indicating the 95% confidence interval.  Basically, what this means is that we can be 95% certain that the true regression line lies somewhere in the shaded area.  Notably, this means that, although the point-estimate of the correlation is .28, the real correlation could be positive, negative, or zero.  In other words, for all we know from this data, there is no correlation between the number of people killed in a violent conflict and the number of times that conflict is mentioned in the NYT.

draft of Moral Psychology, Chapter 1: preferences

As always, comments, suggestions, questions, criticisms, etc. are most welcome….

“We are strangers to ourselves.”

~ Friedrich Nietzsche, On the Genealogy of Morals, Preface, section 1


1 The function of preferences: prediction, explanation, planning, and evaluation


Among our diverse mental states, some are best understood as representing how the world is. If I know that wine is made from grapes, I correctly represent the world as being a certain way. If I think that Toronto is the capital of Canada, I incorrectly represent the world as being a certain way (it’s actually Ottawa). Other mental states are best understood as moving us to act, react, or forebear in various ways. I want to see the Grand Canyon before I die. I desire to know how to speak Spanish. I prefer to use chopsticks rather than a fork to eat sushi. I intend to keep my promises. I aim to be fair. I love to hear New Orleans-style brass band music. Depending on their longevity, their intensity, their specificity, their malleability, and their idiosyncrasy, we use different words to describe these mental states: values, drives, choices, appraisals, volitions, cravings, goals, reasons, purposes, passions, sentiments, longings, appetites, aspirations, attractions, motives, urges, needs, acts of will. Such mental states are sometimes referred to as pro-attitudes, and related states that move someone to avoid, escape, or prevent a particular state of affairs are correspondingly called con-attitudes.

If you put together an agent’s representations of how the world is and the mental states that move her to act, you have some hope of predicting and explaining her actions. Suppose, for instance, that you know that I have a free weekend, that I deeply yearn to see the Grand Canyon, and that I have some spare cash. What am I going to do? It’s not unreasonable to predict that I will purchase a plane ticket (or rent a car) and go to Arizona. Now suppose that you know that my comprehension of geography is pretty weak. I still want to see the Grand Canyon, but I mistakenly think that it’s in Chihuahua. (Oops – nobody’s perfect). What do you think I’ll do now? It’s not unreasonable to predict that I’ll still purchase a plane ticket or rent a car, but that instead of going to Arizona I’ll end up in Mexico (and pretty frustrated!). Someone’s representations and purposes combine to lead them to act. If you know what someone’s representations and purposes are, you can to some extent predict what they’ll do.

In the same vein, knowing what someone’s representations and purposes are puts you in a position to explain their actions. Suppose you see me stand up, walk across the room, open a door, and walk through the doorway. On the door, you notice the following icon:

Figure 1


Why did I do what I did? A plausible explanation isn’t too hard to assemble. If you saw the sign indicating that the door led to the men’s bathroom, then presumably I did too: so I probably had a relevant representation of what was on the other side of the door. What desire (preference, goal, intention, need) might I have that would rationalize my behavior? The most obvious suggestion is that I wanted to relieve myself. Of course, it’s possible that I went to the men’s bathroom to participate in a drug deal, to conceal myself while I had a good long cry, or for some other reason. But if you’re right in thinking that I wanted to urinate, then you’ve successfully explained my action. If you know what someone’s representations and purposes are, you can to some extent explain what they’ve done.

To predict and explain other people’s actions, we need some idea of what they prefer (want, desire, value, need). But that’s not all that preferences are for. Preferences also figure in planning and evaluation, and when they’re structured appropriately, they contribute to the agent’s autonomy. Think about your best friend. Imagine that her birthday is in a week. You love your friend, and want to do something special for her birthday. You don’t need to predict your own action here, nor do you need to explain it. Your task now is to plan: in the next week, what can you do for your friend that will simultaneously please and surprise her without emptying your bank account? To give your friend a special birthday present, you need to know what she enjoys (or would enjoy, if she hasn’t experienced it yet). To be motivated to give your friend a special birthday present in the first place, you need to want to do something that she wants. In philosophical jargon, you must have a higher-order desire – a desire about another desire (hers). You want to give her something that she wants.

It’s remarkable how adept people can be at solving this sort of problem, which involves the sort of recursively embedded agent-patient relations discussed in the introduction. Think about it. To plan a good gift, you need to know now not just what your friend currently wants but what she will want in the future. You can’t just give her what you yourself want or what you will want in a week. You can’t give her what she wants now but won’t want in a week. To successfully give your friend a good present, you have to figure out in advance what she’ll want in a week.

The same constraints apply when you plan for yourself. Think about choosing your major in college. What do you want to specialize in? Musicology is interesting, but will you still be interested in it three years from now? Will it set you up to earn a decent living (something you’ll presumably want in five, ten, and twenty years)? Marketing might earn you a decent living, but will you find it boring (not want to do it, or even want not to do it) after a few years? Are you going to want to have children? In that case, you may need more income than you would if you didn’t want (and didn’t have) children. Living a sensible life requires planning. You need to make plans that affect your friends, your family, your colleagues, and your rivals. You also need to make plans for yourself. Doing this successfully requires intimate knowledge of (or at least some pretty good guesses about) your own and others’ future desires, needs, and preferences.

Thus, preferences figure in the prediction, explanation, and planning of action. They’re also important when we morally evaluate action. I reach out violently and knock you over, causing you some pain and surprising you more than a little. What should you think of my action? It depends in part on what moved me to do it. If I’ve shoved you because I want to hurt you, if I’m engaged in an assault, you’re going to think I’m doing something wrong. If I’m not depraved, I’ll also feel guilty. If I’m just clumsily gesturing at a pretty tree over there, I should probably know better, but you’ll temper your anger. I may not feel guilty, but I’ll probably be embarrassed or even ashamed. If I’m knocking you out of the way of a biker who’s zooming down the sidewalk towards you, perhaps you’ll feel grateful, while I’ll feel relieved or even proud.

What marks the difference between your reactions to my action? What marks the difference between my own assessments of it after the fact? It’s not that my shoving you and your falling hurts more or less in one case or the other. Instead, what leads you to evaluate my action as wrong, misguided, or benevolent is the pro- (or con-)attitude that moves me to act. Likewise, what leads me to feel guilt, embarrassment, or relief is the pro- (or con-)attitude that moved me to act. If I want to hurt you, if I want to do something to you that you prefer not to happen, you’ll say that I’ve acted wrongly. If my aim is to do something relatively harmless (something you neither prefer nor disprefer) like pointing out a feature of the environment, you’ll perhaps think I’m a klutz, but you won’t think I’ve done something morally wrong. If I’m trying to prevent you from being run down by an out-of-control cyclist, if I want to do something to you that (once you understand it) you prefer that I do, you’ll presumably think I’ve done something morally good.

Preferences are important and versatile. They help us predict and explain actions. They help us exercise agency on our own behalf and for those we care about. They help us evaluate the actions of others and ourselves. In the context of moral psychology, there’s one last thing that preferences are good for: autonomy. According to many philosophers, such as Harry Frankfurt (1971, 1992), a person is autonomous or free to the extent that she wants what she wants to want, or at least does not want what she would prefer not to want. An autonomous agent is someone whose will has a characteristic structure. This idea is discussed in more depth in chapter 2.

As I mentioned above, we have dozens of terms to refer to pro- and con-attitudes. But the title of this chapter is ‘Preferences’. Why? Preferences are sufficiently fine-grained to help in the prediction, explanation, and evaluation of action in the face of tradeoffs. Other motivating attitudes lack this specificity. Consider, for instance, values.[1] At a high enough level of abstraction, everyone values the same ten things: power, achievement, pleasure, stimulation, self-direction, universalism, benevolence, tradition, conformity, and security (Schwartz 2012). If you want to know what someone will do, why someone did something, or whether someone deserves praise or blame for acting as they did, knowing that they accept these values gives you no purchase. Qualitatively weighting values doesn’t improve things much. Consider someone who values pleasure “somewhat,” stimulation “a lot,” and security “quite a bit.” What will she do? It’s hard to say. Why’d she go to the punk rock show? It’s hard to say. Does she merit some praise for engaging in a pleasant conversation with a stranger at the coffee shop? It’s hard to say.

Preferences set up a rank ordering of states of affairs. This is easiest to see in the case of tradeoffs. Suppose two desires are moving you to act. You’re exhausted after a long day, so you want to take a nap. But your friend just texted to suggest meeting up for a drink at a local bar, and you want to join her. We can represent this tradeoff with the following table:


  Nap Don’t nap
Join friend A B
Don’t join friend C D

Table 1: Choice matrix


In this simplified choice matrix, there are four ways things could turn out. You could take a nap and join your friend (A); you could join your friend without taking a nap (B); you could take a nap without joining your friend (C); and you could neither nap nor join your friend (D). If you have a complete set of preferences over these options, one of them is optimal for you, another is in second place, another is in third place, and the final one is in last place. Presumably A is your top outcome and D is your bottom outcome. Unfortunately, although you most prefer A (i.e., you prefer it to B, C, and D), it’s impossible. So you’re in a position where you need to weigh a tradeoff. This is where preferences become important. If you simply value the nap and value socializing with your friend, there’s no saying whether you’ll go with B or C. But if you prefer socializing to napping, we can predict that you’ll opt for B over C. By the same token, if you prefer napping to socializing, we can predict that you’ll opt for C over B.

So preferences are especially helpful in predicting behavior. They’re also great for explaining and evaluating behavior. A useful rule of thumb for explaining behavior is that people act in such a way as to bring about the highest-ranked outcome they think they can achieve. Imagine someone who prefers A to B, B to C, C to D, D to E, E to F, F to G, and G to H. She acts in such a way as to produce C. How can we explain this? Well, if we posit that she believes that A and B are out of the question (perhaps she takes them to be impossible or at least extremely difficult to achieve), then we can explain her behavior by saying that she went with the best outcome available to her.


2 The role of preferences in moral psychology


We’re now in a position to see how preferences relate to the five core concepts of moral psychology (patiency, agency, sociality, reflexivity, and temporality).


2.1 The role of preferences in patiency


Even if no one else is involved, even if you’re not exercising agency, your preferences matter for your patiency. According to one attractive theory of personal well-being, what it means for your life to go well is that your preferences are satisfied (Brandt 1972, 1983; Heathwood 2006). Your preferences might be satisfied through your own agency. You might prefer, among other things, to exercise agency in pursuit of some goal or other. Your preferences might be satisfied because you are involved in social relations with other people. Even so, there will be cases in which what you prefer happens or fails to happen simply by luck, accident, or unanticipated causal necessity. Fundamentally, then, well-being is associated with patiency, with what happens to you.

The preference-satisfaction theory of well-being is attractive for several reasons. It explains why one aspect of morality is intrinsically motivating. If my well-being is a matter of whether my preferences are satisfied, then I can’t help caring about my well-being. Preferences are a way of caring about things. Of course I care about what I care about. The preference-satisfaction theory of well-being also accounts for cases in which hedonic (pleasure-based) theories of well-being fail. Sometimes, it seems like my life goes no better, and may even go worse, when I experience some pleasures. I struggle with alcohol dependency and end up drinking to excess. While I enjoy the drinks, I prefer to stop. Arguably, I’m worse rather than better off because, even though I experience pleasure, my preferences are frustrated. Similarly, sometimes it seems like your life goes no worse, and may even go better, when you experience some pains. You exercise vigorously at the gym. You force yourself to study extra hard for an exam. You watch a frightening or depressing or horrifying movie. You eat a meal spiced with more than a little wasabi. These are painful experiences, but in each case you prefer to suffer through the pain. Arguably, you’re better rather than worse off because, even though you experience pain, your preferences are satisfied.

The preference-satisfaction theory of well-being also provides a way to understand well-being comparatively. People don’t just have good or bad lives. They have better or worse lives. Someone whose life is going poorly could be even worse off. Someone whose life is going well could be even better off. This distinction maps nicely onto the idea of a preference ranking. Since preferences can in principle put all the ways the world could be in order from best to worst, it’s possible to identify someone’s well-being with how far up their ranking things actually are. If you prefer A to B, B to C, C to D, D to E, E to F, F to G, and G to H, and the actual state of affairs is C, then your level of well-being is better than many ways it could be but not maximal. If things change to B, your well-being improves one notch; if things change to D, your well-being goes down a notch.

The most plausible version of the preference-satisfaction theory of well-being claims that what really contributes to your well-being is not the extent to which your actual preferences are satisfied but the extent to which your better-informed preferences are satisfied. Why? And what does it mean for preferences to be informed? Imagine that you’re about to take a bite of a delicious chile relleno. It’s your favorite dish. The cheese is perfectly melted. The poblanos are fresh. The tomatoes are local. Everything is perfect except for one little exception: unbeknownst to you, the cook accidentally used rat poison rather than salt. If you eat these chiles, you’re going to end up in the hospital. But you don’t know this; in fact, you have no clue. It won’t improve your life to eat those chiles. It’ll make your life (much!) worse.

Philosophers recognize this, and that’s why they say that your well-being is a function not of what you want but of what you would want if you were better informed. If you knew that the chiles relleno were poisoned, you would prefer quite strongly not to eat them, so even though you currently prefer to eat them, doing so would detract from rather than contribute to your well-being.

Knowledge of potential poisons is clearly not the only thing you need to have informed preferences, so philosophers of well-being argue that your better-informed preferences are your fully-informed preferences. According to this approach, the preferences that determine someone’s well-being are not the preferences that person actually has, but the ones they would have if they were fully informed. Specifying what full information means in a way that doesn’t collapse into omniscience is tricky, but one attractive suggestion is to take into account “all those knowable facts which, if [you] thought about them, would make a difference to [your] tendency to act” (Brandt 1972, p. 682) or “everything that might make [you] change [your] desires” (Brandt 1983, p. 40) – a process Richard Brandt dubbed cognitive psychotherapy.[2]


2.2 The role of preferences in agency, reflexivity, and temporality


I briefly mentioned the role of preferences in agency, reflexivity, and temporality above. Several points are relevant. First, to act at all, you must have pro-attitudes like preferences. Without states that move you to act, you’d never act in the first place, never exercise agency at all. Second, to act in the face of tradeoffs, you must have some way of ranking potential outcomes. That’s what preferences do: they put potential outcomes in a rank order. Third, to be the sort of agent that the vast majority of adult humans are, you need to engage in long-term plans and projects. This involves having some idea in advance what your future self’s preferences will or might be. It involves having temporally extended preferences, so that you want now for your future preferences, whatever they end up being, to be satisfied. It involves thinking of that future person as yourself and therefore having a special regard for him or her. If your future self mattered to you no more or less than some random stranger, long-term projects would be pretty foolish.

To be a recognizably human agent, your preferences must not violate certain constraints. Put less dramatically, your agency is undermined to the extent that your preferences violate certain constraints. You’ll fail to act successfully to the extent that you suffer from preference reversals (preferring A to B one moment and B to A the next moment). You’ll fail to act successfully if you have cyclical preferences (preferring A to B, B to C, but C to A). You’ll fail to act successfully over time if you cannot rely on your current representation of your future preferences to be largely accurate (thinking that you’ll prefer A to B when in fact you’ll prefer B to A).


2.3 The role of preferences in sociality


We tend to think that people deserve praise and blame only, or at least primarily, for their motivated actions. As I pointed out above, if someone inadvertently brings about a consequence, we tend to withhold or at least temper praise (even if the consequence was good) and blame (even if it was bad). Moral good luck is nice, but not particularly praiseworthy. Negligence is blameworthy, but less so than malignance.

The role of preferences in sociality is most directly comprehensible from a utilitarian (or other consequentialist) framework, but does not depend essentially on the truth of utilitarianism. Utilitarians such as Brandt analyze right action in terms of preference-satisfaction. According to Brandt (1983, p. 37), an action is permissible if (and only if) “it would be as beneficial to have a moral code permitting that act as to have any moral code that is similar but prohibits the act.” Obligatory and forbidden actions can then be defined in terms of permissibility using well-known equivalences in deontic logic: an obligatory action is one that it’s not permissible not to do, and a forbidden action is one that it’s not permissible to do. The connection with preferences is that benefit (and harm) are understood on this account in terms of well-being. In other words, according to Brandt, an action is permissible if (and only if) it would satisfy as many fully-informed preferences, across all people, to have a moral code permitting that act as to have any moral code that is similar but prohibits the act.

Brandt’s theory is a rule utilitarian approach to right action. One could instead adopt an act utilitarian theory, according to which an action is permissible if and only if performing it in the circumstances would be as beneficial as performing any alternative action (Smart 1956). Or one could adopt a motive utilitarian theory, according to which an action is permissible if and only if it’s what a person with an ideal motivational set (i.e., a psychologically possible motivational set that, over the course of a lifetime, is as beneficial as any alternative psychologically possible motivational set) would perform in the circumstances (Adams 1976). Regardless of the precise flavor of utilitarianism one adopts, then, it’s clear that, for utilitarians, preferences are immensely important on the dimension of sociality. To act in such a way as to satisfy the most preferences, you must take into account the effects of your action not just on yourself but on everyone else. In other words, you need to take into account how your agency affects others’ patiency. Nested agent-patient relations also play a role here. What you do (or fail to do) to one person will often have some effect on what they do (or fail to do) to another person, which will have an effect on what the second person does (or fails to do) to a third person, and so on.

As I mentioned above, the relevance of preferences to sociality is easiest to see from a utilitarian perspective, but it doesn’t rely on such a perspective. Virtue ethicists and care ethicists (though perhaps not Kantians) all accept the centrality of preferences in their approaches to sociality. For instance, one nearly universally recognized virtue is benevolence, the disposition both to want to benefit other people and to often succeed in doing so. Even if a virtue ethicist thinks that there are benefits other than preference-satisfaction, they admit that preference-satisfaction is one kind of benefit. In the same vein, Aristotle and other ancient virtue ethicists gave pride of place to friendship. Friends aim, among other things, to benefit each other (and typically succeed), which again involves (perhaps among other things) preference-satisfaction. Similarly, in the care tradition, the one-caring aims among other things to benefit the cared-for. This typically involves not only satisfying the cared-for’s informed preferences but actively helping the cared-for to get their actual preferences to approximate their idealized preferences.


3 Preference reversals and choice blindness


Thus, preferences matter in multiple ways to the core concepts of moral psychology. What does the scientific literature on preferences tell us about these important mental states? Two convergent lines of evidence suggest that preferences are neither determinate nor stable: the heuristics and biases research on preference reversals, and the psychological research on choice blindness.

Preferences are dispositions to choose one option over another. You strictly prefer a to b only if, if you were offered a choice between them, then ceteris paribus you would choose a. If your preferences are stable, then what you would choose now is identical to what you would choose in the future. If your preferences are determinate, then there is some fact of the matter about how you would choose. That is to say, exactly one of the following subjunctive conditionals is true: if you were offered a choice, then ceteris paribus you would choose a; if you were offered a choice, then ceteris paribus you would choose b; if you were offered a choice, then ceteris paribus you would be willing to flip a coin and accept a if heads and b if tails (or you would be willing to let someone else – even your worst enemy – choose for you). The kind of indeterminacy and instability I argue for in this section is modest rather than radical. I want to claim that preferences are unstable in the sense of sometimes changing in the face of seemingly trivial and normatively irrelevant situational influences, not in the sense of constantly changing. Similarly, I want to claim that preferences are indeterminate in the sense of there sometimes being no fact of the matter how someone would choose, not in the sense of there always being no fact of the matter how someone would choose.


3.1 Preference reversals


Two distinctions are worth making regarding the types of possible preference reversals. In a chain-type reversal, you prefer a to b, prefer b to c, and prefer c to a; such reversals are sometimes labeled failures of acyclicity. In a waffle-type reversal, you prefer a to b, but also prefer b to a. The other distinction has to do with temporal scale. Preference reversals can be synchronic, in which case you would have the inconsistent preferences all at the same time. More commonly, they are diachronic, in which case you might now prefer a to b and b to c, and then later come to prefer c to a (and perhaps give up your preference for a over b). Or you might now prefer a to b, but later prefer b to a (and perhaps give up your preference for a over b). In my (2012) paper, I call diachronic waffle-type reversals the result of Rum Tum Tugger preferences, after the character in T. S. Eliot’s Book of Practical Cats who is “always on the wrong side of every door.”

Preference reversals were first systematically studied by Daniel Kahneman, Sarah Lichtenstein, Paul Slovic, and Amos Tversky as part of the heuristics and biases research program.[3] In study after study, they and others showed that people’s cardinal preferences could be reversed by strategically framing the choice situation. When faced with a high-risk / high-reward gamble and a low-risk / low-reward gamble, most people choose the former but assign a higher monetary value to the latter. These investigations focused on choices between lotteries or gambles rather than choices between outcomes because the researchers were attempting to engage with theories of rational choice and strategic interaction, which – in order to generate representation theorems – employ preferences over probability-weighted outcomes. While this research is fascinating, its complexity makes it hard to interpret confidently. In particular, whenever the interpreter encounters a phenomenon like this, it’s always possible to say that the problem lies not in people’s preferences but in their credences or subjective probabilities. Since evaluating a gamble always involves weighting an outcome by its probability, one can never be sure whether anomalies are attributable to the value attached to the outcome or the process of weighting. And since we have independent reason to think that people’s ability to think clearly about probability is limited and unreliable (Alfano 2013), it’s tempting to hope that preferences can be insulated from this line of critique.

For this reason, I will focus on more recent research on preference reversals in the context of choices between outcomes rather than choices between lotteries (or, if you like, degenerate lotteries with probabilities of only 0 and 1). A choice of outcome a over outcome b can only reveal someone’s ordinal preferences; it can only tell us that she prefers a to b, not by how much she prefers a to b. This limitation is worth the price, however, because looking at choices between outcomes lets us rule out the possibility that any preference reversal might be attributable to the agent’s credences rather than her preferences.

Some of the most striking investigations of preference reversals in this paradigm have been conducted by Dan Ariely and his colleagues.   For instance, Ariely, Loewenstein, and Prelec (2006) used an arbitrary anchoring paradigm to show that preferences ranging over baskets of goods and money are susceptible to diachronic waffle-type reversals.[4] In this paradigm, a participant first writes down the final two digits of her social security number (henceforth SSN-truncation[5]), then puts a ‘$’ in front of it. Next, the experimenters showcase some consumer goods, such as chocolate, books, wine, and computer peripherals. The participant is instructed to record whether, hypothetically speaking, she would pay her SSN-truncation for the goods. Finally, the goods are auctioned off for real money. The surprising result is that participants with high SSN-truncations bid 57% to 107% more than those with low SSN-truncations.

To better understand this phenomenon, consider a fictional participant whose SSN-truncation was 89. She ended up bidding $50 for the goods, so, at the moment of bidding, she preferred the goods to the money; otherwise, she would have entered a lower bid. However, one natural interpretation of the experiment is that, prior to the anchoring intervention, she would or at least might have chosen that amount of money over the goods (i.e., she would have bid lower); in other words, prior to the anchoring intervention, she preferred the money to the goods. Anchoring on her high SSN-truncation induced a diachronic waffle-type reversal in her preferences. Prior to the intervention, she preferred the money to the goods, but after, she preferred the goods to the money. This way of explaining the experiment entails that her preferences were unstable: they changed in response to the seemingly trivial and normatively irrelevant framing of the choice.

Another way to explain the same result is to say that, prior to the anchoring intervention, there was no fact of the matter whether she preferred the goods to the money or the money to the goods. In other words, it was false that, given a choice, she would have chosen the goods, but it was equally false that, given a choice, she would have chosen the money or been willing to accept a coin flip. Only in the face of the choice in all its messy situational details did she construct a preference ordering, and the process of construction was modulated by her anchoring on her SSN-truncation. This alternative explanation entails that her preferences were indeterminate.

Furthermore, these potential explanations are mutually compatible. It could be, for instance, that her preferences were partially indeterminate, and that they became determinate in the face of the choice situation. Perhaps she definitely did not prefer the money to the goods prior to the anchoring intervention, but there was no fact of the matter regarding whether she was indifferent or preferred the goods to the money. Then, in the face of the hypothetical choice, this local indeterminacy was resolved in favor of preference rather than indifference. Finally, her newly-crystallized preference was expressed when she entered her bid.

Such a robust effect calls for explanation. My own suspicion is that a hybrid of indeterminacy and instability is the right theory of what happens in these cases, but it’s difficult to find evidence that points one way or the other. In any event, for present purposes, I’m satisfied with the inclusive disjunction of indeterminacy and instability.


3.2 Choice Blindness


There are many other – often amusing and sometimes depressing – studies of preference reversals, but the gist of them should be clear, so I’d like to turn now to the phenomenon of choice blindness, a field of research pioneered in the last decade by Petter Johansson and his colleagues. As I mentioned above, preferences are dispositions to choose. You prefer a to b only if, were you given the choice between them, then ceteris paribus you would choose a. Preferences are also dispositions to make characteristic assertions and offer characteristic reasons. While it’s certainly possible for someone to prefer a to b but not to say so when asked, the linguistic disposition is closely connected to the preference. Someone might be embarrassed by her preferences. She might worry that her interlocutor could use them against her in a bargaining context. She could be self-deceived about her own preferences. In such cases, we wouldn’t necessarily expect her to say what she wants, or to give reasons that support her actual preferences. But in the case of garden-variety preferences, it’s natural to assume that when someone says she prefers a to b, she really does, and it’s natural to assume that when someone gives reasons that support choosing a over b, she herself prefers a to b. Research on choice blindness challenges these assumptions.

Imagine that someone shows you two pictures, each a snapshot of a woman’s face. He asks you to say which you prefer on the basis of attractiveness. You point to the face on the left. He then asks you to explain why, displaying the chosen photograph a second time. Would you notice that the faces had been surreptitiously switched, so that the face you hadn’t pointed at is now the one you’re being asked about? Or would you give a reason for choosing the face that you’d initially dispreferred?   Johansson et al. (2005) found that participants detected the ruse in fewer than 20% of trials. Moreover, when asked for reasons, many of the participants who had not detected the manipulation gave reasons that were inconsistent with their original choice. For instance, some said that they preferred blondes even though they had originally chosen a brunette.

This original study of choice blindness has been supplemented with experiments in other domains. For instance, Hall et al. (2010) found that people exhibited choice blindness in more than two thirds of all trials when the choice was between two kinds of jam or two kinds of tea. After tasting both, participants indicated which of the two they preferred, then were asked to explain their choice while sampling their preferred option “again.” Even when the phenomenological contrast between the items was especially large (cinnamon apple versus grapefruit for jam, pernod versus mango for tea), fewer than half the participants detected the switch.

Choice blindness in the domain of aesthetic evaluations of faces and comestibles might not seem weighty enough to support the argument that preferences are often indeterminate and unstable. But perhaps choice blindness in the domain of political preferences and moral judgments would be. Johansson, Hall, and Chater (2011) used the choice blindness paradigm to flip Swedish participants’ political preferences across the conservative-socialist gap.[6] Participants filled in a series of scales on their political preferences for policies such as taxes on fuel. Some of these scales were then surreptitiously reversed, so that, for example, a very conservative answer was now a very socialist answer. Participants were then asked to indicate whether they wanted to change any of their choices, and to give reasons for their positions. Fewer than 20% of the reversals were detected, and only one in every ten of the participants detected enough reversals to keep their aggregate position from switching from conservative to socialist (or conversely). In a similar study, Hall, Johansson, and Strandberg (2012) used a self-transforming survey to flip participants’ moral judgments on both socially contentious issues, such as the permissibility of prostitution, and broad normative principles, such as the permissibility of large-scale government surveillance and illegal immigration. For instance, an answer indicating that prostitution was sometimes morally permissible would be flipped to say that prostitution was never morally permissible, and an answer indicating that illegal immigration was morally permissible would be flipped to say that illegal immigration was morally impermissible. Detection rates for individual questions ranged between 33% and 50%. Almost 7 out of every 10 of the participants failed to detect at least one reversal.

As with the behavioral evidence for preference reversals, the evidence for choice blindness suggests that people’s preferences are unstable, indeterminate, or both. The choices people make can fairly easily be made to diverge from the reasons they give. If preferring a to b is a disposition both to choose a over b and to offer reasons that support the choice of a over b (or at least not to offer reasons that support the choice of b over a), then it would appear that many people lack preferences, or that their preferences do exist but are extremely labile. Not only is there sometimes no fact of the matter about what we prefer, but also our preferences are often seemingly constructed on the fly in choice situations, and their ordering is shaped by seemingly trivial and normatively irrelevant factors.


3.3 A descriptive preference model


While it is of course possible to dispute the ecological validity of these experiments or my interpretation of them, I want to proceed by considering some of the philosophical implications of that interpretation, assuming for the sake of argument that it is sound. I’ve already explored some of the implications of this perspective in Alfano (2012), where I argue that the indeterminacy and instability of preferences infirm our ability to explain and predict behavior. Predictions of behavior often refer to the preferences of the target agent. If you know that Karen prefers vanilla ice cream to chocolate, then you can predict that, ceteris paribus, when offered a choice between them she will go with vanilla. Likewise for explanations: you can base an explanation of Karen’s choice of vanilla on the fact that she prefers vanilla. But if there’s no fact of the matter about what Karen prefers, you cannot so easily predict what she will do, nor can you so easily explain why she did what she did. A related problem arises when considering instability. If Karen prefers vanilla to chocolate now, but her preference is unstable, then the prediction that she will choose vanilla in the future – even the near future – is on shaky ground. For all you know, by the time the choice is presented, her preferences will have reversed. Similarly for explanation: if Karen’s preferences are unstable, you might be able to say that she chose vanilla because she preferred it at that very moment, but you gain little purchase on her longitudinal preferences from such an attribution.

I’ve responded to these problems by proposing a model in which preferences are interval-valued rather than point-valued. A traditional valuation function v maps from outcomes to points. The binary preference relation is then defined in terms of these points: a is strictly preferred to b just in case v(a) > v(b), b is strictly preferred to a just in case v(a) < v(b), and the agent is indifferent as between a and b just in case v(a) = v(b).

Screen Shot 2014-04-24 at 4.17.18 PM

Figure 2: a preferred to b because 1 = v(a) > 0 = v(b)


In the looser model I propose, by contrast, the valuation function maps from outcomes to closed intervals, such that a is strictly preferred to b just in case min(v(a)) > max(v(b)) and the agent is indifferent as between a and b just in case there is some overlap in the intervals assigned to a and b.

Screen Shot 2014-04-24 at 4.17.27 PM

Figure 3: indifference because neither min(v(a)) > max(v(b)) nor max(v(a)) < min(v(b))



Though this model preserves the transitivity of strict preference, it does not preserve the transitivity of indifference. This, however, may be a feature rather than a bug, since ordinary preferences as exhibited in choice behavior themselves seem not to preserve the transitivity of indifference.


4 Philosophical implications of the indeterminacy and instability of preferences


In this section, I consider some possible philosophical implications of the indeterminacy and instability of preferences, drawing on the descriptive model outlined in the previous section. Moving from the descriptive to the normative domain is always fraught, but, as I argued in the introduction, the two need to be explored in tandem, with mutual theoretical adjustments made on each side. Moral psychology without normative structure is a baggy monster. Normative theory without empirical support is a castle in the sky.


4.1 Implications for patiency


The primary worry raised for the theory of personal well-being by the indeterminacy and instability of preferences is that, if the extent to which your life is going well depends on or is a function of the extent to which you’re getting what you want, then well-being inherits the indeterminacy and instability of preferences. In other words, there might be no fact of the matter concerning how good a life you’re living at this very moment, and if there is such a fact, it might fluctuate from moment to moment in response to seemingly trivial and normatively irrelevant situational factors.

By way of example, consider someone who is eating toast with apple cinnamon jam. Is his life as good as it would be if he were eating toast with grapefruit jam? If he is like the people in the choice blindness studies mentioned above, there might be no answer to this question. If he’s told that he prefers apple cinnamon, he will prefer the present state of affairs, but if he is told that he prefers grapefruit, he’ll be less pleased with the present state of affairs than he would be with the world in which he is eating grapefruit jam. Whether his life is better in the apple cinnamon jam-world or the grapefruit jam-world is indeterminate until his preferences crystallize in one ordering or the other.

Or consider someone who has a brand new hardbound copy of Moby Dick, for which she just paid $50 when it was marked down from $70. Is her life going better now that she has the book, or was it going better before, when she had the money? If she is like the participants in Ariely’s preference reversal study, the answer may be “yes” to both disjuncts. Before she bought the book, she preferred the money to the book. But then she anchored on the manufacturer’s suggested retail price of $70, raised her valuation of the book, and ended up preferring it to $50. Her unstable preferences mean that she was better off with the money than the book, and that she is better off with the book than the money. It’s not a contradiction, but it makes her well-being a pain in the neck to evaluate.

Fortunately, though, there is a ready response to this worry, which begins by pointing out that the indeterminacy and instability of preferences is not radical but modest, a feature captured by the descriptive model sketched above. Although there may be no fact of the matter whether the life of the consumer of cinnamon apple jam is better than the life of the consumer of grapefruit jam, there is a fact of the matter whether either of these lives is better than that of someone who, instead of eating jam, is enduring irritable bowel syndrome. Although preference orderings may fluctuate between owning a book and having $50, they do not fluctuate between owning the same book and having $50,000. These observations are consistent with the interval-valued preferences of the descriptive model outlined in the previous section. In the first example, the intervals for cinnamon apple jam and for grapefruit jam overlap with each other, but neither overlaps with the interval for irritable bowel syndrome. Thus, we can still make a whole host of judgments about the quality of various possible lives, even if, when we “zoom in,” such judgments cannot always be made. In the second example, the intervals for having $50 and having the book overlap with each other, but neither overlaps with the interval for having $50,000.

For the price of this local indeterminacy and instability, the theoretician of well-being can purchase an answer to an objection to the preference-satisfaction theory of well-being. The objection goes like this: when assessing whether it would be better to have the life of a successful lawyer or a successful artist, it seems trivial or even perverse to ask whether the artist’s life would involve slightly more ice cream, even if the agent considering what to do with her life likes ice cream. Slight preferences shouldn’t bear normative weight in this context.

However, if we assume, as seems reasonable in light of the evidence, that her preference for a little more ice cream is weak enough that it could be shifted by preference reversal or choice blindness, then its normative irrelevance is unmasked. The life of the ice cream-deprived artist and the life of the ice cream-enjoying artist are assigned nearly identical intervals on the scale of preference – intervals that differ less from each other than from that assigned to the life of the lawyer. Hence, if we are willing to put up with a little indeterminacy and instability, we can avoid more serious objections to the theory of personal well-being.


4.2 Implications for sociality


The main worry raised by the indeterminacy and instability of preferences in the context of sociality is that, if right action depends on preference-satisfaction (perhaps among other things), then it inherits the indeterminacy and instability of preferences. It might turn out that there’s just no fact of the matter what it would be right to do, or that that fact is in constant flux. This worry is perhaps most pressing for preference-utilitarians, such as Brandt and Singer (1993), but it casts a long shadow. Even if you don’t think that right action is a function of preferences and only preferences, it’s hard to deny that preferences matter at all. For instance, as I pointed out above, virtue ethicists typically countenance benevolence as an important virtue. If, as I argued in the previous section, well-being is affected by the indeterminacy and instability of preferences, then benevolence is too. And even if one thinks that benevolence is not a virtue, virtually any tolerable theory of right action is going to say that maleficence is a vice and that there is a duty – whether perfect or imperfect – of non-maleficence.

In the remainder of this section, I will concentrate on the normative implications of indeterminacy and instability for preference-utilitarianism, but it should be clear that these are just some of the more straightforward implications, and that others.

Before considering some responses I find attractive, I should point out that the problem we face here is not the one that is solved by distinguishing between a decision procedure and a standard of value. An objection to utilitarianism that was lodged early and often is that it’s either impossible or at least extremely computationally complex to know what would satisfy the most preferences. This knowledge could only be acquired by eliciting the preference orderings of every living person – or perhaps even every past, present, and future person. The correct response to this objection is that utilitarianism is meant to be a standard of value, not a decision procedure.[7] It identifies (if it is the correct theory of right action) what it would be right to do, but that doesn’t mean that we can use it to find out what it would be right to do every time you make a moral decision. The distinction is meant to parallel other general theories: Newtonian mechanics would have identified, if it had been the correct physical theory, what a projectile will do in any circumstances whatsoever, even if people were unable to apply the theory in a given instance.

This response is unavailable in the present context. There are two ways in which it might be impossible to know what would satisfy someone’s preferences: epistemic and metaphysical. You would be unable to know what someone wants if there was a fact of the matter about what that person wants, but you couldn’t find out what that fact is. This would be a merely epistemic problem, and the distinction between a decision procedure and a standard of value handles it nicely. But you would also be unable to know what someone wants if there simply was no fact of the matter concerning what that person wants. If I am right that preferences are indeterminate, then this is the problem we now face, and it does not good to have recourse to the distinction between a decision procedure and a standard of value.

Preference-utilitarianism is not without resources, however. As in the case of well-being, one attractive response is to point out that preferences are only modestly indeterminate and unstable. Although there may be no uniquely most-preferred outcome for a given individual (or indeed for any individual), there will be many genuinely dispreferred outcomes, and hopefully a manageably constrained subset of preferred outcomes, than which nothing is more preferred. They are all outcomes than which nothing is determinately and stably better, but there is no unique best outcome.

Furthermore, from among this subset of alternatives it might be possible to winnow out those that satisfy preferences which we have independent normative grounds to reject – preferences that are silly, ignorant, perverse, or malevolent. As I pointed out above, it’s commonly argued in the context of right action that brute preferences carry less weight than fully-informed preferences. According to those who argue in this way, whether it’s right to do something depends less on whether it would satisfy people’s actual preferences than on whether it would satisfy their fully-informed preferences. It might be hoped that idealizing preferences would cut down or even eliminate their indeterminacy and instability.

Here’s what that might look like. Suppose that Jake’s actual preferences are captured by my interval-valued model. As such, they present two problems: they fail to uniquely determine how it would be right to treat Jake, and they may even rule out the genuinely right way to treat him because his actual preferences are normatively objectionable. It might be possible to kill these two birds with the single stone of idealization if idealization leads to unique, point-valued preferences that are no longer normatively objectionable. Perhaps there is only one way that Jake’s preferences could turn out after he undergoes cognitive psychotherapy. This is a big ‘perhaps,’ but it is worth considering. What evidence we have, however, suggests that idealizing in this way would not lead to determinate, stable preferences. When Kahneman, Lichtenstein, Slovic, and Tversky began to investigate preference reversals, many economists saw the phenomenon as a threat, since it challenged some of the most fundamental assumptions of their field. Accordingly, they tried to show that preference reversals could removed be root and branch if participants were given sufficient information about the choices they were making. Years of attempts to eliminate the effect proved fruitless.[8]

The burden is then on the idealizer to say what information participants lack in the relevant experiments. What does someone who bids high on a bottle of wine after considering her SSN-truncation not know, or not know fully enough? Perhaps she should be allowed first to drink some of the wine. While Ariely et al. (2006) did not investigate whether this would eliminate the anchoring on SSN-truncation, they did conduct other experiments in which participants sampled their options and thus had the relevant information. In one, participants first listened to an annoying sound over headphones, then bid for the right not to listen to the sound again. As in the consumer goods experiment, before bidding, participants first considered whether they would pay their SSN-truncation in cents to avoid listening to the sound again. And as expected, those with higher SSN-truncations entered higher bids, while those with lower SSN-truncations entered lower bids. It’s unclear what further information they could have acquired to inform their preferences. It seems more plausible is that they had too much information, not too little. If they hadn’t first considered whether to bid their SSN-truncation, they would not have anchored on it and would therefore have had “uncontaminated” preferences. But cognitive psychotherapy says to take into account “everything that might make [one] change [one’s] desires” (Brandt 1983, p. 40). Anchoring changed their desires, so it counts as part of cognitive psychotherapy. Perhaps the process can be revised by saying that one should take into account everything that might correctly or relevantly change one’s desires, but then the problem is to come up with an account of what makes an influence on one’s desires correct or relevant that doesn’t involve either a vicious regress or a vicious circle. No one has managed to do this, perhaps because it can’t be done.

Another response, which I find more attractive, is to embrace rather than reject the indeterminacy and instability of preferences. There are several ways to do this. One is to figure out which preferences are wildly indeterminate or unstable and disqualify their normative standing completely. Just as it makes sense to ignore the Rum Tum Tugger’s begging to be let inside because you know he’ll just beg to get back out again, perhaps it makes sense to hive off Jake’s indeterminate and unstable preferences, leaving a kernel of normatively respectable ones behind. Only these would matter when considering what it would be right to do by Jake, or what would promote his well-being.

A second way to embrace indeterminacy and instability is to make a less heroic assumption about the effect of cognitive psychotherapy. Instead of taking it for granted that this process is bound to converge on unique, point-valued preferences, perhaps it will merely shrink the width of Jake’s interval-valued preferences. In that case, even after idealization, there would be no unique characterization of what it would be right to do by Jake or what would most promote his well-being. As I’ve argued in the context of prediction and explanation (Alfano 2012), however, this might be a feature rather than a bug. Suppose that idealization yields a preference ordering that rules out most actions as wrong and condemns many outcomes as detrimental to Jake’s well-being, but does not adjudicate among many others. The remaining actions would then all be considered morally right in the weak sense of being permissible but not obligatory, and the remaining outcomes would all be vindicated as conducive to well-being. This strategy might help to solve the so-called demandingness problem by expanding what James Fishkin calls “the zone of indifference or permissibly free personal choice” (1982, p. 23; see also 1986). Thus, while it is possible to try to resist the evidence for indeterminacy and instability, or to acknowledge the evidence while denying its normative import, it may be better instead to embrace these features of preferences and use them to respond to existing problems.


5 Future directions in the moral psychology of preferences


Because preferences are involved in multiple ways in patiency, agency, sociality, temporality, and reflexivity, there are many avenues for further research. In this closing section, I list just a few of them.

First, further conceptual work by philosophers and theoretically-minded psychologists and behavioral economists may reveal or clarify relevant distinctions, such as a contemporary version of Mill’s distinction between higher and lower pleasures. Perhaps a useful distinction can be made between satisfaction of higher and lower preferences. According to Mill, one pleasure is higher than another if an expert who was acquainted with both would choose any amount of the former over any amount of the latter. This maps fairly directly onto the idea of lexicographic preferences: one good or value is lexicographically preferred to another if (and only if) any amount of the former would be chosen over any amount of the latter. Such values would be in principle immune to preference reversals. Jeremy Ginges and Scott Atran (2013) have found that when a value is “sacralized,” it becomes lexicographically preferred in this way. Moral values seem to be the only values that are capable of becoming sacred. However, tradeoffs have only been studied in one direction (giving up a sacred value to gain a secular value).

Second, further empirical research would help to determine whether the hiving off strategy succeeds. Is there some identifiable class of preferences that are especially susceptible to reversals and choice blindness? We currently lack sufficient evidence to say. It seems that effects may be stronger in business and gambling domains, weaker in social and health domains (Kuhberger 1998), but these distinctions are neither mutually exclusive nor exhaustive. This is yet another area in which collaboration between philosophers, who are specially trained in making this sort of distinction, and psychologists would be useful.

Third, to what extent do preference reversals and choice blindness disappear when people are informed about them? Are psychologists who know all about these effects less susceptible to them? More susceptible? The same as other people?

Fourth, are there some people who are congenitally more susceptible to preference reversals and choice blindness than others? There is very little research on this, though one study suggests that roughly a quarter of the population is highly susceptible and another quarter is immune (Bostic, Herrnstein, & Duncan 1990). Perhaps the preferences of people who are clear on what they want deserve more normative weight than the preferences of people who don’t know what they want. Perhaps the second group would benefit not so much from getting what they (think) want (for the moment) but from having their preferences shaped in more or less subtle ways.

Finally, on a related note, perhaps public policy should sometimes aim not so much to satisfy existing preferences, but to shape people’s preferences in such a way that they are (more easily) satisfiable. The idea here is to take advantage of the instability of preferences, cultivating them in such a way that the people who have them will be most able to satisfy their own wants. If you’re not getting what you want, either change what you’re getting, or change what you want. Of course, this proposal may seem objectionably paternalistic, but I tend to agree with Richard Thaler and Cass Sunstein (2008) in thinking that in some cases such policies may be permissible. In fact, it’s a striking asymmetry that almost no one objects to the shaping of beliefs, provided they are made to accord with (what we take to be) the truth, whereas it’s hard to find someone who doesn’t object to the shaping of desires and preferences. However, I would argue that the choice we often face is not whether to mould preferences but how. Given how easily preferences are influenced, it’s highly likely that they are constantly being socially shaped without our realizing it. If this is right, existing policies already shape preferences; we just don’t know how. The choice is therefore between inadvertently influencing preferences and doing so strategically. I tend to think that society has not just a right but an obligation to help people develop appropriate preferences – a point with which feminists such as Serene Khader (2011) concur. The worry that such interventions might be objectionably paternalistic can be assuaged somewhat by insisting, as Khader does, that the very people whose preferences are the targets of policy intervention participate in designing the interventions.


[1] Preferences are causally influenced by values, but values on their own don’t do all the work (Homer & Kahle 1988).

[2] A version of this idea was first formulated by Sidgwick (1981). Rosati (1995) argues persuasively that mere information without imaginative awareness and engagement with that information is not enough.

[3] See Lichtenstein & Slovic (1971); Slovic (1995); Slovic & Lichtenstein (1968, 1983); Tversky & Kahneman (1981); Tversky, Slovic, & Kahneman (1990).

[4] See also Ariely & Norton (2008), Green et al. (1998), Hoeffler & Ariely (1999), Hoeffler et al. (2006), Johnson and Schkade (1989), and Lichtenstein and Slovic (1971).

[5] A social security number is a kind of national identification code: it associates each citizen of the United States with a unique, quasi-random number.

[6] In the United States, this would be equivalent to flipping preferences across the conservative-liberal gap; in the United Kingdom, it would be equivalent to flipping preferences across the conservative-labor gap.

[7] Bentham (1789/1961, p. 31), Mill (1861/1998, 26), and Sidgwick (1907, p. 413) all deal with the objection in this way.

[8] See Berg, Dickhaut, & O’Brien (1985); Pommerehne, Schneider, & Zweifel (1982); and Reilly (1982).