The Early Debates

Magister Ludi


(i) Cardinality
(ii) The Independence Axiom
(iii) Allais's Paradox and the "Fanning Out" Hypothesis


(i) Cardinality

Since the Paretian revolution (or at least since its 1930s "resurrection"), conventional, non-stochastic utility functions u: X R are generally assumed to be ordinal, i.e. they are order-preserving indexes of preferences. By this we mean that the numerical magnitudes we give to u are irrelevant, as long as they preserve preference orderings. However, when facing the von Neumann-Morgenstern expected utility decomposition U(p) = ・/font> x p(x)u(x), it is common to fall into the misleading conclusion that utility is cardinal, i.e. that utility here is a measure of preferences.

Of course, the elementary utility function u: X R within U(p) is cardinal. Why this is so is clear enough: if expected utility is obtained by adding up probabilities multiplied by elementary utilities, then the precise measure of the elementary utilities matters very much indeed. More explicitly, if u represents preferences over outcomes, then it is unique up to any linear transformation, i.e. if v represents preferences over outcomes, then there is a b> 0 and a such that v = bu + c. This can be regarded as the "definition" of cardinality. Consequently, many early commentators - such as Paul Samuelson (1950) and William J. Baumol (1951) - condemned the expected utility construction because, with its cardinality, it seemed to revert the clock to pre-Paretian times. For the subsequent debate and search for clarification which ensued, see Milton Friedman and Leonard J. Savage (1948, 1952), Herman Wold (1952), Armen Alchian (1953), R.H. Strotz (1953), Daniel Ellsberg (1954), Nicholas Georgescu-Roegen (1954) and, finally, Baumol (1958).

The idea that early disputants did not realize and is of extreme importance to remember is that even though the elementary utility function u is a cardinal utility measure on outcomes, the utility function over lotteries U is not a cardinal utility function. This is because, and here is the important point, the elementary utilities on outcomes are not primitives in this model; rather, as we insisted earlier, the preferences on lotteries are the primitives. The utility function over lotteries U thus logically precedes the elementary utility function: the latter is derived from the former.

Consequently, the utility function U:D (X) R itself is an ordinal utility function since any increasing transformation of U will preserve the ordering on the lotteries. In other words, if U represents preferences on lotteries, then so does V = (U) where is an increasing monotonic transformation of U, e.g. if U(p) is a representation of the preference ordering on lotteries D (X), then V(p) = [U(p)]2 = [・/font> xp(x)u(x)]2 is also a representation of the preference ordering on lotteries. However, there is one sense in which U still carries an element of cardinality. Namely, given U(p) = ・/font> x p(x)u(x), then v: X R will generate an ordinally equivalent V(p) = ・/font> x p(x)v(x) if and only if v = bu + c for b > 0 - and thus V = bU + c. Thus, in von Neumann-Morgenstern theory, we have a "cardinal utility which is ordinal", to use Baumol's (1958) felicitous phrase.

(ii) The Independence Axiom

A second more serious issue confronted the early commentators - namely, the absence of the Independence Axiom (A.4) in the original v von Neumann-Morgenstern (1944) construction. This was first proposed by Jacob Marschak (1950) and, independently, Paul A. Samuelson (1952). It is now understood, following Edmond Malinvaud's (1952) demonstration, that the independence axiom was implied by the original axioms of von Neumann and Morgenstern (1944).

Almost from the outset, then, the Independence Axiom promised trouble. To understand why, it is best to come clear as to its meaning and significance. For instance, it was said that the Independence Axiom ruled out "complementarity", which some economists considered an unreasonable restriction. However, it is important to understand what that means. Consider the following example. Suppose there are two travel agencies, p and q competing for your attention. If you go to travel travel agency p, you get a free ticket to Paris with 100% probability; if you go to agency q you get a free ticket to London with 100% probability. Thus, considering the set of outcomes to be X = (ticket to Paris, ticket to London), then p = (1, 0) and q = (0, 1). Suppose you prefer to go to Paris than London, then you will choose p over q, i.e. p >h q.

To understand what the independence axiom does not say, consider the following situation. Suppose now that both travel agencies also give out a book of free vouchers for London theatres with 20% probability and give out their old prizes with 80% probability, so now p = (0.8, 0, 0.2) and q = (0, 0.8, 0.2). Now, it may seem like the independence axiom says that in this case one still prefers p to q which seems to go against common sense as the a ticket to London and vouchers for London theatres are natural complements - and thus the introduction of these tickets ought to change one's preference of agency p over agency q.

However, this reasoning is wrong - the independence axiom does not claim this at all. This is because the offer of London theatre vouchers is not in addition to the regular plane tickets but rather it is offered instead of the regular tickets. Ticket to Paris, ticket to London and theatre vouchers are mutually exclusive outcomes, i.e. X = (ticket to Paris, ticket to London, theater vouchers). If one prefers Paris to London, one will still go to travel agency p rather than q, regardless of whether or not there is a possibility of getting a vouchers for London theatres in both places instead of the plane ticket one was hoping for. Thus, the independence axiom does not rule out complementarity of outcomes, but rather rules out complementarity of lotteries.

Yet there are cases when the independence axiom can be counterintuitive. Suppose we have our travel agencies again and suppose now that a new random factor comes in and there is now a 20% probability that instead of the original prize, you will get a free viewing of a new movie that is set in Paris. Thus, now our outcome space is X = (ticket to Paris, ticket to London, movie about Paris) and p = (0.8, 0, 0.2) and q = (0, 0.8, 0.2).

Now, the independence axiom says that you will still prefer going to travel agency p and try to get the Paris tickets rather than switching to agency q for the London tickets. However, it may be argued that there might be a reversal of choice in the real world. If you choose agency p and win the movie prize, you may sit through it cursing your bad luck for missing out on your possible trip and not enjoy it at all. Watching a movie about Paris when you had the possibility of actually going there (agency p) may be worse than watching the movie when the possibility was not there (agency q). In this case, one might prefer going to agency q rather than p. Thus, the sudden emergence of the possibility of the movie has led to a reversal on your choice of agencies. This sort of situation is what the independence axiom rules out.

How necessary is the independence axiom? To understand its importance, it is useful to attempt a diagrammatic representation of the von Neumann-Morgenstern theory. A diagram due to Jacob Marschak (1950) is depicted in Figure 3 where X = {x1, x2, x3} and thus D (X) = (p1, p2, p3) is the set of all probability measures on X. This set is depicted in Figure 3 as the area in the triangle. The corners represent the certainty cases, i.e. p1 = 1, p2 = 1 and p3 = 1. A particular lottery, p = (p1, p2, p3) is represented as a point in the triangle in Figure 3 Note that as ・/font> i=1n pi = 1, then p2 = 1 - (p1 + p3). The case p2 = 1 will represent the origin. Thus, a point on the horizontal axis will represent a case when p1 > 0, p2 > 0 and p3 = 0 while a point on the vertical axis will represent a case where p1 = 0, p2 > 0 and p3 > 0 and, finally, a point on the hypotenuse will represent the case where p1 > 0, p2 = 0 and p3 > 0. A point anywhere within the triangle represents the case where p1 > 0, p2 > 0, p3 > 0.

expect3.gif (4542 bytes)

Figure 3 - Marschak Triangle

Now, suppose we begin in Figure 3 at point p = (p1, p2, p3). Obviously, as p2 = 1 - p1 - p3, then horizontal movements from p to the right represent increases in p1 at the expense of p2 (while p3 remains constant), while vertical movements from p upwards represent increases in p3 at the expense of p2 (while p1 remains constant). Changes in all probabilities are represented by diagonal movements. For example, suppose that we attempt to increase p3 to p3 ; in this case, either p1 or p2 or both must decline. As we see in Figure 3, p1 goes to p1 while p2 changes by the residual amount from 1- p1 - p3 to 1 - p1 - p3 . Suppose, on the other hand, that beginning from p we seek to increase p2. In this the movement will be something like p to p , i.e. p1 declines to p1 and p3 falls to p3 so the residual p2 = 1-p1 -p2 is higher.

Compound lotteries are easily represented. Consider two simple lotteries in Figure 3, p = (p1, p2, p3) and p = (p1 , p2 , p3 ). Note that we can take a convex combination of p and p with a [0, 1] yielding a p + (1-a )p as depicted in Figure 3 by the lottery on the line connecting p and p . We thus have point a p + (1-a )p = (a p1 + (1-a )p1 , a p2 + (1-a )p2 , a p3 + (1-a )p3 ) as our compound lottery. In Figure 3, a p1 + (1-a )p1 = p1a and a p3+(1-a )p3 = p3a , thus the new simple lottery representing our compound lottery will be (pa 1, (1-p1a -p3a ), p3a ).

Now, let us turn to preferences. The straight lines U, Ua and U are indifference curves. The arrow indicates the direction of increasing preference, so U > Ua > U, which implies, of course, that p >h a p + (1-a )p >h p ~ p are the preferences between four lotteries in Figure 3. Obviously, given the direction of increasing preference, the extreme lottery where p3 = 1 (i.e. x3 with certainty) is the most desirable lottery, thus by implication x3 >h x2 >h x1. The reason indifference curves are straight lines comes from the linearity of the von Neumann-Morgenstern utility function which, in this case, can be written as:

U = p1u(x1) + (1-p1-p3)(x2) + p3u(x3)

thus differentiating with respect to p1 and p3 and setting to zero:

dU = - [u(x2) - u(x1)]dp1 + [u(x3) - u(x2)]dp3 = 0

thus, solving:

dp3/dp1|U = [u(x2) - u(x1)]/[u(x3) - u(x2)] > 0

which is the slope of the indifference curves in Figure 3. The positivity comes from the assumption that x3 >h x2 >h x1, or u(x3) > u(x2) > u(x1), thus the indifference curve in the Marschak triangle is upward sloping. Note that as neither p1 nor p3 are in this term, then the slopes are unaffected by changes in p, and thus the indifference curves are parallel straight lines increasing in value to the northwest.

We can detect a few more things in this diagram regarding the von Neumann-Morgenstern axioms. In particular, notice the depiction of unique solvability. Consider three distributions, p, p and q in Figure 3. Notice that as U > Ua > U, then p >h q > p. Then, unique solvability claims that there is an a [0, 1] such that q ~ a p + (1-a )p . This is exactly what is depicted in Figure 3 via point pa as U(pa ) = Ua = U(q). Thus, for any point in the area between U and U , we can make a convex combination of p and p such that the convex combination is equivalent in utility to that point. Also recall that the independence axiom claims that for any b [0, 1], p >h p if and only if b p + (1-b )q >h b p + (1-b )q. This is depicted in Figure 3 where b p + (1-b )q is given by point pq on the line segment connecting p and q and b p + (1-b )q is given by point pq on the line segment connecting p and q. Obviously, pq >h pq if and only if p >h p.

However, Figure 3 does not really illustrate the centrality of the Independence Axiom. To see it more clearly, examine Figure 4. It is easy to demonstrate that the independence axiom requires that the indifference curves be linear and parallel. Consider allocations p and q in Figure 4 where p ~h q as both p and q lie on the same indifference curve U . Now, by the independence axiom, it must be that b p + (1-b )q ~h b q + (1-b )q = q , so any convex combination between two equivalent points must be on the same indifference curve. This is exactly what we obtain with linear indifference curve U in Figure 4 where r = b p + (1-b )q . However, notice that if we have instead a non-linear indifference curve U that passes through both p and q , then it would not necessarily be the case that b p + (1-b )q ~h q . As we see in Figure 4, when r = b p + (1-b )q , then r >h q as r lies above U . Thus, non-linear indifference curves violate the independence axiom.

expect4.gif (4167 bytes)

Figure 4 - The Independence Axiom

The independence axiom also imposes another restriction: namely, that the indifference curves are parallel to each other. To see this, examine Figure 4 again and consider two non-parallel but linear indifference curves U and Ua . Now, obviously, q h p (or, more precisely, q ~h p) as they lie on the same indifference curve U. However note that when taking the same convex combination with r so that qa = a q + (1-a )r and pa = a p + (1-a )r, then note that qa <h pa as pa lies above the indifference curve Ua . This is a violation of the independence axiom as q h p does not imply qa = a q + (1-a )r h a p + (1-a )r = pa . To restore the independence axiom, we would need the indifference curve passing through qa to be parallel to the previous curve U so that pa ~h qa . This is what we see when we impose parallel Ua instead of the non-parallel Ua .

Finally, we should note the impact of the risk-aversion on this diagram. (we discuss risk-aversion and measures of risk in detail elsewhere) Suppose that x3 > x2 > x1. Then, for any p, we can define a particular amount of expected return E = p1x1 + p2x2 + p3x3 = p1x1 + (1-p1-p3)x2 + p3x3. Thus a mean-preserving change in spread would maintain the same E, i.e.

0 = x1dp1 + x2dp2 + x3dp3 = x1dp1 - x2(dp1 + dp3) + x3dp3

as dp2 = -dp1 - dp3. Thus, rearranging:

(x1 - x2)dp1 + (x3 - x2)dp3 = 0


dp3/dp1|E = (x2 -x1)/(x3-x2) > 0

because of the proposed x3 > x2 > x1. Thus, we can define an expected return "curve" E in the Marschak triangle with slope (x2-x1)/(x3-x2). However, recall that the slope of the indifference curve is dp3/dp1|U = [u(x2) - u(x1)]/[u(x3) - u(x2)]. Because risk-aversion implies a concave utility function, then this means that:

dp3/dp1|U = [u(x2) - u(x1)]/[u(x3) - u(x2)] > (x2 -x1)/(x3-x2) = dp3/dp1|E

i.e. the slope of the indifference curves are steeper than the expected return curve E. Risk-neutrality would imply they are equal while risk-loving implies that indifference curve are flatter than E. Obviously, the greater the degree of risk-aversion, the steeper the indifference curves become. Note that a stochastically dominant shift in distribution from some initial p will be a movement to any distribution to the northwest of it. The reasoning for this, of course, is that northwesterly movements lead to increases in p3 or p2 at the expense of p1, and as x3 >h x2 >h x1, then such a shift constitutes a stochastically dominant shift.

In sum, of all the von Neumann-Morgenstern axioms, it is appears that the independence axiom is the great workhorse that pushes the results through so smoothly. As we shall see in the next section, it is also the one that is most liable to fail empirically. As we shall see even later, some have also claimed that there is also sufficient evidence of violations of the transitivity axiom. There has also been attempted assassinations on other axioms, e.g. on the Archimedean axiom by Georgescu-Roegen (1958), but there is very little empirical evidence for this.

(iii) Allais's Paradox and the "Fanning Out" Hypothesis

An early challenge to the Independence Axiom was set forth by Maurice Allais (1953) in the now-famous "Allais Paradox". To understand it, it is best to proceed via an example. Consider the quartet of distributions (p1, p2, q1, q2) depicted in Figure 5 which, when connected, form a parallelogram. These points represent the following sets of lotteries. Let outcomes be x1 = $0, x2 = $100 and x3 = $500. Let us start first with the pair of lotteries p1 and p2:

p1: $100 with certainty

p2: $0 with 1% chance, $100 with 89% chance, $500 with 10% chance,

so p1 = (0, 1, 0) (and thus is at the origin) and p2 = (0.01, 0.89, 0.10) (and thus is in the interior of the triangle). As it happens, agents usually choose p1 in this case, so p1 >h p2 or, as shown in Figure 5, there is an indifference curve Up such that p1 lies above it and p2 lies below it. In contrast, consider now the following pairs of lotteries:

q1: $0 with 89% chance and $100 with 11% chance

q2: $0 with 90% chance and $500 with 10% chance

so q1 = (0.89, 0.11, 0) (and thus is on the horizontal axis) and q2 = (0.90, 0, 0.10) (and thus is on the hypotenuse). Now, if indifference curves are parallel to each other, then it should be that q1 >h q2. We can see this diagrammatically in Figure 5 by comparing Up which divides p1 from p2 and Uq which divides q1 from q2. Obviously, as q1 lies above Uq and q2 below it, then q1 >h q2.

Recall that it was the independence axiom that guaranteed this. To see this clearly for this example, we shall show that if the independence axiom is fulfilled, then it is indeed true that p1 >h p2 q1 >h q2, i.e. there is a Uq that divides it in the manner of Figure 5. As p1 >h p2 then by the von Neumann-Morgenstern expected utility representation, there is some elementary utility function u such that:

u($100) > 0.1u($500) + 0.89u($100) + 0.01u($0)

But as we can decompose u($100) = 0.1u($100) + 0.89u($100) + 0.01u($100), then subtracting 0.89u($100) from both sides, this implies:

0.1u($100) + 0.01u($100) > 0.1u($500) + 0.01u($0)

But now adding 0.89u($0) to both sides:

0.1u($100) + 0.01u($100) + 0.89u($0) > 0.1u($500) + 0.01u($0) + 0.89u(0)

where, as we added the same amount to both sides, then the independence axiom claims that the inequality does not change sign. But combining the similar terms together, this means:

0.11u($100) + 0.89u($0) > 0.1u($500) + 0.90u(0)

which implies that q1 >h q2, which is what we sought.

expect5.gif (4790 bytes)

Figure 5 - Allais's Paradoxes and the "Fanning Out" Hypothesis

However, as Maurice Allais (1953) insisted (and later experimental evidence has apparently confirmed), when confronted with these set of lotteries, people tend to choose p1 over p2 in the first case and then choose q2 over q1 in the second case -- thereby contradicting what we have just claimed. Anecdotal evidence has it that even Leonard Savage, when confronted by Allais's example, made this contradictory choice. Thus, they must be violating the independence axiom of expected utility. This is the "Allais Paradox".

It has been hypothesized that these contradictory choices imply what is called a "fanning out" of indifference curves. Specifically, assume the indifference curves are linear but not parallel so they "fan out" as in the sequence of dashed indifference curves U , U , Up, U etc. in Figure 5. Notice that even if we let Up dominate the relationship between p1 and p2 (so predicting that p1 >h p2), it is the flatter U (and not the parallel Uq) that governs the relationship between q1 and q2 - and thus, as q1 lies below U and q2 above it, then q2 >h q1. If we can somehow allow the fanning of indifference curves in this manner, then Allais's Paradox would no longer be that paradoxical.

However, what guarantees this "fanning out"? Maurice Allais's (1953, 1979) suggestion, further developed by Ole Hagen (1972, 1979), was that the expected utility decomposition was incorrect. The utility of a particular lottery p is not U(p) = E(u; p) = ・/font> x X p(x)u(x). Rather, Allais proposed that U(p) = [E(u; p), var(u; p)], so that the utility of a lottery is not only a function of the expected utility E(u; p) but also incorporates the variance of the elementary utilities var(u; p). (Hagen incorporates the third moment as well).

Allais's "fanning out" hypothesis would also yield what Kahneman and Tversky (1979) have called the "common consequence" effect. The common consequence effect can be understood by appealing to the independence axiom which, recall, claims that if p >h q, then for any b [0, 1] and r D (X), then b p + (1-b )r >h b q + (1-b )r. In short, the possibility of a new lottery r should not affect preferences between the old lotteries p and q. However, the common consequence effect argues that the inclusion of r will affect one's preferences between p and q. Intuitively, p and q now become "consolation prizes" if r does not happen. The short way of describing the common consequence effect, then, is that if the prize in r is great, then the agent becomes "more" risk-averse and thus modifies his preferences between p and q so that he takes less risky choices. The idea is that if r offers indeed a great prize, then if one does not get it, then one will be very disappointed ("cursing one's bad luck") - and the greater the prize r offered, the greater the disappointment in the case one does not get it. Intuitively, the common consequence effect argues that getting $50 as a consolation prize in a multi-million dollar lottery one has lost is probably less exhilarating than finding $50 on the street. Consequently, in order to compensate for the potential disappointment, an agent will be less willing to take on risks as an alternative - as that would only worsen the burden. In contrast, if r is not that good, then one might be more willing to take on risks.

We can see this in the context of the example for Allais's Paradox. Decomposing our expected utilities:

E(u; p1) = 0.1u($100) + 0.89u($100) + 0.01u($100)

E(u; p2) = 0.1u($500) + 0.89u($100) + 0.01u($0)

E(u; q1) = 0.1u($100) + 0.01u($100) + 0.89u($0)

E(u;q2) = 0.1u($500) + 0.01u($0) + 0.89u(0)

Notice that the "common part" between p1 and p2 is 0.89u($100) whereas the common part between q1 and q2 is 0.89u($0), thus the common prize for the p1/p2 trade-off is rather high while the common prize for the q1/q2 trade-off is rather low. Notice, also, by omitting the "common parts" that p1 is less risky than p2 and, similarly, q1 is less risky than q2. Thus, the common consequence effect would imply that in the case of the high-prize pair (p1/p2) the agent should be rather risk-averse and thus prefer the low-risk p1 to the high-risk p2, while in the low-prize case (q1/q2), the agent will not be very risk-averse at all and thus might take the riskier q2 rather than q1. Thus, p1 >h p2 and q1 <h q2, as Allais suggests, can be explained by this common consequence effect.

Another of Allais's (1953) paradoxical examples exhibits what is called a "common ratio" effect. This is also depicted in Figure 5 when we take the quartet of lotteries (s1, s2, q1, q2). Notice that the line s1s2 is parallel to q1q2, i.e. they have a "common ratio". Now, by effectively the same argument as before, we can see that by the parallel linear indifference curves Us, Uq that s1 >h s2 and q1 >h q2. However, by the "fanning out" indifference curves, we see that s1 >h s2 but q1 <h q2. The structure of a common ratio situation would be akin to the following:

s1: p chance of $X and (1-p) chance $0

s2: p chance of $Y and (1-p ) chance of $0.

q1: kp chance of $X and (1-kp) chance of $0

q2: kp chance of $Y and (1-kp ) chance of $0.

where p > p , $Y > $X > 0 and k (0, 1). Although we only have two choices within each lottery, in terms of Figure 5, we can pretend we have outcomes (x1, x2, x3) = ($Y, $X, $0) and for each lottery set the probability of the unavailable outcome to zero. As we see, in each case, we hug the axes and the hypotenuse in Figure 5. Notice also that the p/p = kp/kp ("common ratio") thus the line connecting s1 and s2 is parallel to q1 and q2. Now, the independence axiom argues that if s1 >h s2 then it should be that q1 >h q2. To see why, note that:

E(u; s1) = pu($X) + (1-p)($0)

E(u; s2) = p u($Y) + (1-p )u($0)

E(u; q1) = kpu($X) + (1-kp)($0)

E(u;q2) = kp u($Y) + (1-kp )u($0)

Now, the independence axiom claims that if s1 > s2, then a convex combination of these lotteries with the same third lottery r implies ks1 + (1-k)r > ks2 + (1-k)r where k (0, 1). However, let r be a degenerate lottery which yields $0. In this case expected utility of the compound lottery ks1 + (1-k)r is:

E(ks1 + (1-k)r) = k[pu($X) + (1-p)u($0)] + (1-k)u($0)

= kpu($X) + (1-kp)u($0)

while the expected utility of the compound lottery ks2 + (1-k)r is:

E(ks2 + (1-k)r) = k[p u($Y) + (1-p )u($0)] + (1-k)u($0)

= kp u($Y) + (1-kp )u($0)

Thus, notice that actually ks1 + (1-k)r = q1 and ks2 + (1-k)r = q2. So, the independence axiom claims that if s1 >h s2, then it must be that q1 >h q2. This is shown in Figure 5 by the dividing parallel indifference curves Us and Uq. However, as experimental evidence has shown (e.g. Kahnemann and Tversky, 1979), these are not the usual choices people make. Rather, people usually exhibit s1 >h s2 but then q1 <h q2. As we can see in Figure 5, the "fanning out" hypothesis would explain such contradictory choices.

book4.gif (1891 bytes)
Selected References