** **

Contents

(A) Expected Utility with Univariate Payoffs

(B) Risk Aversion, Neutrality and Proclivity

(C) Arrow-Pratt Measures of Risk-Aversion

(D) Application: Portfolio Allocation and Arrow's Hypothesis

(E) Ross's Stronger Risk-Aversion Measurement

**(A) Expected Utility with Univariate Payoffs**

The von Neumann-Morgenstern expected utility
hypothesis claimed that the utility of a lottery could be written as U = ・/font> _{xﾎ supp(p)} p(x)u(x) where
we referred to U: D (X) as the expected utility function and u:
X ｮ R as the implied elementary utility function. At the risk
of confusion with the literature, we shall refer to the utility function on outcomes, u: X
ｮ R, as the *elementary *utility function (what we
sometimes referred to earlier as the "Bernoulli utility
function") and reserve the term "von Neumann-Morgenstern utility
function" to U(p), the utility of a lottery.

After the axiomatization of the expected utility hypothesis by John von Neumann and Oskar Morgenstern (1944), economists began immediately
seeing the potential applications of expected utility to economic issues like portfolio
choice, insurance, etc. These simple applications tended to use simple models where
outcomes were expressed as a single commodity, "wealth", thus the set of
outcomes X, became merely the real line, R. As a result, a "lottery" is now
conceived as a random variable z taking values in R. Consequently, preferences over
lotteries can be thought of as preferences over alternative probability distributions.
Thus, letting F_{z} denote the cumulative probability distribution associated with
random variable z where F_{z}(x) = prob{z ｣ x}, then
we can think of agents making choices over different F_{z}. Accordingly, the
preferences over lotteries, ｳ _{h}, are now defined
over the space of cumulative distribution functions. Thus, letting the von
Neumann-Morgenstern utility function U represent preferences over distributions, then
lottery F_{z} is preferred to F_{y}, F_{z} ｳ
_{h} F_{y} if and only if U(F_{z}) ｳ
U(F_{y}). Consequently, the expected utility decomposition of U(F_{z}) is
now:

U(F

_{z}) = ・/font>_{R}u(x) dF_{z}(x)

where u: R ｮ R is the elementary utility
function over outcomes. Naturally, if z only takes a finite number of values and thus
there were a finite number of probabilities, then this becomes the more familiar U(F_{z})
= ・/font> _{x} p(x)u(x).

**(B) Risk Aversion, Neutrality and Proclivity**

We first turn to the concept of univariate "risk aversion"
which, intuitively, implies that when facing choices with comparable returns, agents tend
chose the less-risky alternative, a construction we owe largely to Milton Friedman and Leonard J. Savage (1948). We can visualize the problem as in
Figure 1 below. Let z be a random variable which can take on two values, {z_{1}, z_{2}},
and let p be the probability that z_{1 }happens and (1-p) the probability that z_{2}
happens. Consequently, expected outcome, or E(z) = pz_{1} + (1-p)z_{2}
which is shown in Figure 1 on the horizontal axis as the convex combination of z_{1}
and z_{2}. Let u: R ｮ R be the elementary utility
function depicted in Figure 1 as concave. Thus, expected utility E(u) = pu(z_{1})
+ (1-p)u(z_{2}), as shown in Figure 1 by point E on the chord connecting A = {z_{1},
u(z_{1})} and B = {z_{2}, u(z_{2})}. The position of E on the
chord depends, of course, on the probabilities p, (1-p).

Figure 1- Risk-Aversion and Certainty Equivalence

Notice by comparing points D and E in Figure 1 that the concavity of the
elementary utility function implies that the utility of expected income, u[E(z)] is
greater than expected utility E(u), i.e. u[pz_{1} + (1-p)z_{2}] > pu(z_{1})
+ (1-p)u(z_{2}). This represents the utility-decreasing aspects of pure
risk-bearing. We can think of it this way. Suppose there are two lotteries, one that pays
E(z) with certainty and another that pays z_{1} or z_{2} with
probabilities (p, 1-p) respectively. Reverting to our von Neumann-Morgenstern notation,
the utility of the first lottery is U(E(z)) = u(E(z)) as E(z) is received with certainty;
the utility of the second lottery is U(z_{1}, z_{2}; p, 1-p) = pu(z_{1})
+ (1-p)u(z_{2}). Now, the expected income in both lotteries is the same, yet it is
obvious that if an agent is generally averse to risk he would *prefer* E(z) with
certainty than E(z) with uncertainty, i.e. he would choose the first lottery over the
second. This is what is captured in Figure 1 as u[E(z)] > E(u).

Another way to capture this effect is by finding a "*certainty-equivalent*"
allocation. In other words, consider a third lottery which yields the income C(z) with
certainty. As is obvious from Figure 1, the utility of this allocation is equal to the
expected utility of the random prospect, i.e. u(C(z)) = E(u). Thus, lottery C(z) with
certainty is known as the certainty-equivalent lottery, i.e. the sure-thing lottery which
yields the same utility as the random lottery. However, notice that the income C(z) is *less*
than the expected income, C(z) < E(z). Yet we know that an agent would be indifferent
between receiving C(z) with certainty and E(z) with uncertainty. This difference, which we
denote p (z) = E(z) - C(z) is known as the *risk-premium*,
i.e. the maximum amount of income that an agent is willing to forego in order to obtain an
allocation without risk (Pratt, 1964).

Turning to generalities, letting u: R ｮ R be a
elementary utility function, z be a random variable with cumulative distribution function
F_{z}, so F_{z}(x) = P{z ｣ x}. We denote by M
the set of all random variables. For a particular random variable z ﾎ
M, the expected z is E(z) = ・/font> _{R} x dF_{z}(x)
and the expected utility is E(u(z)) = ・/font> _{R}u(x) dF_{z}(x).
Let C^{u}(z) denote the certainty-equivalent allocation, i.e. C^{u}(z) ~_{h}
z and the risk premium as p ^{u}(z) = E(z) - C^{u}(z)
- where the superscript "u" reminds us that certainty equivalence and
risk-premium are dependent on the form of the elementary utility function. Then we can
define risk-aversion as follows:

Risk-Aversion: an agent is "risk-averse" if C^{u}(z) ｣ E(z) or p^{u}(z) ｳ 0 for all z ﾎ M.

This just formalizes the notion that we had in Figure 1. Of course, we can
easily visualize that if an agent is *not* risk averse, for instance he does not care
about risk, then we should expect that receiving E(z) with certainty or uncertainty should
not matter to him, thus u(E(z)) = E(u). In terms of Figure 1, this would require that the
elementary utility function u(z) be a straight line so that points D and E coincide. It is
obvious in this case that C^{u}(z) = E(z) and p ^{u}(z)
= 0. Thus:

Risk-Neutral: an agent is "risk-neutral" if C^{u}(z) = E(z) or p^{u}(z) = 0 for all z ﾎ M.

Finally, if we have a risk-loving agent, we should expect that he would *prefer*
receiving E(z) with *uncertainty* than receiving it with certainty, thus u(E(z)) <
E(u). In this case, his utility function would have to be one where point E lies above
point D. This will be the case if the elementary utility function u: R ｮ
R is a convex function. It is easy to visualize that, in such a case, he would pay a
premium to take *on* the risk or, equivalently, one would have to pay *him* to
move to a certainty-equivalent allocation, thus C^{u}(z) > E(z) and p ^{u}(z) < 0. Thus:

Risk-Proclivity: an agent has "risk-proclivity" (or is "risk-loving") if C^{u}(z) > E(z) or p^{u}(z) < 0 for all z ﾎ M.

Now, we have appealed to the ideas of concave, linear and convex utility functions to represent risk-aversion, risk-neutrality and risk-proclivity. Consequently, let us state the following theorem:

Theorem: Let u: R ｮ R be an elementary utility function representing preferences ｳ_{h}over M which is monotonically increasing. Then: (i) u is concave if and only if ｳ_{h}displays risk-aversion. (ii) u is convex if and only if ｳ_{h}display risk-proclivity; (iii) u is linear if and only if ｳ_{h}is risk-neutral.

Proof: (i) Let u be concave. Then by definition of concavity u(a x + (1-a )y) ｳ
a u(x) + (1-a )u(y) for all x, y ﾎ R and a ﾎ
(0, 1). But E(z) = a x + (1-a )y and
E(u) = a u(x) + (1-a )u(y). Thus,
this inequality implies u(E(z)) ｳ E(u). As by definition E(u)
= u(C^{u}(z)), then u(E(z)) ｳ u(C^{u}(z)). As
u is monotonically increasing, then E(z) ｳ C^{u}(z),
which is the definition of risk-aversion. (ii) and (iii) follow analogously.ｧ

Of course, as Milton Friedman
and Leonard J. Savage(1948) indicated, it is not
necessarily true that an individual's utility function has the same kind of curvature
everywhere: there may be levels of wealth, for instance, when he is a risk-lover and
levels of wealth when he is risk-neutral. We can see this in the famous Friedman-Savage
double inflection utility function in Figure 2. Obviously, u(z) is concave up until
inflection point B and then becomes convex until inflection C after which it becomes
concave again. Thus, at low income levels (between the origin and z_{B}) agents
exhibit risk-averse behavior; similarly, they are also risk averse at very high incomes
levels (above z_{C}). However, between the inflection points B and C, agents are
risk-loving.

Figure 2- Friedman-Savage Double-Inflection Utility Function

Friedman and Savage (1948) tried to use this to explain why people may
take low probability, high-payoff risks (e.g. lottery tickets) while at the same insuring
against mild risks with mild payoffs (e.g. flight insurance). To see this, presume one is
at point B, on the inflection between risk-aversion and risk-loving. Suppose one faces two
lotteries, one yielding A or B another yielding B or C. These lotteries are captured by
the solid-line chords between the respective payoffs AB and BC. Expected utility of the
first gamble is noted to be E(u) and is depicted in Figure 2 at point E - where,
obviously, E(u) is less than the utility of the expected outcome of the first gamble,
u(E(z)). Consequently, a risk-averse agent would pay a premium to avoid it. The second
gamble yields expected utility E(u｢ ) - at point E｢ on the BC chord - which is greater than the utility of the
expected outcome u(E(z｢ )). A risk-loving agent would *pay*
a premium to undertake this gamble. Thus, we can view risk-averse behavior with regard to
AB as a case of insurance against small losses and the risk-loving behavior with regard to
BC as a case of purchasing lottery tickets.

Harry Markowitz (1952), however,
disputed the Friedman-Savage conjecture that people, or at least the population in
aggregate, have such doubly inflected utility curves. Specifically, Markowitz noted that a
person at point F would take *also* accept a gamble that might take them to F｢ . Conversely, a person at F｢ or
slightly below it will *not* pay a premium against being taken down to F, for
instance, i.e. he will *not* take insurance against situations of huge losses with
low probability. Finally, of course, people above F｢ , i.e.
the *very* rich, will *never* take a fair bet - a phenomenon that does not seem
compatible with empirical phenomena such as, well, Monte Carlo casinos. What Markowitz
(1952) proposed, instead, was that the z's be considered not "income levels" as
Friedman and Savage proposed, but rather "changes in income" and added an
additional inflection point at the bottom. People's "normal" income - whether
rich, poor or moderate, and controlling for the "utility" derived from the
recreational pleasure of gambling - would all be at a point such as B and the rest would
reflect deviations from this average income. In this manner, the apparent
lottery-insurance paradox is resolved without invoking the strange implications of the
original Friedman-Savage hypothesis.

**(C) Arrow-Pratt Measures of Risk-Aversion**

How does one measure the "degree" of risk aversion of an agent? Our first instincts may be to appeal immediately to the concavity of the elementary utility functions. However, as utility functions are not unique, second derivatives of utility functions are not unique, and thus will not serve to compare the degrees of risk aversion in any pair of utility functions. However, the risk premiums, expressed in terms of "wealth", might be a better magnitude. If these can be connected then to the "concavity" of utility curves - adjusted to control for non-uniqueness - so much the better. The most famous measures of risk-aversion were introduced by John W. Pratt (1964) and Kenneth J. Arrow (1965).

Let u: R ｮ R and v:R ｮ
R be two elementary utility functions over wealth representing preferences ｳ _{u} and ｳ _{v} over M
respectively. Consider the following definition:

Premium Measure: ｳ_{u}hasmore risk-aversionthan ｳ v if p^{u}(z) ｳ p^{v}(z) for all z ﾎ M.

Now consider the following theorem due to J.W. Pratt (1964):

Theorem: (Pratt) Let u, v be elementary utility functions over wealth which are continuous, monotonically increasing and twice-differentiable. Then the following are equivalent:(1) -u｢ ｢ (x)/u｢ (x) > - v｢ ｢ (x)/v｢ (x) for every x ﾎ R

(2) u(x) = T(v(x)) where T is a concave function.

(3) p

^{u}(z) ｳ p^{v}(z) for all z ﾎ M.

Proof: We shall go (1) ﾞ (2) ﾞ (3) ﾞ (1).

(1) ﾞ (2): First we must establish that T
exists. Let v(x) = t. As v is monotonic and continuous, then the inverse v^{-1}
exists and so v^{-1}(t) = x. Thus, u(x) = u(v^{-1}(t)). Let us define T =
uｰ v^{-1}, then u(x) = T(t). But recall that t = v(x),
thus u(x) = T(v(x)). Thus T exists. Now, differentiating u｢
(x) = T｢ (v(x))v｢ (x). Thus:

u｢ (x)/v｢ (x) = T｢ (v(x)) > 0

as u｢ (x), v｢ (x) > 0 by assumption of monotonicity. Differentiating again:

u｢ ｢ (x) = T｢ ｢ (v(x))v｢ (x)

^{2}+ T｢ (v(x))v｢ ｢ (x)

or substituting in for T｢ (v(x)):

u｢ ｢ (x) = T｢ ｢ (v(x))v｢ (x)

^{2}+ u｢ (x)v｢ ｢ (x)/v｢ (x)

Thus dividing through by u｢ (x) and rearranging:

v｢ ｢ (x)/v｢ (x) - v｢ ｢ (x)/u｢ (x) = -T｢ ｢ (v(x))v｢ (x)

^{2}/u｢ (x)

Now, by assumption v｢ > 0 and u｢ > 0, thus let v｢ (x)^{2}/u｢ (x) = a > 0 so:

v｢ ｢ (x)/v｢ (x) - u｢ ｢ (x)/u｢ (x) = -T｢ ｢ (v(x))a

But by (1), v｢ ｢ (x)/v｢ (x) - u｢ ｢ (x)/u｢ (x) ｳ 0. Thus, as a > 0, then it must be that -T｢ ｢ (v(x)) ｳ 0, or simply, T｢ ｢ (v(x)) ｣ 0. Thus, for all x ﾎ R, T｢ ｳ 0 and T｢ ｢ ｣ 0, thus T is concave. Q.E.D.

(2) ﾞ (3): Recall that u(C^{u}(z)) =
E(u(z)). Thus, by (2), as u(x) = T(v(x)), then u(C^{u}(z)) = E(T(u(z))). As T is
concave, then by Jensen's inequality:

E(T(v(z))) ｣ T(E(v(z)))

but as E(T(v(z))) = u(C^{u}(z)) and E(V(z)) = v(C^{v}(z))
by definition, then this implies:

u(C

^{u}(z)) ｣ T(v(C^{v}(z)))

or, by (2), as u(ｷ) = T(v(ｷ)), this becomes u(C^{u}(z)) ｣ u(C^{v}(z)). Thus by monotonicity of u, C^{u}(z)) ｣ C^{v}(z) which implies, by definition, that p ^{u}(z) ｳ p
^{v}(z) which is (3). Q.E.D.

(3) ﾞ (1): Let us show, equivalently, that "not (1)" ﾞ "not (3)". Thus, if "not (1)", then:

-u｢ ｢ (xｰ )/u｢ (xｰ ) < v｢ ｢ (xｰ )/v｢ (xｰ ) for some xｰ ﾎ R.

By continuity, there is a neighborhood N_{e}
(xｰ ) for which this is true. Let z be a random variable which
takes values *only* in Ne (xｰ
), and elsewhere zero. Now, in this area N_{e}(xｰ ), T(.) is convex. Why? Well, recall that in our earlier proof of
(1) ﾞ (2), we obtained v｢ ｢ (x)/v｢ (x) -u｢
｢ (x)/u｢ (x) = -T[v(x)]a . Well, for x ﾎ Ne
(xｰ ), -u｢ ｢
(x)/u｢ (x) < v｢ ｢ (x)/v｢ (x), thus, -T[v(x)]a < 0 for x ﾎ Ne
(xｰ ), thus T｢ ｢
> 0, and thus T(.) is convex. But, by earlier theorem (2) ﾞ
(3), we can see that T｢ ｢ > 0
implies that p ^{v}(z) ｳ p ^{u}(z), i.e. "not (3)". Thus, "not
(1)" ﾞ "not (3)" or, equivalently, (3) ﾞ (1). Q.E.D.

A more direct alternative proof of (3) ﾞ (1)
would be to proceed as follows: Let z = [w - e , 0.5; w + e , 0.5], where w is some initial level of wealth and e is some random variable representing pure risk. Without loss of
generality, we assume E(e ) = 0 and thus E(z) = w. Thus, by von
Neumann-Morgenstern, E[u(z)] = 0.5[w-e ] + 0.5[w+e ]. By definition of the risk-premium, u(w - p
^{u}(z)) = E[u(z)]. Taking a Taylor approximation on the right side (thus we are
"scaling" the risk "in the small"):

E[u(z)] = E[u(w)] + E[u｢ (w)e ] + E[u｢ ｢ (w)e

^{2}/2] + o(e^{3}) + ....

where o denotes a negligible order of magnitude, i.e. lime ｮ _{0}o(e
^{3})/e ^{3} = 0. As E[u(w)] = u(w) (as w is
certain) and E[u｢ (w)e ] = 0 (as E(e ) = 0) and E[u｢｢(w)e
^{2}] = u｢｢ (w)s ^{2}_{e}^{ }(where E(e ^{2})
= s ^{2}_{e}^{ }),
then omitting remainders:

E[u(z)] = u(w) + u｢ ｢ (w)s

^{2}_{e}^{ }/2

Now, taking a Taylor approximation of u(w - p (z)), the left side of our earlier equation, we have:

u(w - p (z)) = u(w) - u｢ (w)p (z) + o(p

^{2}) + ...

then equating both sides and omitting remainders:

u(w) - u｢ (w)p (z) = u(w) + u｢ ｢ (w)s

^{2}_{e}^{ }/2

or:

p (z) = -[u｢ ｢ (w)/u｢ (w)]s

^{2}_{e}/2

As s ^{2} > 0 and faced by all
agents, then we can see that if p ^{u}(z) ｳ p ^{v}(z), then -[u｢ ｢ (w)/u｢
(w)] ｳ -[v｢ ｢
(w)/v｢ (w)]. Thus, (3) ﾞ (1).
Q.E.D.

Thus, (1) ﾞ (2) ﾞ (3) ﾞ (1). ｧ

Consequently, we can note that the term -u｢ ｢ (x)/u｢
(x) is another measure of risk-aversion. Thus the term r_{u}(x) = -u｢ ｢ (x)/u｢
(x) is also known as that *Arrow-Pratt Measure of Absolute Risk-Aversion* or ARA.
Notice that r_{u}(x) > 0 if u is monotonically increasing and strictly concave
(as in Figure 1 for the risk-averse individual). Naturally, r_{u}(x) = 0 for the
risk-neutral individual with a linear utility function and r_{u}(x) < 0 for the
risk-loving individual with a strictly convex utility function.

As we can see, the Arrow-Pratt measure of absolute risk aversion cannot
capture a situation as in Figure 2 as the agent switches from risk-aversion, to
risk-loving and then back to risk-aversion. Thus, an alternative would be to weight the
measure of risk aversion by the level of wealth, x. In this case we obtain the *Arrow-Pratt
Measure of Relative Risk-Aversion*, or RRA, which is defined as R_{u}(x) = xr_{u}(x)
= - xu｢ ｢ (x)/u｢
(x).

Of course, even without inflections, increases and decreases in wealth can change the degree of risk-aversion. Specifically, recall that risk-aversion is defined via the risk premium. Thus, let us define the following, where w is wealth, and x is risk where E(x) = 0:

Decreasing Absolute Risk Aversion: u displaysdecreasing absolute risk-aversion(DARA) if p^{u}(w, x) > p^{u}(w + a, x) for all a > 0.

Constant absolute risk aversion (CARA) and increasing absolute risk aversion (IARA) are defined analogously by replacing the appropriate inequalities. Let us now consider the following:

Theorem: The following three conditions are equivalent:(i) u(w) displays decreasing absolute risk aversion (DARA)

(ii) u(w) is a concave transformation in the level of wealth u(w) = T

_{a}(u(w+a)), where T｢_{a}> 0 and T｢｢_{a}< 0 for all w and all given a > 0.(iii) -u｢｢(w+a)/u｢(w+a) ｣ -u｢｢(w)/u｢(w) for all w and a > 0.

Proof: Let v_{a}(w) = u(w+a) and the rest follows by the Pratt
theorem. Note that d(-u｢ ｢ (w)/u｢ (w))/dw < 0.ｧ

We can also have changing rates of relative risk aversion. The definition in this case is as follows:

Decreasing Relative Risk Aversion: u displaysdecreasing relative risk aversion(DRRA) if d(-wｷu｢ ｢ (w)/u(w))/dw ｣ 0.

Of course, constant relative risk aversion (CRRA) and increasing relative risk aversion (IRRA) follow analogously. Consider the following famous examples of functions with different degrees of risk-aversion:

*Example*: (DARA/CRRA): u(x) = xa where a ﾎ (0, 1). This displays decreasing
absolute risk aversion (DARA) and constant relative risk-aversion (CRRA). To see this,
note that u｢ (x) = a xa ^{-1} and u｢ ｢ (x) = a (a
-1)xa ^{-2}, thus r_{u}(x) = -a (a -1)xa ^{-2}/a xa ^{-1} = (1-a )/x ｳ 0 so there is absolute
risk-aversion. Thus, dr_{u}(x)/dx = -(1-a )/x^{2}
< 0 as a < 1, thus we have declining absolute risk
aversion. In contrast, R_{u}(x) = -a (a -1)xxa ^{-2}/a
xa ^{-1} = 1-a ｳ 0, thus dR_{u}(x)/dx = 0, so there is constant relative
risk aversion. Notice that the famous CRRA utility function used in macroeconomic
consumption theory, u(c) = c^{1-r }/(1-r ) where r ﾎ
(0, 1) is merely a special case of this which yields R_{u}(c) = r , thus r is the "coefficient"
of relative risk-aversion.

*Example*: (CARA/IRRA) u(x) = -e^{-x}. This displays constant
absolute risk aversion and increasing relative risk aversion. Notice that u｢ (x) = e^{-x} and u｢ ｢ (x) = -e^{-x}. Thus, r_{u} = e^{-x}/e^{-x}
= 1 ｳ 0. Thus, dr_{u}/dx = 0, thus there is constant
absolute risk-aversion. Similarly, dR_{u}/dx = x(e^{-x}/e^{-x}) =
x, so there is increasing relative risk-aversion. Again, the famous CARA utility function
used in macroeconomics, u(c) = -(1/a )e^{-a c} where a > 0 is also a special
case of this. Obviously, r_{u}(c) = a , thus a is also known as the coefficient of absolute risk-aversion.

*Example*: Quadratic utility function u(x) = a
+ b x -g x^{2} where, for
concavity, g > 0. Notice that now we have u｢ (x) = b - 2g
x and u｢ ｢ (x) = -2g . Thus r_{u}(x) = 2g /(b -2g x). Notice for r_{u}(x) ｳ 0, we need b ｳ
2g x, thus it only applies for a limited range of x. Notice
that dr_{u}(x)/dx ｳ 0 *up *to where x = b /2g . Beyond that, marginal utility is
negative - i.e. beyond this level of wealth, utility *declines*. Notice another
implication: namely, that the unwillingness to take risks *increases* as wealth
increases, i.e. that richer people are more unwilling to take risks. Thus, the quadratic
utility function exhibits IARA. John Hicks (1962) and Kenneth Arrow (1965) have assaulted
the quadratic utility function on this basis.

**(D) Application: Portfolio Allocation and Arrow's
Hypothesis**

Kenneth J. Arrow (1965)
hypothesized that individuals ought to display decreasing absolute risk aversion (DARA)
and increasing relative risk aversion (IRRA) with respect to wealth, i.e. dr_{u}(x)/dx
｣ 0 and dR_{u}(x)/dx ｳ 0.
The reasoning for DARA is, recall, to argue that wealthy individuals are not more
risk-averse than poorer ones with regard to the *same* risk. Thus, as Arrow notes,
DARA is necessary if risky assets are to be "normal goods", i.e. a rise in
wealth leads to an increase in demand for them - whereas IARA implies they are an inferior
good. The reasoning for hypothesizing IRRA is that as wealth increases *and* the size
of the risk increases, then the willingness to accept the risk should decline.
Alternatively stated, IRRA implies that the wealth elasticity of demand for risky assets
is less than unity. We will verify Arrow's hypothesis in the next section.

The basic problem can be set out as a portfolio allocation problem, similar in spirit to Harry Markowitz (1952, 1958) and James Tobin (1958). Let w be initial wealth. Suppose there are two assets: a risk-free asset with zero return and a risky asset whose rate of return (x) is a random variable with positive mean, i.e. E(x) > 0. Thus, letting a ﾎ [0, 1] be the proportion of initial wealth invested in the risky asset, then expected return on a portfolio of size w is E(w + a wx). Obviously, the higher a , the higher the expected return on the portfolio.

The portfolio allocation decision is as follows: namely, choose a ﾎ [0, 1] such that the expected utility
of the portfolio, E[u(w + a wx)] is maximized. To visualize the
problem, we can examine it geometrically in the space of random variables in Figure 3.
Suppose the random variable x, the return on the risky asset, takes only two values, x =
{x_{1}, x_{2}} where x_{1} > 0 > x_{2} where the
probability of x_{1} occurring is p and of x_{2} is (1-p) and we assume
E(x) = px_{1} + (1-p)x_{2} > 0. In Figure 3, the horizontal axis
measures wealth in state 1 and the vertical axis measures wealth in state 2. Now, initial
wealth is w which remains the same regardless of which state occurs. This is shown in
Figure 3 by point W which lies on the 45ｰ "certainty
line": whether state 1 or state 2 happens, wealth remains w. This is equivalent to
the situation where all wealth is held in the form of the riskless asset, i.e. a = 0. If, in contrast, all wealth is held in the form of the risky
asset, so a = 1, then the agent receives wealth w + x_{1}
in state 1 and w + x_{2} in state 2 which is represented as point W+X in Figure 3
which is, of course, off the certainty line. Thus, if 0 < a
< 1, then wealth allocation is represented somewhere on the chord between W and W+X,
such as at point a * where we have an allocation which yields
wealth w+a *x_{1} in state 1 and w+a
*x_{2} in state 2. Increasing a moves us away from
point W and towards point W+X.

** **

Figure 3 - Portfolio Allocation Decision

Now, expected utility is:

E[u(w+a wx)] = pu(w + a x

_{1}) + (1-p)u(w + a x_{2})

If a = a *, then this is represented by indifference curve U(w+a *x) which passes through a * in Figure 3. If all wealth is held in the form of the riskless asset, a = 0, then E(u) = u(w), which is represented by the U(w) indifference curve passing through W. Totally differentiating, we see that the slope of the indifference curves in Figure 3 are:

dx

_{2}/dx_{1}|_{U}= -[p/(1-p)]ｷ[u｢_{1}(w+a x_{2})/u｢_{2}(w+a x_{2})]

Now, when the income in both states are the same, then u｢ _{1} = u｢ _{2}, thus
at the 45ｰ line then the slope of the indifference curve is
-p/(1-p). This is shown in Figure 3 by the dashed line with slope -p/(1-p) lying tangent
to U(w). Notice that the slope of U(w+a *x) is *also*
-p/(1-p) on the 45ｰ line as shown by point A .

An agent is thus faced with an indifference curve map on the space of random variables depicted in Figure 3. Maximizing expected utility, then, we can see that the agent will choose a * where the highest isoquant is tangent to the line between W and W+X. This is his optimal allocation of wealth between risky and riskless assets. Notice that the first order condition for a maximum implies:

dE[u(w+a wx)]/da = E[u｢ (w+a wx)ｷwx] = 0

Notice that if the agent places everything in riskless assets so a * = 0, then this implies E[u｢ (w)ｷwx]
= 0 = u｢ (w)wE(x) (as w is certain). But as E(x) > 0 by
assumption, then this cannot hold. Thus, by implication, it cannot be that a risk-averting
agent will take a completely riskless position, i.e. a * ｹ 0. As long as the returns are actuarially fair, then no matter how
risk-averse, an agent will always take *some* positive amount of the risky asset.

How might risk-aversion affect portfolio allocation decisions? Recall that
the Arrow-Pratt measure of risk-aversion implies there is a relationship between the
degree of concavity of the utility function and the degree of risk-aversion. In the space
of random variables, this implies there is a relationship between the degree of *convexity*
of indifference curve and the degree of risk-aversion - with *more* risk-averse
agents having *more* convex indifference curves. We see this in Figure 4 below where
we have two sets of indifference curves - for agent U (solid lines) and agent V (dashed
lines). Notice from the (badly drawn) diagram that the indifference curves of agent U are *more*
convex than those of agent V, thus we would argue that U is *more* risk-averse than
V.

Figure 4- Portfolio Allocation and Risk-Aversion

Another implication can be immediately drawn from Figure 4. Notice that
facing the same portfolio decision, agent V chooses optimal portfolio allocation a ^{v} whereas agent U chooses the optimal portfolio
allocation a ^{u} where, obviously, a
^{u} < a ^{v}. This seems to arise because
of the relative convexity of indifference curves. Does this imply that, in general, the *more*
risk-averse agents will take a *smaller* proportion of risky assets in their
portfolio? Let us see:

Theorem: Let u, v be twice-differentiable elementary utility functions. If u is more risk-averse than v, then a^{u}｣ a^{v}for the same initial wealth, w.

Proof: Suppose both agents have the same initial wealth, w. Thus, each agent maximizes E[u(w + a wx)] and E[v(w + a wx)] respectively. For v, the first order condition for a maximum is that:

ｶ E[ｷ]/ｶ a = E[v｢ (w+a wx)ｷwx] = 0

while the second order condition is:

ｶ

^{2}E[ｷ]/ｶ a^{2}= E[v｢ ｢ (w+a wx)ｷw^{2}x^{2}] ｣ 0

Let a ^{v} be the optimal portfolio
allocation for agent v. What we are going to do can be illustrated by Figure 5, where we
have the expected utility functions of agents u and v over the range of different
portfolio allocations, a . Notice that a
^{v} yields the maximum expected utility for agent v. Now, suppose agent u is *forced*
to hold agent v's optimal portfolio allocation, a ^{v},
then agent u's expected utility will be at point A in Figure 5. *If* agent u's
expected utility is declining at that level, we know that he could improve his expected
utility by decreasing the amount of the risky asset he held, i.e. reducing a . Thus, the implication is agent u's optimal portfolio, a ^{u}, must have been below agent v's, a
^{v}, as we see depicted in Figure 5.

Figure 5- Expected utilities over different portfolios.

Let us proceed with the proof. Recall from the Pratt
theorem that if u is more risk-averse than v, then there is a concave T such that u =
T(v). Now, *if* agent u holds agent v's portfolio, a ^{v},
then u has expected utility E[u(w+a ^{v}wx)] = E[T(v(w+a ^{v}wx))]. Thus, differentiating with respect to a :

ｶ

^{2}E[ｷ]/ｶ a |a_{=a v}= E[T｢ (v(w+a^{v}wx))v｢ (w+a^{v}wx)ｷwx]

Let us now conduct a little plastic surgery. Effectively, we are going to
subtract 0 into this equation. Recall, from the first order condition of agent v, E[v｢ (w+a ^{v}wx)ｷwx] = 0. Thus,
pre-multiplying this by T｢ (v(w)) implies that still E[T｢ (v(w))ｷv｢ (w+a
^{v})ｷwx] = 0 (we can place it within the expectations operator because T｢ (v(w)) is *not* random). Thus, let us subtract this from the
term in the above equation for agent u:

ｶ

^{2}E[ｷ]/ｶ a |a_{=a v}= E[T｢ (v(w+a^{v}wx))v｢ (w+a^{v}wx)ｷwx] - E[T｢ (v(w))ｷv｢ (w+a^{v})ｷwx]

where, because it has value zero, implies there is no change. But then combining terms we obtain:

ｶ

^{2}E[ｷ]/ｶ a |a_{=a v}= E[{T｢ (v(w+a^{v}wx) - T｢ (v(w))}ｷv｢ (w+a^{v})ｷwx]

Now examine this closely: if x > 0, then w + a
^{v}wx > w, which implies that v(w+a ^{v}wx)
> v(w), which implies, in turn that T｢ (v(w+a ^{v}wx) < T｢ (v(w)) by the
concavity of T. Thus, {T｢ (v(w+a ^{v}wx)
- T｢ (v(w))} > 0 and so the entire term is negative, i.e. ｶ ^{2}E[ｷ]/ｶ a |a _{=a v}.
In other words, as shown in Figure 5. the slope of agent u's expected utility *at* a = a ^{v} is negative, thus agent
u's expected utility E[u(w + a ^{v}wx)] is decreasing
if he is forced to hold agent v's optimal portfolio allocation. If so, then the
implication is that his own optimal allocation was *below* agent u's optimal
portfolio allocation, i.e. a ^{u} ｣
a ^{v}. Thus, the more risk-averse agent holds a
smaller proportion of his wealth in the risky asset.ｧ

Let us now turn to verifying Arrow's (1965) hypothesis, namely that if we have decreasing absolute risk aversion (DARA) and increasing relative risk aversion (IRRA) with respect to wealth, then optimal holdings of risky assets increase with wealth but the proportion of wealth invested in assets declines.

Theorem: (Arrow's Hypothesis) Suppose u is a twice-differentiable elementary utility function which exhibits decreasing absolute risk aversion and increasing relative risk aversion. Suppose we are faced with the two-asset portfolio allocation problem outlined above. Then optimal a increases when wealth increases, but less than proportionally to the increase in wealth.

Proof: Recall that the agent maximizes E[u(w+a
wx)]. Thus, the first order condition is that ｶ E[ｷ]/ｶ a = E[u｢
(w+a wx)ｷwx] = 0 while the second order condition is ｶ ^{2}E[ｷ]/ｶ a ^{2} = E[u｢ ｢
(w+a wx)ｷw^{2}x^{2}] ｣
0. Let a * be the solution to this problem. Letting FOC denote
"first order condition" and SOC denote "second order condition", then
we know that totally differentiating the FOC:

(dFOC/dw)dw + (dFOC/da *)da * = 0

but as dFOC/da * = SOC, then this reduces simply to:

da */dw = -(dFOC/dw)/SOC

As we know, SOC ｣ 0. What about the numerator?
Now, dFOC/dw = E[u｢ ｢ (w+a *wx)ｷwx(1+a *x)] + E[u｢ (w+a wx)ｷwx]. The second term falls
out because of the first order condition, thus we know that the sign of da */dw will be the same as the sign of E[u｢
｢ (w+a *wx)ｷwx(1+a *x)]. Let w_{0} = w(1+a *x),
then:

E[u｢ ｢ (w+a *wx)ｷwx(1+a *x)] = E[u｢ ｢ (w

_{0})ｷxw_{0}]

thus:

sgn da */dw = sgn E[u｢ ｢ (w

_{0})ｷxw_{0}]

Now, multiplying and dividing E[u｢ ｢ (w_{0})ｷxw_{0}] by u｢
(w_{0}):

E[u｢ ｢ (w

_{0})ｷxw_{0}] = E[[u｢ ｢ (w_{0})/u｢ (w_{0})]ｷxw_{0}u｢ (w_{0})]

Now, let us conduct some more plastic surgery. Recall that the FOC states
that E[u｢ (w+a wx)ｷwx] = E[u｢ (w_{0})ｷwx] = 0. Thus, multiplying by the constant u｢ ｢ (w)/u｢
(w), then the FOC says that E[[u｢ ｢
(w)/u｢ (w)]u｢ (w_{0})ｷwx]
= 0. Thus, subtracting this from our equation above:

E[u｢ ｢ (w

_{0})ｷxw_{0}] = E[[u｢ ｢ (w_{0})/u｢ (w_{0})]ｷxw_{0}u｢ (w_{0})] - E[[u｢ ｢ (w)/u｢ (w)]u｢ (w_{0})ｷwx]

which does not make a difference because the subtracted term has value zero. Thus rearranging:

E[u｢ ｢ (w

_{0})ｷxw_{0}] = E[{[u｢ ｢ (w_{0})/u｢ (w_{0})]ｷw_{0}- [u｢ ｢ (w)/u｢ (w)]ｷw}u｢ (w_{0})x]

Note that the term [u｢ ｢
(w_{0})/u｢ (w_{0})]ｷw_{0} is
effectively the (negative of) the rate of relative risk aversion for w_{0} = (1+a *x)w while term [u｢ ｢
(w)/u｢ (w)]ｷw is effectively the (negative of ) the rate of
relative risk aversion for w. Thus, if x > 0, w_{0} > w. Consequently, by
Arrow's hypothesis of increasing relative risk aversion (IRRA), we should expect [u｢ ｢ (w_{0})/u｢
(w_{0})]ｷw_{0} < [u｢ ｢
(w)/u｢ (w)]ｷw. Or, in other words, and recalling that x >
0:

E[u｢ ｢ (w

_{0})ｷxw_{0}] = E[{[u｢ ｢ (w_{0})/u｢ (w_{0})]ｷw_{0}- [u｢ ｢ (w)/u｢ (w)]ｷw}u｢ (w_{0})x] < 0.

If we now let x < 0, then w_{0} < w, so the same logic works
itself backwards so we know have [u｢ ｢
(w_{0})/u｢ (w_{0})]ｷw_{0} > [u｢ ｢ (w)/u｢
(w)]ｷw but as x < 0, then we obtain the same result that the whole term is negative.
As a result, as sgn E[u｢ ｢ (w_{0})ｷxw_{0}]
= sgn da */dw, then da */dw < 0.
Thus, as wealth increases, the proportion of wealth held in risky assets declines - which
is exactly Arrow's hypothesis.

For the rest of the hypothesis, i.e. DARA, let us proceed as follows. Let
w_{1} < w_{2} and define u_{1}(z) = u(w_{1} + z) and u_{2}(z)
= u(w_{2} + z). Thus, by DARA, then u_{1}(ｷ) is a concave transformation
of u_{2}(ｷ). As a result, we can think of u_{1}(ｷ) as the elementary
utility function of a more risk-averse agent and u_{2}(ｷ) as that of a less
risk-averse agent. By our previous theorem, we saw that a more risk-averse agent will
invest a smaller absolute amount in the risky asset than the less-risk averse. Thus, u_{1}
will invest less than u_{2}, but as w_{2} > w_{1}, this implies
that the amount invested in risky assets increases as wealth increases. Thus, both of
Arrow's hypotheses are confirmed.ｧ

**(E) Ross's Stronger Risk Aversion Measurement**

The implicit assumption in the previous section on portfolio allocation
was the existence of a riskless asset. Suppose now that we have *two* risky assets
and no riskless asset. Let a be the proportion of wealth held
in the form of the riskier asset and thus (1-a ) is the
proportion of wealth held in the form of the less risky (but not riskless) asset. Do we
still obtain the conclusion that if u is more risk-averse than v that a
^{u} ｣ a ^{v}.
Stephen A. Ross (1981) has demonstrated that this is
no longer true. To see this, consider his example:

*Ross's Counterexample*: Let there be two assets, x and y. Let y pay 1
or 0 with equal probability, i.e. 0.5 each. In contrast, x is much riskier: it pays 3, 2,
0 or -1, each with probability of 0.25 each. Now, we can decompose x into a combination of
y and another random variable z. Specifically, letting z = x - y, then x = y + z. Thus, we
can think of x being "riskier" than y by taking the payoffs of y and adding *another*
stochastic component z to the already risky payoffs of y. In order for this to be true,
then z has to pay 2 or -1 with probability 0.5 each and E(z|y) > 0. We can visualize
this decomposition of x heuristically in Figure 6 below where, note, the compound lottery
y+z *is* x. Thus, x is merely y's payoffs plus some more "noise" by z -
this is what we mean by x being "riskier" than y (see our section on the
definition of "riskiness").

Figure 6- Random variable x "riskier" than y by addition of z

Let a be the proportion of wealth w invested in
asset x and (1-a ) the proportion invested in asset y, so w_{0}
= w(a x + (1-a )y) = w(y + a (x-y)) or, normalizing w = 1, then w_{0} = y + a (x-y). As z = x-y, then w_{0} = y + a
z, thus we need only consider two random variables y and z. We are assuming that z and y
are stochastically independent. Notice that when a = 1, then
the final compounded terms y_{ }+ a z are merely x. If a = 0, then we are reduced merely to y.

Now, expected utility for the less risk-averse agent v is E[v(a x + (1-a )y)] = E[v(y + a z)], thus differentiating with respect to a , we obtain the first-order condition for a maximum for agent v:

ｶ E[v(y + a z)]/ｶ a = E[v｢ (y + a z)ｷz] = 0

or, substituting our numbers:

E[v｢ (y + a z)ｷz] = 0.25v｢ (1+a

^{v}2)2 + 0.25v｢ (1+a^{v}(-1))(-1)

+ 0.25v｢ (0+a

^{v}2)2 + 0.25v｢ (0+a^{v}(-1))(-1) = 0.

Assume that v is such that optimally, a ^{v}
= 0.25, then:

E[v｢ (y + a z)ｷz] = 0.5v｢ (1.5) - 0.25v｢ (0.75) + 0.5v｢ (0.5) - 0.25v｢ (-0.25) = 0.

and assume, for the sake of argument, that v is such that we attain the following numbers: v｢ (1.5) = 0, v｢ (0.75) = 2, v｢ (0.5) = 3, v｢ (-0.25) = 4 which satisfy diminishing marginal utility of wealth. Notice that these numbers fulfill the first order condition above, i.e.:

E[v｢ (y + a z)ｷz] = 0.5(0) - 0.25(2) + 0.5(3) - 0.25(4) = 0.

The question now turns to u. As u is more risk-averse that v, then u =
T(v) where T is a concave function. Thereby, forcing u to hold the portfolio a ^{v}, we have:

E[u｢ (y + a

^{v}z)ｷz] = E[T｢ (v(y+a^{v}z))ｷv｢ (y+a^{v}z)z]

= 0.5T｢ (v｢ (1.5))ｷ0 - 0.25T｢ (v｢ (0.75))2 + 0.5T｢ (v｢ (0.5))ｷ3 - 0.25T｢ (v｢ (-0.25))ｷ4

= T｢ (0)ｷ0 + T｢ (2)ｷ(-0.5) + T｢ (3)ｷ(1.5) + T｢ (4)(-1)

But as T｢ is concave, then it perfectly legitimate to assign the following numbers T｢ (0) = 1, T｢ (2) = 1, T｢ (3) = 10 and T｢ (4) = 10, thus:

E[u｢ (y + a

^{v}z)ｷz] = 1ｷ0 + 1ｷ(-0.5) + 10ｷ(1.5) + 10(-1)

= 0 - 0.5 + 15 - 10 = 4.5 > 0.

This is the slope of the expected marginal utility function of agent u
forced to hold the portfolio of agent v. As the slope of u's marginal utility function at a ^{v} is *positive*, then this implies that his own
optimal allocation a ^{u} is at a higher level. Thus, a ^{u} > a ^{v}.
However, we posited that u was *more* risk-averse that v. Thus, from this example, it
does *not* follow that if u is more risk-averse than v, then a
^{u} ｣ a ^{v};
rather, a *more *risk-averse agent can take a *less* risky position.

What exactly is going on in this counterexample? With perfectly legitimate
values for V and T, this example contradicts the Arrow-Pratt hypothesis
and has more risk-averse agents choosing the riskier position. Notice that if a = 0, then we have merely the risk of y to contend with and the *extra*
risk of random variable z is avoided.; if a = 1, then we take
the full risk of x. So the question that imposes itself now is the following: *given*
that the agent cannot avoid the risk of y, how much is he willing to pay to avoid the
additional risk of adding z to that? In other words, we want to compute the risk premium
of the extra risk of z.

To understand the essence of the problem, examine the somewhat cluttered
Figure 7. Suppose y is an asset which pays y_{1} = 1 or y_{2} = 0, each
with some probability (p, 1-p). In Figure 7, the risky asset y is represented by the chord
connecting points A and B on the utility curve, and E[u(y)] is expected utility of asset
y. Suppose now that we add a little bit more of risk as we did when we added z before to y
to obtain x. However, let us have it that y_{2} = 0 remains the same, but that y_{1}
= 1 now has a *little* extra noise, thus let us *only* add z_{1} = e and z_{2} = -e to y_{1}
alone. Assume z_{1} has probability q = 0.5 and z_{2} has probability
(1-q) = 0.5 so that E(z) = qz_{1} + (1-q)z_{2} = (0.5)e
+ (0.5)(-e ) = 0. As we see in Figure 7, the *extra* risk
z makes y_{1} + z range from 1+e to 1-e while y_{2} remains constant. In other words, our payoffs
when we add the extra variable z to our existing random variable y are now y_{1} +
z_{1} = 1+e , y_{1} + z_{2} = 1-e and y_{2} = 0. The extra randomness is represented by the
chord connecting points e and -e in
Figure 7. Thus, the expected utility of the whole combination of y+z is now:

E[u(y+z)] = p[qu(y

_{1}+z_{1}) + (1-q)u(y_{1}-z_{2})] + (1-p)u(y_{2})

= p[0.5u(1+e ) + 0.5u(1-e )] + (1-p)u(0)

which is shown heuristically in Figure 7 as a point on the chord connecting A and C.

Figure 7- Ross's Paradox

What is the risk premium to get rid of the extra risk z? The Arrow-Pratt
risk-premium is computed now using the chord (-e , e). Recall that E(z) = 0, thus the expected utility of y_{1}
+ z *alone* is shown in Figure 7 as E[u(y_{1}+z)] which is obtained from
point C on the chord (-e , e ). In
order to get rid of the extra risk z, the agent would pay a premium p
that yielded the "certainty-equivalent" allocation B｢
which thus results in (1-p ) and yields the same
"certain" utility u(1-p ) as the old expected utility
E[u(y_{1} + z)]. Thus, paying premium p , we move to
point B｢ .

*However*, because y_{1} is only *one* state of a risky
asset, thus we do *not* obtain (1-p ) with
"certainty", but rather (1-p ) becomes merely *one*
of the possible payoffs. Our analysis does not end with this, then. A premium is a
premium, and it will be paid before one knows whether y_{1} or y_{2} has
realized itself. Consequently, we must *also* reduce y_{2} by the amount of
the premium to -p . This is point A｢
. Thus, by paying the premium p , we have changed the returns
in *both* cases y_{1} and y_{2} from 1 and 0 to 1-p
and -p respectively, thus our chord AB representing risky asset
y now moves to A｢ B｢ to represent
risky asset minus premium, y - p , in Figure 7. Notice that
expected utility is now reduced from uncertain E[u(y+z)] to uncertain E[u(y-p)].

The idea of the Arrow-Pratt measure was that it was a *local* measure
and not a global one. As the premium paid to get rid of z affects payoffs *elsewhere*,
then the shape of the utility curve *elsewhere* (i.e. around A and A｢ ) begins to matter because the impact of the risks are now spread
over two states. In other words, the measure of risk-aversion should not be exclusively
concentrated around B but one should also take into account the curvature around A. In
short, we want a measure of *average* risk-aversion. Algebraically, the original
expected utility E[u(y+z)], we had

E[u(y+z)] = p[0.5u(y

_{1}+e ) + 0.5u(y_{1}- e )] + (1-p)u(y_{2})

so taking a Taylor expansion:

E[u(y+z)] = (1-p)u(y

_{2}) + pu(y_{1}) + p[0.5u｢ (y_{1})e - 0.5u｢ (y_{1})e ] + 0.5pu｢ ｢ (y_{1})e^{2}+ oe^{2}

= (1-p)u(y

_{2}) + pu(y_{1}) + 0.5pu｢ ｢ (y_{1})e^{2}+ oe^{2}

where o is a order of magnitude denoting the negligibility of the remainder. In contrast, once the premium is paid to get rid of z we have:

E[u(y-p )] = pu(y

_{1}- p ) +(1-p)u(y_{2}- p )

= (1-p)u(y

_{2}) + pu(y_{1}) + [(1-p)u｢ (y_{2})p + pu｢ (y_{1})p ] + op^{2}

Thus, as E[u(y+z)] = E[u(y-p )] so that the
premium is calculated to equate the expected utility of the risky situation y+z and the
expected utility of the second y-p (which is *not*
certainty-equivalent because y is not riskless!), then:

0.5pu｢ ｢ (y

_{1})e^{2}+ oe^{2}= [(1-p)u｢ (y_{2})p + pu｢ (y_{1})]p + op^{2}

which solving for p yields:

p = -0.5pu｢ ｢ (y

_{1})e^{2}/[(1-p)u｢ (y_{2})p + pu｢ (y_{1})]p

or:

p = -0.5[u｢ ｢ (y

_{1})/u｢ (y_{1})]e^{2}/[1+ (1-p)u｢ (y_{2})/pu｢ (y_{2})]

thus the risk premium is calculated using the measure of risk aversion
around y_{1}, -u｢ ｢ (y_{1})/u｢ (y_{1}) *and* the measure of risk-aversion around y_{2},
-u｢ ｢ (y_{2})/u｢ (y_{2}), thus we compute the premium from *average*
risk-aversion.

To understand why we obtained the paradoxical result in Ross that the *less*
risk-averse agent pays a *higher* premium than the *more* risk-averse agent, we
can appeal to (imperfectly drawn) Figure 8. Here we have two individuals u and v with
utility functions which are identical around 1 but different around 0 - specifically,
assume that u has a more concave utility function than v around 0 - thus agent u is *more*
risk-averse (around 0 and thus on average) than agent v. We originally start with y which
yields the chord A_{v}B for agent v and A_{u}B for agent u. Again, we
presume we go through the same exercise as before in Figure 7 where we have a chord
representing z, etc., etc. so that following the same story, we get rid of z by swinging
the chord from A_{v}B to A_{v｢ }B｢ and A_{u}B to A_{u}B｢
. (notice that the upper part of the chord, B and B｢ remain
the same for both agents as they have the same utility functions at higher levels of
wealth).

The important point to note in Figure 8 is that as the upper part of the
utility curves are the same, then they would, *ceteris paribus*, pay the same premium
p to get rid of z and thus for *both* agents the lower
return would decline from 0 to 0-p . In paying this premium,
agent u's utility around here falls from u(0) to u(-p ) and v's
utility falls from v(0) to v(-p ). Thus, as we see in Figure 8,
in paying the same premium, p , the *reduction* in utility
of agent u around A_{u}, A_{u｢ }is *more*
than the reduction in utility of agent v around A_{v}, A_{v｢
}. This means that in terms of utility lost, the cost of paying p
is *greater* for agent u than it is for agent v. Notice that the loss in utility
around 1 would be the same but the loss in utility around 0 would be heavier on u.
Consequently, if we allowed them to choose their optimal risk premia rather than imposing
it, then u would pay *less* than agent v to get rid of z. Thus, the individual with
the steeper slope at 0 would want to pay *less* of a premium to avoid the same risk.
But recall that u was more risk-averse than v. Thus, the more risk-averse individual u
pays a *smaller* premium than the less risk-averse individual v for the same risk.
The paradox of a ^{u} > a
^{v} we obtained in the Ross counterexample stems from something like this.

Figure 8- Detail of Ross's Paradox

In sum, what Ross (1981) has
illustrated is that the risk-premium is *not* a good measure of risk-aversion because
we are not taking the "global" behavior of the utility function. The Arrow-Pratt risk-premium we paid to avoid the extra risk in state 1 is
only a *local* risk-premium and is not a good measure of risk-aversion. In order to
compensate for this, Stephen Ross (1981) suggested a *stronger risk-aversion measurement*
(SRAM) that takes these global differences into account. This is defined as follows:

Ross's Stronger Risk-Aversion Measurement: Let u and v be elementary utility functions. Then u is said to display higher risk aversion than v if there exists l > 0 such that for all w, w｢ , we have it that u｢ ｢ (w)/v｢ ｢ (w) ｳ l ｳ u｢ (w｢ )/v｢ (w｢ ).

Another way of saying this is that inf_{w} u｢
｢ (w)/v｢ ｢
(w) ｳ sup_{w} u｢ (w)/v｢ (w｢ ). Note that there are two *different*
wealth levels at which we compute the ratio of first and second derivatives, w｢ and w. Thus, *if* w = w｢ , then
we have the old Arrow-Pratt measure of risk-aversion and thus this inequality can be
written -u｢ ｢ (w)/u｢ (w) ｳ -v｢ ｢ (w)/v｢ (w) so agent u is more
risk-averse than agent v in the Arrow-Pratt sense. Thus, SRAM implies Arrow-Pratt, but
Arrow-Pratt does *not* imply SRAM.

Theorem: (Ross) Let u and v be twice continuously differentiable elementary utility functions. Then the following are equivalent:(1) u｢ ｢ (w)/v｢ ｢ (w) ｳ l ｳ u｢ (w｢ )/v｢ (w｢ ) " w, w｢ .

(2) there is a l > 0 and a decreasing concave function G:Rｮ R (i.e. G｢ < 0 and G｢ ｢ < 0) such that v(w) = l u(w) + G(w) for all w ﾎ R.

(3) p

^{u}(w, z) ｳ p^{v}(w, z) for all w ﾎ R and z where E(z) = 0.

Proof: We go (1)ﾞ (2) ﾞ (3) ﾞ (1).

(1) ﾞ (2): Let l > 0 and G(w) = v(w) - l u(w). Then G｢ (w) = v｢ (w) - l u｢ (w). Thus, l = G｢ (w)/u｢ (w) = v｢ (w)/u｢ (w) - l . By (1), v｢ (w)/u｢ (w) - l ｣ 0 and as u｢ (w) > 0, then G｢ (w) ｣ 0. Differentiating again, G｢ ｢ (w)/u｢ ｢ (w) = v｢ ｢ (w)/u｢ ｢ (w) - l . By (1), v｢ (w)/u｢ (w) - l ｳ 0 but u｢ ｢ (w) < 0, thus G｢ ｢ (w) ｣ 0. Thus, G(w) is a decreasing concave function.

(2)ﾞ (3) By definition, v(w-p ^{v}) = E[v(w+z)] = E[l u(w+z) +
G(w+z)] by (2). Consequently, by Jensen's inequality, E[l
u(w+z) + G(w+z)] ｣ l E[u(w+z)] +
E[G(w+z)]. Now, as E[u(w+z)] = u(w-p ^{u}) by
definition and G is concave, then using our inequalities:

v(w-p

^{v}) ｣ l u(w-p^{u}) + G(w-p^{u})

but by definition l u(w-p
^{u}) + G(w-p ^{u}) = v(w - p
^{u}), thus:

v(w-p

^{v}) ｣ v(w-p^{u})

or, as v is a concave utility function, the it must be that p ^{u} ｣ p
^{v}.

(3) ﾞ (1): Note that maxp
E[u(w-p ^{u})] implies the following conditions for a
maximum

dE[u(w-p

^{u})]/dp^{u}= E[-u｢ (w-p^{u})] = 0d

^{2}E[u(w-p^{u})]/dp^{u2}= E[u｢ (w-p^{u}) - u｢ ｢ (w-p^{u})] < 0

where the second order condition implies, recalling u is concave, that:

- E[u｢ ｢ (w-p

^{u})]/E[u｢ (w-p^{u})] > 0

To analyze this, let E[u(w-p ^{u})] =
pE[u(w_{1}-p ^{u})] + (1-p)E[u(w_{2} -p ^{u})]. Without loss of generality, let us presume that
there is an extra "noise" term attached to w_{1} but not to w_{2},
which pays +e and -e with equal
probability so E(e ) = 0. E[u(w-p ^{u})]
= p[0.5u(w_{1}+e ) + 0.5u(w_{1}-e )] + (1-p)u(w_{2}). Now, let us normalize the optimal
premium so that p ^{u} = 0 when e
= 0. Thus, in this case, when e = 0, then E[u｢ (w-p ^{u})] = pu｢ (w_{1}) + (1-p)u｢ (w_{2})
and E[u｢ ｢ (w-p
^{u})] = pu｢ ｢ (w_{1})
at the optimum premium, p ^{u} =0. Thus, the second
order conditions imply:

-pu｢ ｢ (w

_{1})/[pu｢ (w_{1}) + (1-p)u｢ (w_{2})] > 0

Notice that this expression is identical to ｶ ^{2p u}(e )/de
^{2}|e _{=0}, so p ^{u}
is a convex function of e centered at 0. Now, if p ^{u} > p ^{u}, then
this implies that at e = 0, p ^{u}(e ) is more "convex" than p ^{v}(e ), or:

-pu｢ ｢ (w

_{1})/[pu｢ (w_{1}) + (1-p)u｢ (w_{2})] ｳ -pv｢ ｢ (w_{1})/[pv｢ (w_{1}) + (1-p)v｢ (w_{2})]

which can be rewritten:

u｢ ｢ (w

_{1})/v｢ ｢ (w_{1}) ｳ [pu｢ (w_{1}) + (1-p)u｢ (w_{2})]/[pv｢ (w_{1}) + (1-p)v｢ (w_{2})]

for every w_{1}, w_{2}. Letting p ｮ
0, we see that this implies:

u｢ ｢ (w

_{1})/v｢ ｢ (w_{1}) ｳ u｢ (w_{2})/ v｢ (w_{2})

for every w_{1}, w_{2}. This is Ross' condition for SRAM.
Thus (3) ﾞ (1). ｧ

To understand the implications of Ross's measure, let us turn to portfolio
problem again. Let us again have our standard structure with y being a risky asset and z
the extra "noise" on y which is independent of y and E(z|y) > 0. Normalizing
wealth w = 1, then expected utility of agent u is E[u(y + a ^{u}z)]
and expected utility of agent v is E[v(y + a ^{v}z)]
where a ^{u} and a ^{v}
are is the optimal proportion of wealth held in the riskier asset. Thus the optimization
problem for agent u is to choose a to maximize expected
utility. As a consequence, we obtain the first order condition that:

E[u｢ (y+a z)ｷz] = 0

Let us force agent u to hold agent v's optimal portfolio, then expected
marginal utility of agent u is, in this case, E[u｢ (y+a ^{v}z)ｷz]. Suppose that u is more risk-averse than v in
the sense of Ross: in this case, we know that u(y+a z) = l v(y+a z) + G(y+a
z) as implied by Ross's measure, where G is a decreasing concave function. Then when
forcing agent u to hold a ^{v}, we have:

E[u｢ (y+a

^{v}z)ｷz] = l E[v｢ (y+a^{v}z)ｷz] + E[G｢ (y+a^{v}z)ｷz]

or, as E[v｢ (y+a ^{v}z)ｷz]
= 0 by v's first order condition, then this reduces to:

E[u｢ (y+a

^{v}z)ｷz] = E[G｢ (y+a^{v}z)ｷz]

Now, let q (z) = G｢
(y+a ^{v}z). As G is a decreasing function, then q (z) < 0. As G is concave, then q ｢ (z) = G｢ ｢
(y+a ^{v}z)a ^{v}
< 0. Now, assuming z is independent of y, E[q (z)ｷz] = E[E(q (z))ｷz|y]] = E[cov(q (z), z | y] + E[q (z)|y]ｷE(z|y)]. Now as E(z|y) > 0 and q
｢ (z) < 0, then E[cov(q (z), z)
| y] + E[q (z)|y]ｷE(z|y)] ｣ E[cov(q (z), z | y]. Consequently, by concavity, E[cov(q
(z), z | y] < 0. Thus:

E[u｢ (y+a

^{v}z)ｷz] = E[q (z)ｷz] + E[q (z)|y]ｷE(z|y)] ｣ E[cov(q (z), z | y] < 0

The result that E[u｢ (y+a
^{v}z)ｷz] < 0, implies that the expected utility function of agent u is
declining when he is forced to hold a ^{v}. Thus,
following our earlier logic, this implies necessarily that a ^{v}
ｳ a ^{u}, i.e. the more
risk-averse the agent, the less he invests in the risky asset. Thus, Ross's stronger
measure of risk-aversion seems to give the right result.

A final thing needs to be mentioned at this point. Although we have shown
that a more risk-averse agent holds a less risky position, it does *not* follow that
if someone holds a less risky position implies they are *more* risk-averse. That the
reverse does not hold was shown by Machina and
Neilson (1987) when comparing a risk-averse u and a risk-loving v. However, they argued
that if we *restrict* our comparison to risk-averse agents *or* restrict it
entirely to risk-lovers, so that we do not have cross-comparisons, then indeed the reverse
holds.

Finally, one thing we have not touched upon has been measures of risk-aversion in a multi-variable context. This was first broached by Menachem Yaari (1969) and is examined in the section on state-preference, within which it seems to fit a bit more appropriately.

** **

K.J. Arrow (1965) *Aspects of the Theory of Risk-Bearing*. Helsinki:
Yrj・Hahnsson Foundation.

M. Friedman and L.P. Savage (1948) "The Utility Analysis of Choices
involving Risk", *Journal of Political Economy*, Vol. 56, p.279-304.

J. Hicks (1962) "Liquidity", *Economic Journal*, Vol. 72,
p.787-802.

M.J. Machina and W.S. Neilson (1987) "The Ross Characterization of
Risk Aversion: Strengthening and extension", *Econometrica*, Vol. 55 (5),
p.1139-50.

H. Markowitz (1952) "The Utility of Wealth", *Journal of
Political Economy*, Vol. 60, p.151-8.

H. Markowitz (1952) "Portfolio Selection", *Journal of Finance*,
Vol. 7, p.77-91.

H. Markowitz (1958) *Portfolio Selection: Efficient diversification of
investment*. New Haven, Conn: Yale University Press.

J.W. Pratt (1964) "Risk Aversion in the Small and in the Large",
*Econometrica*, Vol. 32, p.122-36.

S.A. Ross (1981) "Some Stronger Measures of Risk Aversion in the
Small and in the Large with Applications", *Econometrica*, Vol. 49 (3),
p.621-39.

J. Tobin (1958) "Liquidity Preference as a Behavior toward
Risk", *Review of Economic Studies*, Vol. 25, p.65-86.