# Untitled Page 22

- Page ID
- 148362

## 4.4 Applications

### Averages

In the story of Gauss's problem of adding up the numbers from 1 to 100, one interpretation of the result, 5,050, is that the average of all the numbers from 1 to 100 is 50.5. This is the ordinary definition of an average: add up all the things you have, and divide by the number of things. (The result in this example makes sense, because half the numbers are from 1 to 50, and half are from 51 to 100, so the average is half-way between 50 and 51.)

Similarly, a definite integral can also be thought of as a kind of average. In general, if *y* is
a function of *x*, then the average, or mean, value of *y* on the interval from *x*=*a* to *b* can be
defined as

In the continuous case, dividing by *b*-*a* accomplishes the same thing as dividing by the
number of things in the discrete case.

##### Example 6

◊ Show that the definition of the average makes sense in the case where the function is a constant.

◊ If *y* is a constant, then we can take it outside of the integral, so

##### Example 7

◊ Find the average value of the function*y*=

*x*

^{2}for values of

*x*ranging from 0 to 1.

##### The mean value theorem

If the continuous function*y*(

*x*) has the average value y

^{-}on the interval from

*x*=

*a*to

*b*, then

*y*attains its average value at least once in that interval, i.e., there exists ξ with

*a*<ξ<b such that y(ξ)=y

^{-}.

The mean value theorem is proved on page 161.
The special case in which y^{-}=0 is known as Rolle's theorem.

##### Example 8

◊ Verify the mean value theorem for *y*=*x*^{2} on the interval from 0 to 1.

◊ The mean value is 1/3, as shown in example 56. This value is achieved at x=√1/3=1/√3, which lies between 0 and 1.

### Work

In physics, work is a measure of the amount of energy transferred by a force; for example, if a horse
sets a wagon in motion, the horse's force on the wagon is putting some energy of motion into the wagon.
When a force *F* acts on an object that moves in the direction of the force by an infinitesimal
distance *dx*, the infinitesimal work done is *dW*=*Fdx*. Integrating both sides,
we have W=\int_{a}^{b} Fdx, where the force may depend on *x*, and *a* and *b* represent the
initial and final positions of the object.

##### Example 9

◊ A spring compressed by an amount *x* relative to its relaxed length provides
a force *F*=*kx*. Find the amount of work that must be done in order to compress the spring
from *x*=0 to *x*=*a*. (This is the amount of energy stored in the spring, and that energy
will later be released into the toy bullet.)

◊

The reason *W* grows like *a*^{2}, not just like *a*, is that as the spring is compressed
more, more and more effort is required in order to compress it.

### Probability

Mathematically, the probability that something will happen can be specified with a number
ranging from 0 to 1, with 0 representing impossibility and 1 representing certainty.
If you flip a coin, heads and tails both have probabilities of 1/2.
The sum of the probabilities of all the possible outcomes has to have probability 1.
This is called *normalization*.

So far we've discussed random processes having only two possible outcomes: yes or no, win or lose, on or off. More generally, a random process could have a result that is a number. Some processes yield integers, as when you roll a die and get a result from one to six, but some are not restricted to whole numbers, e.g., the height of a human being, or the amount of time that a uranium-238 atom will exist before undergoing radioactive decay. The key to handling these continuous random variables is the concept of the area under a curve, i.e., an integral.

Consider a throw of a die. If the die is “honest,” then we expect all six values to be equally likely. Since all six probabilities must add up to 1, then probability of any particular value coming up must be 1/6. We can summarize this in a graph, f. Areas under the curve can be interpreted as total probabilities. For instance, the area under the curve from 1 to 3 is 1/6+1/6+1/6=1/2, so the probability of getting a result from 1 to 3 is 1/2. The function shown on the graph is called the probability distribution.

Figure g shows the probabilities of various results obtained by rolling two dice and adding them together, as in the game of craps. The probabilities are not all the same. There is a small probability of getting a two, for example, because there is only one way to do it, by rolling a one and then another one. The probability of rolling a seven is high because there are six different ways to do it: 1+6, 2+5, etc.

If the number of possible outcomes is large but finite, for example the number of hairs on a dog, the graph would start to look like a smooth curve rather than a ziggurat.

What about probability distributions for random numbers that
are not integers? We can no longer make a graph with
probability on the *y* axis, because the probability of
getting a given exact number is typically zero. For
instance, there is zero probability that a person will be
*exactly* 200 cm tall, since there are
infinitely many possible results that are close to 200 but not
exactly two, for example 199.99999999687687658766.
It doesn't usually make sense, therefore, to talk
about the probability of a single numerical result, but it
does make sense to talk about the probability of a certain
range of results. For instance, the probability that a randomly
chosen person will be more than 170 cm and less than 200 cm
in height is a perfectly
reasonable thing to discuss. We can still summarize the
probability information on a graph, and we can still
interpret areas under the curve as probabilities.

But the *y* axis can no longer be a unitless probability
scale. In the example of human height, we want the *x*
axis to have units of meters, and we want areas under the
curve to be unitless probabilities. The area of a single
square on the graph paper is then

If the units are to cancel out, then the height of the
square must evidently be a quantity with units of inverse
centimeters. In other words, the *y* axis of the graph is to be
interpreted as probability per unit height, not probability.

Another way of looking at it is that the *y* axis on the graph
gives a derivative, *dP*/*dx*: the infinitesimally small
probability that *x* will lie in the infinitesimally small
range covered by *dx*.

##### Example 10

◊ A computer language will typically have a built-in subroutine that produces a fairly random number that is equally likely to take on any value in the range from 0 to 1. If you take the absolute value of the difference between two such numbers, the probability distribution is of the form*dP*/

*dx*=

*k*(1-

*x*). Find the value of the constant

*k*that is required by normalization.

◊

*self-check:*

Compare the number of people with heights in the range of 130-135 cm to the number in the range 135-140.

(answer in the back of the PDF version of the book)When one random variable is related to another in some mathematical way, the chain rule can be used to relate their probability distributions.

##### Example 11

◊ A laser is placed one meter away from a wall, and spun on the ground to give it a random direction, but if the angle*u*shown in figure j doesn't come out in the range from 0 to π/2, the laser is spun again until an angle in the desired range is obtained. Find the probability distribution of the distance

*x*shown in the figure. The derivative dtan

^{-1}

*z*/

*dz*=1/(1+

*z*

^{2}) will be required (see example 66, page 88).

◊ Since any angle between 0 and π/2 is equally likely, the
probability distribution *dP*/*du* must be a constant, and normalization
tells us that the constant must be *dP*/*du*=2/π.

The laser is one meter from the wall, so the distance *x*, measured in
meters, is given by *x*=tan *u*. For the probability distribution of *x*, we have

Note that the range of possible values of *x* theoretically extends from 0 to
infinity. Problem 7 on page 104 deals with this.

If the next Martian you meet asks you, “How tall is an
adult human?,” you will probably reply with a statement
about the average human height, such as “Oh, about 5 feet 6
inches.” If you wanted to explain a little more, you could
say, “But that's only an average. Most people are somewhere
between 5 feet and 6 feet tall.” Without bothering to draw
the relevant bell curve for your new extraterrestrial
acquaintance, you've summarized the relevant information by
giving an average and a typical range of variation.
The average of a probability distribution can be defined
geometrically as the horizontal position at which it could
be balanced if it was constructed out of cardboard, i.
This is a different way of working with averages than the one
we did earlier. Before, had a graph of *y* versus *x*, we implicitly
assumed that all values of *x* were equally likely, and we found an
average value of *y*. In this new method using probability distributions,
the variable we're averaging is on the *x* axis, and the *y* axis
tells us the relative probabilities of the various *x* values.

For a discrete-valued variable with *n* possible values, the average would be

and in the case of a continuous variable, this becomes an integral,

##### Example 12

◊ For the situation described in example 59,
find the average value of *x*.

◊

Sometimes we don't just want to know the average value of a certain variable, we
also want to have some idea of the amount of variation above and below the average.
The most common way of measuring this is the *standard deviation*,
defined by

The idea here is that if there was no variation at all above or below the average,
then the quantity (x-x^{-}) would be zero whenever *dP*/*dx* was nonzero, and
the standard deviation would be zero. The reason for taking the square root of the whole
thing is so that the result will have the same units as *x*.

##### Example 13

◊ For the situation described in example 59,
find the standard deviation of *x*.

◊ The square of the standard deviation is