# Untitled Page 11

- Page ID
- 125284

# Chapter 6. Random Variables and Probabilities

## 6.1. Random Variables and Probabilities^{*}

### Introduction

Probability associates with an event a number which indicates the likelihood of the occurrence of that event on any trial. An event is modeled as the set of those possible outcomes of an experiment which satisfy a property or proposition characterizing the event.

Often, each outcome is characterized by a number. The experiment is performed. If the
outcome is observed as a physical quantity, the size of that quantity (in prescribed units)
is the entity actually observed. In many nonnumerical cases, it is convenient to assign
a number to each outcome. For example, in a coin flipping experiment, a “head” may be
represented by a 1 and a “tail” by a 0. In a Bernoulli trial, a success may be represented
by a 1 and a failure by a 0. In a sequence of trials, we may be interested in the number
of successes in a sequence of *n* component trials. One could assign a distinct number to
each card in a deck of playing cards. Observations of the result of selecting a card
could be recorded in terms of individual numbers. In each case, the associated number
becomes a property of the outcome.

### Random variables as functions

We consider in this chapter *real* random variables (i.e., real-valued random variables).
In the chapter "Random Vectors and Joint Distributions", we extend the notion to vector-valued random quantites. The fundamental idea of
a real *random variable* is the assignment of a real number to
each elementary outcome *ω* in the basic space *Ω*. Such an assignment
amounts to determining a *function* *X*, whose domain is *Ω* and whose range is
a subset of the real line **R**. Recall that a real-valued function on a domain (say an
interval *I* on the real line) is characterized by the assignment of a real number *y* to
each element *x* (argument) in the domain. For a real-valued function of a real variable,
it is often possible to write a formula or otherwise state a rule describing the assignment
of the value to each argument. Except in special cases, we cannot write a formula for a random
variable *X*. However, random variables share some important general properties of
functions which play an essential role in determining their usefulness.

**Mappings and inverse mappings**

There are various ways of characterizing a function. Probably the most useful for our
purposes is as a *mapping* from the *domain Ω* to the *codomain* **R**. We find the mapping
diagram of Figure 1 extremely useful in visualizing the essential patterns. Random
variable *X*, as a mapping from basic space *Ω* to the real line **R**, assigns to
each element *ω* a value *t*=*X*(*ω*). The object point *ω* is mapped, or carried,
into the image point *t*. Each *ω* is mapped into exactly one *t*, although
several *ω* may have the same image point.

Associated with a function *X* as a mapping are the *inverse mapping **X*^{–1} and the
*inverse images* it produces. Let *M* be a set of numbers on the real line. By the
inverse image of *M* under the mapping *X*, we mean the *set* of *all* those *ω*∈*Ω* which are mapped into *M* by *X* (see Figure 2). If *X* does not take a value
in *M*, the inverse image is the empty set (impossible event). If *M* includes the range of
*X*, (the set of all possible values of *X*), the inverse image is the entire basic space *Ω*.
Formally we write

Now we assume the set *X*^{–1}(*M*), a subset of *Ω*, is an event for each *M*. A
detailed examination of that assertion is a topic in *measure theory*. Fortunately,
the results of measure theory ensure that we may make the assumption for any *X* and any
subset *M* of the real line likely to be encountered in practice. The set *X*^{–1}(*M*) is
the event that *X* takes a value in *M*. As an event, it may be assigned a probability.

*X*=*I*_{E}where*E*is an event with probability*p*. Now*X*takes on only two values, 0 and 1. The event that*X*take on the value 1 is the set(6.2)so that

*P*({*ω*:*X*(*ω*)=1})=*p*. This rather ungainly notation is shortened to*P*(*X*=1)=*p*. Similarly,*P*(*X*=0)=1–*p*. Consider any set*M*. If neither 1 nor 0 is in*M*, then*X*^{–1}(*M*)=*∅*If 0 is in*M*, but 1 is not, then*X*^{–1}(*M*)=*E*^{c}If 1 is in*M*, but 0 is not, then*X*^{–1}(*M*)=*E*If both 1 and 0 are in*M*, then*X*^{–1}(*M*)=*Ω*In this case the class of all events*X*^{–1}(*M*) consists of event*E*, its complement E, the impossible event^{c}*∅*, and the sure event*Ω*.Consider a sequence of

*n*Bernoulli trials, with probability*p*of success. Let*S*be the random variable whose value is the number of successes in the sequence of_{n}*n*component trials. Then, according to the analysis in the section "Bernoulli Trials and the Binomial Distribution"(6.3)

Before considering further examples, we note a general property of inverse images. We state it
in terms of a random variable, which maps *Ω* to the real line (see Figure 3).

**Preservation of set operations**

Let *X* be a mapping from *Ω* to the real line **R**. If *M*,*M*_{i},*i*∈*J*, are sets of real numbers, with respective inverse images *E*,*E*_{i}, then

Examination of simple graphical examples exhibits the plausibility of these patterns. Formal
proofs amount to careful reading of the notation. Central to the structure are the facts that
each element *ω* is mapped into only one image point *t* and that the inverse image of *M* is the set
of *all* those *ω* which are mapped into image points in *M*.

An easy, but important, consequence of the general patterns is that the inverse images of disjoint *M*,*N*
are also disjoint. This implies that the inverse of a disjoint union of *M _{i}* is a disjoint
union of the separate inverse images.

Consider, again, the random variable *S _{n}* which counts the number of successes
in a sequence of

*n*Bernoulli trials. Let

*n*=10 and

*p*=0.33. Suppose we want to determine the probability . Let , which we usually shorten to . Now the

*A*form a partition, since we cannot have

_{k}*ω*∈

*A*

_{k}and (i.e., for any

*ω*, we cannot have two values for

*S*

_{n}(

*ω*)). Now,

since *S _{10}* takes on a value greater than 2 but no greater than 8 iff it takes one
of the integer values from 3 to 8. By the additivity of probability,

### Mass transfer and induced probability distribution

Because of the abstract nature of the basic space and the class of events, we are limited
in the kinds of calculations that can be performed meaningfully with the probabilities
on the basic space. We represent probability as mass distributed
on the basic space and visualize this with the aid of general Venn diagrams and minterm
maps. We now think of the mapping from *Ω* to **R** as a producing a point-by-point
*transfer of the probability mass* to the real line. This may be done as follows:

To any set *M* on the real line assign probability mass

It is apparent that *P*_{X}(*M*)≥0 and *P*_{X}(**R**)=*P*(*Ω*)=1. And because of the
preservation of set operations by the inverse mapping

This means that *P _{X}* has the properties of a probability measure defined on the subsets of
the real line. Some results of measure theory show that this
probability is defined uniquely on a class of subsets of

**R**that includes any set normally encountered in applications. We have achieved a point-by-point transfer of the probability apparatus to the real line in such a manner that we can make calculations about the random variable

*X*. We call

*P*the

_{X}*probability measure induced by*

*X*. Its importance lies in the fact that

*P*(

*X*∈

*M*)=

*P*

_{X}(

*M*). Thus,

*to determine the likelihood that random quantity*. This transfer produces what is called the

*X*will take on a value in set*M*, we determine how much induced probability mass is in the set*M**probability distribution*for

*X*. In the chapter "Distribution and Density Functions", we consider useful ways to describe the probability distribution induced by a random variable. We turn first to a special class of random variables.

### Simple random variables

We consider, in some detail, random variables which have only a finite set of possible
values. These are called *simple* random variables. Thus the term “simple” is used
in a special, technical sense. The importance of simple random variables rests on two facts.
For one thing, in practice we can distinguish only a finite set of possible values for any random variable. In
addition, any random variable may be approximated as closely as pleased by a simple
random variable. When the structure and properties of simple random variables have been examined,
we turn to more general cases. Many properties of simple random variables extend to
the general case via the approximation procedure.

**Representation with the aid of indicator functions**

In order to deal with simple random variables clearly and precisely, we must find suitable ways to express them analytically. We do this with the aid of indicator functions. Three basic forms of representation are encountered. These are not mutually exclusive representatons.

Standard or

*canonical form*, which displays the possible values and the corresponding events. If*X*takes on distinct values(6.8)and if , for 1≤

*i*≤*n*, then is a partition (i.e., on any trial, exactly one of these events occurs). We call this the*partition determined by*(or, generated by)*X*. We may write(6.9)If

*X*(*ω*)=*t*_{i}, then*ω*∈*A*_{i}, so that*I*_{Ai}(*ω*)=1 and all the other indicator functions have value zero. The summation expression thus picks out the correct value*t*. This is true for any_{i}*t*, so the expression represents_{i}*X*(*ω*) for all*ω*. The distinct set of the values and the corresponding probabilities constitute the*distribution*for*X*. Probability calculations for*X*are made in terms of its distribution. One of the advantages of the canonical form is that it displays the range (set of values), and if the probabilities are known, the distribution is determined.*Note*that in canonical form, if one of the*t*has value zero, we include that term. For some probability distributions it may be that for one or more of the_{i}*t*. In that case, we call these values_{i}*null values*, for they can only occur with probability zero, and hence are practically impossible. In the general formulation, we include possible null values, since they do not affect any probabilitiy calculations.Example 6.3. Successes in Bernoulli trialsAs the analysis of Bernoulli trials and the binomial distribution shows (see Section 4.8), canonical form must be

(6.10)For many purposes, both theoretical and practical, canonical form is desirable. For one thing, it displays directly the range (i.e., set of values) of the random variable. The

*distribution*consists of the set of values paired with the corresponding set of probabilities , where .Simple random variable

*X*may be represented by a*primitive form*(6.11)*Remarks*If is a disjoint class, but , we may append the event and assign value zero to it.

We say

*a*primitive form, since the representation is not unique. Any of the*C*may be partitioned, with the same value_{i}*c*associated with each subset formed._{i}Canonical form is a special primitive form. Canonical form is unique, and in many ways normative.

Example 6.4. Simple random variables in primitive formA wheel is spun yielding, on a equally likely basis, the integers 1 through 10. Let

*C*be the event the wheel stops at_{i}*i*, 1≤*i*≤10. Each . If the numbers 1, 4, or 7 turn up, the player loses ten dollars; if the numbers 2, 5, or 8 turn up, the player gains nothing; if the numbers 3, 6, or 9 turn up, the player gains ten dollars; if the number 10 turns up, the player loses one dollar. The random variable expressing the results may be expressed in primitive form as(6.12)*X*=–10*I*_{C1}+0*I*_{C2}+10*I*_{C3}–10*I*_{C4}+0*I*_{C5}+10*I*_{C6}–10*I*_{C7}+0*I*_{C8}+10*I*_{C9}–*I*_{C10}A store has eight items for sale. The prices are $3.50, $5.00, $3.50, $7.50, $5.00, $5.00, $3.50, and $7.50, respectively. A customer comes in. She purchases one of the items with probabilities 0.10, 0.15, 0.15, 0.20, 0.10 0.05, 0.10 0.15. The random variable expressing the amount of her purchase may be written

(6.13)*X*=3.5*I*_{C1}+5.0*I*_{C2}+3.5*I*_{C3}+7.5*I*_{C4}+5.0*I*_{C5}+5.0*I*_{C6}+3.5*I*_{C7}+7.5*I*_{C8}

We commonly have

*X*represented in*affine form*, in which the random variable is represented as an affine combination of indicator functions (i.e., a linear combination of the indicator functions plus a constant, which may be zero).(6.14)In this form, the class is not necessarily mutually exclusive, and the coefficients do not display directly the set of possible values. In fact, the

*E*often form an independent class._{i}*Remark*. Any primitive form is a special affine form in which*c*_{0}=0 and the*E*form a partition._{i}Example 6.5.Consider, again, the random variable

*S*which counts the number of successes in a sequence of_{n}*n*Bernoulli trials. If*E*is the event of a success on the_{i}*i*th trial, then one natural way to express the count is(6.15)This is affine form, with

*c*_{0}=0 and*c*_{i}=1 for 1≤*i*≤*n*. In this case, the*E*cannot form a mutually exclusive class, since they form an independent class._{i}**Events generated by a simple random variable: canonical form**

We may characterize the class of all inverse images formed by a simple random*X*in terms of the partition it determines. Consider any set*M*of real numbers. If*t*in the range of_{i}*X*is in*M*, then every point*ω*∈*A*_{i}maps into*t*, hence into_{i}*M*. If the set*J*is the set of indices*i*such that*t*_{i}∈*M*, then

Only those points*ω*in map into*M*.

Hence, the class of events (i.e., inverse images) determined by*X*consists of the impossible event*∅*, the sure event*Ω*, and the union of any subclass of the*A*in the partition determined by_{i}*X*.Example 6.6. Events determined by a simple random variableSuppose simple random variable

*X*is represented in canonical form by(6.16)*X*= – 2*I*_{A}–*I*_{B}+ 0*I*_{C}+ 3*I*_{D}Then the class {

*A*,*B*,*C*,*D*} is the partition determined by*X*and the range of*X*is {–2,–1,0,3}.If

*M*is the interval , then the values -2, -1, and 0 are in*M*and*X*^{–1}(*M*)=*A*⋁*B*⋁*C*.If

*M*is the set (–2,–1]∪[1,5], then the values -1, 3 are in*M*and*X*^{–1}(*M*)=*B*⋁*D*.The event , where

*M*=(–∞,1]. Since values -2, -1, 0 are in*M*, the event {*X*≤1}=*A*⋁*B*⋁*C*.

### Determination of the distribution

Determining the partition generated by a simple random variable amounts to determining the canonical form. The distribution is then completed by determining the probabilities of each event .

**From a primitive form**

Before writing down the general pattern, we consider an illustrative example.

Suppose one item is selected at random from a group of ten items. The values (in dollars) and respective probabilities are

c
_{j} | 2.00 | 1.50 | 2.00 | 2.50 | 1.50 | 1.50 | 1.00 | 2.50 | 2.00 | 1.50 |

0.08 | 0.11 | 0.07 | 0.15 | 0.10 | 0.09 | 0.14 | 0.08 | 0.08 | 0.10 |

By inspection, we find four distinct values: *t*_{1}=1.00, *t*_{2}=1.50, *t*_{3}=2.00,
and *t*_{4}=2.50. The value 1.00 is taken on for *ω*∈*C*_{7} , so that *A*_{1}=*C*_{7} and
. Value 1.50 is taken on for *ω*∈*C*_{2},*C*_{5},*C*_{6},*C*_{10}
so that

Similarly

The distribution for *X* is thus

k
| 1.00 | 1.50 | 2.00 | 2.50 |

P
(
X
=
k
)
| 0.14 | 0.40 | 0.23 | 0.23 |

*The general procedure* may be formulated as follows:

If , we identify the set of distinct values
in the set . Suppose these are *t*_{1}<*t*_{2}<⋯<*t*_{n}.
For any possible value *t _{i}* in the range, identify the index set

*J*of those

_{i}*j*such that

*c*

_{j}=

*t*

_{i}. Then the terms

and

Examination of this procedure shows that there are two phases:

Select and sort the distinct values

Add all probabilities associated with each value

*t*to determine_{i}

We use the m-function *csort* which performs these two
operations (see Example 4 from "Minterms and MATLAB Calculations").

>> C = [2.00 1.50 2.00 2.50 1.50 1.50 1.00 2.50 2.00 1.50]; % Matrix of c_j >> pc = [0.08 0.11 0.07 0.15 0.10 0.09 0.14 0.08 0.08 0.10]; % Matrix of P(C_j) >> [X,PX] = csort(C,pc); % The sorting and consolidating operation >> disp([X;PX]') % Display of results 1.0000 0.1400 1.5000 0.4000 2.0000 0.2300 2.5000 0.2300

For a problem this small, use of a tool such as csort is not really needed. But in many problems with large sets of data the m-function csort is very useful.

**From affine form**

Suppose *X* is in affine form,

We determine a particular primitive form by determining the value of *X* on each
minterm generated by the class . We do this in
a systematic way by utilizing minterm vectors and properties of indicator functions.

*X*is constant on each minterm generated by the class since, as noted in the treatment of the minterm expansion, each indicator function*I*_{Ei}is constant on each minterm. We determine the value*s*of_{i}*X*on each minterm*M*. This describes_{i}*X*in a special primitive form(6.22)We apply the csort operation to the matrices of values and minterm probabilities to determine the distribution for

*X*.

We illustrate with a simple example. Extension to the general case should be quite evident. First, we do the problem “by hand” in tabular form. Then we use the m-procedures to carry out the desired operations.

A mail order house is featuring three items (limit one of each kind per customer). Let

*E*_{1}= the event the customer orders item 1, at a price of 10 dollars.*E*_{2}= the event the customer orders item 2, at a price of 18 dollars.*E*_{3}= the event the customer orders item 3, at a price of 10 dollars.

There is a mailing charge of 3 dollars per order.

We suppose is independent with probabilities 0.6, 0.3, 0.5,
respectively. Let *X* be the amount a customer who orders the special items spends on
them plus mailing cost. Then, in affine form,

We seek first the primitive form, using the minterm probabilities, which may calculated in this case by using the m-function minprob.

To obtain the value of

*X*on each minterm weMultiply the minterm vector for each generating event by the coefficient for that event

Sum the values on each minterm and add the constant

To complete the table, list the corresponding minterm probabilities.

*i*10 *I*_{Ei}18 *I*_{E2}10 *I*_{E3}c *s*_{i}*p**m*_{i}0 0 0 0 3 3 0.14 1 0 0 10 3 13 0.14 2 0 18 0 3 21 0.06 3 0 18 10 3 31 0.06 4 10 0 0 3 13 0.21 5 10 0 10 3 23 0.21 6 10 18 0 3 31 0.09 7 10 18 10 3 41 0.09 We then sort on the

*s*, the values on the various_{i}*M*, to expose more clearly the primitive form for_{i}*X*.“Primitive form” Values *i**s*_{i}*p**m*_{i}0 3 0.14 1 13 0.14 4 13 0.21 2 21 0.06 5 23 0.21 3 31 0.06 6 31 0.09 7 41 0.09 The primitive form of

*X*is thus(6.24)We note that the value 13 is taken on on minterms

*M*and_{1}*M*. The probability_{4}*X*has the value 13 is thus*p*(1)+*p*(4). Similarly,*X*has value 31 on minterms*M*and_{3}*M*._{6}To complete the process of determining the distribution, we list the sorted values and consolidate by adding together the probabilities of the minterms on which each value is taken, as follows:

*k**t*_{k}*p*_{k}1 3 0.14 2 13 0.14 + 0.21 = 0.35 3 21 0.06 4 23 0.21 5 31 0.06 + 0.09 = 0.15 6 41 0.09 The results may be put in a matrix

*X*of possible values and a corresponding matrix**PX**of probabilities that*X*takes on each of these values. Examination of the table shows that(6.25)Matrices

*X*and**PX**describe the distribution for*X*.

### An m-procedure for determining the distribution from affine form

We now consider suitable MATLAB steps in determining the distribution from affine form, then
incorporate these in the m-procedure *canonic* for carrying out the transformation.
We start with the random variable in affine form, and suppose we have available, or
can calculate, the minterm probabilities.

The procedure uses

*mintable*to set the basic minterm vector patterns, then uses a matrix of coefficients, including the constant term (set to zero if absent), to obtain the values on each minterm. The minterm probabilities are included in a row matrix.Having obtained the values on each minterm, the procedure performs the desired consolidation by using the m-function csort.

*X*in Example 6.9

>> c = [10 18 10 3]; % Constant term is listed last >> pm = minprob(0.1*[6 3 5]); >> M = mintable(3) % Minterm vector pattern M = 0 0 0 0 1 1 1 1 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1 % - - - - - - - - - - - - - - % An approach mimicking ``hand'' calculation >> C = colcopy(c(1:3),8) % Coefficients in position C = 10 10 10 10 10 10 10 10 18 18 18 18 18 18 18 18 10 10 10 10 10 10 10 10 >> CM = C.*M % Minterm vector values CM = 0 0 0 0 10 10 10 10 0 0 18 18 0 0 18 18 0 10 0 10 0 10 0 10 >> cM = sum(CM) + c(4) % Values on minterms cM = 3 13 21 31 13 23 31 41 % - - - - - - - - - - - - - % Practical MATLAB procedure >> s = c(1:3)*M + c(4) s = 3 13 21 31 13 23 31 41 >> pm = 0.14 0.14 0.06 0.06 0.21 0.21 0.09 0.09 % Extra zeros deleted >> const = c(4)*ones(1,8);}

>> disp([CM;const;s;pm]') % Display of primitive form 0 0 0 3 3 0.14 % MATLAB gives four decimals 0 0 10 3 13 0.14 0 18 0 3 21 0.06 0 18 10 3 31 0.06 10 0 0 3 13 0.21 10 0 10 3 23 0.21 10 18 0 3 31 0.09 10 18 10 3 41 0.09 >> [X,PX] = csort(s,pm); % Sorting on s, consolidation of pm >> disp([X;PX]') % Display of final result 3 0.14 13 0.35 21 0.06 23 0.21 31 0.15 41 0.09

The two basic steps are combined in the m-procedure *canonic*, which we use
to solve the previous problem.

>> c = [10 18 10 3]; % Note that the constant term 3 must be included last >> pm = minprob([0.6 0.3 0.5]); >> canonic Enter row vector of coefficients c Enter row vector of minterm probabilities pm Use row matrices X and PX for calculations Call for XDBN to view the distribution >> disp(XDBN) 3.0000 0.1400 13.0000 0.3500 21.0000 0.0600 23.0000 0.2100 31.0000 0.1500 41.0000 0.0900

With the distribution available in the matrices *X* (set of values) and **PX** (set of
probabilities), we may calculate a wide variety of quantities associated with the
random variable.

We use two key devices:

Use relational and logical operations on the matrix of values

*X*to determine a matrix*M*which has ones for those values which meet a prescribed condition.*P*(*X*∈*M*): PM = M*PX'Determine by using array operations on matrix

*X*. We have two alternatives:Use the matrix

*G*, which has values for each possible value*t*for_{i}*X*, or,Apply csort to the pair to get the distribution for

*Z*=*g*(*X*). This distribution (in value and probability matrices) may be used in exactly the same manner as that for the original random variable*X*.

Suppose for the random variable *X* in Example 6.11 it is desired to determine the probabilities

*P*(15≤*X*≤35), *P*(|*X*–20|≤7), and *P*((*X*–10)(*X*–25)>0).

>> M = (X>=15)&(X<=35); M = 0 0 1 1 1 0 % Ones for minterms on which 15 <= X <= 35 >> PM = M*PX' % Picks out and sums those minterm probs PM = 0.4200 >> N = abs(X-20)<=7; N = 0 1 1 1 0 0 % Ones for minterms on which |X - 20| <= 7 >> PN = N*PX' % Picks out and sums those minterm probs PN = 0.6200 >> G = (X - 10).*(X - 25) G = 154 -36 -44 -26 126 496 % Value of g(t_i) for each possible value >> P1 = (G>0)*PX' % Total probability for those t_i such that P1 = 0.3800 % g(t_i) > 0 >> [Z,PZ] = csort(G,PX) % Distribution for Z = g(X) Z = -44 -36 -26 126 154 496 PZ = 0.0600 0.3500 0.2100 0.1500 0.1400 0.0900 >> P2 = (Z>0)*PZ' % Calculation using distribution for Z P2 = 0.3800

Ten race cars are involved in time trials to determine pole positions for an
upcoming race. To qualify, they must post an average speed of 125 mph or more on a
trial run. Let *E _{i}* be the event the

*i*th car makes qualifying speed. It seems reasonable to suppose the class is independent. If the respective probabilities for success are 0.90, 0.88, 0.93, 0.77, 0.85, 0.96, 0.72, 0.83, 0.91, 0.84, what is the probability that

*k*or more will qualify (

*k*=6,7,8,9,10)?

**SOLUTION**

Let .

>> c = [ones(1,10) 0]; >> P = [0.90, 0.88, 0.93, 0.77, 0.85, 0.96, 0.72, 0.83, 0.91, 0.84]; >> canonic Enter row vector of coefficients c Enter row vector of minterm probabilities minprob(P) Use row matrices X and PX for calculations Call for XDBN to view the distribution >> k = 6:10; >> for i = 1:length(k) Pk(i) = (X>=k(i))*PX'; end >> disp(Pk) 0.9938 0.9628 0.8472 0.5756 0.2114

This solution is not as convenient to write out. However, with the distribution for
*X* as defined, a great many other probabilities can be determined. This is particularly
the case when it is desired to compare the results of two independent races or “heats.”
We consider such problems in the study of Independent Classes of Random
Variables.

**A function form for canonic**

One disadvantage of the procedure canonic is that it always names the output
*X* and **PX**. While these can easily be renamed, frequently it is desirable to use some other
name for the random variable from the start. A function form, which we call **canonicf**,
is useful in this case.

>> c = [10 18 10 3]; >> pm = minprob(0.1*[6 3 5]); >> [Z,PZ] = canonicf(c,pm); >> disp([Z;PZ]') % Numbers as before, but the distribution 3.0000 0.1400 % matrices are now named Z and PZ 13.0000 0.3500 21.0000 0.0600 23.0000 0.2100 31.0000 0.1500 41.0000 0.0900

### General random variables

The distribution for a simple random variable is easily visualized as point mass
concentrations at the various values in the range, and the class
of events determined by a simple random variable is described in terms of the
partition generated by *X* (i.e., the class of those events of the form
for each *t _{i}* in the range). The situation is conceptually the same for the
general case, but the details are more complicated. If the random variable takes on
a continuum of values, then the probability mass distribution may be spread smoothly
on the line. Or, the distribution may be a mixture of point mass concentrations and
smooth distributions on some intervals. The class of events determined by

*X*is the set of all inverse images

*X*

^{–1}(

*M*) for

*M*any member of a general class of subsets of subsets of the real line known in the mathematical literature as the

*Borel sets*. There are technical mathematical reasons for not saying

*M*is

*any*subset, but the class of Borel sets is general enough to include any set likely to be encountered in applications—certainly at the level of this treatment. The Borel sets include any interval and any set that can be formed by complements, countable unions, and countable intersections of Borel sets. This is a type of class known as a

*sigma algebra*of events. Because of the preservation of set operations by the inverse image, the class of events determined by random variable

*X*is also a sigma algebra, and is often designated

*σ*(

*X*). There are some technical questions concerning the probability measure

*P*induced by

_{X}*X*, hence the distribution. These also are settled in such a manner that there is no need for concern at this level of analysis. However, some of these questions become important in dealing with random processes and other advanced notions increasingly used in applications. Two facts provide the freedom we need to proceed with little concern for the technical details.

*X*^{–1}(*M*) is an event for every Borel set*M*iff for every semi-infinite interval on the real line is an event.The induced probability distribution is determined uniquely by its assignment to all intervals of the form .

These facts point to the importance of the distribution function introduced in the next chapter.

Another fact, alluded to above and discussed in some detail in the next chapter, is that any general random variable can be approximated as closely as pleased by a simple random variable. We turn in the next chapter to a description of certain commonly encountered probability distributions and ways to describe them analytically.

## 6.2. Problems on Random Variables and Probabilities^{*}

The following simple random variable is in canonical form:

*X*=–3.75*I*_{A}–1.13*I*_{B}+0*I*_{C}+2.6*I*_{D}.

Express the events {*X*∈(–4,2]},{*X*∈(0,3]},{*X*∈(–∞,1]},
{|*X*–1|≥1}, and {*X*≥0} in terms of *A*,*B*,*C*, and *D*.

*A*⋁*B*⋁*C**D**A*⋁*B*⋁*C**C**C*⋁*D*

Random variable *X*, in canonical form, is given by *X*=–2*I*_{A}–*I*_{B}+*I*_{C}+2*I*_{D}+5*I*_{E}.

Express the events {*X*∈[2,3)},{*X*≤0},{*X*<0}, {|*X*–2|≤3},
and , in terms of *A*,*B*,*C*,*D*, and *E*.

*D**A*⋁*B**A*⋁*B**B*⋁*C*⋁*D*⋁*E**A*⋁*D*⋁*E*

The class is a partition. Random variable *X*
has values {1,3,2,3,4,2,1,3,5,2} on *C _{1}* through

*C*, respectively. Express

_{10}*X*in canonical form.

T = [1 3 2 3 4 2 1 3 5 2]; [X,I] = sort(T) X = 1 1 2 2 2 3 3 3 4 5 I = 1 7 3 6 10 2 4 8 5 9

*X*=

*I*

_{A}+ 2

*I*

_{B}+ 3

*I*

_{C}+ 4

*I*

_{D}+ 5

*I*

_{E}

The class in Exercise 3. has respective
probabilities 0.08, 0.13, 0.06, 0.09, 0.14, 0.11, 0.12, 0.07, 0.11, 0.09. Determine
the distribution for *X*.

T = [1 3 2 3 4 2 1 3 5 2]; pc = 0.01*[8 13 6 9 14 11 12 7 11 9]; [X,PX] = csort(T,pc); disp([X;PX]') 1.0000 0.2000 2.0000 0.2600 3.0000 0.2900 4.0000 0.1400 5.0000 0.1100

A wheel is spun yielding on an equally likely basis the
integers 1 through 10. Let
*C _{i}* be the event the wheel stops at

*i*, 1≤

*i*≤10. Each . If the numbers 1, 4, or 7 turn up, the player loses ten dollars; if the numbers 2, 5, or 8 turn up, the player gains nothing; if the numbers 3, 6, or 9 turn up, the player gains ten dollars; if the number 10 turns up, the player loses one dollar. The random variable expressing the results may be expressed in primitive form as

*X*= – 10

*I*

_{C1}+ 0

*I*

_{C2}+ 10

*I*

_{C3}– 10

*I*

_{C4}+ 0

*I*

_{C5}+ 10

*I*

_{C6}– 10

*I*

_{C7}+ 0

*I*

_{C8}+ 10

*I*

_{C9}–

*I*

_{C10}

Determine the distribution for

*X*, (a) by hand, (b) using MATLAB.Determine

*P*(*X*<0),*P*(*X*>0).

p = 0.1*ones(1,10); c = [-10 0 10 -10 0 10 -10 0 10 -1]; [X,PX] = csort(c,p); disp([X;PX]') -10.0000 0.3000 -1.0000 0.1000 0 0.3000 10.0000 0.3000 Pneg = (X<0)*PX' Pneg = 0.4000 Ppos = (X>0)*PX' Ppos = 0.300

A store has eight items for sale. The prices are $3.50, $5.00, $3.50, $7.50, $5.00, $5.00, $3.50, and $7.50, respectively. A customer comes in. She purchases one of the items with probabilities 0.10, 0.15, 0.15, 0.20, 0.10 0.05, 0.10 0.15. The random variable expressing the amount of her purchase may be written

*X*= 3 . 5

*I*

_{C1}+ 5 . 0

*I*

_{C2}+ 3 . 5

*I*

_{C3}+ 7 . 5

*I*

_{C4}+ 5 . 0

*I*

_{C5}+ 5 . 0

*I*

_{C6}+ 3 . 5

*I*

_{C7}+ 7 . 5

*I*

_{C8}

Determine the distribution for *X* (a) by hand, (b) using MATLAB.

p = 0.01*[10 15 15 20 10 5 10 15]; c = [3.5 5 3.5 7.5 5 5 3.5 7.5]; [X,PX] = csort(c,p); disp([X;PX]') 3.5000 0.3500 5.0000 0.3000 7.5000 0.3500

Suppose in canonical form are

The are 0.3, 0.6, 0.1, respectively, and the are 0.2 0.6 0.2. Each
pair is independent. Consider the random variable *Z*=*X*+*Y*. Then
*Z*=2+1 on *A*_{1}*B*_{1}, *Z*=3+3 on *A*_{2}*B*_{3}, etc. Determine the value of
*Z* on each *A*_{i}*B*_{j} and determine the corresponding . From this,
determine the distribution for *Z*.

A = [2 3 5]; B = [1 2 3]; a = rowcopy(A,3); b = colcopy(B,3); Z =a + b % Possible values of sum Z = X + Y Z = 3 4 6 4 5 7 5 6 8 PA = [0.3 0.6 0.1]; PB = [0.2 0.6 0.2]; pa= rowcopy(PA,3); pb = colcopy(PB,3); P = pa.*pb % Probabilities for various values P = 0.0600 0.1200 0.0200 0.1800 0.3600 0.0600 0.0600 0.1200 0.0200 [Z,PZ] = csort(Z,P); disp([Z;PZ]') % Distribution for Z = X + Y 3.0000 0.0600 4.0000 0.3000 5.0000 0.4200 6.0000 0.1400 7.0000 0.0600 8.0000 0.0200

For the random variables in Exercise 7., let *W*=*X**Y*. Determine
the value of *W* on each *A*_{i}*B*_{j} and determine the distribution of *W*.

XY = a.*b XY = 2 3 5 % XY values 4 6 10 6 9 15 W PW % Distribution for W = XY 2.0000 0.0600 3.0000 0.1200 4.0000 0.1800 5.0000 0.0200 6.0000 0.4200 9.0000 0.1200 10.0000 0.0600 15.0000 0.0200

A pair of dice is rolled.

Let

*X*be the minimum of the two numbers which turn up. Determine the distribution for*X*Let

*Y*be the maximum of the two numbers. Determine the distribution for*Y*.Let

*Z*be the sum of the two numbers. Determine the distribution for*Z*.Let

*W*be the absolute value of the difference. Determine its distribution.

t = 1:6; c = ones(6,6); [x,y] = meshgrid(t,t) x = 1 2 3 4 5 6 % x-values in each position 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 y = 1 1 1 1 1 1 % y-values in each position 2 2 2 2 2 2 3 3 3 3 3 3 4 4 4 4 4 4 5 5 5 5 5 5 6 6 6 6 6 6 m = min(x,y); % min in each position M = max(x,y); % max in each position s = x + y; % sum x+y in each position d = abs(x - y); % |x - y| in each position [X,fX] = csort(m,c) % sorts values and counts occurrences X = 1 2 3 4 5 6 fX = 11 9 7 5 3 1 % PX = fX/36 [Y,fY] = csort(M,c) Y = 1 2 3 4 5 6 fY = 1 3 5 7 9 11 % PY = fY/36 [Z,fZ] = csort(s,c) Z = 2 3 4 5 6 7 8 9 10 11 12 fZ = 1 2 3 4 5 6 5 4 3 2 1 %PZ = fZ/36 [W,fW] = csort(d,c) W = 0 1 2 3 4 5 fW = 6 10 8 6 4 2 % PW = fW/36

Minterm probabilities *p*(0) through *p*(15) for the class
are, in order,

Determine the distribution for random variable

*X*= – 5 . 3

*I*

_{A}– 2 . 5

*I*

_{B}+ 2 . 3

*I*

_{C}+ 4 . 2

*I*

_{D}– 3 . 7

% file npr06_10.m % Data for Exercise 10. pm = [ 0.072 0.048 0.018 0.012 0.168 0.112 0.042 0.028 ... 0.062 0.048 0.028 0.010 0.170 0.110 0.040 0.032]; c = [-5.3 -2.5 2.3 4.2 -3.7]; disp('Minterm probabilities are in pm, coefficients in c') npr06_10 Minterm probabilities are in pm, coefficients in c canonic Enter row vector of coefficients c Enter row vector of minterm probabilities pm Use row matrices X and PX for calculations Call for XDBN to view the distribution XDBN XDBN = -11.5000 0.1700 -9.2000 0.0400 -9.0000 0.0620 -7.3000 0.1100 -6.7000 0.0280 -6.2000 0.1680 -5.0000 0.0320 -4.8000 0.0480 -3.9000 0.0420 -3.7000 0.0720 -2.5000 0.0100 -2.0000 0.1120 -1.4000 0.0180 0.3000 0.0280 0.5000 0.0480 2.8000 0.0120

On a Tuesday evening, the Houston Rockets, the Orlando Magic, and the
Chicago Bulls all have games (but not with one another). Let *A* be the event the
Rockets win, *B* be the event the Magic win, and *C* be the
event the Bulls win. Suppose the class is independent, with respective
probabilities 0.75, 0.70 0.8. Ellen's boyfriend is a rabid Rockets fan, who does not like
the Magic. He wants to bet on the games. She decides to take him up on his bets as follows:

$10 to 5 on the Rockets --- i.e. She loses five if the Rockets win and gains ten if they lose

$10 to 5 against the Magic

even $5 to 5 on the Bulls.

Ellen's winning may be expressed as the random variable

*X*= – 5

*I*

_{A}+ 10

*I*

_{Ac}+ 10

*I*

_{B}– 5

*I*

_{Bc}– 5

*I*

_{C}+ 5

*I*

_{Cc}= – 15

*I*

_{A}+ 15

*I*

_{B}– 10

*I*

_{C}+ 10

Determine the distribution for *X*. What are the probabilities Ellen loses money,
breaks even, or comes out ahead?

P = 0.01*[75 70 80]; c = [-15 15 -10 10]; canonic Enter row vector of coefficients c Enter row vector of minterm probabilities minprob(P) Use row matrices X and PX for calculations Call for XDBN to view the distribution disp(XDBN) -15.0000 0.1800 -5.0000 0.0450 0 0.4800 10.0000 0.1200 15.0000 0.1400 25.0000 0.0350 PXneg = (X<0)*PX' PXneg = 0.2250 PX0 = (X==0)*PX' PX0 = 0.4800 PXpos = (X>0)*PX' PXpos = 0.2950

The class has minterm probabilities

Determine whether or not the class is independent.

The random variable

*X*=*I*_{A}+*I*_{B}+*I*_{C}+*I*_{D}counts the number of the events which occur on a trial. Find the distribution for*X*and determine the probability that two or more occur on a trial. Find the probability that one or three of these occur on a trial.

npr06_12 Minterm probabilities in pm, coefficients in c a = imintest(pm) The class is NOT independent Minterms for which the product rule fails a = 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 canonic Enter row vector of coefficients c Enter row vector of minterm probabilities pm Use row matrices X and PX for calculations Call for XDBN to view the distribution XDBN = 0 0.0050 1.0000 0.0430 2.0000 0.2120 3.0000 0.4380 4.0000 0.3020 P2 = (X>=2)*PX' P2 = 0.9520 P13 = ((X==1)|(X==3))*PX' P13 = 0.4810

James is expecting three checks in the mail, for $20, $26, and $33 dollars. Their arrivals are the events . Assume the class is independent, with respective probabilities 0.90, 0.75, 0.80. Then

*X*= 20

*I*

_{A}+ 26

*I*

_{B}+ 33

*I*

_{C}

represents the total amount received. Determine the distribution for *X*. What is the
probability he receives at least $50? Less than $30?

c = [20 26 33 0]; P = 0.01*[90 75 80]; canonic Enter row vector of coefficients c Enter row vector of minterm probabilities minprob(P) Use row matrices X and PX for calculations Call for XDBN to view the distribution disp(XDBN) 0 0.0050 20.0000 0.0450 26.0000 0.0150 33.0000 0.0200 46.0000 0.1350 53.0000 0.1800 59.0000 0.0600 79.0000 0.5400 P50 = (X>=50)*PX' P50 = 0.7800 P30 = (X <30)*PX' P30 = 0.0650

A gambler places three bets. He puts down two dollars for each bet. He picks up three dollars (his original bet plus one dollar) if he wins the first bet, four dollars if he wins the second bet, and six dollars if he wins the third. His net winning can be represented by the random variable

Assume the results of the games are independent. Determine the distribution for *X*.

c = [3 4 6 -6]; P = 0.1*[5 4 3]; canonic Enter row vector of coefficients c Enter row vector of minterm probabilities minprob(P) Use row matrices X and PX for calculations Call for XDBN to view the distribution dsp(XDBN) -6.0000 0.2100 -3.0000 0.2100 -2.0000 0.1400 0 0.0900 1.0000 0.1400 3.0000 0.0900 4.0000 0.0600 7.0000 0.0600

Henry goes to a hardware store. He considers a power drill at $35, a socket
wrench set at $56, a set of screwdrivers at $18, a vise at $24, and hammer at $8. He
decides independently on the purchases of the individual items, with respective probabilities
0.5, 0.6, 0.7, 0.4, 0.9. Let *X* be the amount of his total purchases. Determine the
distribution for *X*.

c = [35 56 18 24 8 0]; P = 0.1*[5 6 7 4 9]; canonic Enter row vector of coefficients c Enter row vector of minterm probabilities minprob(P) Use row matrices X and PX for calculations Call for XDBN to view the distribution disp(XDBN) 0 0.0036 8.0000 0.0324 18.0000 0.0084 24.0000 0.0024 26.0000 0.0756 32.0000 0.0216 35.0000 0.0036 42.0000 0.0056 43.0000 0.0324 50.0000 0.0504 53.0000 0.0084 56.0000 0.0054 59.0000 0.0024 61.0000 0.0756 64.0000 0.0486 67.0000 0.0216 74.0000 0.0126 77.0000 0.0056 80.0000 0.0036 82.0000 0.1134 85.0000 0.0504 88.0000 0.0324 91.0000 0.0054 98.0000 0.0084 99.0000 0.0486 106.0000 0.0756 109.0000 0.0126 115.0000 0.0036 117.0000 0.1134 123.0000 0.0324 133.0000 0.0084 141.0000 0.0756

A sequence of trials (not necessarily independent) is performed. Let *E _{i}*
be the event of success on the

*i*th component trial. We associate with each trial a “payoff function”

*X*

_{i}=

*a*

*I*

_{Ei}+

*b*

*I*

_{Eic}. Thus, an amount

*a*is earned if there is a success on the trial and an amount

*b*(usually negative) if there is a failure. Let

*S*be the number of successes in the

_{n}*n*trials and

*W*be the net payoff. Show that

*W*=(

*a*–

*b*)

*S*

_{n}+

*b*

*n*.

A marker is placed at a reference position on a line (taken to be the origin); a coin is tossed repeatedly. If a head turns up, the marker is moved one unit to the right; if a tail turns up, the marker is moved one unit to the left.

Show that the position at the end of ten tosses is given by the random variable

(6.39)where

*E*is the event of a head on the_{i}*i*th toss and*S*is the number of heads in ten trials._{10}After ten tosses, what are the possible positions and the probabilities of being in each?

S = 0:10; PS = ibinom(10,0.5,0:10); X = 2*S - 10; disp([X;PS]') -10.0000 0.0010 -8.0000 0.0098 -6.0000 0.0439 -4.0000 0.1172 -2.0000 0.2051 0 0.2461 2.0000 0.2051 4.0000 0.1172 6.0000 0.0439 8.0000 0.0098 10.0000 0.0010

Margaret considers five purchases in the amounts 5, 17, 21, 8, 15 dollars with respective probabilities 0.37, 0.22, 0.38, 0.81, 0.63. Anne contemplates six purchases in the amounts 8, 15, 12, 18, 15, 12 dollars, with respective probabilities 0.77, 0.52, 0.23, 0.41, 0.83, 0.58. Assume that all eleven possible purchases form an independent class.

Determine the distribution for

*X*, the amount purchased by Margaret.Determine the distribution for

*Y*, the amount purchased by Anne.Determine the distribution for

*Z*=*X*+*Y*, the total amount the two purchase.

*Suggestion* for part (c). Let MATLAB perform the calculations.

[r,s] = ndgrid(X,Y); [t,u] = ndgrid(PX,PY); z = r + s; pz = t.*u; [Z,PZ] = csort(z,pz);

% file npr06_18.m cx = [5 17 21 8 15 0]; cy = [8 15 12 18 15 12 0]; pmx = minprob(0.01*[37 22 38 81 63]); pmy = minprob(0.01*[77 52 23 41 83 58]); npr06_18 [X,PX] = canonicf(cx,pmx); [Y,PY] = canonicf(cy,pmy); [r,s] = ndgrid(X,Y); [t,u] = ndgrid(PX,PY); z = r + s; pz = t.*u; [Z,PZ] = csort(z,pz); a = length(Z) a = 125 % 125 different values plot(Z,cumsum(PZ)) % See figure Plotting details omitted