# Chapter 6. Random Variables and Probabilities

## 6.1. Random Variables and Probabilities*

### Introduction

Probability associates with an event a number which indicates the likelihood of the occurrence of that event on any trial. An event is modeled as the set of those possible outcomes of an experiment which satisfy a property or proposition characterizing the event.

Often, each outcome is characterized by a number. The experiment is performed. If the outcome is observed as a physical quantity, the size of that quantity (in prescribed units) is the entity actually observed. In many nonnumerical cases, it is convenient to assign a number to each outcome. For example, in a coin flipping experiment, a “head” may be represented by a 1 and a “tail” by a 0. In a Bernoulli trial, a success may be represented by a 1 and a failure by a 0. In a sequence of trials, we may be interested in the number of successes in a sequence of n component trials. One could assign a distinct number to each card in a deck of playing cards. Observations of the result of selecting a card could be recorded in terms of individual numbers. In each case, the associated number becomes a property of the outcome.

### Random variables as functions

We consider in this chapter real random variables (i.e., real-valued random variables). In the chapter "Random Vectors and Joint Distributions", we extend the notion to vector-valued random quantites. The fundamental idea of a real random variable is the assignment of a real number to each elementary outcome ω in the basic space Ω. Such an assignment amounts to determining a function X, whose domain is Ω and whose range is a subset of the real line R. Recall that a real-valued function on a domain (say an interval I on the real line) is characterized by the assignment of a real number y to each element x (argument) in the domain. For a real-valued function of a real variable, it is often possible to write a formula or otherwise state a rule describing the assignment of the value to each argument. Except in special cases, we cannot write a formula for a random variable X. However, random variables share some important general properties of functions which play an essential role in determining their usefulness.

Mappings and inverse mappings

There are various ways of characterizing a function. Probably the most useful for our purposes is as a mapping from the domain Ω to the codomain R. We find the mapping diagram of Figure 1 extremely useful in visualizing the essential patterns. Random variable X, as a mapping from basic space Ω to the real line R, assigns to each element ω a value t=X(ω). The object point ω is mapped, or carried, into the image point t. Each ω is mapped into exactly one t, although several ω may have the same image point.

Associated with a function X as a mapping are the inverse mapping X–1 and the inverse images it produces. Let M be a set of numbers on the real line. By the inverse image of M under the mapping X, we mean the set of all those ωΩ which are mapped into M by X (see Figure 2). If X does not take a value in M, the inverse image is the empty set (impossible event). If M includes the range of X, (the set of all possible values of X), the inverse image is the entire basic space Ω. Formally we write

(6.1)

Now we assume the set X–1(M), a subset of Ω, is an event for each M. A detailed examination of that assertion is a topic in measure theory. Fortunately, the results of measure theory ensure that we may make the assumption for any X and any subset M of the real line likely to be encountered in practice. The set X–1(M) is the event that X takes a value in M. As an event, it may be assigned a probability.

Example 6.1Some illustrative examples.
1. X=IE where E is an event with probability p. Now X takes on only two values, 0 and 1. The event that X take on the value 1 is the set

(6.2)

so that P({ω:X(ω)=1})=p. This rather ungainly notation is shortened to P(X=1)=p. Similarly, P(X=0)=1–p. Consider any set M. If neither 1 nor 0 is in M, then X–1(M)= If 0 is in M, but 1 is not, then X–1(M)=Ec If 1 is in M, but 0 is not, then X–1(M)=E If both 1 and 0 are in M, then X–1(M)=Ω In this case the class of all events X–1(M) consists of event E, its complement Ec, the impossible event , and the sure event Ω.

2. Consider a sequence of n Bernoulli trials, with probability p of success. Let Sn be the random variable whose value is the number of successes in the sequence of n component trials. Then, according to the analysis in the section "Bernoulli Trials and the Binomial Distribution"

(6.3)

Before considering further examples, we note a general property of inverse images. We state it in terms of a random variable, which maps Ω to the real line (see Figure 3).

Preservation of set operations

Let X be a mapping from Ω to the real line R. If M,Mi,iJ, are sets of real numbers, with respective inverse images E,Ei, then

(6.4)

Examination of simple graphical examples exhibits the plausibility of these patterns. Formal proofs amount to careful reading of the notation. Central to the structure are the facts that each element ω is mapped into only one image point t and that the inverse image of M is the set of all those ω which are mapped into image points in M.

An easy, but important, consequence of the general patterns is that the inverse images of disjoint M,N are also disjoint. This implies that the inverse of a disjoint union of Mi is a disjoint union of the separate inverse images.

Example 6.2 Events determined by a random variable

Consider, again, the random variable Sn which counts the number of successes in a sequence of n Bernoulli trials. Let n=10 and p=0.33. Suppose we want to determine the probability . Let , which we usually shorten to . Now the Ak form a partition, since we cannot have ωAk and (i.e., for any ω, we cannot have two values for Sn(ω)). Now,

(6.5)

since S10 takes on a value greater than 2 but no greater than 8 iff it takes one of the integer values from 3 to 8. By the additivity of probability,

(6.6)

### Mass transfer and induced probability distribution

Because of the abstract nature of the basic space and the class of events, we are limited in the kinds of calculations that can be performed meaningfully with the probabilities on the basic space. We represent probability as mass distributed on the basic space and visualize this with the aid of general Venn diagrams and minterm maps. We now think of the mapping from Ω to R as a producing a point-by-point transfer of the probability mass to the real line. This may be done as follows:

To any set M on the real line assign probability mass

It is apparent that PX(M)≥0 and PX(R)=P(Ω)=1. And because of the preservation of set operations by the inverse mapping

(6.7)

This means that PX has the properties of a probability measure defined on the subsets of the real line. Some results of measure theory show that this probability is defined uniquely on a class of subsets of R that includes any set normally encountered in applications. We have achieved a point-by-point transfer of the probability apparatus to the real line in such a manner that we can make calculations about the random variable X. We call PX the probability measure induced byX. Its importance lies in the fact that P(XM)=PX(M). Thus, to determine the likelihood that random quantity X will take on a value in set M, we determine how much induced probability mass is in the set M. This transfer produces what is called the probability distribution for X. In the chapter "Distribution and Density Functions", we consider useful ways to describe the probability distribution induced by a random variable. We turn first to a special class of random variables.

### Simple random variables

We consider, in some detail, random variables which have only a finite set of possible values. These are called simple random variables. Thus the term “simple” is used in a special, technical sense. The importance of simple random variables rests on two facts. For one thing, in practice we can distinguish only a finite set of possible values for any random variable. In addition, any random variable may be approximated as closely as pleased by a simple random variable. When the structure and properties of simple random variables have been examined, we turn to more general cases. Many properties of simple random variables extend to the general case via the approximation procedure.

Representation with the aid of indicator functions

In order to deal with simple random variables clearly and precisely, we must find suitable ways to express them analytically. We do this with the aid of indicator functions. Three basic forms of representation are encountered. These are not mutually exclusive representatons.

1. Standard or canonical form, which displays the possible values and the corresponding events. If X takes on distinct values

(6.8)

and if , for 1≤in, then is a partition (i.e., on any trial, exactly one of these events occurs). We call this the partition determined by (or, generated by) X. We may write

(6.9)

If X(ω)=ti, then ωAi, so that IAi(ω)=1 and all the other indicator functions have value zero. The summation expression thus picks out the correct value ti. This is true for any ti, so the expression represents X(ω) for all ω. The distinct set of the values and the corresponding probabilities constitute the distribution for X. Probability calculations for X are made in terms of its distribution. One of the advantages of the canonical form is that it displays the range (set of values), and if the probabilities are known, the distribution is determined. Note that in canonical form, if one of the ti has value zero, we include that term. For some probability distributions it may be that for one or more of the ti. In that case, we call these values null values, for they can only occur with probability zero, and hence are practically impossible. In the general formulation, we include possible null values, since they do not affect any probabilitiy calculations.

Example 6.3Successes in Bernoulli trials

As the analysis of Bernoulli trials and the binomial distribution shows (see Section 4.8), canonical form must be

(6.10)

For many purposes, both theoretical and practical, canonical form is desirable. For one thing, it displays directly the range (i.e., set of values) of the random variable. The distribution consists of the set of values paired with the corresponding set of probabilities , where .

2. Simple random variable X may be represented by a primitive form

(6.11)

Remarks

• If is a disjoint class, but , we may append the event and assign value zero to it.

• We say a primitive form, since the representation is not unique. Any of the Ci may be partitioned, with the same value ci associated with each subset formed.

• Canonical form is a special primitive form. Canonical form is unique, and in many ways normative.

Example 6.4Simple random variables in primitive form
• A wheel is spun yielding, on a equally likely basis, the integers 1 through 10. Let Ci be the event the wheel stops at i, 1≤i≤10. Each . If the numbers 1, 4, or 7 turn up, the player loses ten dollars; if the numbers 2, 5, or 8 turn up, the player gains nothing; if the numbers 3, 6, or 9 turn up, the player gains ten dollars; if the number 10 turns up, the player loses one dollar. The random variable expressing the results may be expressed in primitive form as

(6.12)X=–10IC1+0IC2+10IC3–10IC4+0IC5+10IC6–10IC7+0IC8+10IC9IC10
• A store has eight items for sale. The prices are $3.50,$5.00, $3.50,$7.50, $5.00,$5.00, $3.50, and$7.50, respectively. A customer comes in. She purchases one of the items with probabilities 0.10, 0.15, 0.15, 0.20, 0.10 0.05, 0.10 0.15. The random variable expressing the amount of her purchase may be written

(6.13)X=3.5IC1+5.0IC2+3.5IC3+7.5IC4+5.0IC5+5.0IC6+3.5IC7+7.5IC8

3. We commonly have X represented in affine form, in which the random variable is represented as an affine combination of indicator functions (i.e., a linear combination of the indicator functions plus a constant, which may be zero).

(6.14)

In this form, the class is not necessarily mutually exclusive, and the coefficients do not display directly the set of possible values. In fact, the Ei often form an independent class. Remark. Any primitive form is a special affine form in which c0=0 and the Ei form a partition.

Example 6.5

Consider, again, the random variable Sn which counts the number of successes in a sequence of n Bernoulli trials. If Ei is the event of a success on the ith trial, then one natural way to express the count is

(6.15)

This is affine form, with c0=0 and ci=1 for 1≤in. In this case, the Ei cannot form a mutually exclusive class, since they form an independent class.

Events generated by a simple random variable: canonical form
We may characterize the class of all inverse images formed by a simple random X in terms of the partition it determines. Consider any set M of real numbers. If ti in the range of X is in M, then every point ωAi maps into ti, hence into M. If the set J is the set of indices i such that tiM, then
Only those points ω in map into M.
Hence, the class of events (i.e., inverse images) determined by X consists of the impossible event , the sure event Ω, and the union of any subclass of the Ai in the partition determined by X.

Example 6.6 Events determined by a simple random variable

Suppose simple random variable X is represented in canonical form by

(6.16) X = – 2 IAIB + 0 IC + 3 ID

Then the class {A,B,C,D} is the partition determined by X and the range of X is {–2,–1,0,3}.

1. If M is the interval , then the values -2, -1, and 0 are in M and X–1(M)=ABC.

2. If M is the set (–2,–1]∪[1,5], then the values -1, 3 are in M and X–1(M)=BD.

3. The event , where M=(–∞,1]. Since values -2, -1, 0 are in M, the event {X≤1}=ABC.

### Determination of the distribution

Determining the partition generated by a simple random variable amounts to determining the canonical form. The distribution is then completed by determining the probabilities of each event .

From a primitive form

Before writing down the general pattern, we consider an illustrative example.

Example 6.7The distribution from a primitive form

Suppose one item is selected at random from a group of ten items. The values (in dollars) and respective probabilities are

 cj 2 1.5 2 2.5 1.5 1.5 1 2.5 2 1.5 0.08 0.11 0.07 0.15 0.1 0.09 0.14 0.08 0.08 0.1

By inspection, we find four distinct values: t1=1.00, t2=1.50, t3=2.00, and t4=2.50. The value 1.00 is taken on for ωC7 , so that A1=C7 and . Value 1.50 is taken on for ωC2,C5,C6,C10 so that

(6.17)

Similarly

(6.18)

The distribution for X is thus

 k 1 1.5 2 2.5 P ( X = k ) 0.14 0.4 0.23 0.23

The general procedure may be formulated as follows:

If , we identify the set of distinct values in the set . Suppose these are t1<t2<⋯<tn. For any possible value ti in the range, identify the index set Ji of those j such that cj=ti. Then the terms

(6.19)

and

(6.20)

Examination of this procedure shows that there are two phases:

• Select and sort the distinct values

• Add all probabilities associated with each value ti to determine

We use the m-function csort which performs these two operations (see Example 4 from "Minterms and MATLAB Calculations").

Example 6.8Use of csort on Example 6.7
>> C = [2.00 1.50 2.00 2.50 1.50 1.50 1.00 2.50 2.00 1.50];  % Matrix of c_j
>> pc = [0.08 0.11 0.07 0.15 0.10 0.09 0.14 0.08 0.08 0.10]; % Matrix of P(C_j)
>> [X,PX] = csort(C,pc);     % The sorting and consolidating operation
>> disp([X;PX]')             % Display of results
1.0000    0.1400
1.5000    0.4000
2.0000    0.2300
2.5000    0.2300


For a problem this small, use of a tool such as csort is not really needed. But in many problems with large sets of data the m-function csort is very useful.

From affine form

Suppose X is in affine form,

(6.21)

We determine a particular primitive form by determining the value of X on each minterm generated by the class . We do this in a systematic way by utilizing minterm vectors and properties of indicator functions.

1. X is constant on each minterm generated by the class since, as noted in the treatment of the minterm expansion, each indicator function IEi is constant on each minterm. We determine the value si of X on each minterm Mi. This describes X in a special primitive form

(6.22)
2. We apply the csort operation to the matrices of values and minterm probabilities to determine the distribution for X.

We illustrate with a simple example. Extension to the general case should be quite evident. First, we do the problem “by hand” in tabular form. Then we use the m-procedures to carry out the desired operations.

Example 6.9Finding the distribution from affine form

A mail order house is featuring three items (limit one of each kind per customer). Let

• E1= the event the customer orders item 1, at a price of 10 dollars.

• E2= the event the customer orders item 2, at a price of 18 dollars.

• E3= the event the customer orders item 3, at a price of 10 dollars.

There is a mailing charge of 3 dollars per order.

We suppose is independent with probabilities 0.6, 0.3, 0.5, respectively. Let X be the amount a customer who orders the special items spends on them plus mailing cost. Then, in affine form,

(6.23)

We seek first the primitive form, using the minterm probabilities, which may calculated in this case by using the m-function minprob.

1. To obtain the value of X on each minterm we

• Multiply the minterm vector for each generating event by the coefficient for that event

• Sum the values on each minterm and add the constant

To complete the table, list the corresponding minterm probabilities.

 i 10 IEi 18 IE2 10 IE3 c si p mi 0 0 0 0 3 3 0.14 1 0 0 10 3 13 0.14 2 0 18 0 3 21 0.06 3 0 18 10 3 31 0.06 4 10 0 0 3 13 0.21 5 10 0 10 3 23 0.21 6 10 18 0 3 31 0.09 7 10 18 10 3 41 0.09

We then sort on the si, the values on the various Mi, to expose more clearly the primitive form for X.

 i si p mi 0 3 0.14 1 13 0.14 4 13 0.21 2 21 0.06 5 23 0.21 3 31 0.06 6 31 0.09 7 41 0.09

The primitive form of X is thus

(6.24)

We note that the value 13 is taken on on minterms M1 and M4. The probability X has the value 13 is thus p(1)+p(4). Similarly, X has value 31 on minterms M3 and M6.

2. To complete the process of determining the distribution, we list the sorted values and consolidate by adding together the probabilities of the minterms on which each value is taken, as follows:

 k tk pk 1 3 0.14 2 13 0.14 + 0.21 = 0.35 3 21 0.06 4 23 0.21 5 31 0.06 + 0.09 = 0.15 6 41 0.09

The results may be put in a matrix X of possible values and a corresponding matrix PX of probabilities that X takes on each of these values. Examination of the table shows that

(6.25)

Matrices X and PX describe the distribution for X.

### An m-procedure for determining the distribution from affine form

We now consider suitable MATLAB steps in determining the distribution from affine form, then incorporate these in the m-procedure canonic for carrying out the transformation. We start with the random variable in affine form, and suppose we have available, or can calculate, the minterm probabilities.

1. The procedure uses mintable to set the basic minterm vector patterns, then uses a matrix of coefficients, including the constant term (set to zero if absent), to obtain the values on each minterm. The minterm probabilities are included in a row matrix.

2. Having obtained the values on each minterm, the procedure performs the desired consolidation by using the m-function csort.

Example 6.10Steps in determining the distribution for X in Example 6.9
>> c = [10 18 10 3];                 % Constant term is listed last
>> pm = minprob(0.1*[6 3 5]);
>> M  = mintable(3)                  % Minterm vector pattern
M =
0     0     0     0     1     1     1     1
0     0     1     1     0     0     1     1
0     1     0     1     0     1     0     1
% - - - - - - - - - - - - - -        % An approach mimicking hand'' calculation
>> C = colcopy(c(1:3),8)             % Coefficients in position
C =
10    10    10    10    10    10    10    10
18    18    18    18    18    18    18    18
10    10    10    10    10    10    10    10
>> CM = C.*M                         % Minterm vector values
CM =
0     0     0     0    10    10    10    10
0     0    18    18     0     0    18    18
0    10     0    10     0    10     0    10
>> cM = sum(CM) + c(4)               % Values on minterms
cM =
3    13    21    31    13    23    31    41
% - - - - - - - - - - - -  -         % Practical MATLAB procedure
>> s = c(1:3)*M + c(4)
s =
3    13    21    31    13    23    31    41
>> pm = 0.14  0.14  0.06  0.06  0.21  0.21  0.09  0.09   % Extra zeros deleted
>> const = c(4)*ones(1,8);}

>> disp([CM;const;s;pm]')            % Display of primitive form
0     0     0   3    3    0.14  % MATLAB gives four decimals
0     0    10   3   13    0.14
0    18     0   3   21    0.06
0    18    10   3   31    0.06
10     0     0   3   13    0.21
10     0    10   3   23    0.21
10    18     0   3   31    0.09
10    18    10   3   41    0.09
>> [X,PX] = csort(s,pm);              % Sorting on s, consolidation of  pm
>> disp([X;PX]')                      % Display of final result
3    0.14
13    0.35
21    0.06
23    0.21
31    0.15
41    0.09


The two basic steps are combined in the m-procedure canonic, which we use to solve the previous problem.

Example 6.11Use of canonic on the variables of Example 6.10
>> c = [10 18 10 3]; % Note that the constant term 3 must be included last
>> pm = minprob([0.6 0.3 0.5]);
>> canonic
Enter row vector of coefficients  c
Enter row vector of minterm probabilities  pm
Use row matrices X and PX for calculations
Call for XDBN to view the distribution
>> disp(XDBN)
3.0000    0.1400
13.0000    0.3500
21.0000    0.0600
23.0000    0.2100
31.0000    0.1500
41.0000    0.0900


With the distribution available in the matrices X (set of values) and PX (set of probabilities), we may calculate a wide variety of quantities associated with the random variable.

We use two key devices:

1. Use relational and logical operations on the matrix of values X to determine a matrix M which has ones for those values which meet a prescribed condition. P(XM): PM = M*PX'

2. Determine by using array operations on matrix X. We have two alternatives:

1. Use the matrix G, which has values for each possible value ti for X, or,

2. Apply csort to the pair to get the distribution for Z=g(X). This distribution (in value and probability matrices) may be used in exactly the same manner as that for the original random variable X.

Example 6.12Continuation of Example 6.11

Suppose for the random variable X in Example 6.11 it is desired to determine the probabilities

P(15≤X≤35), P(|X–20|≤7), and P((X–10)(X–25)>0).

>> M = (X>=15)&(X<=35);
M = 0   0    1    1    1    0    % Ones for minterms on which 15 <= X <= 35
>> PM = M*PX'                    % Picks out and sums those minterm probs
PM =  0.4200
>> N = abs(X-20)<=7;
N = 0    1    1    1    0    0   % Ones for minterms on which |X - 20| <= 7
>> PN = N*PX'                    % Picks out and sums those minterm probs
PN =  0.6200
>> G = (X - 10).*(X - 25)
G = 154 -36 -44 -26 126 496      % Value of g(t_i) for each possible value
>> P1 = (G>0)*PX'                % Total probability for those t_i such that
P1 =  0.3800                     % g(t_i) > 0
>> [Z,PZ] = csort(G,PX)          % Distribution for Z = g(X)
Z =  -44   -36   -26   126   154   496
PZ =  0.0600    0.3500    0.2100    0.1500    0.1400    0.0900
>> P2 = (Z>0)*PZ'                % Calculation using distribution for Z
P2 =  0.3800

Example 6.13Alternate formulation of Example 3 from "Composite Trials"

Ten race cars are involved in time trials to determine pole positions for an upcoming race. To qualify, they must post an average speed of 125 mph or more on a trial run. Let Ei be the event the ith car makes qualifying speed. It seems reasonable to suppose the class is independent. If the respective probabilities for success are 0.90, 0.88, 0.93, 0.77, 0.85, 0.96, 0.72, 0.83, 0.91, 0.84, what is the probability that k or more will qualify (k=6,7,8,9,10)?

SOLUTION

Let .

>> c = [ones(1,10) 0];
>> P = [0.90, 0.88, 0.93, 0.77, 0.85, 0.96, 0.72, 0.83, 0.91, 0.84];
>> canonic
Enter row vector of coefficients  c
Enter row vector of minterm probabilities  minprob(P)
Use row matrices X and PX for calculations
Call for XDBN to view the distribution
>> k = 6:10;
>> for i = 1:length(k)
Pk(i) = (X>=k(i))*PX';
end
>> disp(Pk)
0.9938    0.9628    0.8472    0.5756    0.2114


This solution is not as convenient to write out. However, with the distribution for X as defined, a great many other probabilities can be determined. This is particularly the case when it is desired to compare the results of two independent races or “heats.” We consider such problems in the study of Independent Classes of Random Variables.

A function form for canonic

One disadvantage of the procedure canonic is that it always names the output X and PX. While these can easily be renamed, frequently it is desirable to use some other name for the random variable from the start. A function form, which we call canonicf, is useful in this case.

Example 6.14Alternate solution of Example 6.13, using canonicf
>> c = [10 18 10 3];
>> pm = minprob(0.1*[6 3 5]);
>> [Z,PZ] = canonicf(c,pm);
>> disp([Z;PZ]')                % Numbers as before, but the distribution
3.0000    0.1400            % matrices are now named Z and PZ
13.0000    0.3500
21.0000    0.0600
23.0000    0.2100
31.0000    0.1500
41.0000    0.0900


### General random variables

The distribution for a simple random variable is easily visualized as point mass concentrations at the various values in the range, and the class of events determined by a simple random variable is described in terms of the partition generated by X (i.e., the class of those events of the form for each ti in the range). The situation is conceptually the same for the general case, but the details are more complicated. If the random variable takes on a continuum of values, then the probability mass distribution may be spread smoothly on the line. Or, the distribution may be a mixture of point mass concentrations and smooth distributions on some intervals. The class of events determined by X is the set of all inverse images X–1(M) for M any member of a general class of subsets of subsets of the real line known in the mathematical literature as the Borel sets. There are technical mathematical reasons for not saying M is any subset, but the class of Borel sets is general enough to include any set likely to be encountered in applications—certainly at the level of this treatment. The Borel sets include any interval and any set that can be formed by complements, countable unions, and countable intersections of Borel sets. This is a type of class known as a sigma algebra of events. Because of the preservation of set operations by the inverse image, the class of events determined by random variable X is also a sigma algebra, and is often designated σ(X). There are some technical questions concerning the probability measure PX induced by X, hence the distribution. These also are settled in such a manner that there is no need for concern at this level of analysis. However, some of these questions become important in dealing with random processes and other advanced notions increasingly used in applications. Two facts provide the freedom we need to proceed with little concern for the technical details.

1. X–1(M) is an event for every Borel set M iff for every semi-infinite interval on the real line is an event.

2. The induced probability distribution is determined uniquely by its assignment to all intervals of the form .

These facts point to the importance of the distribution function introduced in the next chapter.

Another fact, alluded to above and discussed in some detail in the next chapter, is that any general random variable can be approximated as closely as pleased by a simple random variable. We turn in the next chapter to a description of certain commonly encountered probability distributions and ways to describe them analytically.

## 6.2. Problems on Random Variables and Probabilities*

The following simple random variable is in canonical form:

X=–3.75IA–1.13IB+0IC+2.6ID.

Express the events {X∈(–4,2]},{X∈(0,3]},{X∈(–∞,1]}, {|X–1|≥1}, and {X≥0} in terms of A,B,C, and D.

• ABC

• D

• ABC

• C

• CD

Random variable X, in canonical form, is given by X=–2IAIB+IC+2ID+5IE.

Express the events {X∈[2,3)},{X≤0},{X<0}, {|X–2|≤3}, and , in terms of A,B,C,D, and E.

• D

• AB

• AB

• BCDE

The class is a partition. Random variable X has values {1,3,2,3,4,2,1,3,5,2} on C1 through C10, respectively. Express X in canonical form.

T = [1 3 2 3 4 2 1 3 5 2];
[X,I] = sort(T)
X =   1   1   2   2   2   3   3   3   4   5
I =   1   7   3   6  10   2   4   8   5   9

(6.26) X = IA + 2 IB + 3 IC + 4 ID + 5 IE
(6.27)

The class in Exercise 3. has respective probabilities 0.08, 0.13, 0.06, 0.09, 0.14, 0.11, 0.12, 0.07, 0.11, 0.09. Determine the distribution for X.

T = [1 3 2 3 4 2 1 3 5 2];
pc = 0.01*[8 13 6 9 14 11 12 7 11 9];
[X,PX] = csort(T,pc);
disp([X;PX]')
1.0000    0.2000
2.0000    0.2600
3.0000    0.2900
4.0000    0.1400
5.0000    0.1100


A wheel is spun yielding on an equally likely basis the integers 1 through 10. Let Ci be the event the wheel stops at i, 1≤i≤10. Each . If the numbers 1, 4, or 7 turn up, the player loses ten dollars; if the numbers 2, 5, or 8 turn up, the player gains nothing; if the numbers 3, 6, or 9 turn up, the player gains ten dollars; if the number 10 turns up, the player loses one dollar. The random variable expressing the results may be expressed in primitive form as

(6.28) X = – 10 IC1 + 0 IC2 + 10 IC3 – 10 IC4 + 0 IC5 + 10 IC6 – 10 IC7 + 0 IC8 + 10 IC9IC10
• Determine the distribution for X, (a) by hand, (b) using MATLAB.

• Determine P(X<0), P(X>0).

p = 0.1*ones(1,10);
c = [-10 0 10 -10 0 10 -10 0 10 -1];
[X,PX] = csort(c,p);
disp([X;PX]')
-10.0000    0.3000
-1.0000    0.1000
0    0.3000
10.0000    0.3000
Pneg = (X<0)*PX'
Pneg =  0.4000
Ppos = (X>0)*PX'
Ppos =  0.300


A store has eight items for sale. The prices are $3.50,$5.00, $3.50,$7.50, $5.00,$5.00, $3.50, and$7.50, respectively. A customer comes in. She purchases one of the items with probabilities 0.10, 0.15, 0.15, 0.20, 0.10 0.05, 0.10 0.15. The random variable expressing the amount of her purchase may be written

(6.29) X = 3 . 5 IC1 + 5 . 0 IC2 + 3 . 5 IC3 + 7 . 5 IC4 + 5 . 0 IC5 + 5 . 0 IC6 + 3 . 5 IC7 + 7 . 5 IC8

Determine the distribution for X (a) by hand, (b) using MATLAB.

p = 0.01*[10 15 15 20 10  5 10 15];
c = [3.5 5 3.5 7.5 5 5 3.5 7.5];
[X,PX] = csort(c,p);
disp([X;PX]')
3.5000    0.3500
5.0000    0.3000
7.5000    0.3500


Suppose in canonical form are

(6.30)

The are 0.3, 0.6, 0.1, respectively, and the are 0.2 0.6 0.2. Each pair is independent. Consider the random variable Z=X+Y. Then Z=2+1 on A1B1, Z=3+3 on A2B3, etc. Determine the value of Z on each AiBj and determine the corresponding . From this, determine the distribution for Z.

A = [2 3 5];
B = [1 2 3];
a = rowcopy(A,3);
b = colcopy(B,3);
Z =a + b               % Possible values of sum Z = X + Y
Z = 3     4     6
4     5     7
5     6     8
PA = [0.3 0.6 0.1];
PB = [0.2 0.6 0.2];
pa= rowcopy(PA,3);
pb = colcopy(PB,3);
P = pa.*pb            % Probabilities for various values
P =  0.0600    0.1200    0.0200
0.1800    0.3600    0.0600
0.0600    0.1200    0.0200
[Z,PZ] = csort(Z,P);
disp([Z;PZ]')         % Distribution for Z = X + Y
3.0000    0.0600
4.0000    0.3000
5.0000    0.4200
6.0000    0.1400
7.0000    0.0600
8.0000    0.0200


For the random variables in Exercise 7., let W=XY. Determine the value of W on each AiBj and determine the distribution of W.

XY = a.*b
XY = 2     3     5               % XY values
4     6    10
6     9    15

W        PW               % Distribution for W = XY
2.0000    0.0600
3.0000    0.1200
4.0000    0.1800
5.0000    0.0200
6.0000    0.4200
9.0000    0.1200
10.0000    0.0600
15.0000    0.0200


A pair of dice is rolled.

1. Let X be the minimum of the two numbers which turn up. Determine the distribution for X

2. Let Y be the maximum of the two numbers. Determine the distribution for Y.

3. Let Z be the sum of the two numbers. Determine the distribution for Z.

4. Let W be the absolute value of the difference. Determine its distribution.

t = 1:6;
c = ones(6,6);
[x,y] = meshgrid(t,t)
x =  1     2     3     4     5     6     % x-values in each position
1     2     3     4     5     6
1     2     3     4     5     6
1     2     3     4     5     6
1     2     3     4     5     6
1     2     3     4     5     6
y =  1     1     1     1     1     1     % y-values in each position
2     2     2     2     2     2
3     3     3     3     3     3
4     4     4     4     4     4
5     5     5     5     5     5
6     6     6     6     6     6
m = min(x,y);                         % min in each position
M = max(x,y);                         % max in each position
s = x + y;                            % sum x+y in each position
d = abs(x - y);                       % |x - y| in each position
[X,fX] = csort(m,c)                   % sorts values and counts occurrences
X =   1     2     3     4     5     6
fX = 11     9     7     5     3     1    % PX = fX/36
[Y,fY] = csort(M,c)
Y =   1     2     3     4     5     6
fY =  1     3     5     7     9    11    % PY = fY/36
[Z,fZ] = csort(s,c)
Z =   2     3     4     5     6     7     8     9    10    11    12
fZ =  1     2     3     4     5     6     5     4     3     2     1  %PZ = fZ/36
[W,fW] = csort(d,c)
W =   0     1     2     3     4     5
fW =  6    10     8     6     4     2    % PW = fW/36


Minterm probabilities p(0) through p(15) for the class are, in order,

(6.31)

Determine the distribution for random variable

(6.32) X = – 5 . 3 IA – 2 . 5 IB + 2 . 3 IC + 4 . 2 ID – 3 . 7
% file npr06_10.m
% Data for Exercise 10.
pm = [ 0.072 0.048 0.018 0.012 0.168 0.112 0.042 0.028 ...
0.062 0.048 0.028 0.010 0.170 0.110 0.040 0.032];
c  = [-5.3 -2.5 2.3 4.2 -3.7];
disp('Minterm probabilities are in pm, coefficients in c')
npr06_10
Minterm probabilities are in pm, coefficients in c
canonic
Enter row vector of coefficients  c
Enter row vector of minterm probabilities  pm
Use row matrices X and PX for calculations
Call for XDBN to view the distribution
XDBN
XDBN =
-11.5000    0.1700
-9.2000    0.0400
-9.0000    0.0620
-7.3000    0.1100
-6.7000    0.0280
-6.2000    0.1680
-5.0000    0.0320
-4.8000    0.0480
-3.9000    0.0420
-3.7000    0.0720
-2.5000    0.0100
-2.0000    0.1120
-1.4000    0.0180
0.3000    0.0280
0.5000    0.0480
2.8000    0.0120


On a Tuesday evening, the Houston Rockets, the Orlando Magic, and the Chicago Bulls all have games (but not with one another). Let A be the event the Rockets win, B be the event the Magic win, and C be the event the Bulls win. Suppose the class is independent, with respective probabilities 0.75, 0.70 0.8. Ellen's boyfriend is a rabid Rockets fan, who does not like the Magic. He wants to bet on the games. She decides to take him up on his bets as follows:

• $10 to 5 on the Rockets --- i.e. She loses five if the Rockets win and gains ten if they lose •$10 to 5 against the Magic

• even $5 to 5 on the Bulls. Ellen's winning may be expressed as the random variable (6.33) X = – 5 IA + 10 IAc + 10 IB – 5 IBc – 5 IC + 5 ICc = – 15 IA + 15 IB – 10 IC + 10 Determine the distribution for X. What are the probabilities Ellen loses money, breaks even, or comes out ahead? P = 0.01*[75 70 80]; c = [-15 15 -10 10]; canonic Enter row vector of coefficients c Enter row vector of minterm probabilities minprob(P) Use row matrices X and PX for calculations Call for XDBN to view the distribution disp(XDBN) -15.0000 0.1800 -5.0000 0.0450 0 0.4800 10.0000 0.1200 15.0000 0.1400 25.0000 0.0350 PXneg = (X<0)*PX' PXneg = 0.2250 PX0 = (X==0)*PX' PX0 = 0.4800 PXpos = (X>0)*PX' PXpos = 0.2950  The class has minterm probabilities (6.34) • Determine whether or not the class is independent. • The random variable X=IA+IB+IC+ID counts the number of the events which occur on a trial. Find the distribution for X and determine the probability that two or more occur on a trial. Find the probability that one or three of these occur on a trial. npr06_12 Minterm probabilities in pm, coefficients in c a = imintest(pm) The class is NOT independent Minterms for which the product rule fails a = 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 canonic Enter row vector of coefficients c Enter row vector of minterm probabilities pm Use row matrices X and PX for calculations Call for XDBN to view the distribution XDBN = 0 0.0050 1.0000 0.0430 2.0000 0.2120 3.0000 0.4380 4.0000 0.3020 P2 = (X>=2)*PX' P2 = 0.9520 P13 = ((X==1)|(X==3))*PX' P13 = 0.4810  James is expecting three checks in the mail, for$20, $26, and$33 dollars. Their arrivals are the events . Assume the class is independent, with respective probabilities 0.90, 0.75, 0.80. Then

(6.35) X = 20 IA + 26 IB + 33 IC

represents the total amount received. Determine the distribution for X. What is the probability he receives at least $50? Less than$30?

c = [20 26 33 0];
P = 0.01*[90 75 80];
canonic
Enter row vector of coefficients  c
Enter row vector of minterm probabilities  minprob(P)
Use row matrices X and PX for calculations
Call for XDBN to view the distribution
disp(XDBN)
0    0.0050
20.0000    0.0450
26.0000    0.0150
33.0000    0.0200
46.0000    0.1350
53.0000    0.1800
59.0000    0.0600
79.0000    0.5400
P50 = (X>=50)*PX'
P50 =  0.7800
P30 = (X <30)*PX'
P30 =  0.0650


A gambler places three bets. He puts down two dollars for each bet. He picks up three dollars (his original bet plus one dollar) if he wins the first bet, four dollars if he wins the second bet, and six dollars if he wins the third. His net winning can be represented by the random variable

(6.36)

Assume the results of the games are independent. Determine the distribution for X.

c = [3 4 6 -6];
P = 0.1*[5 4 3];
canonic
Enter row vector of coefficients  c
Enter row vector of minterm probabilities  minprob(P)
Use row matrices X and PX for calculations
Call for XDBN to view the distribution
dsp(XDBN)
-6.0000    0.2100
-3.0000    0.2100
-2.0000    0.1400
0    0.0900
1.0000    0.1400
3.0000    0.0900
4.0000    0.0600
7.0000    0.0600


Henry goes to a hardware store. He considers a power drill at $35, a socket wrench set at$56, a set of screwdrivers at $18, a vise at$24, and hammer at \$8. He decides independently on the purchases of the individual items, with respective probabilities 0.5, 0.6, 0.7, 0.4, 0.9. Let X be the amount of his total purchases. Determine the distribution for X.

c = [35 56 18 24 8 0];
P = 0.1*[5 6 7 4 9];
canonic
Enter row vector of coefficients  c
Enter row vector of minterm probabilities  minprob(P)
Use row matrices X and PX for calculations
Call for XDBN to view the distribution
disp(XDBN)
0    0.0036
8.0000    0.0324
18.0000    0.0084
24.0000    0.0024
26.0000    0.0756
32.0000    0.0216
35.0000    0.0036
42.0000    0.0056
43.0000    0.0324
50.0000    0.0504
53.0000    0.0084
56.0000    0.0054
59.0000    0.0024
61.0000    0.0756
64.0000    0.0486
67.0000    0.0216
74.0000    0.0126
77.0000    0.0056
80.0000    0.0036
82.0000    0.1134
85.0000    0.0504
88.0000    0.0324
91.0000    0.0054
98.0000    0.0084
99.0000    0.0486
106.0000    0.0756
109.0000    0.0126
115.0000    0.0036
117.0000    0.1134
123.0000    0.0324
133.0000    0.0084
141.0000    0.0756


A sequence of trials (not necessarily independent) is performed. Let Ei be the event of success on the ith component trial. We associate with each trial a “payoff function” Xi=aIEi+bIEic. Thus, an amount a is earned if there is a success on the trial and an amount b (usually negative) if there is a failure. Let Sn be the number of successes in the n trials and W be the net payoff. Show that W=(ab)Sn+bn.

(6.37)
(6.38)

A marker is placed at a reference position on a line (taken to be the origin); a coin is tossed repeatedly. If a head turns up, the marker is moved one unit to the right; if a tail turns up, the marker is moved one unit to the left.

1. Show that the position at the end of ten tosses is given by the random variable

(6.39)

where Ei is the event of a head on the ith toss and S10 is the number of heads in ten trials.

2. After ten tosses, what are the possible positions and the probabilities of being in each?

(6.40)
(6.41)
S = 0:10;
PS = ibinom(10,0.5,0:10);
X = 2*S - 10;
disp([X;PS]')
-10.0000    0.0010
-8.0000    0.0098
-6.0000    0.0439
-4.0000    0.1172
-2.0000    0.2051
0    0.2461
2.0000    0.2051
4.0000    0.1172
6.0000    0.0439
8.0000    0.0098
10.0000    0.0010


Margaret considers five purchases in the amounts 5, 17, 21, 8, 15 dollars with respective probabilities 0.37, 0.22, 0.38, 0.81, 0.63. Anne contemplates six purchases in the amounts 8, 15, 12, 18, 15, 12 dollars, with respective probabilities 0.77, 0.52, 0.23, 0.41, 0.83, 0.58. Assume that all eleven possible purchases form an independent class.

1. Determine the distribution for X, the amount purchased by Margaret.

2. Determine the distribution for Y, the amount purchased by Anne.

3. Determine the distribution for Z=X+Y, the total amount the two purchase.

Suggestion for part (c). Let MATLAB perform the calculations.

[r,s] = ndgrid(X,Y);
[t,u] = ndgrid(PX,PY);
z = r + s;
pz = t.*u;
[Z,PZ] = csort(z,pz);

% file npr06_18.m
cx = [5 17 21 8 15 0];
cy = [8 15 12 18 15 12 0];
pmx = minprob(0.01*[37 22 38 81 63]);
pmy = minprob(0.01*[77 52 23 41 83 58]);
npr06_18
[X,PX] = canonicf(cx,pmx);  [Y,PY] = canonicf(cy,pmy);
[r,s] = ndgrid(X,Y);   [t,u] = ndgrid(PX,PY);
z = r + s;   pz = t.*u;
[Z,PZ] = csort(z,pz);
a = length(Z)
a  =  125              % 125 different values
plot(Z,cumsum(PZ))  % See figure     Plotting details omitted

Solutions