# Chapter 7. Distribution and Density Functions

## 7.1. Distribution and Density Functions*

### Introduction

In the unit on Random Variables and Probability we introduce real random variables as mappings from the basic space Ω to the real line. The mapping induces a transfer of the probability mass on the basic space to subsets of the real line in such a way that the probability that X takes a value in a set M is exactly the mass assigned to that set by the transfer. To perform probability calculations, we need to describe analytically the distribution on the line. For simple random variables this is easy. We have at each possible value of X a point mass equal to the probability X takes that value. For more general cases, we need a more useful description than that provided by the induced probability measure PX.

### The distribution function

In the theoretical discussion on Random Variables and Probability, we note that the probability distribution induced by a random variable X is determined uniquely by a consistent assignment of mass to semi-infinite intervals of the form (–∞,t] for each real t. This suggests that a natural description is provided by the following.

Definition

The distribution function FX for random variable X is given by

(7.1)

In terms of the mass distribution on the line, this is the probability mass at or to the left of the point t. As a consequence, FX has the following properties:

 (F1) : FX must be a nondecreasing function, for if t>s there must be at least as much probability mass at or to the left of t as there is for s. (F2) : FX is continuous from the right, with a jump in the amount p0 at t0 iff . If the point t approaches t0 from the left, the interval does not include the probability mass at t0 until t reaches that value, at which point the amount at or to the left of t increases ("jumps") by amount p0; on the other hand, if t approaches t0 from the right, the interval includes the mass p0 all the way to and including t0, but drops immediately as t moves to the left of t0. (F3) : Except in very unusual cases involving random variables which may take “infinite” values, the probability mass included in must increase to one as t moves to the right; as t moves to the left, the probability mass included must decrease to zero, so that (7.2)

A distribution function determines the probability mass in each semiinfinite interval (–∞,t]. According to the discussion referred to above, this determines uniquely the induced distribution.

The distribution function FX for a simple random variable is easily visualized. The distribution consists of point mass pi at each point ti in the range. To the left of the smallest value in the range, FX(t)=0; as t increases to the smallest value t1, FX(t) remains constant at zero until it jumps by the amount p1.. FX(t) remains constant at p1 until t increases to t2, where it jumps by an amount p2 to the value p1+p2. This continues until the value of FX(t)reaches 1 at the largest value tn. The graph of FX is thus a step function, continuous from the right, with a jump in the amount pi at the corresponding point ti in the range. A similar situation exists for a discrete-valued random variable which may take on an infinity of values (e.g., the geometric distribution or the Poisson distribution considered below). In this case, there is always some probability at points to the right of any ti, but this must become vanishingly small as t increases, since the total probability mass is one.

The procedure ddbn may be used to plot the distributon function for a simple random variable from a matrix X of values and a corresponding matrix PX of probabilities.

Example 7.1Graph of FX for a simple random variable
>> c = [10 18 10 3];             % Distribution for X in Example 6.5.1
>> pm = minprob(0.1*[6 3 5]);
>> canonic
Enter row vector of coefficients  c
Enter row vector of minterm probabilities  pm
Use row matrices X and PX for calculations
Call for XDBN to view the distribution
>> ddbn                          % Circles show values at jumps
Enter row matrix of VALUES  X
Enter row matrix of PROBABILITIES  PX
%  Printing details   See Figure 7.1


### Description of some common discrete distributions

We make repeated use of a number of common distributions which are used in many practical situations. This collection includes several distributions which are studied in the chapter "Random Variables and Probabilities".

1. Indicator function. X=IE P(X=1)=P(E)=pP(X=0)=q=1–p. The distribution function has a jump in the amount q at t=0 and an additional jump of p to the value 1 at t=1.

2. Simple random variable (canonical form)

(7.3)

The distribution function is a step function, continuous from the right, with jump of pi at t=ti (See Figure 7.1 for Example 7.1)

3. Binomial (n,p). This random variable appears as the number of successes in a sequence of n Bernoulli trials with probability p of success. In its simplest form

(7.4)
(7.5)

As pointed out in the study of Bernoulli sequences in the unit on Composite Trials, two m-functions ibinom andcbinom are available for computing the individual and cumulative binomial probabilities.

4. Geometric (p) There are two related distributions, both arising in the study of continuing Bernoulli sequences. The first counts the number of failures before the first success. This is sometimes called the “waiting time.” The event {X=k} consists of a sequence of k failures, then a success. Thus

(7.6)

The second designates the component trial on which the first success occurs. The event {Y=k} consists of k–1 failures, then a success on the kth component trial. We have

(7.7)

We say X has the geometric distribution with parameter (p), which we often designate by X geometric (p). Now Y=X+1 or Y–1=X. For this reason, it is customary to refer to the distribution for the number of the trial for the first success by saying Y–1∼ geometric (p). The probability of k or more failures before the first success is P(Xk)=qk. Also

(7.8)

This suggests that a Bernoulli sequence essentially "starts over" on each trial. If it has failed n times, the probability of failing an additional k or more times before the next success is the same as the initial probability of failing k or more times before the first success.

Example 7.2The geometric distribution

A statistician is taking a random sample from a population in which two percent of the members own a BMW automobile. She takes a sample of size 100. What is the probability of finding no BMW owners in the sample?

SOLUTION

The sampling process may be viewed as a sequence of Bernoulli trials with probability p=0.02 of success. The probability of 100 or more failures before the first success is 0.98100=0.1326 or about 1/7.5.

5. Negative binomial (m,p). X is the number of failures before the mth success. It is generally more convenient to work with Y=X+m, the number of the trial on which the mth success occurs. An examination of the possible patterns and elementary combinatorics show that

(7.9)

There are m–1 successes in the first k–1 trials, then a success. Each combination has probability pmqkm. We have an m-function nbinom to calculate these probabilities.

Example 7.3A game of chance

A player throws a single six-sided die repeatedly. He scores if he throws a 1 or a 6. What is the probability he scores five times in ten or fewer throws?

>> p = sum(nbinom(5,1/3,5:10))
p  =  0.2131


An alternate solution is possible with the use of the binomial distribution. The mth success comes not later than the kth trial iff the number of successes in k trials is greater than or equal to m.

>> P = cbinom(10,1/3,5)
P  =  0.2131


6. Poisson (μ). This distribution is assumed in a wide variety of applications. It appears as a counting variable for items arriving with exponential interarrival times (see the relationship to the gamma distribution below). For large n and small p (which may not be a value found in a table), the binomial distribution is approximately Poisson (np). Use of the generating function (see Transform Methods) shows the sum of independent Poisson random variables is Poisson. The Poisson distribution is integer valued, with

(7.10)

Although Poisson probabilities are usually easier to calculate with scientific calculators than binomial probabilities, the use of tables is often quite helpful. As in the case of the binomial distribution, we have two m-functions for calculating Poisson probabilities. These have advantages of speed and parameter range similar to those for ibinom and cbinom.

 : P(X=k) is calculated by P = ipoisson(mu,k), where k is a row or column vector of integers and the result P is a row matrix of the probabilities. : P(X≥k) is calculated by P = cpoisson(mu,k), where k is a row or column vector of integers and the result P is a row matrix of the probabilities.

Example 7.4Poisson counting random variable

The number of messages arriving in a one minute period at a communications network junction is a random variable N Poisson (130). What is the probability the number of arrivals is greater than equal to 110, 120, 130, 140, 150, 160 ?

>> p = cpoisson(130,110:10:160)
p  =  0.9666  0.8209  0.5117  0.2011  0.0461  0.0060


The descriptions of these distributions, along with a number of other facts, are summarized in the table DATA ON SOME COMMON DISTRIBUTIONS in Appendix C.

### The density function

If the probability mass in the induced distribution is spread smoothly along the real line, with no point mass concentrations, there is a probability density function fX which satisfies

(7.11)

At each t, fX(t) is the mass per unit length in the probability distribution. The density function has three characteristic properties:

(7.12)

A random variable (or distribution) which has a density is called absolutely continuous. This term comes from measure theory. We often simply abbreviate as continuous distribution.

Remarks
1. There is a technical mathematical description of the condition “spread smoothly with no point mass concentrations.” And strictly speaking the integrals are Lebesgue integrals rather than the ordinary Riemann kind. But for practical cases, the two agree, so that we are free to use ordinary integration techniques.

2. By the fundamental theorem of calculus

(7.13)
3. Any integrable, nonnegative function f with f=1 determines a distribution function F , which in turn determines a probability distribution. If f≠1, multiplication by the appropriate positive constant gives a suitable f . An argument based on the Quantile Function shows the existence of a random variable with that distribution.

4. In the literature on probability, it is customary to omit the indication of the region of integration when integrating over the whole line. Thus

(7.14)

The first expression is not an indefinite integral. In many situations, fX will be zero outside an interval. Thus, the integrand effectively determines the region of integration.

### Some common absolutely continuous distributions

1. Uniform .
Mass is spread uniformly on the interval . It is immaterial whether or not the end points are included, since probability associated with each individual point is zero. The probability of any subinterval is proportional to the length of the subinterval. The probability of being in any two subintervals of the same length is the same. This distribution is used to model situations in which it is known that X takes on values in but is equally likely to be in any subinterval of a given length. The density must be constant over the interval (zero outside), and the distribution function increases linearly with t in the interval. Thus,

(7.15)

The graph of FX rises linearly, with slope 1/(ba) from zero at t=a to one at t=b.

2. Symmetric triangular .
This distribution is used frequently in instructional numerical examples because probabilities can be obtained geometrically. It can be shifted, with a shift of the graph, to different sets of values. It appears naturally (in shifted form) as the distribution for the sum or difference of two independent random variables uniformly distributed on intervals of the same length. This fact is established with the use of the moment generating function (see Transform Methods). More generally, the density may have a triangular graph which is not symmetric.

Example 7.5Use of a triangular distribution

Suppose X symmetric triangular . Determine P(120<X≤250).

Remark. Note that in the continuous case, it is immaterial whether the end point of the intervals are included or not.

SOLUTION

To get the area under the triangle between 120 and 250, we take one minus the area of the right triangles between 100 and 120 and between 250 and 300. Using the fact that areas of similar triangles are proportional to the square of any side, we have

(7.16)

3. Exponential ( λ) (zero elsewhere).
Integration shows (zero elsewhere). We note that . This leads to an extremely important property of the exponential distribution. Since implies X>t, we have

(7.17)P(X>t+h|X>t)=P(X>t+h)/P(X>t)=eλ(t+h)/eλt=eλh=P(X>h)

Because of this property, the exponential distribution is often used in reliability problems. Suppose X represents the time to failure (i.e., the life duration) of a device put into service at t=0. If the distribution is exponential, this property says that if the device survives to time t (i.e., X>t) then the (conditional) probability it will survive h more units of time is the same as the original probability of surviving for h units of time. Many devices have the property that they do not wear out. Failure is due to some stress of external origin. Many solid state electronic devices behave essentially in this way, once initial “burn in” tests have removed defective units. Use of Cauchy's equation (Appendix B) shows that the exponential distribution is the only continuous distribution with this property.

4. Gamma distribution (zero elsewhere)
We have an m-function gammadbn to determine values of the distribution function for X gamma . Use of moment generating functions shows that for α=n, a random variable X gamma has the same distribution as the sum of n independent random variables, each exponential (λ). A relation to the Poisson distribution is described in Sec 7.5.

Example 7.6An arrival problem

On a Saturday night, the times (in hours) between arrivals in a hospital emergency unit may be represented by a random quantity which is exponential (λ=3). As we show in the chapter Mathematical Expectation, this means that the average interarrival time is 1/3 hour or 20 minutes. What is the probability of ten or more arrivals in four hours? In six hours?

SOLUTION

The time for ten arrivals is the sum of ten interarrival times. If we suppose these are independent, as is usually the case, then the time for ten arrivals is gamma .

>> p = gammadbn(10,3,[4 6])
p  =  0.7576    0.9846


5. Normal, or Gaussian
We generally indicate that a random variable X has the normal or gaussian distribution by writing , putting in the actual values for the parameters. The gaussian distribution plays a central role in many aspects of applied probability theory, particularly in the area of statistics. Much of its importance comes from the central limit theorem (CLT), which is a term applied to a number of theorems in analysis. Essentially, the CLT shows that the distribution for the sum of a sufficiently large number of independent random variables has approximately the gaussian distribution. Thus, the gaussian distribution appears naturally in such topics as theory of errors or theory of noise, where the quantity observed is an additive combination of a large number of essentially independent quantities. Examination of the expression shows that the graph for fX(t) is symmetric about its maximum at t=μ. The greater the parameter σ2, the smaller the maximum value and the more slowly the curve decreases with distance from μ. Thus parameter μ locates the center of the mass distribution and σ2 is a measure of the spread of mass about μ. The parameter μ is called the mean value and σ2 is the variance. The parameter σ, the positive square root of the variance, is called the standard deviation. While we have an explicit formula for the density function, it is known that the distribution function, as the integral of the density function, cannot be expressed in terms of elementary functions. The usual procedure is to use tables obtained by numerical integration.
Since there are two parameters, this raises the question whether a separate table is needed for each pair of parameters. It is a remarkable fact that this is not the case. We need only have a table of the distribution function for . This is refered to as the standardized normal distribution. We use φ and Φ for the standardized normal density and distribution functions, respectively.
Standardized normal so that the distribution function is .
The graph of the density function is the well known bell shaped curve, symmetrical about the origin (see Figure 7.4). The symmetry about the origin contributes to its usefulness.
(7.18)

Note that the area to the left of t=–1.5 is the same as the area to the right of t=1.5, so that Φ(–2)=1–Φ(2). The same is true for any t, so that we have
(7.19)

This indicates that we need only a table of values of Φ(t) for t>0 to be able to determine Φ(t) for any t. We may use the symmetry for any case. Note that Φ(0)=1/2,
Example 7.7Standardized normal calculations

Suppose XN(0,1). Determine P(–1≤X≤2) and P(|X|>1).

SOLUTION

1. P(–1≤X≤2)=Φ(2)–Φ(–1)=Φ(2)–[1–Φ(1)]=Φ(2)+Φ(1)–1

2. P(|X|>1)=P(X>1)+P(X<–1)=1–Φ(1)+Φ(–1)=2[1–Φ(1)]

From a table of standardized normal distribution function (see Appendix D), we find

Φ(2)=0.9772 and Φ(1)=0.8413 which gives P(–1≤X≤2)=0.8185 and P(|X|>1)=0.3174

General gaussian distribution
For , the density maintains the bell shape, but is shifted with different spread and height. Figure 7.5 shows the distribution function and density function for . The density is centered about t=2. It has height 1.2616 as compared with 0.3989 for the standardized normal density. Inspection shows that the graph is narrower than that for the standardized normal. The distribution function reaches 0.5 at the mean value 2.

A change of variables in the integral shows that the table for standardized normal distribution function can be used for any case.

(7.20)

Make the change of variable and corresponding formal changes

(7.21)

to get

(7.22)
Example 7.8General gaussian calculation

Suppose XN(3,16) (i.e., μ=3 and σ2=16). Determine P(–1≤X≤11) and P(|X–3|>4).

SOLUTION

In each case the problem reduces to that in Example 7.7

We have m-functions gaussian and gaussdensity to calculate values of the distribution and density function for any reasonable value of the parameters.
The following are solutions of Example 7.7 and Example 7.8, using the m-function gaussian.
Example 7.9Example 7.7 and Example 7.8 (continued)
>> P1 = gaussian(0,1,2) - gaussian(0,1,-1)
P1 =  0.8186
>> P2 = 2*(1 - gaussian(0,1,1))
P2 =  0.3173
>> P1 = gaussian(3,16,11) - gaussian(3,16,-1)
P2 =  0.8186
>> P2 = gaussian(3,16,-1)) + 1 - (gaussian(3,16,7)
P2 =  0.3173


The differences in these results and those above (which used tables) are due to the roundoff to four places in the tables.

6. Beta.
Analysis is based on the integrals

(7.23)

Figure 7.6 and Figure 7.7 show graphs of the densities for various values of . The usefulness comes in approximating densities on the unit interval. By using scaling and shifting, these can be extended to other intervals. The special case r=s=1 gives the uniform distribution on the unit interval. The Beta distribution is quite useful in developing the Bayesian statistics for the problem of sampling to determine a population proportion. If are integers, the density function is a polynomial. For the general case we have two m-functions, beta and betadbn to perform the calculatons.

7. Weibull
The parameter ν is a shift parameter. Usually we assume ν=0. Examination shows that for α=1 the distribution is exponential (λ). The parameter α provides a distortion of the time scale for the exponential distribution. Figure 7.6 and Figure 7.7 show graphs of the Weibull density for some representative values of α and λ (ν=0). The distribution is used in reliability theory. We do not make much use of it. However, we have m-functions weibull (density) and weibulld (distribution function) for shift parameter ν=0 only. The shift can be obtained by subtracting a constant from the t values.

## 7.2. Distribution Approximations*

### Binomial, Poisson, gamma, and Gaussian distributions

The Poisson approximation to the binomial distribution

The following approximation is a classical one. We wish to show that for small p and sufficiently large n

(7.24)

Suppose p=μ/n with n large and μ/n<1. Then,

(7.25)

The first factor in the last expression is the ratio of polynomials in n of the same degree k, which must approach one as n becomes large. The second factor approaches one as n becomes large. According to a well known property of the exponential

(7.26)

The result is that for large n, , where μ=np.

The Poisson and gamma distributions

Suppose Y Poisson (λt). Now X gamma (α,λ) iff

(7.27)
(7.28)

A well known definite integral, obtained by integration by parts, is

(7.29)

Noting that we find after some simple algebra that

(7.30)

For a=λt and α=n, we have the following equality iff X gamma .

(7.31)

Now

(7.32)

The gaussian (normal) approximation

The central limit theorem, referred to in the discussion of the gaussian or normal distribution above, suggests that the binomial and Poisson distributions should be approximated by the gaussian. The number of successes in n trials has the binomial (n,p) distribution. This random variable may be expressed

(7.33)

Since the mean value of X is np and the variance is npq, the distribution should be approximately .

Use of the generating function shows that the sum of independent Poisson random variables is Poisson. Now if X Poisson (μ), then X may be considered the sum of n independent random variables, each Poisson (μ/n). Since the mean value and the variance are both μ, it is reasonable to suppose that suppose that X is approximately .

It is generally best to compare distribution functions. Since the binomial and Poisson distributions are integer-valued, it turns out that the best gaussian approximaton is obtained by making a “continuity correction.” To get an approximation to a density for an integer-valued random variable, the probability at t=k is represented by a rectangle of height pk and unit width, with k as the midpoint. Figure 1 shows a plot of the “density” and the corresponding gaussian density for n=300, p=0.1. It is apparent that the gaussian density is offset by approximately 1/2. To approximate the probability Xk, take the area under the curve from k+1/2; this is called the continuity correction.

Use of m-procedures to compare

We have two m-procedures to make the comparisons. First, we consider approximation of the

Poisson (μ) distribution. The m-procedure poissapp calls for a value of μ, selects a suitable range about k=μ and plots the distribution function for the Poisson distribution (stairs) and the normal (gaussian) distribution (dash dot) for N(μ,μ). In addition, the continuity correction is applied to the gaussian distribution at integer values (circles). Figure 7.10 shows plots for μ=10. It is clear that the continuity correction provides a much better approximation. The plots in Figure 7.11 are for μ=100. Here the continuity correction provides the better approximation, but not by as much as for the smaller μ.

The m-procedure bincomp compares the binomial, gaussian, and Poisson distributions. It calls for values of n and p, selects suitable k values, and plots the distribution function for the binomial, a continuous approximation to the distribution function for the Poisson, and continuity adjusted values of the gaussian distribution function at the integer values. Figure 7.11 shows plots for n=1000, p=0.03. The good agreement of all three distribution functions is evident. Figure 7.12 shows plots for n=50, p=0.6. There is still good agreement of the binomial and adjusted gaussian. However, the Poisson distribution does not track very well. The difficulty, as we see in the unit Variance, is the difference in variances—npq for the binomial as compared with np for the Poisson.

### Approximation of a real random variable by simple random variables

Simple random variables play a significant role, both in theory and applications. In the unit Random Variables, we show how a simple random variable is determined by the set of points on the real line representing the possible values and the corresponding set of probabilities that each of these values is taken on. This describes the distribution of the random variable and makes possible calculations of event probabilities and parameters for the distribution.

A continuous random variable is characterized by a set of possible values spread continuously over an interval or collection of intervals. In this case, the probability is also spread smoothly. The distribution is described by a probability density function, whose value at any point indicates "the probability per unit length" near the point. A simple approximation is obtained by subdividing an interval which includes the range (the set of possible values) into small enough subintervals that the density is approximately constant over each subinterval. A point in each subinterval is selected and is assigned the probability mass in its subinterval. The combination of the selected points and the corresponding probabilities describes the distribution of an approximating simple random variable. Calculations based on this distribution approximate corresponding calculations on the continuous distribution.

Before examining a general approximation procedure which has significant consequences for later treatments, we consider some illustrative examples.

Example 7.10Simple approximation to Poisson

A random variable with the Poisson distribution is unbounded. However, for a given parameter value μ, the probability for kn, n sufficiently large, is negligible. Experiment indicates (i.e., six standard deviations beyond the mean) is a reasonable value for 5≤μ≤200.

>> mu = [5 10 20 30 40 50 70 100 150 200];
>> K = zeros(1,length(mu));
>> p = zeros(1,length(mu));
>> for i = 1:length(mu)
K(i) = floor(mu(i)+ 6*sqrt(mu(i)));
p(i) = cpoisson(mu(i),K(i));
end
>> disp([mu;K;p*1e6]')
5.0000   18.0000    5.4163  % Residual probabilities are 0.000001
10.0000   28.0000    2.2535  % times the numbers in the last column.
20.0000   46.0000    0.4540  % K is the value of k needed to achieve
30.0000   62.0000    0.2140  % the residual shown.
40.0000   77.0000    0.1354
50.0000   92.0000    0.0668
70.0000  120.0000    0.0359
100.0000  160.0000    0.0205
150.0000  223.0000    0.0159
200.0000  284.0000    0.0133


An m-procedure for discrete approximation

If X is bounded, absolutely continuous with density functon fX, the m-procedure tappr sets up the distribution for an approximating simple random variable. An interval containing the range of X is divided into a specified number of equal subdivisions. The probability mass for each subinterval is assigned to the midpoint. If dx is the length of the subintervals, then the integral of the density function over the subinterval is approximated by . where ti is the midpoint. In effect, the graph of the density over the subinterval is approximated by a rectangle of length dx and height . Once the approximating simple distribution is established, calculations are carried out as for simple random variables.

Example 7.11 A numerical example

Suppose fX(t)=3t2,0≤t≤1. Determine P(0.2≤X≤0.9).

SOLUTION

In this case, an analytical solution is easy. FX(t)=t3 on the interval , so

P=0.93–0.23=0.7210. We use tappr as follows:

>> tappr
Enter matrix [a b] of x-range endpoints  [0 1]
Enter number of x approximation points  200
Enter density as a function of t  3*t.^2
Use row matrices X and PX as in the simple case
>> M = (X >= 0.2)&(X <= 0.9);
>> p = M*PX'
p  =  0.7210


Because of the regularity of the density and the number of approximation points, the result agrees quite well with the theoretical value.

The next example is a more complex one. In particular, the distribution is not bounded. However, it is easy to determine a bound beyond which the probability is negligible.

The life (in miles) of a certain brand of radial tires may be represented by a random variable X with density

(7.34)

where , and k=1/4000. Determine P(X≥45,000).

>> a = 40000;
>> b = 20/3;
>> k = 1/4000;
>> % Test shows cutoff point of 80000 should be satisfactory
>> tappr
Enter matrix [a b] of x-range endpoints  [0 80000]
Enter number of x approximation points  80000/20
Enter density as a function of t  (t.^2/a^3).*(t < 40000) + ...
(b/a)*exp(k*(a-t)).*(t >= 40000)
Use row matrices X and PX as in the simple case
>> P = (X >= 45000)*PX'
P   =  0.1910             % Theoretical value = (2/3)exp(-5/4) = 0.191003
>> cdbn
Enter row matrix of VALUES  X
Enter row matrix of PROBABILITIES  PX  % See Figure 7.14 for plot


In this case, we use a rather large number of approximation points. As a consequence, the results are quite accurate. In the single-variable case, designating a large number of approximating points usually causes no computer memory problem.

The general approximation procedure

We show now that any bounded real random variable may be approximated as closely as desired by a simple random variable (i.e., one having a finite set of possible values). For the unbounded case, the approximation is close except in a portion of the range having arbitrarily small total probability.

We limit our discussion to the bounded case, in which the range of X is limited to a bounded interval I=[a,b]. Suppose I is partitioned into n subintervals by points ti, 1≤in–1, with a=t0 and b=tn. Let be the ith subinterval, 1≤in–1 and (see Figure 7.14). Now random variable X may map into any point in the interval, and hence into any point in each subinterval Mi. Let be the set of points mapped into Mi by X. Then the Ei form a partition of the basic space Ω. For the given subdivision, we form a simple random variable Xs as follows. In each subinterval, pick a point si,ti–1si<ti. Consider the simple random variable .

This random variable is in canonical form. If ωEi, then X(ω)∈Mi and Xs(ω)=si. Now the absolute value of the difference satisfies

(7.35)

Since this is true for each ω and the corresponding subinterval, we have the important fact

(7.36)

By making the subintervals small enough by increasing the number of subdivision points, we can make the difference as small as we please.

While the choice of the si is arbitrary in each Mi, the selection of si=ti–1 (the left-hand endpoint) leads to the property Xs(ω)≤X(ω)∀ω. In this case, if we add subdivision points to decrease the size of some or all of the Mi, the new simple approximation Ys satisfies

(7.37)

To see this, consider ti*Mi (see Figure 7.15). Mi is partitioned into Mi'Mi'' and Ei is partitioned into Ei'Ei''. X maps Ei' into Mi' and Ei'' into Mi''. Ys maps Ei' into ti and maps Ei'' into ti*>ti. Xs maps both Ei' and Ei'' into ti. Thus, the asserted inequality must hold for each ω By taking a sequence of partitions in which each succeeding partition refines the previous (i.e. adds subdivision points) in such a way that the maximum length of subinterval goes to zero, we may form a nondecreasing sequence of simple random variables Xn which increase to X for each ω.

The latter result may be extended to random variables unbounded above. Simply let N th set of subdivision points extend from a to N, making the last subinterval . Subintervals from a to N are made increasingly shorter. The result is a nondecreasing sequence of simple random variables, with XN(ω)→X(ω) as N→∞, for each ωΩ.

For probability calculations, we simply select an interval I large enough that the probability outside I is negligible and use a simple approximation over I.

## 7.3. Problems on Distribution and Density Functions*

(See Exercises 3 and 4 from "Problems on Random Variables and Probabilities"). The class is a partition. Random variable X has values {1,3,2,3,4,2,1,3,5,2} on C1 through C10, respectively, with probabilities 0.08, 0.13, 0.06, 0.09, 0.14, 0.11, 0.12, 0.07, 0.11, 0.09. Determine and plot the distribution function FX.

T = [1 3 2 3 4 2 1 3 5 2];
pc = 0.01*[8 13 6 9 14 11 12 7 11 9];
[X,PX] = csort(T,pc);
ddbn
Enter row matrix of VALUES  X
Enter row matrix of PROBABILITIES  PX    % See MATLAB plot


(See Exercise 6 from "Problems on Random Variables and Probabilities"). A store has eight items for sale. The prices are $3.50,$5.00, $3.50,$7.50, $5.00,$5.00, $3.50, and$7.50, respectively. A customer comes in. She purchases one of the items with probabilities 0.10, 0.15, 0.15, 0.20, 0.10 0.05, 0.10 0.15. The random variable expressing the amount of her purchase may be written

(7.38) X = 3 . 5 IC1 + 5 . 0 IC2 + 3 . 5 IC3 + 7 . 5 IC4 + 5 . 0 IC5 + 5 . 0 IC6 + 3 . 5 IC7 + 7 . 5 IC8

Determine and plot the distribution function for X.

T = [3.5 5 3.5 7.5 5 5 3.5 7.5];
pc = 0.01*[10 15 15 20 10 5 10 15];
[X,PX] = csort(T,pc);
ddbn
Enter row matrix of VALUES  X
Enter row matrix of PROBABILITIES  PX    % See MATLAB plot


(See Exercise 12 from "Problems on Random Variables and Probabilities"). The class has minterm probabilities

(7.39)

Determine and plot the distribution function for the random variable X=IA+IB+IC+ID, which counts the number of the events which occur on a trial.

npr06_12
Minterm probabilities in pm, coefficients in c
T = sum(mintable(4)); % Alternate solution.  See Exercise 12 from "Problems on Random Variables and Probabilities"
[X,PX] = csort(T,pm);
ddbn
Enter row matrix of VALUES  X
Enter row matrix of PROBABILITIES  PX    % See MATLAB plot


Suppose a is a ten digit number. A wheel turns up the digits 0 through 9 with equal probability on each spin. On ten spins what is the probability of matching, in order, k or more of the ten digits in a, 0≤k≤10? Assume the initial digit may be zero.

P=cbinom(10,0.1,0:10).

In a thunderstorm in a national park there are 127 lightning strikes. Experience shows that the probability of of a lightning strike starting a fire is about 0.0083. What is the probability that k fires are started, k=0,1,2,3?

P = ibinom(127,0.0083,0:3) P = 0.3470 0.3688 0.1945 0.0678

A manufacturing plant has 350 special lamps on its production lines. On any day, each lamp could fail with probability p=0.0017. These lamps are critical, and must be replaced as quickly as possible. It takes about one hour to replace a lamp, once it has failed. What is the probability that on any day the loss of production time due to lamp failaures is k or fewer hours,

P = 1 - cbinom(350,0.0017,1:6)

=  0.5513    0.8799    0.9775    0.9968    0.9996    1.0000


Two hundred persons buy tickets for a drawing. Each ticket has probability 0.008 of winning. What is the probability of k or fewer winners,

P = 1 - cbinom(200,0.008,3:5) = 0.7838 0.9220 0.9768

Two coins are flipped twenty times. What is the probability the results match (both heads or both tails) k times, 0≤k≤20?

P = ibinom(20,1/2,0:20)

Thirty members of a class each flip a coin ten times. What is the probability that at least five of them get seven or more heads?

p = cbinom(10,0.5,7) = 0.1719

P = cbinom(30,p,5) = 0.6052


For the system in Exercise 6., call a day in which one or more failures occur among the 350 lamps a “service day.” Since a Bernoulli sequence “starts over” at any time, the sequence of service/nonservice days may be considered a Bernoulli sequence with probability p1, the probability of one or more lamp failures in a day.

1. Beginning on a Monday morning, what is the probability the first service day is the first, second, third, fourth, fifth day of the week?

2. What is the probability of no service days in a seven day week?

p1 = 1 - (1 - 0.0017)^350 = 0.4487 k = 1:5; (prob given day is a service day)

1. P = p1*(1 - p1).^(k-1) = 0.4487  0.2474  0.1364  0.0752  0.0414

2. P0 = (1 - p1)^7 = 0.0155


For the system in Exercise 6. and Exercise 10. assume the plant works seven days a week. What is the probability the third service day occurs by the end of 10 days? Solve using the negative binomial distribution; repeat using the binomial distribution.

p1 = 1 - (1 - 0.0017)^350 = 0.4487

• P = sum(nbinom(3,p1,3:10)) = 0.8990

• Pa = cbinom(10,p1,3) = 0.8990

A residential College plans to raise money by selling “chances” on a board. Fifty chances are sold. A player pays $10 to play; he or she wins$30 with probability p=0.2. The profit to the College is

(7.40)

Determine the distribution for X and calculate P(X>0), P(X≥200), and

P(X≥300).

N = 0:50;
PN = ibinom(50,0.2,0:50);
X = 500 - 30*N;
Ppos = (X>0)*PN'
Ppos =  0.9856
P200 = (X>=200)*PN'
P200 =  0.5836
P300 = (X>=300)*PN'
P300 =  0.1034


A single six-sided die is rolled repeatedly until either a one or a six turns up. What is the probability that the first appearance of either of these numbers is achieved by the fifth trial or sooner?

P = 1 - (2/3)^5 = 0.8683

Consider a Bernoulli sequence with probability p=0.53 of success on any component trial.

1. The probability the fourth success will occur no later than the tenth trial is determined by the negative binomial distribution. Use the procedure nbinom to calculate this probability .

2. Calculate this probability using the binomial distribution.

1. P = sum(nbinom(4,0.53,4:10)) = 0.8729

2.  Pa = cbinom(10,0.53,4) = 0.8729 

Fifty percent of the components coming off an assembly line fail to meet specifications for a special job. It is desired to select three units which meet the stringent specifications. Items are selected and tested in succession. Under the usual assumptions for Bernoulli trials, what is the probability the third satisfactory unit will be found on six or fewer trials?

P = cbinom(6,0.5,3) = 0.6562

The number of cars passing a certain traffic count position in an hour has Poisson (53) distribution. What is the probability the number of cars passing in an hour lies between 45 and 55 (inclusive)? What is the probability of more than 55?

P1 = cpoisson(53,45) - cpoisson(53,56) = 0.5224

P2 = cpoisson(53,56) = 0.3581


Compare P(Xk) and P(Yk) for X binomial(5000, 0.001) and Y Poisson (5), for 0≤k≤10. Do this directly with ibinom and ipoisson. Then use the m-procedure bincomp to obtain graphical results (including a comparison with the normal distribution).

k = 0:10;
Pb = 1 - cbinom(5000,0.001,k+1);
Pp = 1 - cpoisson(5,k+1);
disp([k;Pb;Pp]')
0    0.0067    0.0067
1.0000    0.0404    0.0404
2.0000    0.1245    0.1247
3.0000    0.2649    0.2650
4.0000    0.4404    0.4405
5.0000    0.6160    0.6160
6.0000    0.7623    0.7622
7.0000    0.8667    0.8666
8.0000    0.9320    0.9319
9.0000    0.9682    0.9682
10.0000    0.9864    0.9863

bincomp
Enter the parameter n  5000
Enter the parameter p  0.001
Binomial-- stairs
Poisson--  -.-.
gtext('Exercise 17')


Suppose X binomial (12, 0.375), Y Poisson (4.5), and Z exponential (1/4.5). For each random variable, calculate and tabulate the probability of a value at least k, for integer values 3≤k≤8.

k = 3:8;
Px = cbinom(12,0.375,k);
Py = cpoisson(4.5,k);
Pz = exp(-k/4.5);
disp([k;Px;Py;Pz]')
3.0000    0.8865    0.8264    0.5134
4.0000    0.7176    0.6577    0.4111
5.0000    0.4897    0.4679    0.3292
6.0000    0.2709    0.2971    0.2636
7.0000    0.1178    0.1689    0.2111
8.0000    0.0390    0.0866    0.1690


The number of noise pulses arriving on a power circuit in an hour is a random quantity having Poisson (7) distribution. What is the probability of having at least 10 pulses in an hour? What is the probability of having at most 15 pulses in an hour?

P1 = cpoisson(7,10) = 0.1695 P2 = 1 - cpoisson(7,16) = 0.9976

The number of customers arriving in a small specialty store in an hour is a random quantity having Poisson (5) distribution. What is the probability the number arriving in an hour will be between three and seven, inclusive? What is the probability of no more than ten?

P1 = cpoisson(5,3) - cpoisson(5,8) = 0.7420

P2 = 1 - cpoisson(5,11) = 0.9863


Random variable X binomial (1000, 0.1).

1. Determine

2. Use the appropriate Poisson distribution to approximate these values.

k = [80 100 120];
P = cbinom(1000,0.1,k)
P  =  0.9867    0.5154    0.0220
P1 = cpoisson(100,k)
P1 =  0.9825    0.5133    0.0282


The time to failure, in hours of operating time, of a televesion set subject to random voltage surges has the exponential (0.002) distribution. Suppose the unit has operated successfully for 500 hours. What is the (conditional) probability it will operate for another 500 hours?

P(X>500+500|X>500)=P(X>500)=e–0.002·500=0.3679

For X exponential (λ), determine P(X≥1/λ), P(X≥2/λ).

P(X>kλ)=eλk/λ=ek

Twenty “identical” units are put into operation. They fail independently. The times to failure (in hours) form an iid class, exponential (0.0002). This means the “expected” life is 5000 hours. Determine the probabilities that at least k, for k=5,8,10,12,15, will survive for 5000 hours.

p = exp(-0.0002*5000)
p = 0.3679
k = [5 8 10 12 15];
P = cbinom(20,p,k)
P = 0.9110  0.4655  0.1601  0.0294  0.0006


Let T gamma (20, 0.0002) be the total operating time for the units described in Exercise 24..

1. Use the m-function for the gamma distribution to determine P(T≤100,000).

2. Use the Poisson distribution to determine P(T≤100,000).

P1 = gammadbn(20,0.0002,100000) = 0.5297 P2 = cpoisson(0.0002*100000,20) = 0.5297

The sum of the times to failure for five independent units is a random variable X gamma . Without using tables or m-programs, determine P(X≤25).

(7.41)
(7.42)

Interarrival times (in minutes) for fax messages on a terminal are independent, exponential (λ=0.1). This means the time X for the arrival of the fourth message is gamma(4, 0.1). Without using tables or m-programs, utilize the relation of the gamma to the Poisson distribution to determine P(X≤30).

(7.43)
(7.44)

Customers arrive at a service center with independent interarrival times in hours, which have exponential (3) distribution. The time X for the third arrival is thus gamma . Without using tables or m-programs, determine P(X≤2).

(7.45)
(7.46) P ( Y ≥ 3 ) = 1 – P ( Y ≤ 2 ) = 1 – e – 6 ( 1 + 6 + 36 / 2 ) = 0 . 9380

Five people wait to use a telephone, currently in use by a sixth person. Suppose time for the six calls (in minutes) are iid, exponential (1/3). What is the distribution for the total time Z from the present for the six calls? Use an appropriate Poisson distribution to determine P(Z≤20).

Z gamma (6,1/3).

(7.47)
(7.48)

A random number generator produces a sequence of numbers between 0 and 1. Each of these can be considered an observed value of a random variable uniformly distributed on the interval [0, 1]. They assume their values independently. A sequence of 35 numbers is generated. What is the probability 25 or more are less than or equal to 0.71? (Assume continuity. Do not make a discrete adjustment.)

p = cbinom(35,0.71,25) = 0.5620

Five “identical” electronic devices are installed at one time. The units fail independently, and the time to failure, in days, of each is a random variable exponential (1/30). A maintenance check is made each fifteen days. What is the probability that at least four are still operating at the maintenance check?

p = exp(-15/30) = 0.6065 P = cbinom(5,p,4) = 0.3483

Suppose XN(4,81). That is, X has gaussian distribution with mean μ=4 and variance σ2=81.

1. Use a table of standardized normal distribution to determine P(2<X<8) and P(|X–4|≤5).

2. Calculate the probabilities in part (a) with the m-function gaussian.

1. (7.49) P ( 2 < X < 8 ) = Φ ( ( 8 – 4 ) / 9 ) – Φ ( ( 2 – 4 ) / 9 ) =

(7.50) Φ ( 4 / 9 ) + Φ ( 2 / 9 ) – 1 = 0 . 6712 + 0 . 5875 – 1 = 0 . 2587

(7.51) P ( | X – 4 | ≤ 5 ) = 2 Φ ( 5 / 9 ) – 1 = 1 . 4212 – 1 = 0 . 4212
2. P1 = gaussian(4,81,8) - gaussian(4,81,2)
P1 = 0.2596
P2 = gaussian(4,81,9) - gaussian(4,84,-1)
P2 = 0.4181


Suppose . That is, X has gaussian distribution with μ=5 and σ2=81. Use a table of standardized normal distribution to determine P(3<X<9) and P(|X–5|≤5). Check your results using the m-function gaussian.

(7.52) P ( 3 < X < 9 ) = Φ ( ( 9 – 5 ) / 9 ) – Φ ( ( 3 – 5 ) / 9 ) = Φ ( 4 / 9 ) + Φ ( 2 / 9 ) – 1 = 0 . 6712 + 0 . 5875 – 1 = 0 . 2587
(7.53) P ( | X – 5 | ≤ 5 ) = 2 Φ ( 5 / 9 ) – 1 = 1 . 4212 – 1 = 0 . 4212
P1 = gaussian(5,81,9) - gaussian(5,81,3)
P1 = 0.2596
P2 = gaussian(5,81,10) - gaussian(5,84,0)
P2 = 0.4181


Suppose . That is, X has gaussian distribution with μ=3 and σ2=64. Use a table of standardized normal distribution to determine P(1<X<9) and P(|X–3|≤4). Check your results with the m-function gaussian.

(7.54) P ( 1 < X < 9 ) = Φ ( ( 9 – 3 ) / 8 ) – Φ ( ( 1 – 3 ) / 9 ) =
(7.55) Φ ( 0 . 75 ) + Φ ( 0 . 25 ) – 1 = 0 . 7734 + 0 . 5987 – 1 = 0 . 3721
(7.56) P ( | X – 3 | ≤ 4 ) = 2 Φ ( 4 / 8 ) – 1 = 1 . 3829 – 1 = 0 . 3829
P1 = gaussian(3,64,9) - gaussian(3,64,1)
P1 = 0.3721
P2 = gaussian(3,64,7) - gaussian(3,64,-1)
P2 = 0.3829


Items coming off an assembly line have a critical dimension which is represented by a random variable N(10, 0.01). Ten items are selected at random. What is the probability that three or more are within 0.05 of the mean value μ.

p = gaussian(10,0.01,10.05) - gaussian(10,0.01,9.95)
p =  0.3829
P = cbinom(10,p,3)
P =  0.8036


The result of extensive quality control sampling shows that a certain model of digital watches coming off a production line have accuracy, in seconds per month, that is normally distributed with μ=5 and σ2=300. To achieve a top grade, a watch must have an accuracy within the range of -5 to +10 seconds per month. What is the probability a watch taken from the production line to be tested will achieve top grade? Calculate, using a standardized normal table. Check with the m-function gaussian.

(7.57)

Use the m-procedure bincomp with various values of n from 10 to 500 and p from 0.01 to 0.7, to observe the approximation of the binomial distribution by the Poisson.

Experiment with the m-procedure bincomp.

Use the m-procedure poissapp to compare the Poisson and gaussian distributions. Use various values of μ from 10 to 500.

Experiment with the m-procedure poissapp.

Random variable X has density (and zero elsewhere).

1. Determine P(–0.5≤X<0,8), P(|X|>0.5), P(|X–0.25|≤0.5).

2. Determine an expression for the distribution function.

3. Use the m-procedures tappr and cdbn to plot an approximation to the distribution function.

(7.58)
1. (7.59)

(7.60)

2. tappr
Enter matrix [a b] of x-range endpoints  [-1 1]
Enter number of x approximation points  200
Enter density as a function of t  1.5*t.^2
Use row matrices X and PX as in the simple case
cdbn
Enter row matrix of VALUES  X
Enter row matrix of PROBABILITIES  PX    % See MATLAB plot


Random variable X has density function (and zero elsewhere).

1. Determine P(X≤0.5), P(0.5≤X<1.5), P(|X–1|<1/4).

2. Determine an expression for the distribution function.

3. Use the m-procedures tappr and cdbn to plot an approximation to the distribution function.

(7.61)
1. (7.62)

2. tappr
Enter matrix [a b] of x-range endpoints  [0 2]
Enter number of x approximation points  200
Enter density as a function of t  t - (3/8)*t.^2
Use row matrices X and PX as in the simple case
cdbn
Enter row matrix of VALUES  X
Enter row matrix of PROBABILITIES  PX    % See MATLAB plot


Random variable X has density function

(7.63)
1. Determine P(X≤0.5), P(0.5≤X<1.5), P(|X–1|<1/4).

2. Determine an expression for the distribution function.

3. Use the m-procedures tappr and cdbn to plot an approximation to the distribution function.

1. (7.64)

(7.65)

2. (7.66)

3. tappr
Enter matrix [a b] of x-range endpoints  [0 2]
Enter number of x approximation points  400
Enter density as a function of t  (6/5)*(t<=1).*t.^2 + ...
(6/5)*(t>1).*(2 - t)
Use row matrices X and PX as in the simple case
cdbn
Enter row matrix of VALUES  X
Enter row matrix of PROBABILITIES  PX    % See MATLAB plot


Solutions