Untitled Page 12
- Page ID
- 125285
Chapter 7. Distribution and Density Functions
7.1. Distribution and Density Functions^{*}
Introduction
In the unit on Random Variables and Probability we introduce real random variables as mappings from the basic space Ω to the real line. The mapping induces a transfer of the probability mass on the basic space to subsets of the real line in such a way that the probability that X takes a value in a set M is exactly the mass assigned to that set by the transfer. To perform probability calculations, we need to describe analytically the distribution on the line. For simple random variables this is easy. We have at each possible value of X a point mass equal to the probability X takes that value. For more general cases, we need a more useful description than that provided by the induced probability measure P_{X}.
The distribution function
In the theoretical discussion on Random Variables and Probability, we note that the probability distribution induced by a random variable X is determined uniquely by a consistent assignment of mass to semi-infinite intervals of the form (–∞,t] for each real t. This suggests that a natural description is provided by the following.
Definition
The distribution function F_{X} for random variable X is given by
In terms of the mass distribution on the line, this is the probability mass at or to the left of the point t. As a consequence, F_{X} has the following properties:
(F1) : F_{X} must be a nondecreasing function, for if t>s there must be at least as much probability mass at or to the left of t as there is for s. |
(F2) : F_{X} is continuous from the right, with a jump in the amount p_{0} at t_{0} iff . If the point t approaches t_{0} from the left, the interval does not include the probability mass at t_{0} until t reaches that value, at which point the amount at or to the left of t increases ("jumps") by amount p_{0}; on the other hand, if t approaches t_{0} from the right, the interval includes the mass p_{0} all the way to and including t_{0}, but drops immediately as t moves to the left of t_{0}. |
(F3) : Except in very unusual cases involving random variables which may take “infinite”
values, the probability mass included in must increase to one as
t moves to the right; as t moves to the left, the probability mass included must decrease
to zero, so that
(7.2) |
A distribution function determines the probability mass in each semiinfinite interval (–∞,t]. According to the discussion referred to above, this determines uniquely the induced distribution.
The distribution function F_{X} for a simple random variable is easily visualized. The distribution consists of point mass p_{i} at each point t_{i} in the range. To the left of the smallest value in the range, F_{X}(t)=0; as t increases to the smallest value t_{1}, F_{X}(t) remains constant at zero until it jumps by the amount p_{1}.. F_{X}(t) remains constant at p_{1} until t increases to t_{2}, where it jumps by an amount p_{2} to the value p_{1}+p_{2}. This continues until the value of F_{X}(t)reaches 1 at the largest value t_{n}. The graph of F_{X} is thus a step function, continuous from the right, with a jump in the amount p_{i} at the corresponding point t_{i} in the range. A similar situation exists for a discrete-valued random variable which may take on an infinity of values (e.g., the geometric distribution or the Poisson distribution considered below). In this case, there is always some probability at points to the right of any t_{i}, but this must become vanishingly small as t increases, since the total probability mass is one.
The procedure ddbn may be used to plot the distributon function for a simple random variable from a matrix X of values and a corresponding matrix PX of probabilities.
>> c = [10 18 10 3]; % Distribution for X in Example 6.5.1 >> pm = minprob(0.1*[6 3 5]); >> canonic Enter row vector of coefficients c Enter row vector of minterm probabilities pm Use row matrices X and PX for calculations Call for XDBN to view the distribution >> ddbn % Circles show values at jumps Enter row matrix of VALUES X Enter row matrix of PROBABILITIES PX % Printing details See Figure 7.1
Description of some common discrete distributions
We make repeated use of a number of common distributions which are used in many practical situations. This collection includes several distributions which are studied in the chapter "Random Variables and Probabilities".
Indicator function. X=I_{E} P(X=1)=P(E)=pP(X=0)=q=1–p. The distribution function has a jump in the amount q at t=0 and an additional jump of p to the value 1 at t=1.
Simple random variable (canonical form)
(7.3)The distribution function is a step function, continuous from the right, with jump of p_{i} at t=t_{i} (See Figure 7.1 for Example 7.1)
Binomial (n,p). This random variable appears as the number of successes in a sequence of n Bernoulli trials with probability p of success. In its simplest form
(7.4)(7.5)As pointed out in the study of Bernoulli sequences in the unit on Composite Trials, two m-functions ibinom andcbinom are available for computing the individual and cumulative binomial probabilities.
Geometric (p) There are two related distributions, both arising in the study of continuing Bernoulli sequences. The first counts the number of failures before the first success. This is sometimes called the “waiting time.” The event {X=k} consists of a sequence of k failures, then a success. Thus
(7.6)The second designates the component trial on which the first success occurs. The event {Y=k} consists of k–1 failures, then a success on the kth component trial. We have
(7.7)We say X has the geometric distribution with parameter (p), which we often designate by X∼ geometric (p). Now Y=X+1 or Y–1=X. For this reason, it is customary to refer to the distribution for the number of the trial for the first success by saying Y–1∼ geometric (p). The probability of k or more failures before the first success is P(X≥k)=q^{k}. Also
(7.8)This suggests that a Bernoulli sequence essentially "starts over" on each trial. If it has failed n times, the probability of failing an additional k or more times before the next success is the same as the initial probability of failing k or more times before the first success.
Example 7.2. The geometric distributionA statistician is taking a random sample from a population in which two percent of the members own a BMW automobile. She takes a sample of size 100. What is the probability of finding no BMW owners in the sample?
SOLUTIONThe sampling process may be viewed as a sequence of Bernoulli trials with probability p=0.02 of success. The probability of 100 or more failures before the first success is 0.98^{100}=0.1326 or about 1/7.5.
Negative binomial (m,p). X is the number of failures before the mth success. It is generally more convenient to work with Y=X+m, the number of the trial on which the mth success occurs. An examination of the possible patterns and elementary combinatorics show that
(7.9)There are m–1 successes in the first k–1 trials, then a success. Each combination has probability p^{m}q^{k–m}. We have an m-function nbinom to calculate these probabilities.
Example 7.3. A game of chanceA player throws a single six-sided die repeatedly. He scores if he throws a 1 or a 6. What is the probability he scores five times in ten or fewer throws?
>> p = sum(nbinom(5,1/3,5:10)) p = 0.2131
An alternate solution is possible with the use of the binomial distribution. The mth success comes not later than the kth trial iff the number of successes in k trials is greater than or equal to m.
>> P = cbinom(10,1/3,5) P = 0.2131
Poisson (μ). This distribution is assumed in a wide variety of applications. It appears as a counting variable for items arriving with exponential interarrival times (see the relationship to the gamma distribution below). For large n and small p (which may not be a value found in a table), the binomial distribution is approximately Poisson (np). Use of the generating function (see Transform Methods) shows the sum of independent Poisson random variables is Poisson. The Poisson distribution is integer valued, with
(7.10)Although Poisson probabilities are usually easier to calculate with scientific calculators than binomial probabilities, the use of tables is often quite helpful. As in the case of the binomial distribution, we have two m-functions for calculating Poisson probabilities. These have advantages of speed and parameter range similar to those for ibinom and cbinom.
: P(X=k) is calculated by P = ipoisson(mu,k)
, where k is a row or column vector of integers and the result P is a row matrix of the probabilities.: P(X≥k) is calculated by P = cpoisson(mu,k)
, where k is a row or column vector of integers and the result P is a row matrix of the probabilities.Example 7.4. Poisson counting random variableThe number of messages arriving in a one minute period at a communications network junction is a random variable N∼ Poisson (130). What is the probability the number of arrivals is greater than equal to 110, 120, 130, 140, 150, 160 ?
>> p = cpoisson(130,110:10:160) p = 0.9666 0.8209 0.5117 0.2011 0.0461 0.0060
The descriptions of these distributions, along with a number of other facts, are summarized in the table DATA ON SOME COMMON DISTRIBUTIONS in Appendix C.
The density function
If the probability mass in the induced distribution is spread smoothly along the real line, with no point mass concentrations, there is a probability density function f_{X} which satisfies
At each t, f_{X}(t) is the mass per unit length in the probability distribution. The density function has three characteristic properties:
A random variable (or distribution) which has a density is called absolutely continuous. This term comes from measure theory. We often simply abbreviate as continuous distribution.
There is a technical mathematical description of the condition “spread smoothly with no point mass concentrations.” And strictly speaking the integrals are Lebesgue integrals rather than the ordinary Riemann kind. But for practical cases, the two agree, so that we are free to use ordinary integration techniques.
By the fundamental theorem of calculus
(7.13)Any integrable, nonnegative function f with ∫f=1 determines a distribution function F , which in turn determines a probability distribution. If ∫f≠1, multiplication by the appropriate positive constant gives a suitable f . An argument based on the Quantile Function shows the existence of a random variable with that distribution.
In the literature on probability, it is customary to omit the indication of the region of integration when integrating over the whole line. Thus
(7.14)The first expression is not an indefinite integral. In many situations, f_{X} will be zero outside an interval. Thus, the integrand effectively determines the region of integration.
Some common absolutely continuous distributions
Uniform .
Mass is spread uniformly on the interval . It is immaterial whether or not the end points are included, since probability associated with each individual point is zero. The probability of any subinterval is proportional to the length of the subinterval. The probability of being in any two subintervals of the same length is the same. This distribution is used to model situations in which it is known that X takes on values in but is equally likely to be in any subinterval of a given length. The density must be constant over the interval (zero outside), and the distribution function increases linearly with t in the interval. Thus,(7.15)The graph of F_{X} rises linearly, with slope 1/(b–a) from zero at t=a to one at t=b.
Symmetric triangular .
This distribution is used frequently in instructional numerical examples because probabilities can be obtained geometrically. It can be shifted, with a shift of the graph, to different sets of values. It appears naturally (in shifted form) as the distribution for the sum or difference of two independent random variables uniformly distributed on intervals of the same length. This fact is established with the use of the moment generating function (see Transform Methods). More generally, the density may have a triangular graph which is not symmetric.Example 7.5. Use of a triangular distributionSuppose X∼ symmetric triangular . Determine P(120<X≤250).
Remark. Note that in the continuous case, it is immaterial whether the end point of the intervals are included or not.
SOLUTIONTo get the area under the triangle between 120 and 250, we take one minus the area of the right triangles between 100 and 120 and between 250 and 300. Using the fact that areas of similar triangles are proportional to the square of any side, we have
(7.16)Exponential ( λ) (zero elsewhere).
Integration shows (zero elsewhere). We note that . This leads to an extremely important property of the exponential distribution. Since implies X>t, we have(7.17)P(X>t+h|X>t)=P(X>t+h)/P(X>t)=e^{–λ(t+h)}/e^{–λt}=e^{–λh}=P(X>h)Because of this property, the exponential distribution is often used in reliability problems. Suppose X represents the time to failure (i.e., the life duration) of a device put into service at t=0. If the distribution is exponential, this property says that if the device survives to time t (i.e., X>t) then the (conditional) probability it will survive h more units of time is the same as the original probability of surviving for h units of time. Many devices have the property that they do not wear out. Failure is due to some stress of external origin. Many solid state electronic devices behave essentially in this way, once initial “burn in” tests have removed defective units. Use of Cauchy's equation (Appendix B) shows that the exponential distribution is the only continuous distribution with this property.
Gamma distribution (zero elsewhere)
We have an m-function gammadbn to determine values of the distribution function for X∼ gamma . Use of moment generating functions shows that for α=n, a random variable X∼ gamma has the same distribution as the sum of n independent random variables, each exponential (λ). A relation to the Poisson distribution is described in Sec 7.5.Example 7.6. An arrival problemOn a Saturday night, the times (in hours) between arrivals in a hospital emergency unit may be represented by a random quantity which is exponential (λ=3). As we show in the chapter Mathematical Expectation, this means that the average interarrival time is 1/3 hour or 20 minutes. What is the probability of ten or more arrivals in four hours? In six hours?
SOLUTIONThe time for ten arrivals is the sum of ten interarrival times. If we suppose these are independent, as is usually the case, then the time for ten arrivals is gamma .
>> p = gammadbn(10,3,[4 6]) p = 0.7576 0.9846
- Normal, or Gaussian
We generally indicate that a random variable X has the normal or gaussian distribution by writing , putting in the actual values for the parameters. The gaussian distribution plays a central role in many aspects of applied probability theory, particularly in the area of statistics. Much of its importance comes from the central limit theorem (CLT), which is a term applied to a number of theorems in analysis. Essentially, the CLT shows that the distribution for the sum of a sufficiently large number of independent random variables has approximately the gaussian distribution. Thus, the gaussian distribution appears naturally in such topics as theory of errors or theory of noise, where the quantity observed is an additive combination of a large number of essentially independent quantities. Examination of the expression shows that the graph for f_{X}(t) is symmetric about its maximum at t=μ. The greater the parameter σ^{2}, the smaller the maximum value and the more slowly the curve decreases with distance from μ. Thus parameter μ locates the center of the mass distribution and σ^{2} is a measure of the spread of mass about μ. The parameter μ is called the mean value and σ^{2} is the variance. The parameter σ, the positive square root of the variance, is called the standard deviation. While we have an explicit formula for the density function, it is known that the distribution function, as the integral of the density function, cannot be expressed in terms of elementary functions. The usual procedure is to use tables obtained by numerical integration.
Since there are two parameters, this raises the question whether a separate table is needed for each pair of parameters. It is a remarkable fact that this is not the case. We need only have a table of the distribution function for . This is refered to as the standardized normal distribution. We use φ and Φ for the standardized normal density and distribution functions, respectively.
Standardized normal so that the distribution function is .
The graph of the density function is the well known bell shaped curve, symmetrical about the origin (see Figure 7.4). The symmetry about the origin contributes to its usefulness.(7.18)
Note that the area to the left of t=–1.5 is the same as the area to the right of t=1.5, so that Φ(–2)=1–Φ(2). The same is true for any t, so that we have(7.19)
This indicates that we need only a table of values of Φ(t) for t>0 to be able to determine Φ(t) for any t. We may use the symmetry for any case. Note that Φ(0)=1/2,Figure 7.4.General gaussian distributionExample 7.7. Standardized normal calculationsSuppose X∼N(0,1). Determine P(–1≤X≤2) and P(|X|>1).
SOLUTION1. P(–1≤X≤2)=Φ(2)–Φ(–1)=Φ(2)–[1–Φ(1)]=Φ(2)+Φ(1)–1
2. P(|X|>1)=P(X>1)+P(X<–1)=1–Φ(1)+Φ(–1)=2[1–Φ(1)]
From a table of standardized normal distribution function (see Appendix D), we find
Φ(2)=0.9772 and Φ(1)=0.8413 which gives P(–1≤X≤2)=0.8185 and P(|X|>1)=0.3174
For , the density maintains the bell shape, but is shifted with different spread and height. Figure 7.5 shows the distribution function and density function for . The density is centered about t=2. It has height 1.2616 as compared with 0.3989 for the standardized normal density. Inspection shows that the graph is narrower than that for the standardized normal. The distribution function reaches 0.5 at the mean value 2.Figure 7.5.A change of variables in the integral shows that the table for standardized normal distribution function can be used for any case.
(7.20)Make the change of variable and corresponding formal changes
(7.21)to get
(7.22)We have m-functions gaussian and gaussdensity to calculate values of the distribution and density function for any reasonable value of the parameters.Example 7.8. General gaussian calculationSuppose X∼N(3,16) (i.e., μ=3 and σ^{2}=16). Determine P(–1≤X≤11) and P(|X–3|>4).
SOLUTION
In each case the problem reduces to that in Example 7.7
The following are solutions of Example 7.7 and Example 7.8, using the m-function gaussian.Example 7.9. Example 7.7 and Example 7.8 (continued)>> P1 = gaussian(0,1,2) - gaussian(0,1,-1) P1 = 0.8186 >> P2 = 2*(1 - gaussian(0,1,1)) P2 = 0.3173 >> P1 = gaussian(3,16,11) - gaussian(3,16,-1) P2 = 0.8186 >> P2 = gaussian(3,16,-1)) + 1 - (gaussian(3,16,7) P2 = 0.3173
The differences in these results and those above (which used tables) are due to the roundoff to four places in the tables.
Beta.
Analysis is based on the integrals(7.23)Figure 7.6 and Figure 7.7 show graphs of the densities for various values of . The usefulness comes in approximating densities on the unit interval. By using scaling and shifting, these can be extended to other intervals. The special case r=s=1 gives the uniform distribution on the unit interval. The Beta distribution is quite useful in developing the Bayesian statistics for the problem of sampling to determine a population proportion. If are integers, the density function is a polynomial. For the general case we have two m-functions, beta and betadbn to perform the calculatons.
Figure 7.6.Figure 7.7.Weibull
The parameter ν is a shift parameter. Usually we assume ν=0. Examination shows that for α=1 the distribution is exponential (λ). The parameter α provides a distortion of the time scale for the exponential distribution. Figure 7.6 and Figure 7.7 show graphs of the Weibull density for some representative values of α and λ (ν=0). The distribution is used in reliability theory. We do not make much use of it. However, we have m-functions weibull (density) and weibulld (distribution function) for shift parameter ν=0 only. The shift can be obtained by subtracting a constant from the t values.
7.2. Distribution Approximations^{*}
Binomial, Poisson, gamma, and Gaussian distributions
The Poisson approximation to the binomial distribution
The following approximation is a classical one. We wish to show that for small p and sufficiently large n
Suppose p=μ/n with n large and μ/n<1. Then,
The first factor in the last expression is the ratio of polynomials in n of the same degree k, which must approach one as n becomes large. The second factor approaches one as n becomes large. According to a well known property of the exponential
The result is that for large n, , where μ=np.
The Poisson and gamma distributions
Suppose Y∼ Poisson (λt). Now X∼ gamma (α,λ) iff
A well known definite integral, obtained by integration by parts, is
Noting that we find after some simple algebra that
For a=λt and α=n, we have the following equality iff X∼ gamma .
Now
The gaussian (normal) approximation
The central limit theorem, referred to in the discussion of the gaussian or normal distribution above, suggests that the binomial and Poisson distributions should be approximated by the gaussian. The number of successes in n trials has the binomial (n,p) distribution. This random variable may be expressed
Since the mean value of X is np and the variance is npq, the distribution should be approximately .
Use of the generating function shows that the sum of independent Poisson random variables is Poisson. Now if X∼ Poisson (μ), then X may be considered the sum of n independent random variables, each Poisson (μ/n). Since the mean value and the variance are both μ, it is reasonable to suppose that suppose that X is approximately .
It is generally best to compare distribution functions. Since the binomial and Poisson distributions are integer-valued, it turns out that the best gaussian approximaton is obtained by making a “continuity correction.” To get an approximation to a density for an integer-valued random variable, the probability at t=k is represented by a rectangle of height p_{k} and unit width, with k as the midpoint. Figure 1 shows a plot of the “density” and the corresponding gaussian density for n=300, p=0.1. It is apparent that the gaussian density is offset by approximately 1/2. To approximate the probability X≤k, take the area under the curve from k+1/2; this is called the continuity correction.
Use of m-procedures to compare
We have two m-procedures to make the comparisons. First, we consider approximation of the
Poisson (μ) distribution. The m-procedure poissapp calls for a value of μ, selects a suitable range about k=μ and plots the distribution function for the Poisson distribution (stairs) and the normal (gaussian) distribution (dash dot) for N(μ,μ). In addition, the continuity correction is applied to the gaussian distribution at integer values (circles). Figure 7.10 shows plots for μ=10. It is clear that the continuity correction provides a much better approximation. The plots in Figure 7.11 are for μ=100. Here the continuity correction provides the better approximation, but not by as much as for the smaller μ.
The m-procedure bincomp compares the binomial, gaussian, and Poisson distributions. It calls for values of n and p, selects suitable k values, and plots the distribution function for the binomial, a continuous approximation to the distribution function for the Poisson, and continuity adjusted values of the gaussian distribution function at the integer values. Figure 7.11 shows plots for n=1000, p=0.03. The good agreement of all three distribution functions is evident. Figure 7.12 shows plots for n=50, p=0.6. There is still good agreement of the binomial and adjusted gaussian. However, the Poisson distribution does not track very well. The difficulty, as we see in the unit Variance, is the difference in variances—npq for the binomial as compared with np for the Poisson.
Approximation of a real random variable by simple random variables
Simple random variables play a significant role, both in theory and applications. In the unit Random Variables, we show how a simple random variable is determined by the set of points on the real line representing the possible values and the corresponding set of probabilities that each of these values is taken on. This describes the distribution of the random variable and makes possible calculations of event probabilities and parameters for the distribution.
A continuous random variable is characterized by a set of possible values spread continuously over an interval or collection of intervals. In this case, the probability is also spread smoothly. The distribution is described by a probability density function, whose value at any point indicates "the probability per unit length" near the point. A simple approximation is obtained by subdividing an interval which includes the range (the set of possible values) into small enough subintervals that the density is approximately constant over each subinterval. A point in each subinterval is selected and is assigned the probability mass in its subinterval. The combination of the selected points and the corresponding probabilities describes the distribution of an approximating simple random variable. Calculations based on this distribution approximate corresponding calculations on the continuous distribution.
Before examining a general approximation procedure which has significant consequences for later treatments, we consider some illustrative examples.
A random variable with the Poisson distribution is unbounded. However, for a given parameter value μ, the probability for k≥n, n sufficiently large, is negligible. Experiment indicates (i.e., six standard deviations beyond the mean) is a reasonable value for 5≤μ≤200.
>> mu = [5 10 20 30 40 50 70 100 150 200]; >> K = zeros(1,length(mu)); >> p = zeros(1,length(mu)); >> for i = 1:length(mu) K(i) = floor(mu(i)+ 6*sqrt(mu(i))); p(i) = cpoisson(mu(i),K(i)); end >> disp([mu;K;p*1e6]') 5.0000 18.0000 5.4163 % Residual probabilities are 0.000001 10.0000 28.0000 2.2535 % times the numbers in the last column. 20.0000 46.0000 0.4540 % K is the value of k needed to achieve 30.0000 62.0000 0.2140 % the residual shown. 40.0000 77.0000 0.1354 50.0000 92.0000 0.0668 70.0000 120.0000 0.0359 100.0000 160.0000 0.0205 150.0000 223.0000 0.0159 200.0000 284.0000 0.0133
An m-procedure for discrete approximation
If X is bounded, absolutely continuous with density functon f_{X}, the m-procedure tappr sets up the distribution for an approximating simple random variable. An interval containing the range of X is divided into a specified number of equal subdivisions. The probability mass for each subinterval is assigned to the midpoint. If dx is the length of the subintervals, then the integral of the density function over the subinterval is approximated by . where t_{i} is the midpoint. In effect, the graph of the density over the subinterval is approximated by a rectangle of length dx and height . Once the approximating simple distribution is established, calculations are carried out as for simple random variables.
Suppose f_{X}(t)=3t^{2},0≤t≤1. Determine P(0.2≤X≤0.9).
SOLUTION
In this case, an analytical solution is easy. F_{X}(t)=t^{3} on the interval , so
P=0.9^{3}–0.2^{3}=0.7210. We use tappr as follows:
>> tappr Enter matrix [a b] of x-range endpoints [0 1] Enter number of x approximation points 200 Enter density as a function of t 3*t.^2 Use row matrices X and PX as in the simple case >> M = (X >= 0.2)&(X <= 0.9); >> p = M*PX' p = 0.7210
Because of the regularity of the density and the number of approximation points, the result agrees quite well with the theoretical value.
The next example is a more complex one. In particular, the distribution is not bounded. However, it is easy to determine a bound beyond which the probability is negligible.
The life (in miles) of a certain brand of radial tires may be represented by a random variable X with density
where , and k=1/4000. Determine P(X≥45,000).
>> a = 40000; >> b = 20/3; >> k = 1/4000; >> % Test shows cutoff point of 80000 should be satisfactory >> tappr Enter matrix [a b] of x-range endpoints [0 80000] Enter number of x approximation points 80000/20 Enter density as a function of t (t.^2/a^3).*(t < 40000) + ... (b/a)*exp(k*(a-t)).*(t >= 40000) Use row matrices X and PX as in the simple case >> P = (X >= 45000)*PX' P = 0.1910 % Theoretical value = (2/3)exp(-5/4) = 0.191003 >> cdbn Enter row matrix of VALUES X Enter row matrix of PROBABILITIES PX % See Figure 7.14 for plot
In this case, we use a rather large number of approximation points. As a consequence, the results are quite accurate. In the single-variable case, designating a large number of approximating points usually causes no computer memory problem.
The general approximation procedure
We show now that any bounded real random variable may be approximated as closely as desired by a simple random variable (i.e., one having a finite set of possible values). For the unbounded case, the approximation is close except in a portion of the range having arbitrarily small total probability.
We limit our discussion to the bounded case, in which the range of X is limited to a bounded interval I=[a,b]. Suppose I is partitioned into n subintervals by points t_{i}, 1≤i≤n–1, with a=t_{0} and b=t_{n}. Let be the ith subinterval, 1≤i≤n–1 and (see Figure 7.14). Now random variable X may map into any point in the interval, and hence into any point in each subinterval M_{i}. Let be the set of points mapped into M_{i} by X. Then the E_{i} form a partition of the basic space Ω. For the given subdivision, we form a simple random variable X_{s} as follows. In each subinterval, pick a point s_{i},t_{i–1}≤s_{i}<t_{i}. Consider the simple random variable .
This random variable is in canonical form. If ω∈E_{i}, then X(ω)∈M_{i} and X_{s}(ω)=s_{i}. Now the absolute value of the difference satisfies
Since this is true for each ω and the corresponding subinterval, we have the important fact
By making the subintervals small enough by increasing the number of subdivision points, we can make the difference as small as we please.
While the choice of the s_{i} is arbitrary in each M_{i}, the selection of s_{i}=t_{i–1} (the left-hand endpoint) leads to the property X_{s}(ω)≤X(ω)∀ω. In this case, if we add subdivision points to decrease the size of some or all of the M_{i}, the new simple approximation Y_{s} satisfies
To see this, consider t_{i}^{*}∈M_{i} (see Figure 7.15). M_{i} is partitioned into M_{i}^{'}⋁M_{i}^{''} and E_{i} is partitioned into E_{i}^{'}⋁E_{i}^{''}. X maps E_{i}^{'} into M_{i}^{'} and E_{i}^{''} into M_{i}^{''}. Y_{s} maps E_{i}^{'} into t_{i} and maps E_{i}^{''} into t_{i}^{*}>t_{i}. X_{s} maps both E_{i}^{'} and E_{i}^{''} into t_{i}. Thus, the asserted inequality must hold for each ω By taking a sequence of partitions in which each succeeding partition refines the previous (i.e. adds subdivision points) in such a way that the maximum length of subinterval goes to zero, we may form a nondecreasing sequence of simple random variables X_{n} which increase to X for each ω.
The latter result may be extended to random variables unbounded above. Simply let N th set of subdivision points extend from a to N, making the last subinterval . Subintervals from a to N are made increasingly shorter. The result is a nondecreasing sequence of simple random variables, with X_{N}(ω)→X(ω) as N→∞, for each ω∈Ω.
For probability calculations, we simply select an interval I large enough that the probability outside I is negligible and use a simple approximation over I.
7.3. Problems on Distribution and Density Functions^{*}
(See Exercises 3 and 4 from "Problems on Random Variables and Probabilities"). The class is a partition. Random variable X has values {1,3,2,3,4,2,1,3,5,2} on C_{1} through C_{10}, respectively, with probabilities 0.08, 0.13, 0.06, 0.09, 0.14, 0.11, 0.12, 0.07, 0.11, 0.09. Determine and plot the distribution function F_{X}.
T = [1 3 2 3 4 2 1 3 5 2]; pc = 0.01*[8 13 6 9 14 11 12 7 11 9]; [X,PX] = csort(T,pc); ddbn Enter row matrix of VALUES X Enter row matrix of PROBABILITIES PX % See MATLAB plot
(See Exercise 6 from "Problems on Random Variables and Probabilities"). A store has eight items for sale. The prices are $3.50, $5.00, $3.50, $7.50, $5.00, $5.00, $3.50, and $7.50, respectively. A customer comes in. She purchases one of the items with probabilities 0.10, 0.15, 0.15, 0.20, 0.10 0.05, 0.10 0.15. The random variable expressing the amount of her purchase may be written
Determine and plot the distribution function for X.
T = [3.5 5 3.5 7.5 5 5 3.5 7.5]; pc = 0.01*[10 15 15 20 10 5 10 15]; [X,PX] = csort(T,pc); ddbn Enter row matrix of VALUES X Enter row matrix of PROBABILITIES PX % See MATLAB plot
(See Exercise 12 from "Problems on Random Variables and Probabilities"). The class has minterm probabilities
Determine and plot the distribution function for the random variable X=I_{A}+I_{B}+I_{C}+I_{D}, which counts the number of the events which occur on a trial.
npr06_12 Minterm probabilities in pm, coefficients in c T = sum(mintable(4)); % Alternate solution. See Exercise 12 from "Problems on Random Variables and Probabilities" [X,PX] = csort(T,pm); ddbn Enter row matrix of VALUES X Enter row matrix of PROBABILITIES PX % See MATLAB plot
Suppose a is a ten digit number. A wheel turns up the digits 0 through 9 with equal probability on each spin. On ten spins what is the probability of matching, in order, k or more of the ten digits in a, 0≤k≤10? Assume the initial digit may be zero.
In a thunderstorm in a national park there are 127 lightning strikes. Experience shows that the probability of of a lightning strike starting a fire is about 0.0083. What is the probability that k fires are started, k=0,1,2,3?
P = ibinom(127,0.0083,0:3) P = 0.3470 0.3688 0.1945 0.0678
A manufacturing plant has 350 special lamps on its production lines. On any day, each lamp could fail with probability p=0.0017. These lamps are critical, and must be replaced as quickly as possible. It takes about one hour to replace a lamp, once it has failed. What is the probability that on any day the loss of production time due to lamp failaures is k or fewer hours,
P = 1 - cbinom(350,0.0017,1:6)
= 0.5513 0.8799 0.9775 0.9968 0.9996 1.0000
Two hundred persons buy tickets for a drawing. Each ticket has probability 0.008 of winning. What is the probability of k or fewer winners,
Two coins are flipped twenty times. What is the probability the results match (both heads or both tails) k times, 0≤k≤20?
P = ibinom(20,1/2,0:20)
Thirty members of a class each flip a coin ten times. What is the probability that at least five of them get seven or more heads?
p = cbinom(10,0.5,7) = 0.1719
P = cbinom(30,p,5) = 0.6052
For the system in Exercise 6., call a day in which one or more failures occur among the 350 lamps a “service day.” Since a Bernoulli sequence “starts over” at any time, the sequence of service/nonservice days may be considered a Bernoulli sequence with probability p_{1}, the probability of one or more lamp failures in a day.
Beginning on a Monday morning, what is the probability the first service day is the first, second, third, fourth, fifth day of the week?
What is the probability of no service days in a seven day week?
p1 = 1 - (1 - 0.0017)^350 = 0.4487 k = 1:5; (prob given day is a service day)
P = p1*(1 - p1).^(k-1) = 0.4487 0.2474 0.1364 0.0752 0.0414
P0 = (1 - p1)^7 = 0.0155
For the system in Exercise 6. and Exercise 10. assume the plant works seven days a week. What is the probability the third service day occurs by the end of 10 days? Solve using the negative binomial distribution; repeat using the binomial distribution.
p1 = 1 - (1 - 0.0017)^350 = 0.4487
P = sum(nbinom(3,p1,3:10)) = 0.8990
Pa = cbinom(10,p1,3) = 0.8990
A residential College plans to raise money by selling “chances” on a board. Fifty chances are sold. A player pays $10 to play; he or she wins $30 with probability p=0.2. The profit to the College is
Determine the distribution for X and calculate P(X>0), P(X≥200), and
P(X≥300).
N = 0:50; PN = ibinom(50,0.2,0:50); X = 500 - 30*N; Ppos = (X>0)*PN' Ppos = 0.9856 P200 = (X>=200)*PN' P200 = 0.5836 P300 = (X>=300)*PN' P300 = 0.1034
A single six-sided die is rolled repeatedly until either a one or a six turns up. What is the probability that the first appearance of either of these numbers is achieved by the fifth trial or sooner?
Consider a Bernoulli sequence with probability p=0.53 of success on any component trial.
The probability the fourth success will occur no later than the tenth trial is determined by the negative binomial distribution. Use the procedure nbinom to calculate this probability .
Calculate this probability using the binomial distribution.
P = sum(nbinom(4,0.53,4:10)) = 0.8729
Pa = cbinom(10,0.53,4) = 0.8729
Fifty percent of the components coming off an assembly line fail to meet specifications for a special job. It is desired to select three units which meet the stringent specifications. Items are selected and tested in succession. Under the usual assumptions for Bernoulli trials, what is the probability the third satisfactory unit will be found on six or fewer trials?
P = cbinom(6,0.5,3) = 0.6562
The number of cars passing a certain traffic count position in an hour has Poisson (53) distribution. What is the probability the number of cars passing in an hour lies between 45 and 55 (inclusive)? What is the probability of more than 55?
P1 = cpoisson(53,45) - cpoisson(53,56) = 0.5224
P2 = cpoisson(53,56) = 0.3581
Compare P(X≤k) and P(Y≤k) for X∼ binomial(5000, 0.001) and Y∼ Poisson (5), for 0≤k≤10. Do this directly with ibinom and ipoisson. Then use the m-procedure bincomp to obtain graphical results (including a comparison with the normal distribution).
k = 0:10; Pb = 1 - cbinom(5000,0.001,k+1); Pp = 1 - cpoisson(5,k+1); disp([k;Pb;Pp]') 0 0.0067 0.0067 1.0000 0.0404 0.0404 2.0000 0.1245 0.1247 3.0000 0.2649 0.2650 4.0000 0.4404 0.4405 5.0000 0.6160 0.6160 6.0000 0.7623 0.7622 7.0000 0.8667 0.8666 8.0000 0.9320 0.9319 9.0000 0.9682 0.9682 10.0000 0.9864 0.9863
bincomp Enter the parameter n 5000 Enter the parameter p 0.001 Binomial-- stairs Poisson-- -.-. Adjusted Gaussian-- o o o gtext('Exercise 17')
Suppose X∼ binomial (12, 0.375), Y∼ Poisson (4.5), and Z∼ exponential (1/4.5). For each random variable, calculate and tabulate the probability of a value at least k, for integer values 3≤k≤8.
k = 3:8; Px = cbinom(12,0.375,k); Py = cpoisson(4.5,k); Pz = exp(-k/4.5); disp([k;Px;Py;Pz]') 3.0000 0.8865 0.8264 0.5134 4.0000 0.7176 0.6577 0.4111 5.0000 0.4897 0.4679 0.3292 6.0000 0.2709 0.2971 0.2636 7.0000 0.1178 0.1689 0.2111 8.0000 0.0390 0.0866 0.1690
The number of noise pulses arriving on a power circuit in an hour is a random quantity having Poisson (7) distribution. What is the probability of having at least 10 pulses in an hour? What is the probability of having at most 15 pulses in an hour?
The number of customers arriving in a small specialty store in an hour is a random quantity having Poisson (5) distribution. What is the probability the number arriving in an hour will be between three and seven, inclusive? What is the probability of no more than ten?
P1 = cpoisson(5,3) - cpoisson(5,8) = 0.7420
P2 = 1 - cpoisson(5,11) = 0.9863
Random variable X∼ binomial (1000, 0.1).
Determine
Use the appropriate Poisson distribution to approximate these values.
k = [80 100 120]; P = cbinom(1000,0.1,k) P = 0.9867 0.5154 0.0220 P1 = cpoisson(100,k) P1 = 0.9825 0.5133 0.0282
The time to failure, in hours of operating time, of a televesion set subject to random voltage surges has the exponential (0.002) distribution. Suppose the unit has operated successfully for 500 hours. What is the (conditional) probability it will operate for another 500 hours?
For X∼ exponential (λ), determine P(X≥1/λ), P(X≥2/λ).
Twenty “identical” units are put into operation. They fail independently. The times to failure (in hours) form an iid class, exponential (0.0002). This means the “expected” life is 5000 hours. Determine the probabilities that at least k, for k=5,8,10,12,15, will survive for 5000 hours.
p = exp(-0.0002*5000) p = 0.3679 k = [5 8 10 12 15]; P = cbinom(20,p,k) P = 0.9110 0.4655 0.1601 0.0294 0.0006
Let T∼ gamma (20, 0.0002) be the total operating time for the units described in Exercise 24..
Use the m-function for the gamma distribution to determine P(T≤100,000).
Use the Poisson distribution to determine P(T≤100,000).
P1 = gammadbn(20,0.0002,100000) = 0.5297 P2 = cpoisson(0.0002*100000,20) = 0.5297
The sum of the times to failure for five independent units is a random variable X∼ gamma . Without using tables or m-programs, determine P(X≤25).
Interarrival times (in minutes) for fax messages on a terminal are independent, exponential (λ=0.1). This means the time X for the arrival of the fourth message is gamma(4, 0.1). Without using tables or m-programs, utilize the relation of the gamma to the Poisson distribution to determine P(X≤30).
Customers arrive at a service center with independent interarrival times in hours, which have exponential (3) distribution. The time X for the third arrival is thus gamma . Without using tables or m-programs, determine P(X≤2).
Five people wait to use a telephone, currently in use by a sixth person. Suppose time for the six calls (in minutes) are iid, exponential (1/3). What is the distribution for the total time Z from the present for the six calls? Use an appropriate Poisson distribution to determine P(Z≤20).
Z∼ gamma (6,1/3).
A random number generator produces a sequence of numbers between 0 and 1. Each of these can be considered an observed value of a random variable uniformly distributed on the interval [0, 1]. They assume their values independently. A sequence of 35 numbers is generated. What is the probability 25 or more are less than or equal to 0.71? (Assume continuity. Do not make a discrete adjustment.)
p = cbinom(35,0.71,25) = 0.5620
Five “identical” electronic devices are installed at one time. The units fail independently, and the time to failure, in days, of each is a random variable exponential (1/30). A maintenance check is made each fifteen days. What is the probability that at least four are still operating at the maintenance check?
Suppose X∼N(4,81). That is, X has gaussian distribution with mean μ=4 and variance σ^{2}=81.
Use a table of standardized normal distribution to determine P(2<X<8) and P(|X–4|≤5).
Calculate the probabilities in part (a) with the m-function gaussian.
- (7.49) P ( 2 < X < 8 ) = Φ ( ( 8 – 4 ) / 9 ) – Φ ( ( 2 – 4 ) / 9 ) =(7.50) Φ ( 4 / 9 ) + Φ ( 2 / 9 ) – 1 = 0 . 6712 + 0 . 5875 – 1 = 0 . 2587(7.51) P ( | X – 4 | ≤ 5 ) = 2 Φ ( 5 / 9 ) – 1 = 1 . 4212 – 1 = 0 . 4212
P1 = gaussian(4,81,8) - gaussian(4,81,2) P1 = 0.2596 P2 = gaussian(4,81,9) - gaussian(4,84,-1) P2 = 0.4181
Suppose . That is, X has gaussian distribution with μ=5 and σ^{2}=81. Use a table of standardized normal distribution to determine P(3<X<9) and P(|X–5|≤5). Check your results using the m-function gaussian.
P1 = gaussian(5,81,9) - gaussian(5,81,3) P1 = 0.2596 P2 = gaussian(5,81,10) - gaussian(5,84,0) P2 = 0.4181
Suppose . That is, X has gaussian distribution with μ=3 and σ^{2}=64. Use a table of standardized normal distribution to determine P(1<X<9) and P(|X–3|≤4). Check your results with the m-function gaussian.
P1 = gaussian(3,64,9) - gaussian(3,64,1) P1 = 0.3721 P2 = gaussian(3,64,7) - gaussian(3,64,-1) P2 = 0.3829
Items coming off an assembly line have a critical dimension which is represented by a random variable ∼ N(10, 0.01). Ten items are selected at random. What is the probability that three or more are within 0.05 of the mean value μ.
p = gaussian(10,0.01,10.05) - gaussian(10,0.01,9.95) p = 0.3829 P = cbinom(10,p,3) P = 0.8036
The result of extensive quality control sampling shows that a certain model of digital watches coming off a production line have accuracy, in seconds per month, that is normally distributed with μ=5 and σ^{2}=300. To achieve a top grade, a watch must have an accuracy within the range of -5 to +10 seconds per month. What is the probability a watch taken from the production line to be tested will achieve top grade? Calculate, using a standardized normal table. Check with the m-function gaussian.
Use the m-procedure bincomp with various values of n from 10 to 500 and p from 0.01 to 0.7, to observe the approximation of the binomial distribution by the Poisson.
Use the m-procedure poissapp to compare the Poisson and gaussian distributions. Use various values of μ from 10 to 500.
Random variable X has density (and zero elsewhere).
Determine P(–0.5≤X<0,8), P(|X|>0.5), P(|X–0.25|≤0.5).
Determine an expression for the distribution function.
Use the m-procedures tappr and cdbn to plot an approximation to the distribution function.
- (7.59)(7.60)
tappr Enter matrix [a b] of x-range endpoints [-1 1] Enter number of x approximation points 200 Enter density as a function of t 1.5*t.^2 Use row matrices X and PX as in the simple case cdbn Enter row matrix of VALUES X Enter row matrix of PROBABILITIES PX % See MATLAB plot
Random variable X has density function (and zero elsewhere).
Determine P(X≤0.5), P(0.5≤X<1.5), P(|X–1|<1/4).
Determine an expression for the distribution function.
Use the m-procedures tappr and cdbn to plot an approximation to the distribution function.
- (7.62)
tappr Enter matrix [a b] of x-range endpoints [0 2] Enter number of x approximation points 200 Enter density as a function of t t - (3/8)*t.^2 Use row matrices X and PX as in the simple case cdbn Enter row matrix of VALUES X Enter row matrix of PROBABILITIES PX % See MATLAB plot
Random variable X has density function
Determine P(X≤0.5), P(0.5≤X<1.5), P(|X–1|<1/4).
Determine an expression for the distribution function.
Use the m-procedures tappr and cdbn to plot an approximation to the distribution function.
- (7.64)(7.65)
- (7.66)
tappr Enter matrix [a b] of x-range endpoints [0 2] Enter number of x approximation points 400 Enter density as a function of t (6/5)*(t<=1).*t.^2 + ... (6/5)*(t>1).*(2 - t) Use row matrices X and PX as in the simple case cdbn Enter row matrix of VALUES X Enter row matrix of PROBABILITIES PX % See MATLAB plot