Chapter 5. Conditional Independence

5.1. Conditional Independence*

The idea of stochastic (probabilistic) independence is explored in the unit Independence of Events. The concept is approached as lack of conditioning: P(A|B)=P(A). This is equivalent to the product rule P(AB)=P(A)P(B) . We consider an extension to conditional independence.

The concept

Examination of the independence concept reveals two important mathematical facts:

• Independence of a class of non mutually exclusive events depends upon the probability measure, and not on the relationship between the events. Independence cannot be displayed on a Venn diagram, unless probabilities are indicated. For one probability measure a pair may be independent while for another probability measure the pair may not be independent.

• Conditional probability is a probability measure, since it has the three defining properties and all those properties derived therefrom.

This raises the question: is there a useful conditional independence—i.e., independence with respect to a conditional probability measure? In this chapter we explore that question in a fruitful way.

Among the simple examples of “operational independence" in the unit on independence of events, which lead naturally to an assumption of “probabilistic independence” are the following:

• If customers come into a well stocked shop at different times, each unaware of the choice made by the other, the the item purchased by one should not be affected by the choice made by the other.

• If two students are taking exams in different courses, the grade one makes should not affect the grade made by the other.

Example 5.1Buying umbrellas and the weather

A department store has a nice stock of umbrellas. Two customers come into the store “independently.” Let A be the event the first buys an umbrella and B the event the second buys an umbrella. Normally, we should think the events form an independent pair. But consider the effect of weather on the purchases. Let C be the event the weather is rainy (i.e., is raining or threatening to rain). Now we should think and . The weather has a decided effect on the likelihood of buying an umbrella. But given the fact the weather is rainy (event C has occurred), it would seem reasonable that purchase of an umbrella by one should not affect the likelihood of such a purchase by the other. Thus, it may be reasonable to suppose

(5.1)

An examination of the sixteen equivalent conditions for independence, with probability measure P replaced by probability measure PC, shows that we have independence of the pair with respect to the conditional probability measure PC(·)=P(·|C). Thus, P(AB|C)=P(A|C)P(B|C). For this example, we should also expect that , so that there is independence with respect to the conditional probability measure . Does this make the pair independent (with respect to the prior probability measure P)? Some numerical examples make it plain that only in the most unusual cases would the pair be independent. Without calculations, we can see why this should be so. If the first customer buys an umbrella, this indicates a higher than normal likelihood that the weather is rainy, in which case the second customer is likely to buy. The condition leads to P(B|A)>P(B). Consider the following numerical case. Suppose P(AB|C)=P(A|C)P(B|C) and and

(5.2)

Then

(5.3)
(5.4)

As a result,

(5.5) P ( A ) P ( B ) = 0 . 0816 ≠ 0 . 1110 = P ( A B )

The product rule fails, so that the pair is not independent. An examination of the pattern of computation shows that independence would require very special probabilities which are not likely to be encountered.

Example 5.2Students and exams

Two students take exams in different courses, Under normal circumstances, one would suppose their performances form an independent pair. Let A be the event the first student makes grade 80 or better and B be the event the second has a grade of 80 or better. The exam is given on Monday morning. It is the fall semester. There is a probability 0.30 that there was a football game on Saturday, and both students are enthusiastic fans. Let C be the event of a game on the previous Saturday. Now it is reasonable to suppose

(5.6)

If we know that there was a Saturday game, additional knowledge that B has occurred does not affect the lielihood that A occurs. Again, use of equivalent conditions shows that the situation may be expressed

(5.7)

Under these conditions, we should suppose that and . If we knew that one did poorly on the exam, this would increase the likelihoood there was a Saturday game and hence increase the likelihood that the other did poorly. The failure to be independent arises from a common chance factor that affects both. Although their performances are “operationally” independent, they are not independent in the probability sense. As a numerical example, suppose

(5.8)

Straightforward calculations show . Note that P(A|B)=0.8514>P(A) as would be expected.

Sixteen equivalent conditions

Using the facts on repeated conditioning and the equivalent conditions for independence, we may produce a similar table of equivalent conditions for conditional independence. In the hybrid notation we use for repeated conditioning, we write

(5.9)

This translates into

(5.10)

If it is known that C has occurred, then additional knowledge of the occurrence of B does not change the likelihood of A.

If we write the sixteen equivalent conditions for independence in terms of the conditional probability measure , then translate as above, we have the following equivalent conditions.

 P ( A | B C ) = P ( A | C ) P ( B | A C ) = P ( B | C ) P ( A B | C ) = P ( A | C ) P ( B | C )

The patterns of conditioning in the examples above belong to this set. In a given problem, one or the other of these conditions may seem a reasonable assumption. As soon as one of these patterns is recognized, then all are equally valid assumptions. Because of its simplicity and symmetry, we take as the defining condition the product ruleP(AB|C)=P(A|C)P(B|C).

Definition. A pair of events {A,B} is said to be conditionally independent, givenC, designated iff the following product rule holds: P(AB|C)=P(A|C)P(B|C).

The equivalence of the four entries in the right hand column of the upper part of the table, establish

The replacement rule

If any of the pairs , or is conditionally independent, given C, then so are the others.

This may be expressed by saying that if a pair is conditionally independent, we may replace either or both by their complements and still have a conditionally independent pair.

To illustrate further the usefulness of this concept, we note some other common examples in which similar conditions hold: there is operational independence, but some chance factor which affects both.

• Two contractors work quite independently on jobs in the same city. The operational independence suggests probabilistic independence. However, both jobs are outside and subject to delays due to bad weather. Suppose A is the event the first contracter completes his job on time and B is the event the second completes on time. If C is the event of “good” weather, then arguments similar to those in Examples 1 and 2 make it seem reasonable to suppose and . Remark. In formal probability theory, an event must be sharply defined: on any trial it occurs or it does not. The event of “good weather” is not so clearly defined. Did a trace of rain or thunder in the area constitute bad weather? Did rain delay on one day in a month long project constitute bad weather? Even with this ambiguity, the pattern of probabilistic analysis may be useful.

• A patient goes to a doctor. A preliminary examination leads the doctor to think there is a thirty percent chance the patient has a certain disease. The doctor orders two independent tests for conditions that indicate the disease. Are results of these tests really independent? There is certainly operational independence—the tests may be done by different laboratories, neither aware of the testing by the others. Yet, if the tests are meaningful, they must both be affected by the actual condition of the patient. Suppose D is the event the patient has the disease, A is the event the first test is positive (indicates the conditions associated with the disease) and B is the event the second test is positive. Then it would seem reasonable to suppose and .

In the examples considered so far, it has been reasonable to assume conditional independence, given an event C, and conditional independence, given the complementary event. But there are cases in which the effect of the conditioning event is asymmetric. We consider several examples.

• Two students are working on a term paper. They work quite separately. They both need to borrow a certain book from the library. Let C be the event the library has two copies available. If A is the event the first completes on time and B the event the second is successful, then it seems reasonable to assume . However, if only one book is available, then the two conditions would not be conditionally independent. In general , since if the first student completes on time, then he or she must have been successful in getting the book, to the detriment of the second.

• If the two contractors of the example above both need material which may be in scarce supply, then successful completion would be conditionally independent, give an adequate supply, whereas they would not be conditionally independent, given a short supply.

• Two students in the same course take an exam. If they prepared separately, the event of both getting good grades should be conditionally independent. If they study together, then the likelihoods of good grades would not be independent. With neither cheating or collaborating on the test itself, if one does well, the other should also.

Since conditional independence is ordinary independence with respect to a conditional probability measure, it should be clear how to extend the concept to larger classes of sets.

Definition. A class , where J is an arbitrary index set, is conditionally independent, given event C, denoted , iff the product rule holds for every finite subclass of two or more.

As in the case of simple independence, the replacement rule extends.

The replacement rule

If the class , then any or all of the events Ai may be replaced by their complements and still have a conditionally independent class.

The use of independence techniques

Since conditional independence is independence, we may use independence techniques in the solution of problems. We consider two types of problems: an inference problem and a conditional Bernoulli sequence.

Example 5.3Use of independence techniques

Sharon is investigating a business venture which she thinks has probability 0.7 of being successful. She checks with five “independent” advisers. If the prospects are sound, the probabilities are 0.8, 0.75, 0.6, 0.9, and 0.8 that the advisers will advise her to proceed; if the venture is not sound, the respective probabilities are 0.75, 0.85, 0.7, 0.9, and 0.7 that the advice will be negative. Given the quality of the project, the advisers are independent of one another in the sense that no one is affected by the others. Of course, they are not independent, for they are all related to the soundness of the venture. We may reasonably assume conditional independence of the advice, given that the venture is sound and also given that the venture is not sound. If Sharon goes with the majority of advisers, what is the probability she will make the right decision?

SOLUTION

If the project is sound, Sharon makes the right choice if three or more of the five advisors are positive. If the venture is unsound, she makes the right choice if three or more of the five advisers are negative. Let H= the event the project is sound, F= the event three or more advisers are positive, G=Fc= the event three or more are negative, and E= the event of the correct decision. Then

(5.11)

Let Ei be the event the ith adviser is positive. Then P(F|H)= the sum of probabilities of the form , where Mk are minterms generated by the class . Because of the assumed conditional independence,

(5.12)

with similar expressions for each and . This means that if we want the probability of three or more successes, given H, we can use ckn with the matrix of conditional probabilities. The following MATLAB solution of the investment problem is indicated.

P1 = 0.01*[80 75 60 90 80];
P2 = 0.01*[75 85 70 90 70];
PH = 0.7;
PE = ckn(P1,3)*PH + ckn(P2,3)*(1 - PH)
PE =    0.9255


Often a Bernoulli sequence is related to some conditioning event H. In this case it is reasonable to assume the sequence and . We consider a simple example.

Example 5.4Test of a claim

A race track regular claims he can pick the winning horse in any race 90 percent of the time. In order to test his claim, he picks a horse to win in each of ten races. There are five horses in each race. If he is simply guessing, the probability of success on each race is 0.2. Consider the trials to constitute a Bernoulli sequence. Let H be the event he is correct in his claim. If S is the number of successes in picking the winners in the ten races, determine P(H|S=k) for various numbers k of correct picks. Suppose it is equally likely that his claim is valid or that he is merely guessing. We assume two conditional Bernoulli trials:

Claim is valid:       Ten trials, probability .

Guessing at random: Ten trials, probability .

Let S= number of correct picks in ten trials. Then

(5.13)

Giving him the benefit of the doubt, we suppose and calculate the conditional odds.

k = 0:10;
Pk1 = ibinom(10,0.9,k);    % Probability of k successes, given H
Pk2 = ibinom(10,0.2,k);    % Probability of k successes, given H^c
OH  = Pk1./Pk2;            % Conditional odds-- Assumes P(H)/P(H^c) = 1
e   = OH > 1;              % Selects favorable odds
disp(round([k(e);OH(e)]'))
6           2      % Needs at least six to have creditability
7          73      % Seven would be creditable,
8        2627      % even if P(H)/P(H^c) = 0.1
9       94585
10     3405063


Under these assumptions, he would have to pick at least seven correctly to give reasonable validation of his claim.

5.2. Patterns of Probable Inference*

Some Patterns of Probable Inference

We are concerned with the likelihood of some hypothesized condition. In general, we have evidence for the condition which can never be absolutely certain. We are forced to assess probabilities (likelihoods) on the basis of the evidence. Some typical examples:

 HYPOTHESIS EVIDENCE Job success Personal traits Presence of oil Geological structures Operation of a device Physical condition Market condition Test market condition Presence of a disease Tests for symptoms

If H is the event the hypothetical condition exists and E is the event the evidence occurs, the probabilities available are usually P(H) (or an odds value), P(E|H), and . What is desired is P(H|E) or, equivalently, the odds . We simply use Bayes' rule to reverse the direction of conditioning.

(5.14)

No conditional independence is involved in this case.

Independent evidence for the hypothesized condition

Suppose there are two “independent” bits of evidence. Now obtaining this evidence may be “operationally” independent, but if the items both relate to the hypothesized condition, then they cannot be really independent. The condition assumed is usually of the form —if H occurs, then knowledge of E2 does not affect the likelihood of E1. Similarly, we usually have . Thus and .

Example 5.5Independent medical tests

Suppose a doctor thinks the odds are 2/1 that a patient has a certain disease. She orders two independent tests. Let H be the event the patient has the disease and E1 and E2 be the events the tests are positive. Suppose the first test has probability 0.1 of a false positive and probability 0.05 of a false negative. The second test has probabilities 0.05 and 0.08 of false positive and false negative, respectively. If both tests are positive, what is the posterior probability the patient has the disease?

SOLUTION

Assuming and , we work first in terms of the odds, then convert to probability.

(5.15)

The data are

(5.16)

Substituting values, we get

(5.17)

Evidence for a symptom

Sometimes the evidence dealt with is not evidence for the hypothesized condition, but for some condition which is stochastically related. For purposes of exposition, we refer to this intermediary condition as a symptom. Consider again the examples above.

 HYPOTHESIS SYMPTOM EVIDENCE Job success Personal traits Diagnostic test results Presence of oil Geological structures Geophysical survey results Operation of a device Physical condition Monitoring report Market condition Test market condition Market survey result Presence of a disease Physical symptom Test for symptom

We let S be the event the symptom is present. The usual case is that the evidence is directly related to the symptom and not the hypothesized condition. The diagnostic test results can say something about an applicant's personal traits, but cannot deal directly with the hypothesized condition. The test results would be the same whether or not the candidate is successful in the job (he or she does not have the job yet). A geophysical survey deals with certain structural features beneath the surface. If a fault or a salt dome is present, the geophysical results are the same whether or not there is oil present. The physical monitoring report deals with certain physical characteristics. Its reading is the same whether or not the device will fail. A market survey treats only the condition in the test market. The results depend upon the test market, not the national market. A blood test may be for certain physical conditions which frequently are related (at least statistically) to the disease. But the result of the blood test for the physical condition is not directly affected by the presence or absence of the disease.

Under conditions of this type, we may assume

(5.18)

These imply and . Now

(5.19)

It is worth noting that each term in the denominator differs from the corresponding term in the numerator by having Hc in place of H. Before completing the analysis, it is necessary to consider how H and S are related stochastically in the data. Four cases may be considered.

1. Data are P(S|H), , and P(H).

2. Data are P(S|H), , and P(S).

3. Data are P(H|S), , and P(S).

4. Data are P(H|S), , and P(H).

 Case a: (5.20) Example 5.6. Geophysical survey Let H be the event of a successful oil well, S be the event there is a geophysical structure favorable to the presence of oil, and E be the event the geophysical survey indicates a favorable structure. We suppose and . Data are(5.21) Then(5.22)(5.23) The geophysical result moved the prior odds of 3/1 to posterior odds of 8.6/1, with a corresponding change of probabilities from 0.75 to 0.90. Case b: Data are P(S)P(S|H), , P(E|S). and . If we can determine P(H), we can proceed as in case a. Now by the law of total probability (5.24) which may be solved algebraically to give (5.25) Example 5.7. Geophysical survey revisited In many cases a better estimate of P(S) or the odds can be made on the basis of previous geophysical data. Suppose the prior odds for S are 3/1, so that P(S)=0.75. Using the other data in Example 5.6, we have(5.26) Using the pattern of case a, we have(5.27)(5.28) Usually data relating test results to symptom are of the form P(E|S) and , or equivalent. Data relating the symptom and the hypothesized condition may go either way. In cases a and b, the data are in the form P(S|H) and , or equivalent, derived from data showing the fraction of times the symptom is noted when the hypothesized condition is identified. But these data may go in the opposite direction, yielding P(H|S) and , or equivalent. This is the situation in cases c and d. Case c: Data are and P(S). Example 5.8. Evidence for a disease symptom with prior P(S) When a certain blood syndrome is observed, a given disease is indicated 93 percent of the time. The disease is found without this syndrome only three percent of the time. A test for the syndrome has probability 0.03 of a false positive and 0.05 of a false negative. A preliminary examination indicates a probability 0.30 that a patient has the syndrome. A test is performed; the result is negative. What is the probability the patient has the disease? SOLUTIONIn terms of the notation above, the data are(5.29)(5.30) We suppose and .(5.31)(5.32) which implies . Case d: This differs from case c only in the fact that a prior probability for H is assumed. In this case, we determine the corresponding probability for S by (5.33) and use the pattern of case c. Example 5.9. Evidence for a disease symptom with prior P(H) Suppose for the patient in Example 5.8 the physician estimates the odds favoring the presence of the disease are 1/3, so that P(H)=0.25. Again, the test result is negative. Determine the posterior odds, given Ec. SOLUTIONFirst we determine(5.34) Then(5.35) The result of the test drops the prior odds of 1/3 to approximately 1/21.

Independent evidence for a symptom

In the previous cases, we consider only a single item of evidence for a symptom. But it may be desirable to have a “second opinion.” We suppose the tests are for the symptom and are not directly related to the hypothetical condition. If the tests are operationally independent, we could reasonably assume

(5.36)

This implies . A similar condition holds for Sc. As for a single test, there are four cases, depending on the tie between S and H. We consider a "case a" example.

Example 5.10A market survey problem

A food company is planning to market nationally a new breakfast cereal. Its executives feel confident that the odds are at least 3 to 1 the product would be successful. Before launching the new product, the company decides to investigate a test market. Previous experience indicates that the reliability of the test market is such that if the national market is favorable, there is probability 0.9 that the test market is also. On the other hand, if the national market is unfavorable, there is a probability of only 0.2 that the test market will be favorable. These facts lead to the following analysis. Let

H be the event the national market is favorable (hypothesis)

S be the event the test market is favorable (symptom)

The initial data are the following probabilities, based on past experience:

•      (a) Prior odds:

•      (b) Reliability of the test market:

If it were known that the test market is favorable, we should have

(5.37)

Unfortunately, it is not feasible to know with certainty the state of the test market. The company decision makers engage two market survey companies to make independent surveys of the test market. The reliability of the companies may be expressed as follows. Let

 : E1 be the event the first company reports a favorable test market. : E2 be the event the second company reports a favorable test market.

On the basis of previous experience, the reliability of the evidence about the test market (the symptom) is expressed in the following conditional probabilities.

(5.38)

Both survey companies report that the test market is favorable. What is the probability the national market is favorable, given this result?

SOUTION

The two survey firms work in an “operationally independent” manner. The report of either company is unaffected by the work of the other. Also, each report is affected only by the condition of the test market— regardless of what the national market may be. According to the discussion above, we should be able to assume

(5.39)

We may use a pattern similar to that in Example 2, as follows:

(5.40)
(5.41)

In terms of the posterior probability, we have

(5.42)

We note that the odds favoring H, given positive indications from both survey companies, is 10.2 as compared with the odds favoring H, given a favorable test market, of 13.5. The difference reflects the residual uncertainty about the test market after the market surveys. Nevertheless, the results of the market surveys increase the odds favoring a satisfactory market from the prior 3 to 1 to a posterior 10.2 to 1. In terms of probabilities, the market surveys increase the likelihood of a favorable market from the original P(H)=0.75 to the posterior . The conditional independence of the results of the survey makes possible direct use of the data.

A classification problem

A population consists of members of two subgroups. It is desired to formulate a battery of questions to aid in identifying the subclass membership of randomly selected individuals in the population. The questions are designed so that for each individual the answers are independent, in the sense that the answers to any subset of these questions are not affected by and do not affect the answers to any other subset of the questions. The answers are, however, affected by the subgroup membership. Thus, our treatment of conditional idependence suggests that it is reasonable to supose the answers are conditionally independent, given the subgroup membership. Consider the following numerical example.

Example 5.11 A classification problem

A sample of 125 subjects is taken from a population which has two subgroups. The subgroup membership of each subject in the sample is known. Each individual is asked a battery of ten questions designed to be independent, in the sense that the answer to any one is not affected by the answer to any other. The subjects answer independently. Data on the results are summarized in the following table:

Table 5.5.
GROUP 1 (69 members)GROUP 2 (56 members)
QYesNoUnc.YesNoUnc.
14222520315
23427816373
31545933194
41944631187
52243423285
641131514375
7952831178
84026313385
94812927245
1020371235165

Assume the data represent the general population consisting of these two groups, so that the data may be used to calculate probabilities and conditional probabilities.

Several persons are interviewed. The result of each interview is a “profile” of answers to the questions. The goal is to classify the person in one of the two subgroups on the basis of the profile of answers.

The following profiles were taken.

• Y, N, Y, N, Y, U, N, U, Y. U

• N, N, U, N, Y, Y, U, N, N, Y

• Y, Y, N, Y, U, U, N, N, Y, Y

Classify each individual in one of the subgroups.

SOLUTION

Let G1= the event the person selected is from group 1, and G2=G1c= the event the person selected is from group 2. Let

Ai= the event the answer to the ith question is “Yes”

Bi= the event the answer to the ith question is “No”

Ci= the event the answer to the ith question is “Uncertain”

The data are taken to mean , etc. The profile

Y, N, Y, N, Y, U, N, U, Y. U corresponds to the event E=A1B2A3B4A5C6B7C8A9C10

We utilize the ratio form of Bayes' rule to calculate the posterior odds

(5.43)

If the ratio is greater than one, classify in group 1; otherwise classify in group 2 (we assume that a ratio exactly one is so unlikely that we can neglect it). Because of conditional independence, we are able to determine the conditional probabilities

(5.44)
(5.45)

The odds . We find the posterior odds to be

(5.46)

The factor 569/699 comes from multiplying 5610/6910 by the odds . Since the resulting posterior odds favoring Group 1 is greater than one, we classify the respondent in group 1.

While the calculations are simple and straightforward, they are tedious and error prone. To make possible rapid and easy solution, say in a situation where successive interviews are underway, we have several m-procedures for performing the calculations. Answers to the questions would normally be designated by some such designation as Y for yes, N for no, and U for uncertain. In order for the m-procedure to work, these answers must be represented by numbers indicating the appropriate columns in matrices A and B. Thus, in the example under consideration, each Y must be translated into a 1, each N into a 2, and each U into a 3. The task is not particularly difficult, but it is much easier to have MATLAB make the translation as well as do the calculations. The following two-stage approach for solving the problem works well.

The first m-procedure oddsdf sets up the frequency information. The next m-procedure odds calculates the odds for a given profile. The advantage of splitting into two m-procedures is that we can set up the data once, then call repeatedly for the calculations for different profiles. As always, it is necessary to have the data in an appropriate form. The following is an example in which the data are entered in terms of actual frequencies of response.

% file oddsf4.m
% Frequency data for classification
A = [42 22 5; 34 27 8; 15 45 9; 19 44 6; 22 43 4;
41 13 15; 9 52 8; 40 26 3; 48 12 9; 20 37 12];
B = [20 31 5; 16 37 3; 33 19 4; 31 18 7; 23 28 5;
14 37 5; 31 17 8; 13 38 5; 27 24 5; 35 16 5];
disp('Call for oddsdf')

Example 5.12Classification using frequency data
oddsf4              % Call for data in file oddsf4.m
Call for oddsdf     % Prompt built into data file
oddsdf              % Call for m-procedure oddsdf
Enter matrix A of frequencies for calibration group 1  A
Enter matrix B of frequencies for calibration group 2  B
Number of questions = 10
Enter code for answers and call for procedure "odds"
y = 1;              % Use of lower case for easier writing
n = 2;
u = 3;
odds                % Call for calculating procedure
Enter profile matrix E  [y n y n y u n u y u]   % First profile
Odds favoring Group 1:   5.845
Classify in Group 1
odds                % Second call for calculating procedure
Enter profile matrix E  [n n u n y y u n n y]   % Second profile
Odds favoring Group 1:   0.2383
Classify in Group 2
odds                % Third call for calculating procedure
Enter profile matrix E  [y y n y u u n n y y]   % Third profile
Odds favoring Group 1:   5.05
Classify in Group 1


The principal feature of the m-procedure odds is the scheme for selecting the numbers from the A and B matrices. If E=[yynyuunnyy] , then the coding translates this into the actual numerical matrix

used internally. Then A(:,E) is a matrix with columns corresponding to elements of E. Thus

e = A(:,E)
e =   42    42    22    42     5     5    22    22    42    42
34    34    27    34     8     8    27    27    34    34
15    15    45    15     9     9    45    45    15    15
19    19    44    19     6     6    44    44    19    19
22    22    43    22     4     4    43    43    22    22
41    41    13    41    15    15    13    13    41    41
9     9    52     9     8     8    52    52     9     9
40    40    26    40     3     3    26    26    40    40
48    48    12    48     9     9    12    12    48    48
20    20    37    20    12    12    37    37    20    20


The ith entry on the ith column is the count corresponding to the answer to the ith question. For example, the answer to the third question is N (no), and the corresponding count is the third entry in the N (second) column of A. The element on the diagonal in the third column of A(:,E) is the third element in that column, and hence the desired third entry of the N column. By picking out the elements on the diagonal by the command diag(A(:,E)), we have the desired set of counts corresponding to the profile. The same is true for diag(B(:,E)).

Sometimes the data are given in terms of conditional probabilities and probabilities. A slight modification of the procedure handles this case. For purposes of comparison, we convert the problem above to this form by converting the counts in matrices A and B to conditional probabilities. We do this by dividing by the total count in each group (69 and 56 in this case). Also, and .

Table 5.6.
GROUP 1 GROUP 2
QYesNoUnc.YesNoUnc.
10.60870.31880.07250.35710.55360.0893
20.49280.39130.11590.28570.66070.0536
30.21740.65220.13040.58930.33930.0714
40.27540.63760.08700.55360.32140.1250
50.31880.62320.05800.41070.50000.0893
60.59420.18840.21740.25000.66070.0893
70.13040.75360.11600.55360.30360.1428
80.57970.37680.04350.23210.67860.0893
90.69570.17390.13040.48210.42860.0893
100.28990.53620.17390.62500.28570.0893

These data are in an m-file oddsp4.m. The modified setup m-procedure oddsdp uses the conditional probabilities, then calls for the m-procedure odds.

Example 5.13 Calculation using conditional probability data
oddsp4                 % Call for converted data (probabilities)
oddsdp                 % Setup m-procedure for probabilities
Enter conditional probabilities for Group 1  A
Enter conditional probabilities for Group 2  B
Probability p1 individual is from Group 1  0.552
Number of questions = 10
Enter code for answers and call for procedure "odds"
y = 1;
n = 2;
u = 3;
odds
Enter profile matrix E  [y n y n y u n u y u]
Odds favoring Group 1:  5.845
Classify in Group 1


The slight discrepancy in the odds favoring Group 1 (5.8454 compared with 5.8452) can be attributed to rounding of the conditional probabilities to four places. The presentation above rounds the results to 5.845 in each case, so the discrepancy is not apparent. This is quite acceptable, since the discrepancy has no effect on the results.

5.3. Problems on Conditional Independence*

Suppose and , P(C)=0.7, and

(5.47)

Show whether or not the pair {A,B} is independent.

, and .

PA = 0.4*0.7 + 0.3*0.3
PA =  0.3700
PB = 0.6*0.7 + 0.2*0.3
PB =  0.4800
PA*PB
ans = 0.1776
PAB = 0.4*0.6*0.7 + 0.3*0.2*0.3
PAB = 0.1860       % PAB not equal PA*PB;  not independent


Suppose and , with P(C)=0.4, and

(5.48)

Determine the posterior odds .

(5.49)
(5.50)

Five world class sprinters are entered in a 200 meter dash. Each has a good chance to break the current track record. There is a thirty percent chance a late cold front will move in, bringing conditions that adversely affect the runners. Otherwise, conditions are expected to be favorable for an outstanding race. Their respective probabilities of breaking the record are:

• Good weather (no front): 0.75, 0.80, 0.65, 0.70, 0.85

• Poor weather (front in): 0.60, 0.65, 0.50, 0.55, 0.70

The performances are (conditionally) independent, given good weather, and also, given poor weather. What is the probability that three or more will break the track record?

Hint. If B3 is the event of three or more, .

PW = 0.01*[75 80 65 70 85];
PWc = 0.01*[60 65 50 55 70];
P = ckn(PW,3)*0.7 + ckn(PWc,3)*0.3
P =  0.8353


A device has five sensors connected to an alarm system. The alarm is given if three or more of the sensors trigger a switch. If a dangerous condition is present, each of the switches has high (but not unit) probability of activating; if the dangerous condition does not exist, each of the switches has low (but not zero) probability of activating (falsely). Suppose D= the event of the dangerous condition and A= the event the alarm is activated. Proper operation consists of ADAcDc. Suppose Ei= the event the ith unit is activated. Since the switches operate independently, we suppose

(5.51)

Assume the conditional probabilities of the E1, given D, are 0.91, 0.93, 0.96, 0.87, 0.97, and given Dc, are 0.03, 0.02, 0.07, 0.04, 0.01, respectively. If P(D)=0.02, what is the probability the alarm system acts properly? Suggestion. Use the conditional independence and the procedure ckn.

P1 = 0.01*[91 93 96 87 97];
P2 = 0.01*[3 2 7 4 1];
P  = ckn(P1,3)*0.02 + (1 - ckn(P2,3))*0.98
P =  0.9997


Seven students plan to complete a term paper over the Thanksgiving recess. They work independently; however, the likelihood of completion depends upon the weather. If the weather is very pleasant, they are more likely to engage in outdoor activities and put off work on the paper. Let Ei be the event the ith student completes his or her paper, Ak be the event that k or more complete during the recess, and W be the event the weather is highly conducive to outdoor activity. It is reasonable to suppose and . Suppose

(5.52)
(5.53)

respectively, and P(W)=0.8. Determine the probability that four our more complete their papers and that five or more finish.

PW = 0.1*[4 5 3 7 5 6 2];
PWc = 0.1*[7 8 5 9 7 8 5];
PA4 = ckn(PW,4)*0.8 + ckn(PWc,4)*0.2
PA4 =  0.4993
PA5 = ckn(PW,5)*0.8 + ckn(PWc,5)*0.2
PA5 =  0.2482


A manufacturer claims to have improved the reliability of his product. Formerly, the product had probability 0.65 of operating 1000 hours without failure. The manufacturer claims this probability is now 0.80. A sample of size 20 is tested. Determine the odds favoring the new probability for various numbers of surviving units under the assumption the prior odds are 1 to 1. How many survivors would be required to make the claim creditable?

Let E1 be the event the probability is 0.80 and E2 be the event the probability is 0.65. Assume .

(5.54)
k = 1:20;
odds = ibinom(20,0.80,k)./ibinom(20,0.65,k);
disp([k;odds]')
- - - - - - - - - - - -
13.0000    0.2958
14.0000    0.6372
15.0000    1.3723   % Need at least 15 or 16 successes
16.0000    2.9558
17.0000    6.3663
18.0000   13.7121
19.0000   29.5337
20.0000   63.6111


A real estate agent in a neighborhood heavily populated by affluent professional persons is working with a customer. The agent is trying to assess the likelihood the customer will actually buy. His experience indicates the following: if H is the event the customer buys, S is the event the customer is a professional with good income, and E is the event the customer drives a prestigious car, then

(5.55)

Since buying a house and owning a prestigious car are not related for a given owner, it seems reasonable to suppose and . The customer drives a Cadillac. What are the odds he will buy a house?

Assumptions amount to and .

(5.56)
(5.57)
(5.58)

In deciding whether or not to drill an oil well in a certain location, a company undertakes a geophysical survey. On the basis of past experience, the decision makers feel the odds are about four to one favoring success. Various other probabilities can be assigned on the basis of past experience. Let

• H be the event that a well would be successful

• S be the event the geological conditions are favorable

• E be the event the results of the geophysical survey are positive

The initial, or prior, odds are . Previous experience indicates

(5.59)

Make reasonable assumptions based on the fact that the result of the geophysical survey depends upon the geological formations and not on the presence or absence of oil. The result of the survey is favorable. Determine the posterior odds .

(5.60)
(5.61)

A software firm is planning to deliver a custom package. Past experience indicates the odds are at least four to one that it will pass customer acceptance tests. As a check, the program is subjected to two different benchmark runs. Both are successful. Given the following data, what are the odds favoring successful operation in practice? Let

• H be the event the performance is satisfactory

• S be the event the system satisfies customer acceptance tests

• E1 be the event the first benchmark tests are satisfactory.

• E2 be the event the second benchmark test is ok.

Under the usual conditions, we may assume and . Reliability data show

(5.62)
(5.63)

Determine the posterior odds .

(5.64)
(5.65)
(5.66)

A research group is contemplating purchase of a new software package to perform some specialized calculations. The systems manager decides to do two sets of diagnostic tests for significant bugs that might hamper operation in the intended application. The tests are carried out in an operationally independent manner. The following analysis of the results is made.

• H= the event the program is satisfactory for the intended application

• S= the event the program is free of significant bugs

• E1= the event the first diagnostic tests are satisfactory

• E2= the event the second diagnostic tests are satisfactory

Since the tests are for the presence of bugs, and are operationally independent, it seems reasonable to assume and . Because of the reliability of the software company, the manager thinks P(S)=0.85. Also, experience suggests

Determine the posterior odds favoring H if results of both diagnostic tests are satisfactory.

(5.67)
(5.68)

with similar expressions for the other terms.

(5.69)

A company is considering a new product now undergoing field testing. Let

• H be the event the product is introduced and successful

• S be the event the R&D group produces a product with the desired characteristics.

• E be the event the testing program indicates the product is satisfactory

The company assumes P(S)=0.9 and the conditional probabilities

(5.70)

Since the testing of the merchandise is not affected by market success or failure, it seems reasonable to suppose and . The field tests are favorable. Determine .

(5.71)
(5.72)

Martha is wondering if she will get a five percent annual raise at the end of the fiscal year. She understands this is more likely if the company's net profits increase by ten percent or more. These will be influenced by company sales volume. Let

• H= the event she will get the raise

• S= the event company profits increase by ten percent or more

• E= the event sales volume is up by fifteen percent or more

Since the prospect of a raise depends upon profits, not directly on sales, she supposes and . She thinks the prior odds favoring suitable profit increase is about three to one. Also, it seems reasonable to suppose

(5.73)

End of the year records show that sales increased by eighteen percent. What is the probability Martha will get her raise?

(5.74)
(5.75)

A physician thinks the odds are about 2 to 1 that a patient has a certain disease. He seeks the “independent” advice of three specialists. Let H be the event the disease is present, and A,B,C be the events the respective consultants agree this is the case. The physician decides to go with the majority. Since the advisers act in an operationally independent manner, it seems reasonable to suppose {A,B,C} ci |H and ci |Hc. Experience indicates

(5.76)
(5.77)

What is the probability of the right decision (i.e., he treats the disease if two or more think it is present, and does not if two or more think the disease is not present)?

PH = 0.01*[80 70 75];
PHc = 0.01*[85 80 70];
pH = 2/3;
P  = ckn(PH,2)*pH + ckn(PHc,2)*(1 - pH)
P =  0.8577


A software company has developed a new computer game designed to appeal to teenagers and young adults. It is felt that there is good probability it will appeal to college students, and that if it appeals to college students it will appeal to a general youth market. To check the likelihood of appeal to college students, it is decided to test first by a sales campaign at Rice and University of Texas, Austin. The following analysis of the situation is made.

• H= the event the sales to the general market will be good

• S= the event the game appeals to college students

• E1= the event the sales are good at Rice

• E2= the event the sales are good at UT, Austin

Since the tests are for the reception are at two separate universities and are operationally independent, it seems reasonable to assume and . Because of its previous experience in game sales, the managers think P(S)=0.80. Also, experience suggests

Determine the posterior odds favoring H if sales results are satisfactory at both schools.

(5.78)
(5.79)
(5.80)

In a region in the Gulf Coast area, oil deposits are highly likely to be associated with underground salt domes. If H is the event that an oil deposit is present in an area, and S is the event of a salt dome in the area, experience indicates P(S|H)=0.9 and . Company executives believe the odds favoring oil in the area is at least 1 in 10. It decides to conduct two independent geophysical surveys for the presence of a salt dome. Let be the events the surveys indicate a salt dome. Because the surveys are tests for the geological structure, not the presence of oil, and the tests are carried out in an operationally independent manner, it seems reasonable to assume and . Data on the reliability of the surveys yield the following probabilities

(5.81)

Determine the posterior odds . Should the well be drilled?

(5.82)
(5.83)

with similar expressions for the other terms.

(5.84)

A sample of 150 subjects is taken from a population which has two subgroups. The subgroup membership of each subject in the sample is known. Each individual is asked a battery of ten questions designed to be independent, in the sense that the answer to any one is not affected by the answer to any other. The subjects answer independently. Data on the results are summarized in the following table:

 GROUP 1 (84 members) GROUP 2 (66 members) Q Yes No Unc Yes No Unc 1 51 26 7 27 34 5 2 42 32 10 19 43 4 3 19 54 11 39 22 5 4 24 53 7 38 19 9 5 27 52 5 28 33 5 6 49 19 16 19 41 6 7 16 59 9 37 21 8 8 47 32 5 19 42 5 9 55 17 12 27 33 6 10 24 53 7 39 21 6

Assume the data represent the general population consisting of these two groups, so that the data may be used to calculate probabilities and conditional probabilities.

Several persons are interviewed. The result of each interview is a “profile” of answers to the questions. The goal is to classify the person in one of the two subgroups

For the following profiles, classify each individual in one of the subgroups

1. y, n, y, n, y, u, n, u, y. u

2. n, n, u, n, y, y, u, n, n, y

3. y, y, n, y, u, u, n, n, y, y

% file npr05_16.m
% Data for Exercise 16.
A = [51 26  7; 42 32 10; 19 54 11; 24 53  7; 27 52  5;
49 19 16; 16 59  9; 47 32  5; 55 17 12; 24 53  7];
B = [27 34  5; 19 43  4; 39 22  5; 38 19  9; 28 33  5;
19 41  6; 37 21  8; 19 42  5; 27 33  6; 39 21  6];
disp('Call for oddsdf')
npr05_16
Call for oddsdf
oddsdf
Enter matrix A of frequencies for calibration group 1  A
Enter matrix B of frequencies for calibration group 2  B
Number of questions = 10
Enter code for answers and call for procedure "odds"
y = 1;
n = 2;
u = 3;
odds
Enter profile matrix E  [y n y n y u n u y u]
Odds favoring Group 1:   3.743
Classify in Group 1
odds
Enter profile matrix E  [n n u n y y u n n y]
Odds favoring Group 1:   0.2693
Classify in Group 2
odds
Enter profile matrix E  [y y n y u u n n y y]
Odds favoring Group 1:   5.286
Classify in Group 1


The data of Exercise 16., above, are converted to conditional probabilities and probabilities, as follows (probabilities are rounded to two decimal places).

 GROUP 1 GROUP 2 Q Yes No Unc Yes No Unc 1 0.61 0.31 0.08 0.41 0.51 0.08 2 0.50 0.38 0.12 0.29 0.65 0.06 3 0.23 0.64 0.13 0.59 0.33 0.08 4 0.29 0.63 0.08 0.57 0.29 0.14 5 0.32 0.62 0.06 0.42 0.50 0.08 6 0.58 0.23 0.19 0.29 0.62 0.09 7 0.19 0.70 0.11 0.56 0.32 0.12 8 0.56 0.38 0.06 0.29 0.63 0.08 9 0.65 0.20 0.15 0.41 0.50 0.09 10 0.29 0.63 0.08 0.59 0.32 0.09

For the following profiles classify each individual in one of the subgroups.

1. y, n, y, n, y, u, n, u, y, u

2. n, n, u, n, y, y, u, n, n, y

3. y, y, n, y, u, u, n, n, y, y

npr05_17
% file npr05_17.m
% Data for Exercise 17.
PG1 = 84/150;
PG2 = 66/125;
A = [0.61 0.31 0.08
0.50 0.38 0.12
0.23 0.64 0.13
0.29 0.63 0.08
0.32 0.62 0.06
0.58 0.23 0.19
0.19 0.70 0.11
0.56 0.38 0.06
0.65 0.20 0.15
0.29 0.63 0.08];

B = [0.41 0.51 0.08
0.29 0.65 0.06
0.59 0.33 0.08
0.57 0.29 0.14
0.42 0.50 0.08
0.29 0.62 0.09
0.56 0.32 0.12
0.29 0.64 0.08
0.41 0.50 0.09
0.59 0.32 0.09];
disp('Call for oddsdp')
Call for oddsdp
oddsdp
Enter matrix A of conditional probabilities for Group 1  A
Enter matrix B of conditional probabilities for Group 2  B
Probability p1 an individual is from Group 1  PG1
Number of questions = 10
Enter code for answers and call for procedure "odds"
y = 1;
n = 2;
u = 3;
odds
Enter profile matrix E  [y n y n y u n u y u]
Odds favoring Group 1:   3.486
Classify in Group 1
odds
Enter profile matrix E  [n n u n y y u n n y]
Odds favoring Group 1:   0.2603
Classify in Group 2
odds
Enter profile matrix E  [y y n y u u n n y y]
Odds favoring Group 1:   5.162
Classify in Group 1

Solutions