Skip to main content
Chemistry LibreTexts

Untitled Page 10

  • Page ID
    125283
  • Chapter 5Conditional Independence

    5.1Conditional Independence*

    The idea of stochastic (probabilistic) independence is explored in the unit Independence of Events. The concept is approached as lack of conditioning: P(A|B)=P(A). This is equivalent to the product rule P(AB)=P(A)P(B) . We consider an extension to conditional independence.

    The concept

    Examination of the independence concept reveals two important mathematical facts:

    • Independence of a class of non mutually exclusive events depends upon the probability measure, and not on the relationship between the events. Independence cannot be displayed on a Venn diagram, unless probabilities are indicated. For one probability measure a pair may be independent while for another probability measure the pair may not be independent.

    • Conditional probability is a probability measure, since it has the three defining properties and all those properties derived therefrom.

    This raises the question: is there a useful conditional independence—i.e., independence with respect to a conditional probability measure? In this chapter we explore that question in a fruitful way.

    Among the simple examples of “operational independence" in the unit on independence of events, which lead naturally to an assumption of “probabilistic independence” are the following:

    • If customers come into a well stocked shop at different times, each unaware of the choice made by the other, the the item purchased by one should not be affected by the choice made by the other.

    • If two students are taking exams in different courses, the grade one makes should not affect the grade made by the other.

    Example 5.1Buying umbrellas and the weather

    A department store has a nice stock of umbrellas. Two customers come into the store “independently.” Let A be the event the first buys an umbrella and B the event the second buys an umbrella. Normally, we should think the events _autogen-svg2png-0003.png form an independent pair. But consider the effect of weather on the purchases. Let C be the event the weather is rainy (i.e., is raining or threatening to rain). Now we should think _autogen-svg2png-0004.png and _autogen-svg2png-0005.png. The weather has a decided effect on the likelihood of buying an umbrella. But given the fact the weather is rainy (event C has occurred), it would seem reasonable that purchase of an umbrella by one should not affect the likelihood of such a purchase by the other. Thus, it may be reasonable to suppose

    (5.1)
    _autogen-svg2png-0006.png

    An examination of the sixteen equivalent conditions for independence, with probability measure P replaced by probability measure PC, shows that we have independence of the pair _autogen-svg2png-0007.png with respect to the conditional probability measure PC(·)=P(·|C). Thus, P(AB|C)=P(A|C)P(B|C). For this example, we should also expect that _autogen-svg2png-0010.png, so that there is independence with respect to the conditional probability measure _autogen-svg2png-0011.png. Does this make the pair _autogen-svg2png-0012.png independent (with respect to the prior probability measure P)? Some numerical examples make it plain that only in the most unusual cases would the pair be independent. Without calculations, we can see why this should be so. If the first customer buys an umbrella, this indicates a higher than normal likelihood that the weather is rainy, in which case the second customer is likely to buy. The condition leads to P(B|A)>P(B). Consider the following numerical case. Suppose P(AB|C)=P(A|C)P(B|C) and _autogen-svg2png-0015.png and

    (5.2)
    _autogen-svg2png-0016.png

    Then

    (5.3)
    _autogen-svg2png-0017.png
    (5.4)
    _autogen-svg2png-0018.png

    As a result,

    (5.5) P ( A ) P ( B ) = 0 . 0816 ≠ 0 . 1110 = P ( A B )

    The product rule fails, so that the pair is not independent. An examination of the pattern of computation shows that independence would require very special probabilities which are not likely to be encountered.

    Example 5.2Students and exams

    Two students take exams in different courses, Under normal circumstances, one would suppose their performances form an independent pair. Let A be the event the first student makes grade 80 or better and B be the event the second has a grade of 80 or better. The exam is given on Monday morning. It is the fall semester. There is a probability 0.30 that there was a football game on Saturday, and both students are enthusiastic fans. Let C be the event of a game on the previous Saturday. Now it is reasonable to suppose

    (5.6)
    _autogen-svg2png-0020.png

    If we know that there was a Saturday game, additional knowledge that B has occurred does not affect the lielihood that A occurs. Again, use of equivalent conditions shows that the situation may be expressed

    (5.7)
    _autogen-svg2png-0021.png

    Under these conditions, we should suppose that _autogen-svg2png-0022.png and _autogen-svg2png-0023.png. If we knew that one did poorly on the exam, this would increase the likelihoood there was a Saturday game and hence increase the likelihood that the other did poorly. The failure to be independent arises from a common chance factor that affects both. Although their performances are “operationally” independent, they are not independent in the probability sense. As a numerical example, suppose

    (5.8)
    _autogen-svg2png-0024.png

    Straightforward calculations show _autogen-svg2png-0025.png. Note that P(A|B)=0.8514>P(A) as would be expected.

    Sixteen equivalent conditions

    Using the facts on repeated conditioning and the equivalent conditions for independence, we may produce a similar table of equivalent conditions for conditional independence. In the hybrid notation we use for repeated conditioning, we write

    (5.9)
    _autogen-svg2png-0027.png

    This translates into

    (5.10)
    _autogen-svg2png-0028.png

    If it is known that C has occurred, then additional knowledge of the occurrence of B does not change the likelihood of A.

    If we write the sixteen equivalent conditions for independence in terms of the conditional probability measure _autogen-svg2png-0029.png, then translate as above, we have the following equivalent conditions.

    Table 5.1. Sixteen equivalent conditions
    P ( A | B C ) = P ( A | C ) P ( B | A C ) = P ( B | C ) P ( A B | C ) = P ( A | C ) P ( B | C )
    _autogen-svg2png-0033.png _autogen-svg2png-0034.png _autogen-svg2png-0035.png
    _autogen-svg2png-0036.png _autogen-svg2png-0037.png _autogen-svg2png-0038.png
    _autogen-svg2png-0039.png _autogen-svg2png-0040.png _autogen-svg2png-0041.png
    Table 5.2.
    _autogen-svg2png-0042.png _autogen-svg2png-0043.png _autogen-svg2png-0044.png _autogen-svg2png-0045.png

    The patterns of conditioning in the examples above belong to this set. In a given problem, one or the other of these conditions may seem a reasonable assumption. As soon as one of these patterns is recognized, then all are equally valid assumptions. Because of its simplicity and symmetry, we take as the defining condition the product ruleP(AB|C)=P(A|C)P(B|C).

    Definition. A pair of events {A,B} is said to be conditionally independent, givenC, designated _autogen-svg2png-0048.png iff the following product rule holds: P(AB|C)=P(A|C)P(B|C).

    The equivalence of the four entries in the right hand column of the upper part of the table, establish

    The replacement rule

    If any of the pairs _autogen-svg2png-0050.png, or _autogen-svg2png-0051.png is conditionally independent, given C, then so are the others.

    This may be expressed by saying that if a pair is conditionally independent, we may replace either or both by their complements and still have a conditionally independent pair.

    To illustrate further the usefulness of this concept, we note some other common examples in which similar conditions hold: there is operational independence, but some chance factor which affects both.

    • Two contractors work quite independently on jobs in the same city. The operational independence suggests probabilistic independence. However, both jobs are outside and subject to delays due to bad weather. Suppose A is the event the first contracter completes his job on time and B is the event the second completes on time. If C is the event of “good” weather, then arguments similar to those in Examples 1 and 2 make it seem reasonable to suppose _autogen-svg2png-0053.png and _autogen-svg2png-0054.png. Remark. In formal probability theory, an event must be sharply defined: on any trial it occurs or it does not. The event of “good weather” is not so clearly defined. Did a trace of rain or thunder in the area constitute bad weather? Did rain delay on one day in a month long project constitute bad weather? Even with this ambiguity, the pattern of probabilistic analysis may be useful.

    • A patient goes to a doctor. A preliminary examination leads the doctor to think there is a thirty percent chance the patient has a certain disease. The doctor orders two independent tests for conditions that indicate the disease. Are results of these tests really independent? There is certainly operational independence—the tests may be done by different laboratories, neither aware of the testing by the others. Yet, if the tests are meaningful, they must both be affected by the actual condition of the patient. Suppose D is the event the patient has the disease, A is the event the first test is positive (indicates the conditions associated with the disease) and B is the event the second test is positive. Then it would seem reasonable to suppose _autogen-svg2png-0055.png and _autogen-svg2png-0056.png.

    In the examples considered so far, it has been reasonable to assume conditional independence, given an event C, and conditional independence, given the complementary event. But there are cases in which the effect of the conditioning event is asymmetric. We consider several examples.

    • Two students are working on a term paper. They work quite separately. They both need to borrow a certain book from the library. Let C be the event the library has two copies available. If A is the event the first completes on time and B the event the second is successful, then it seems reasonable to assume _autogen-svg2png-0057.png. However, if only one book is available, then the two conditions would not be conditionally independent. In general _autogen-svg2png-0058.png, since if the first student completes on time, then he or she must have been successful in getting the book, to the detriment of the second.

    • If the two contractors of the example above both need material which may be in scarce supply, then successful completion would be conditionally independent, give an adequate supply, whereas they would not be conditionally independent, given a short supply.

    • Two students in the same course take an exam. If they prepared separately, the event of both getting good grades should be conditionally independent. If they study together, then the likelihoods of good grades would not be independent. With neither cheating or collaborating on the test itself, if one does well, the other should also.

    Since conditional independence is ordinary independence with respect to a conditional probability measure, it should be clear how to extend the concept to larger classes of sets.

    Definition. A class _autogen-svg2png-0059.png, where J is an arbitrary index set, is conditionally independent, given event C, denoted _autogen-svg2png-0060.png, iff the product rule holds for every finite subclass of two or more.

    As in the case of simple independence, the replacement rule extends.

    The replacement rule

    If the class _autogen-svg2png-0061.png, then any or all of the events Ai may be replaced by their complements and still have a conditionally independent class.

    The use of independence techniques

    Since conditional independence is independence, we may use independence techniques in the solution of problems. We consider two types of problems: an inference problem and a conditional Bernoulli sequence.

    Example 5.3Use of independence techniques

    Sharon is investigating a business venture which she thinks has probability 0.7 of being successful. She checks with five “independent” advisers. If the prospects are sound, the probabilities are 0.8, 0.75, 0.6, 0.9, and 0.8 that the advisers will advise her to proceed; if the venture is not sound, the respective probabilities are 0.75, 0.85, 0.7, 0.9, and 0.7 that the advice will be negative. Given the quality of the project, the advisers are independent of one another in the sense that no one is affected by the others. Of course, they are not independent, for they are all related to the soundness of the venture. We may reasonably assume conditional independence of the advice, given that the venture is sound and also given that the venture is not sound. If Sharon goes with the majority of advisers, what is the probability she will make the right decision?

    SOLUTION

    If the project is sound, Sharon makes the right choice if three or more of the five advisors are positive. If the venture is unsound, she makes the right choice if three or more of the five advisers are negative. Let H= the event the project is sound, F= the event three or more advisers are positive, G=Fc= the event three or more are negative, and E= the event of the correct decision. Then

    (5.11)
    _autogen-svg2png-0066.png

    Let Ei be the event the ith adviser is positive. Then P(F|H)= the sum of probabilities of the form _autogen-svg2png-0068.png, where Mk are minterms generated by the class _autogen-svg2png-0069.png. Because of the assumed conditional independence,

    (5.12)
    _autogen-svg2png-0070.png

    with similar expressions for each _autogen-svg2png-0071.png and _autogen-svg2png-0072.png. This means that if we want the probability of three or more successes, given H, we can use ckn with the matrix of conditional probabilities. The following MATLAB solution of the investment problem is indicated.

    P1 = 0.01*[80 75 60 90 80];
    P2 = 0.01*[75 85 70 90 70];
    PH = 0.7;
    PE = ckn(P1,3)*PH + ckn(P2,3)*(1 - PH)
    PE =    0.9255
    

    Often a Bernoulli sequence is related to some conditioning event H. In this case it is reasonable to assume the sequence _autogen-svg2png-0073.png and _autogen-svg2png-0074.png. We consider a simple example.

    Example 5.4Test of a claim

    A race track regular claims he can pick the winning horse in any race 90 percent of the time. In order to test his claim, he picks a horse to win in each of ten races. There are five horses in each race. If he is simply guessing, the probability of success on each race is 0.2. Consider the trials to constitute a Bernoulli sequence. Let H be the event he is correct in his claim. If S is the number of successes in picking the winners in the ten races, determine P(H|S=k) for various numbers k of correct picks. Suppose it is equally likely that his claim is valid or that he is merely guessing. We assume two conditional Bernoulli trials:

    Claim is valid:       Ten trials, probability _autogen-svg2png-0076.png.

    Guessing at random: Ten trials, probability _autogen-svg2png-0077.png.

    Let S= number of correct picks in ten trials. Then

    (5.13)
    _autogen-svg2png-0079.png

    Giving him the benefit of the doubt, we suppose _autogen-svg2png-0080.png and calculate the conditional odds.

    k = 0:10;
    Pk1 = ibinom(10,0.9,k);    % Probability of k successes, given H
    Pk2 = ibinom(10,0.2,k);    % Probability of k successes, given H^c
    OH  = Pk1./Pk2;            % Conditional odds-- Assumes P(H)/P(H^c) = 1
    e   = OH > 1;              % Selects favorable odds
    disp(round([k(e);OH(e)]'))
               6           2      % Needs at least six to have creditability
               7          73      % Seven would be creditable,
               8        2627      % even if P(H)/P(H^c) = 0.1
               9       94585
              10     3405063
    

    Under these assumptions, he would have to pick at least seven correctly to give reasonable validation of his claim.

    5.2Patterns of Probable Inference*

    Some Patterns of Probable Inference

    We are concerned with the likelihood of some hypothesized condition. In general, we have evidence for the condition which can never be absolutely certain. We are forced to assess probabilities (likelihoods) on the basis of the evidence. Some typical examples:

    Table 5.3.
    HYPOTHESISEVIDENCE
    Job successPersonal traits
    Presence of oilGeological structures
    Operation of a devicePhysical condition
    Market conditionTest market condition
    Presence of a diseaseTests for symptoms

    If H is the event the hypothetical condition exists and E is the event the evidence occurs, the probabilities available are usually P(H) (or an odds value), P(E|H), and _autogen-svg2png-0003.png. What is desired is P(H|E) or, equivalently, the odds _autogen-svg2png-0005.png. We simply use Bayes' rule to reverse the direction of conditioning.

    (5.14)
    _autogen-svg2png-0006.png

    No conditional independence is involved in this case.

    Independent evidence for the hypothesized condition

    Suppose there are two “independent” bits of evidence. Now obtaining this evidence may be “operationally” independent, but if the items both relate to the hypothesized condition, then they cannot be really independent. The condition assumed is usually of the form _autogen-svg2png-0007.png —if H occurs, then knowledge of E2 does not affect the likelihood of E1. Similarly, we usually have _autogen-svg2png-0008.png. Thus _autogen-svg2png-0009.png and _autogen-svg2png-0010.png.

    Example 5.5Independent medical tests

    Suppose a doctor thinks the odds are 2/1 that a patient has a certain disease. She orders two independent tests. Let H be the event the patient has the disease and E1 and E2 be the events the tests are positive. Suppose the first test has probability 0.1 of a false positive and probability 0.05 of a false negative. The second test has probabilities 0.05 and 0.08 of false positive and false negative, respectively. If both tests are positive, what is the posterior probability the patient has the disease?

    SOLUTION

    Assuming _autogen-svg2png-0011.png and _autogen-svg2png-0012.png, we work first in terms of the odds, then convert to probability.

    (5.15)
    _autogen-svg2png-0013.png

    The data are

    (5.16)
    _autogen-svg2png-0014.png

    Substituting values, we get

    (5.17)
    _autogen-svg2png-0015.png

    Evidence for a symptom

    Sometimes the evidence dealt with is not evidence for the hypothesized condition, but for some condition which is stochastically related. For purposes of exposition, we refer to this intermediary condition as a symptom. Consider again the examples above.

    Table 5.4.
    HYPOTHESISSYMPTOMEVIDENCE
    Job successPersonal traitsDiagnostic test results
    Presence of oilGeological structuresGeophysical survey results
    Operation of a devicePhysical conditionMonitoring report
    Market conditionTest market conditionMarket survey result
    Presence of a diseasePhysical symptomTest for symptom

    We let S be the event the symptom is present. The usual case is that the evidence is directly related to the symptom and not the hypothesized condition. The diagnostic test results can say something about an applicant's personal traits, but cannot deal directly with the hypothesized condition. The test results would be the same whether or not the candidate is successful in the job (he or she does not have the job yet). A geophysical survey deals with certain structural features beneath the surface. If a fault or a salt dome is present, the geophysical results are the same whether or not there is oil present. The physical monitoring report deals with certain physical characteristics. Its reading is the same whether or not the device will fail. A market survey treats only the condition in the test market. The results depend upon the test market, not the national market. A blood test may be for certain physical conditions which frequently are related (at least statistically) to the disease. But the result of the blood test for the physical condition is not directly affected by the presence or absence of the disease.

    Under conditions of this type, we may assume

    (5.18)
    _autogen-svg2png-0016.png

    These imply _autogen-svg2png-0017.png and _autogen-svg2png-0018.png. Now

    (5.19)
    _autogen-svg2png-0019.png

    It is worth noting that each term in the denominator differs from the corresponding term in the numerator by having Hc in place of H. Before completing the analysis, it is necessary to consider how H and S are related stochastically in the data. Four cases may be considered.

    1. Data are P(S|H), _autogen-svg2png-0021.png, and P(H).

    2. Data are P(S|H), _autogen-svg2png-0024.png, and P(S).

    3. Data are P(H|S), _autogen-svg2png-0027.png, and P(S).

    4. Data are P(H|S), _autogen-svg2png-0030.png, and P(H).

    Case a:
    (5.20)
    _autogen-svg2png-0032.png
    Example 5.6Geophysical survey

    Let H be the event of a successful oil well, S be the event there is a geophysical structure favorable to the presence of oil, and E be the event the geophysical survey indicates a favorable structure. We suppose _autogen-svg2png-0033.png and _autogen-svg2png-0034.png. Data are

    (5.21)
    _autogen-svg2png-0035.png

    Then

    (5.22)
    _autogen-svg2png-0036.png
    (5.23)
    _autogen-svg2png-0037.png

    The geophysical result moved the prior odds of 3/1 to posterior odds of 8.6/1, with a corresponding change of probabilities from 0.75 to 0.90.

    Case b: Data are P(S)P(S|H), _autogen-svg2png-0040.png, P(E|S). and _autogen-svg2png-0042.png. If we can determine P(H), we can proceed as in case a. Now by the law of total probability
    (5.24)
    _autogen-svg2png-0044.png
    which may be solved algebraically to give
    (5.25)
    _autogen-svg2png-0045.png
    Example 5.7Geophysical survey revisited

    In many cases a better estimate of P(S) or the odds _autogen-svg2png-0047.png can be made on the basis of previous geophysical data. Suppose the prior odds for S are 3/1, so that P(S)=0.75. Using the other data in Example 5.6, we have

    (5.26)
    _autogen-svg2png-0049.png

    Using the pattern of case a, we have

    (5.27)
    _autogen-svg2png-0050.png
    (5.28)
    _autogen-svg2png-0051.png
    Usually data relating test results to symptom are of the form P(E|S) and _autogen-svg2png-0053.png, or equivalent. Data relating the symptom and the hypothesized condition may go either way. In cases a and b, the data are in the form P(S|H) and _autogen-svg2png-0055.png, or equivalent, derived from data showing the fraction of times the symptom is noted when the hypothesized condition is identified. But these data may go in the opposite direction, yielding P(H|S) and _autogen-svg2png-0057.png, or equivalent. This is the situation in cases c and d.
    Case c: Data are _autogen-svg2png-0058.png and P(S).
    Example 5.8Evidence for a disease symptom with prior P(S)

    When a certain blood syndrome is observed, a given disease is indicated 93 percent of the time. The disease is found without this syndrome only three percent of the time. A test for the syndrome has probability 0.03 of a false positive and 0.05 of a false negative. A preliminary examination indicates a probability 0.30 that a patient has the syndrome. A test is performed; the result is negative. What is the probability the patient has the disease?

    SOLUTION

    In terms of the notation above, the data are

    (5.29)
    _autogen-svg2png-0061.png
    (5.30)
    _autogen-svg2png-0062.png

    We suppose _autogen-svg2png-0063.png and _autogen-svg2png-0064.png.

    (5.31)
    _autogen-svg2png-0065.png
    (5.32)
    _autogen-svg2png-0066.png

    which implies _autogen-svg2png-0067.png.

    Case d: This differs from case c only in the fact that a prior probability for H is assumed. In this case, we determine the corresponding probability for S by
    (5.33)
    _autogen-svg2png-0068.png
    and use the pattern of case c.
    Example 5.9Evidence for a disease symptom with prior P(H)

    Suppose for the patient in Example 5.8 the physician estimates the odds favoring the presence of the disease are 1/3, so that P(H)=0.25. Again, the test result is negative. Determine the posterior odds, given Ec.

    SOLUTION

    First we determine

    (5.34)
    _autogen-svg2png-0071.png

    Then

    (5.35)
    _autogen-svg2png-0072.png

    The result of the test drops the prior odds of 1/3 to approximately 1/21.

    Independent evidence for a symptom

    In the previous cases, we consider only a single item of evidence for a symptom. But it may be desirable to have a “second opinion.” We suppose the tests are for the symptom and are not directly related to the hypothetical condition. If the tests are operationally independent, we could reasonably assume

    (5.36)
    _autogen-svg2png-0073.png

    This implies _autogen-svg2png-0074.png. A similar condition holds for Sc. As for a single test, there are four cases, depending on the tie between S and H. We consider a "case a" example.

    Example 5.10A market survey problem

    A food company is planning to market nationally a new breakfast cereal. Its executives feel confident that the odds are at least 3 to 1 the product would be successful. Before launching the new product, the company decides to investigate a test market. Previous experience indicates that the reliability of the test market is such that if the national market is favorable, there is probability 0.9 that the test market is also. On the other hand, if the national market is unfavorable, there is a probability of only 0.2 that the test market will be favorable. These facts lead to the following analysis. Let

              H be the event the national market is favorable (hypothesis)

              S be the event the test market is favorable (symptom)

    The initial data are the following probabilities, based on past experience:

    •      (a) Prior odds: _autogen-svg2png-0075.png

    •      (b) Reliability of the test market: _autogen-svg2png-0076.png

    If it were known that the test market is favorable, we should have

    (5.37)
    _autogen-svg2png-0077.png

    Unfortunately, it is not feasible to know with certainty the state of the test market. The company decision makers engage two market survey companies to make independent surveys of the test market. The reliability of the companies may be expressed as follows. Let

    : E1 be the event the first company reports a favorable test market.
    : E2 be the event the second company reports a favorable test market.

    On the basis of previous experience, the reliability of the evidence about the test market (the symptom) is expressed in the following conditional probabilities.

    (5.38)
    _autogen-svg2png-0078.png

    Both survey companies report that the test market is favorable. What is the probability the national market is favorable, given this result?

    SOUTION

    The two survey firms work in an “operationally independent” manner. The report of either company is unaffected by the work of the other. Also, each report is affected only by the condition of the test market— regardless of what the national market may be. According to the discussion above, we should be able to assume

    (5.39)
    _autogen-svg2png-0079.png

    We may use a pattern similar to that in Example 2, as follows:

    (5.40)
    _autogen-svg2png-0080.png
    (5.41)
    _autogen-svg2png-0081.png

    In terms of the posterior probability, we have

    (5.42)
    _autogen-svg2png-0082.png

    We note that the odds favoring H, given positive indications from both survey companies, is 10.2 as compared with the odds favoring H, given a favorable test market, of 13.5. The difference reflects the residual uncertainty about the test market after the market surveys. Nevertheless, the results of the market surveys increase the odds favoring a satisfactory market from the prior 3 to 1 to a posterior 10.2 to 1. In terms of probabilities, the market surveys increase the likelihood of a favorable market from the original P(H)=0.75 to the posterior _autogen-svg2png-0084.png. The conditional independence of the results of the survey makes possible direct use of the data.

    A classification problem

    A population consists of members of two subgroups. It is desired to formulate a battery of questions to aid in identifying the subclass membership of randomly selected individuals in the population. The questions are designed so that for each individual the answers are independent, in the sense that the answers to any subset of these questions are not affected by and do not affect the answers to any other subset of the questions. The answers are, however, affected by the subgroup membership. Thus, our treatment of conditional idependence suggests that it is reasonable to supose the answers are conditionally independent, given the subgroup membership. Consider the following numerical example.

    Example 5.11 A classification problem

    A sample of 125 subjects is taken from a population which has two subgroups. The subgroup membership of each subject in the sample is known. Each individual is asked a battery of ten questions designed to be independent, in the sense that the answer to any one is not affected by the answer to any other. The subjects answer independently. Data on the results are summarized in the following table:

    Table 5.5.
    GROUP 1 (69 members)GROUP 2 (56 members)
    QYesNoUnc.YesNoUnc.
    14222520315
    23427816373
    31545933194
    41944631187
    52243423285
    641131514375
    7952831178
    84026313385
    94812927245
    1020371235165

    Assume the data represent the general population consisting of these two groups, so that the data may be used to calculate probabilities and conditional probabilities.

    Several persons are interviewed. The result of each interview is a “profile” of answers to the questions. The goal is to classify the person in one of the two subgroups on the basis of the profile of answers.

    The following profiles were taken.

    • Y, N, Y, N, Y, U, N, U, Y. U

    • N, N, U, N, Y, Y, U, N, N, Y

    • Y, Y, N, Y, U, U, N, N, Y, Y

    Classify each individual in one of the subgroups.

    SOLUTION

    Let G1= the event the person selected is from group 1, and G2=G1c= the event the person selected is from group 2. Let

              Ai= the event the answer to the ith question is “Yes”

              Bi= the event the answer to the ith question is “No”

              Ci= the event the answer to the ith question is “Uncertain”

    The data are taken to mean _autogen-svg2png-0090.png, etc. The profile

    Y, N, Y, N, Y, U, N, U, Y. U corresponds to the event E=A1B2A3B4A5C6B7C8A9C10

    We utilize the ratio form of Bayes' rule to calculate the posterior odds

    (5.43)
    _autogen-svg2png-0092.png

    If the ratio is greater than one, classify in group 1; otherwise classify in group 2 (we assume that a ratio exactly one is so unlikely that we can neglect it). Because of conditional independence, we are able to determine the conditional probabilities

    (5.44)
    _autogen-svg2png-0093.png
    (5.45)
    _autogen-svg2png-0094.png

    The odds _autogen-svg2png-0095.png. We find the posterior odds to be

    (5.46)
    _autogen-svg2png-0096.png

    The factor 569/699 comes from multiplying 5610/6910 by the odds _autogen-svg2png-0099.png. Since the resulting posterior odds favoring Group 1 is greater than one, we classify the respondent in group 1.

    While the calculations are simple and straightforward, they are tedious and error prone. To make possible rapid and easy solution, say in a situation where successive interviews are underway, we have several m-procedures for performing the calculations. Answers to the questions would normally be designated by some such designation as Y for yes, N for no, and U for uncertain. In order for the m-procedure to work, these answers must be represented by numbers indicating the appropriate columns in matrices A and B. Thus, in the example under consideration, each Y must be translated into a 1, each N into a 2, and each U into a 3. The task is not particularly difficult, but it is much easier to have MATLAB make the translation as well as do the calculations. The following two-stage approach for solving the problem works well.

    The first m-procedure oddsdf sets up the frequency information. The next m-procedure odds calculates the odds for a given profile. The advantage of splitting into two m-procedures is that we can set up the data once, then call repeatedly for the calculations for different profiles. As always, it is necessary to have the data in an appropriate form. The following is an example in which the data are entered in terms of actual frequencies of response.

    % file oddsf4.m
    % Frequency data for classification
    A = [42 22 5; 34 27 8; 15 45 9; 19 44 6; 22 43 4;
         41 13 15; 9 52 8; 40 26 3; 48 12 9; 20 37 12];
    B = [20 31 5; 16 37 3; 33 19 4; 31 18 7; 23 28 5;
         14 37 5; 31 17 8; 13 38 5; 27 24 5; 35 16 5];
    disp('Call for oddsdf')
    
    Example 5.12Classification using frequency data
    oddsf4              % Call for data in file oddsf4.m
    Call for oddsdf     % Prompt built into data file
    oddsdf              % Call for m-procedure oddsdf
    Enter matrix A of frequencies for calibration group 1  A
    Enter matrix B of frequencies for calibration group 2  B
    Number of questions = 10
    Answers per question = 3
     Enter code for answers and call for procedure "odds"
    y = 1;              % Use of lower case for easier writing
    n = 2;
    u = 3;
    odds                % Call for calculating procedure
    Enter profile matrix E  [y n y n y u n u y u]   % First profile
    Odds favoring Group 1:   5.845
    Classify in Group 1
    odds                % Second call for calculating procedure
    Enter profile matrix E  [n n u n y y u n n y]   % Second profile
    Odds favoring Group 1:   0.2383
    Classify in Group 2
    odds                % Third call for calculating procedure
    Enter profile matrix E  [y y n y u u n n y y]   % Third profile
    Odds favoring Group 1:   5.05
    Classify in Group 1
    

    The principal feature of the m-procedure odds is the scheme for selecting the numbers from the A and B matrices. If E=[yynyuunnyy] , then the coding translates this into the actual numerical matrix

    _autogen-svg2png-0101.png used internally. Then A(:,E) is a matrix with columns corresponding to elements of E. Thus

    e = A(:,E)
    e =   42    42    22    42     5     5    22    22    42    42
          34    34    27    34     8     8    27    27    34    34
          15    15    45    15     9     9    45    45    15    15
          19    19    44    19     6     6    44    44    19    19
          22    22    43    22     4     4    43    43    22    22
          41    41    13    41    15    15    13    13    41    41
           9     9    52     9     8     8    52    52     9     9
          40    40    26    40     3     3    26    26    40    40
          48    48    12    48     9     9    12    12    48    48
          20    20    37    20    12    12    37    37    20    20
    

    The ith entry on the ith column is the count corresponding to the answer to the ith question. For example, the answer to the third question is N (no), and the corresponding count is the third entry in the N (second) column of A. The element on the diagonal in the third column of A(:,E) is the third element in that column, and hence the desired third entry of the N column. By picking out the elements on the diagonal by the command diag(A(:,E)), we have the desired set of counts corresponding to the profile. The same is true for diag(B(:,E)).

    Sometimes the data are given in terms of conditional probabilities and probabilities. A slight modification of the procedure handles this case. For purposes of comparison, we convert the problem above to this form by converting the counts in matrices A and B to conditional probabilities. We do this by dividing by the total count in each group (69 and 56 in this case). Also, _autogen-svg2png-0104.png and _autogen-svg2png-0105.png.

    Table 5.6.
    GROUP 1 _autogen-svg2png-0106.pngGROUP 2 _autogen-svg2png-0107.png
    QYesNoUnc.YesNoUnc.
    10.60870.31880.07250.35710.55360.0893
    20.49280.39130.11590.28570.66070.0536
    30.21740.65220.13040.58930.33930.0714
    40.27540.63760.08700.55360.32140.1250
    50.31880.62320.05800.41070.50000.0893
    60.59420.18840.21740.25000.66070.0893
    70.13040.75360.11600.55360.30360.1428
    80.57970.37680.04350.23210.67860.0893
    90.69570.17390.13040.48210.42860.0893
    100.28990.53620.17390.62500.28570.0893

    These data are in an m-file oddsp4.m. The modified setup m-procedure oddsdp uses the conditional probabilities, then calls for the m-procedure odds.

    Example 5.13 Calculation using conditional probability data
    oddsp4                 % Call for converted data (probabilities)
    oddsdp                 % Setup m-procedure for probabilities
    Enter conditional probabilities for Group 1  A
    Enter conditional probabilities for Group 2  B
    Probability p1 individual is from Group 1  0.552
     Number of questions = 10
     Answers per question = 3
     Enter code for answers and call for procedure "odds"
    y = 1;
    n = 2;
    u = 3;
    odds
    Enter profile matrix E  [y n y n y u n u y u]
    Odds favoring Group 1:  5.845
    Classify in Group 1
    

    The slight discrepancy in the odds favoring Group 1 (5.8454 compared with 5.8452) can be attributed to rounding of the conditional probabilities to four places. The presentation above rounds the results to 5.845 in each case, so the discrepancy is not apparent. This is quite acceptable, since the discrepancy has no effect on the results.

    5.3Problems on Conditional Independence*

    Suppose _autogen-svg2png-0001.png and _autogen-svg2png-0002.png, P(C)=0.7, and

    (5.47)
    _autogen-svg2png-0004.png

    Show whether or not the pair {A,B} is independent.

    _autogen-svg2png-0006.png, and _autogen-svg2png-0007.png.

    PA = 0.4*0.7 + 0.3*0.3
    PA =  0.3700
    PB = 0.6*0.7 + 0.2*0.3
    PB =  0.4800
    PA*PB
    ans = 0.1776
    PAB = 0.4*0.6*0.7 + 0.3*0.2*0.3
    PAB = 0.1860       % PAB not equal PA*PB;  not independent
    

    Suppose _autogen-svg2png-0008.png and _autogen-svg2png-0009.png, with P(C)=0.4, and

    (5.48)
    _autogen-svg2png-0011.png

    Determine the posterior odds _autogen-svg2png-0012.png.

    (5.49)
    _autogen-svg2png-0013.png
    (5.50)
    _autogen-svg2png-0014.png

    Five world class sprinters are entered in a 200 meter dash. Each has a good chance to break the current track record. There is a thirty percent chance a late cold front will move in, bringing conditions that adversely affect the runners. Otherwise, conditions are expected to be favorable for an outstanding race. Their respective probabilities of breaking the record are:

    • Good weather (no front): 0.75, 0.80, 0.65, 0.70, 0.85

    • Poor weather (front in): 0.60, 0.65, 0.50, 0.55, 0.70

    The performances are (conditionally) independent, given good weather, and also, given poor weather. What is the probability that three or more will break the track record?

    Hint. If B3 is the event of three or more, _autogen-svg2png-0015.png.

    PW = 0.01*[75 80 65 70 85];
    PWc = 0.01*[60 65 50 55 70];
    P = ckn(PW,3)*0.7 + ckn(PWc,3)*0.3
    P =  0.8353
    

    A device has five sensors connected to an alarm system. The alarm is given if three or more of the sensors trigger a switch. If a dangerous condition is present, each of the switches has high (but not unit) probability of activating; if the dangerous condition does not exist, each of the switches has low (but not zero) probability of activating (falsely). Suppose D= the event of the dangerous condition and A= the event the alarm is activated. Proper operation consists of ADAcDc. Suppose Ei= the event the ith unit is activated. Since the switches operate independently, we suppose

    (5.51)
    _autogen-svg2png-0020.png

    Assume the conditional probabilities of the E1, given D, are 0.91, 0.93, 0.96, 0.87, 0.97, and given Dc, are 0.03, 0.02, 0.07, 0.04, 0.01, respectively. If P(D)=0.02, what is the probability the alarm system acts properly? Suggestion. Use the conditional independence and the procedure ckn.

    P1 = 0.01*[91 93 96 87 97];
    P2 = 0.01*[3 2 7 4 1];
    P  = ckn(P1,3)*0.02 + (1 - ckn(P2,3))*0.98
    P =  0.9997
    

    Seven students plan to complete a term paper over the Thanksgiving recess. They work independently; however, the likelihood of completion depends upon the weather. If the weather is very pleasant, they are more likely to engage in outdoor activities and put off work on the paper. Let Ei be the event the ith student completes his or her paper, Ak be the event that k or more complete during the recess, and W be the event the weather is highly conducive to outdoor activity. It is reasonable to suppose _autogen-svg2png-0022.png and _autogen-svg2png-0023.png. Suppose

    (5.52)
    _autogen-svg2png-0024.png
    (5.53)
    _autogen-svg2png-0025.png

    respectively, and P(W)=0.8. Determine the probability _autogen-svg2png-0027.png that four our more complete their papers and _autogen-svg2png-0028.png that five or more finish.

    PW = 0.1*[4 5 3 7 5 6 2];
    PWc = 0.1*[7 8 5 9 7 8 5];
    PA4 = ckn(PW,4)*0.8 + ckn(PWc,4)*0.2
    PA4 =  0.4993
    PA5 = ckn(PW,5)*0.8 + ckn(PWc,5)*0.2
    PA5 =  0.2482
    

    A manufacturer claims to have improved the reliability of his product. Formerly, the product had probability 0.65 of operating 1000 hours without failure. The manufacturer claims this probability is now 0.80. A sample of size 20 is tested. Determine the odds favoring the new probability for various numbers of surviving units under the assumption the prior odds are 1 to 1. How many survivors would be required to make the claim creditable?

    Let E1 be the event the probability is 0.80 and E2 be the event the probability is 0.65. Assume _autogen-svg2png-0029.png.

    (5.54)
    _autogen-svg2png-0030.png
    k = 1:20;
    odds = ibinom(20,0.80,k)./ibinom(20,0.65,k);
    disp([k;odds]')
    - - - - - - - - - - - -
       13.0000    0.2958
       14.0000    0.6372
       15.0000    1.3723   % Need at least 15 or 16 successes
       16.0000    2.9558
       17.0000    6.3663
       18.0000   13.7121
       19.0000   29.5337
       20.0000   63.6111
    

    A real estate agent in a neighborhood heavily populated by affluent professional persons is working with a customer. The agent is trying to assess the likelihood the customer will actually buy. His experience indicates the following: if H is the event the customer buys, S is the event the customer is a professional with good income, and E is the event the customer drives a prestigious car, then

    (5.55)
    _autogen-svg2png-0031.png

    Since buying a house and owning a prestigious car are not related for a given owner, it seems reasonable to suppose _autogen-svg2png-0032.png and _autogen-svg2png-0033.png. The customer drives a Cadillac. What are the odds he will buy a house?

    Assumptions amount to _autogen-svg2png-0034.png and _autogen-svg2png-0035.png.

    (5.56)
    _autogen-svg2png-0036.png
    (5.57)
    _autogen-svg2png-0037.png
    (5.58)
    _autogen-svg2png-0038.png

    In deciding whether or not to drill an oil well in a certain location, a company undertakes a geophysical survey. On the basis of past experience, the decision makers feel the odds are about four to one favoring success. Various other probabilities can be assigned on the basis of past experience. Let

    • H be the event that a well would be successful

    • S be the event the geological conditions are favorable

    • E be the event the results of the geophysical survey are positive

    The initial, or prior, odds are _autogen-svg2png-0039.png. Previous experience indicates

    (5.59)
    _autogen-svg2png-0040.png

    Make reasonable assumptions based on the fact that the result of the geophysical survey depends upon the geological formations and not on the presence or absence of oil. The result of the survey is favorable. Determine the posterior odds _autogen-svg2png-0041.png.

    (5.60)
    _autogen-svg2png-0042.png
    (5.61)
    _autogen-svg2png-0043.png

    A software firm is planning to deliver a custom package. Past experience indicates the odds are at least four to one that it will pass customer acceptance tests. As a check, the program is subjected to two different benchmark runs. Both are successful. Given the following data, what are the odds favoring successful operation in practice? Let

    • H be the event the performance is satisfactory

    • S be the event the system satisfies customer acceptance tests

    • E1 be the event the first benchmark tests are satisfactory.

    • E2 be the event the second benchmark test is ok.

    Under the usual conditions, we may assume _autogen-svg2png-0044.png and _autogen-svg2png-0045.png. Reliability data show

    (5.62)
    _autogen-svg2png-0046.png
    (5.63)
    _autogen-svg2png-0047.png

    Determine the posterior odds _autogen-svg2png-0048.png.

    (5.64)
    _autogen-svg2png-0049.png
    (5.65)
    _autogen-svg2png-0050.png
    (5.66)
    _autogen-svg2png-0051.png

    A research group is contemplating purchase of a new software package to perform some specialized calculations. The systems manager decides to do two sets of diagnostic tests for significant bugs that might hamper operation in the intended application. The tests are carried out in an operationally independent manner. The following analysis of the results is made.

    • H= the event the program is satisfactory for the intended application

    • S= the event the program is free of significant bugs

    • E1= the event the first diagnostic tests are satisfactory

    • E2= the event the second diagnostic tests are satisfactory

    Since the tests are for the presence of bugs, and are operationally independent, it seems reasonable to assume _autogen-svg2png-0056.png and _autogen-svg2png-0057.png. Because of the reliability of the software company, the manager thinks P(S)=0.85. Also, experience suggests

    Table 5.7.
    _autogen-svg2png-0059.png _autogen-svg2png-0060.png _autogen-svg2png-0061.png
    _autogen-svg2png-0062.png _autogen-svg2png-0063.png _autogen-svg2png-0064.png

    Determine the posterior odds favoring H if results of both diagnostic tests are satisfactory.

    (5.67)
    _autogen-svg2png-0065.png
    (5.68)
    _autogen-svg2png-0066.png

    with similar expressions for the other terms.

    (5.69)
    _autogen-svg2png-0067.png

    A company is considering a new product now undergoing field testing. Let

    • H be the event the product is introduced and successful

    • S be the event the R&D group produces a product with the desired characteristics.

    • E be the event the testing program indicates the product is satisfactory

    The company assumes P(S)=0.9 and the conditional probabilities

    (5.70)
    _autogen-svg2png-0069.png

    Since the testing of the merchandise is not affected by market success or failure, it seems reasonable to suppose _autogen-svg2png-0070.png and _autogen-svg2png-0071.png. The field tests are favorable. Determine _autogen-svg2png-0072.png.

    (5.71)
    _autogen-svg2png-0073.png
    (5.72)
    _autogen-svg2png-0074.png

    Martha is wondering if she will get a five percent annual raise at the end of the fiscal year. She understands this is more likely if the company's net profits increase by ten percent or more. These will be influenced by company sales volume. Let

    • H= the event she will get the raise

    • S= the event company profits increase by ten percent or more

    • E= the event sales volume is up by fifteen percent or more

    Since the prospect of a raise depends upon profits, not directly on sales, she supposes _autogen-svg2png-0078.png and _autogen-svg2png-0079.png. She thinks the prior odds favoring suitable profit increase is about three to one. Also, it seems reasonable to suppose

    (5.73)
    _autogen-svg2png-0080.png

    End of the year records show that sales increased by eighteen percent. What is the probability Martha will get her raise?

    (5.74)
    _autogen-svg2png-0081.png
    (5.75)
    _autogen-svg2png-0082.png

    A physician thinks the odds are about 2 to 1 that a patient has a certain disease. He seeks the “independent” advice of three specialists. Let H be the event the disease is present, and A,B,C be the events the respective consultants agree this is the case. The physician decides to go with the majority. Since the advisers act in an operationally independent manner, it seems reasonable to suppose {A,B,C} ci |H and ci |Hc. Experience indicates

    (5.76)
    _autogen-svg2png-0086.png
    (5.77)
    _autogen-svg2png-0087.png

    What is the probability of the right decision (i.e., he treats the disease if two or more think it is present, and does not if two or more think the disease is not present)?

    PH = 0.01*[80 70 75];
    PHc = 0.01*[85 80 70];
    pH = 2/3;
    P  = ckn(PH,2)*pH + ckn(PHc,2)*(1 - pH)
    P =  0.8577
    

    A software company has developed a new computer game designed to appeal to teenagers and young adults. It is felt that there is good probability it will appeal to college students, and that if it appeals to college students it will appeal to a general youth market. To check the likelihood of appeal to college students, it is decided to test first by a sales campaign at Rice and University of Texas, Austin. The following analysis of the situation is made.

    • H= the event the sales to the general market will be good

    • S= the event the game appeals to college students

    • E1= the event the sales are good at Rice

    • E2= the event the sales are good at UT, Austin

    Since the tests are for the reception are at two separate universities and are operationally independent, it seems reasonable to assume _autogen-svg2png-0092.png and _autogen-svg2png-0093.png. Because of its previous experience in game sales, the managers think P(S)=0.80. Also, experience suggests

    Table 5.8.
    _autogen-svg2png-0095.png _autogen-svg2png-0096.png _autogen-svg2png-0097.png
    _autogen-svg2png-0098.png _autogen-svg2png-0099.png _autogen-svg2png-0100.png

    Determine the posterior odds favoring H if sales results are satisfactory at both schools.

    (5.78)
    _autogen-svg2png-0101.png
    (5.79)
    _autogen-svg2png-0102.png
    (5.80)
    _autogen-svg2png-0103.png

    In a region in the Gulf Coast area, oil deposits are highly likely to be associated with underground salt domes. If H is the event that an oil deposit is present in an area, and S is the event of a salt dome in the area, experience indicates P(S|H)=0.9 and _autogen-svg2png-0105.png. Company executives believe the odds favoring oil in the area is at least 1 in 10. It decides to conduct two independent geophysical surveys for the presence of a salt dome. Let _autogen-svg2png-0106.png be the events the surveys indicate a salt dome. Because the surveys are tests for the geological structure, not the presence of oil, and the tests are carried out in an operationally independent manner, it seems reasonable to assume _autogen-svg2png-0107.png and _autogen-svg2png-0108.png. Data on the reliability of the surveys yield the following probabilities

    (5.81)
    _autogen-svg2png-0109.png

    Determine the posterior odds _autogen-svg2png-0110.png. Should the well be drilled?

    (5.82)
    _autogen-svg2png-0111.png
    (5.83)
    _autogen-svg2png-0112.png

    with similar expressions for the other terms.

    (5.84)
    _autogen-svg2png-0113.png

    A sample of 150 subjects is taken from a population which has two subgroups. The subgroup membership of each subject in the sample is known. Each individual is asked a battery of ten questions designed to be independent, in the sense that the answer to any one is not affected by the answer to any other. The subjects answer independently. Data on the results are summarized in the following table:

    Table 5.9.
    GROUP 1 (84 members)GROUP 2 (66 members)
    QYesNoUncYesNoUnc
    1512672734 5
    24232101943 4
    31954113922 5
    4245373819 9
    5275252833 5
    64919161941 6
    7165993721 8
    8473251942 5
    95517122733 6
    10245373921 6

    Assume the data represent the general population consisting of these two groups, so that the data may be used to calculate probabilities and conditional probabilities.

    Several persons are interviewed. The result of each interview is a “profile” of answers to the questions. The goal is to classify the person in one of the two subgroups

    For the following profiles, classify each individual in one of the subgroups

    1. y, n, y, n, y, u, n, u, y. u

    2. n, n, u, n, y, y, u, n, n, y

    3. y, y, n, y, u, u, n, n, y, y

    % file npr05_16.m
    % Data for Exercise 16.
    A = [51 26  7; 42 32 10; 19 54 11; 24 53  7; 27 52  5;
         49 19 16; 16 59  9; 47 32  5; 55 17 12; 24 53  7];
    B = [27 34  5; 19 43  4; 39 22  5; 38 19  9; 28 33  5;
         19 41  6; 37 21  8; 19 42  5; 27 33  6; 39 21  6];
    disp('Call for oddsdf')
    npr05_16
    Call for oddsdf
    oddsdf
    Enter matrix A of frequencies for calibration group 1  A
    Enter matrix B of frequencies for calibration group 2  B
    Number of questions = 10
    Answers per question = 3
     Enter code for answers and call for procedure "odds"
    y = 1;
    n = 2;
    u = 3;
    odds
    Enter profile matrix E  [y n y n y u n u y u]
    Odds favoring Group 1:   3.743
    Classify in Group 1
    odds
    Enter profile matrix E  [n n u n y y u n n y]
    Odds favoring Group 1:   0.2693
    Classify in Group 2
    odds
    Enter profile matrix E  [y y n y u u n n y y]
    Odds favoring Group 1:   5.286
    Classify in Group 1
    

    The data of Exercise 16., above, are converted to conditional probabilities and probabilities, as follows (probabilities are rounded to two decimal places).

    Table 5.10.
    GROUP 1 _autogen-svg2png-0114.pngGROUP 2 _autogen-svg2png-0115.png
    QYesNoUncYesNoUnc
    10.610.310.080.410.510.08
    20.500.380.120.290.650.06
    30.230.640.130.590.330.08
    40.290.630.080.570.290.14
    50.320.620.060.420.500.08
    60.580.230.190.290.620.09
    70.190.700.110.560.320.12
    80.560.380.060.290.630.08
    90.650.200.150.410.500.09
    100.290.630.080.590.320.09

    For the following profiles classify each individual in one of the subgroups.

    1. y, n, y, n, y, u, n, u, y, u

    2. n, n, u, n, y, y, u, n, n, y

    3. y, y, n, y, u, u, n, n, y, y

    npr05_17
    % file npr05_17.m
    % Data for Exercise 17.
    PG1 = 84/150;
    PG2 = 66/125;
    A = [0.61 0.31 0.08
         0.50 0.38 0.12
         0.23 0.64 0.13
         0.29 0.63 0.08
         0.32 0.62 0.06
         0.58 0.23 0.19
         0.19 0.70 0.11
         0.56 0.38 0.06
         0.65 0.20 0.15
         0.29 0.63 0.08];
     
    B = [0.41 0.51 0.08
         0.29 0.65 0.06
         0.59 0.33 0.08
         0.57 0.29 0.14
         0.42 0.50 0.08
         0.29 0.62 0.09
         0.56 0.32 0.12
         0.29 0.64 0.08
         0.41 0.50 0.09
         0.59 0.32 0.09];
    disp('Call for oddsdp')
    Call for oddsdp
    oddsdp
    Enter matrix A of conditional probabilities for Group 1  A
    Enter matrix B of conditional probabilities for Group 2  B
    Probability p1 an individual is from Group 1  PG1
    Number of questions = 10
    Answers per question = 3
     Enter code for answers and call for procedure "odds"
    y = 1;
    n = 2;
    u = 3;
    odds
    Enter profile matrix E  [y n y n y u n u y u]
    Odds favoring Group 1:   3.486
    Classify in Group 1
    odds
    Enter profile matrix E  [n n u n y y u n n y]
    Odds favoring Group 1:   0.2603
    Classify in Group 2
    odds
    Enter profile matrix E  [y y n y u u n n y y]
    Odds favoring Group 1:   5.162
    Classify in Group 1
    
    Solutions