Skip to main content
Chemistry LibreTexts

Part V: Ways to Draw Conclusions From Data

  • Page ID
    81347
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    In Part IV we noted that when a population is normally distributed, the probability of obtaining a particular result for any single sample is determined by that result’s area under the normal distribution curve defined by the population’s mean and standard deviation. For example, in Investigation 24 we showed that for 1.69-oz bags of plain M&Ms, 22.8% have a net weight less than 1.69 oz if the population’s mean is 48.98 g and its standard deviation is 1.433 g.

    Suppose we select a single sample from this population: What can we predict about the net weight of M&Ms in that sample? Rearranging our equation for z, we find that

    \[x = μ ± zσ\]

    We call this equation a confidence interval because the value we choose for z defines the probability (our confidence) that the result for a single sample is in the range μ ± .

    Investigation 26.

    A z of 1.96 corresponds to a 95% confidence interval. Using Appendix 2, show that this is correct. What value of z corresponds to a 90% confidence inteval, and what value of z corresponds to a 99% confidence inteval? Report the 90%, the 95% and the 99% confidence intervals for the net weight of a single 1.69-oz bag of plain M&Ms drawn from a population for which μ is 48.98 g and σ is 1.433 g. For the data in Table 2, how many of the 30 samples have net weights that fall outside of the 90% confidence interval? Does this result make sense given your understanding of a confidence interval?

    In Investigation 26 we calculated the confidence interval for a single sample based on the properties of the population from which we obtained the sample. If we draw several replicate samples from this population and calculate their mean, \(\bar{x}\), then the confidence interval becomes

    \[\bar{x} = μ ± \dfrac{zσ}{\sqrt{n}}\]

    where n is the number of samples.

    Investigation 27.

    Suppose we draw four 1.69-oz bags of M&Ms from a population for which μ is 48.98 g and σ is 1.433 g. What are the 90%, the 95% and the 99% confidence intervals for the mean, \(\bar{x}\), of these samples? Prepare a plot that shows how n affects the width of the 95% confidence interval, expressed as \(±zσ/\sqrt{n}\), and discuss the significance of your plot. Suppose we wish to decrease the confidence interval by a factor of 3× solely by increasing the number of samples taken. If the original confidence interval is based on the mean of four samples, how many additional samples must we acquire?

    In both Investigation 26 and Investigation 27 we attempt to predict a property of a sample based on a population with known values of μ and σ. For most practical analytical problems, however, we need to work in the opposite direction, using the sample’s mean, \(\bar{x}\), and its standard deviation, s, to predict the population’s mean, μ. To do this, we make three modifications to our equation for the confidence interval: we rewrite the equation so that it expresses μ in terms of \(\bar{x}\); we replace the population’s standard deviation, σ, with the sample’s standard deviation, s; and we replace z with the variable t, where we define t such that, for any confidence level, tz and the value of t approaches z as the number of samples, n, increases.

    \[μ=\bar{x}±\dfrac{ts}{\sqrt n}\]

    Clearly the value of t depends on the confidence interval and the number of samples; see Appendix 3 for further details.

    Investigation 28.

    Our data for 1.69-oz bags of plain M&Ms includes 30 measurements of the net weight. What are the 90%, the 95% and the 99% confidence intervals for the mean, \(\bar{x}\), of these samples? Using the 99% confidence interval as an example, explain the meaning of this confidence interval. Is the stated net weight of 1.69 oz a reasonable estimate of the true mean for the population of 1.69-oz bags of plain M&Ms?

    Our approach in Investigation 28 suggests we can use a confidence interval to decide whether a known value is consistent with our results, a process that we call significance testing and that we carry out a bit more formally than suggested by Investigation 28. To illustrate the process, we will use the data from Table 2 for the bags of M&Ms purchased at Target and evaluate whether the mean net weight for these samples is consistent with the stated net weight of 1.69 oz (47.9 g).

    To begin, we summarize the experimental results for our sample, which in this case is a mean of 49.52 g and a standard deviation of 1.649 g for n = 10 samples. Next, we state our problem in the form of a yes/no question, the answers to which we define using a null hypothesis (H0) and an alternative hypothesis (HA); for example, for this problem our yes/no question is “Is the mean of the samples consistent with the stated net weight of 1.69 oz?,” which we define as

    \[H_0\textrm{: }\bar{x}=μ\: \textrm{(yes)}\]

    \[H_A\textrm{: }\bar{x}≠μ\: \textrm{(no)}\]

    where \(\bar{x}\) is 49.52 g and μ is 47.9 g. To evaluate the two hypotheses, we rewrite the equation for the confidence interval so that we can solve for t

    \[t= \dfrac{|\bar{x}-μ| \sqrt n}{s}=\dfrac{|49.52 - 47.9 | \sqrt{10}}{1.649} =3.087\]

    Finally, we compare this experimental value of t to the critical values of t for the correct number of degrees of freedom (in this case, \(ν = n - 1 = 10 - 1 = 9\)). From Appendix 3 we see that \(t(α,ν)\) is 1.833 for a 90% confidence interval (an \(\alpha\) of 0.10), 2.262 for a 95% confidence interval (an \(\alpha\) of 0.05), 2.821 for a 98% confidence interval (an \(\alpha\) of 0.02), and 3.250 for a 99% confidence interval (an \(\alpha\) of 0.01). Our experimental value for t of 3.087 falls between the critical values for the 98% and the 99% confidence interval; if we are willing to accept an uncertainty of 1–2%, then we can reject the null hypothesis and accept the alternative hypothesis, concluding that the mean of 49.52 g is not consistent with the stated net weight of 1.69 oz. We call this a t-test of \(\bar{x}\) vs. μ.

    Investigation 29.

    In 1996, Mars, the manufacturer of M&Ms, reported the following distribution for the colors of plain M&Ms: 30% brown, 20% red, 20% yellow, 10% blue, 10% green, and 10% orange. Pick any one color of M&Ms and, using the data in Table 2, calculate the percentage of that color in each of the 30 samples. Report the mean and the standard deviation for your color and use a t-test to determine whether your sample’s mean is consistent with the result reported by Mars. Gather results for the remaining five colors from other students and discuss your pooled results. Assuming that the distribution of colors reported by Mars is correct, what can you conclude about the manufacturing process.


    This page titled Part V: Ways to Draw Conclusions From Data is shared under a CC BY-NC-SA 4.0 license and was authored, remixed, and/or curated by Contributor.

    • Was this article helpful?