# 7.5: Using R for Significance Testing and Analysis of Variance

- Page ID
- 222325

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)The base installation of R has functions for most of the significance tests covered in Chapter 7.2 - Chapter 7.4.

## Using R to Compare Variances

The R function for comparing variances is`var.test()`

which takes the following form

`var.test(x, y, ratio = 1, alternative = c("two.sided", "less", "greater"), conf.level = 0.95, ...)`

where `x`

and `y`

are numeric vectors that contain the two samples, `ratio`

is the expected ratio for the null hypothesis (which defaults to 1), `alternative`

is a character string that states the alternative hypothesis (which defaults to `two-sided`

or two-tailed), and a `conf.level`

that gives the size of the confidence interval, which defaults to 0.95, or 95%, or \(\alpha = 0.05\). We can use this function to compare the variances of two samples, \(s_1^2\) vs \(s_2^2\), but not the variance of a sample and the variance for a population \(s^2\) vs \(\sigma^2\).

Let's use R on the data from Example 7.2.3, which considers two sets of United States pennies.

`# create vectors to store the data`

`sample1 = c(3.080, 3.094, 3.107, 3.056, 3.112, 3.174, 3.198)`

`sample2 = c(3.052, 3.141, 3.083, 3.083, 3.048)`

`# run two-sided variance test with alpha = 0.05 and null hypothesis that variances are equal`

`var.test(x = sample1, y = sample2, ratio = 1, alternative = "two.sided", conf.level = 0.95)`

The code above yields the following output

`F test to compare two variances `

`data: sample1 and sample2 `

`F = 1.8726, num df = 6, denom df = 4, p-value = 0.5661 `

`alternative hypothesis: true ratio of variances is not equal to 1 `

`95 percent confidence interval: `

`0.2036028 11.6609726 `

`sample estimates: ratio of variances `

`1.872598`

Two parts of this output lead us to retain the null hypothesis of equal variances. First, the reported *p*-value of 0.5661 is larger than our critical value for \(\alpha\) of 0.05, and second, the 95% confidence interval for the ratio of the variances, which runs from 0.204 to 11.7 includes the null hypothesis that it is 1.

R does not include a function for comparing \(s^2\) to \(\sigma^2\).

## Using R to Compare Means

The R function for comparing means is `t.test()`

and takes the following form

`t.test(x, y = NULL, alternative = c("two.sided", "less", "greater"), mu = 0, `

`paired = FALSE, var.equal = FALSE, conf.level = 0.95, ...)`

where `x`

is a numeric vector that contains the data for one sample and `y`

is an optional vector that contains data for a second sample, `alternative`

is a character string that states the alternative hypothesis (which defaults to two-tailed), `mu`

is either the population's expected mean or the expected difference in the means of the two samples, `paired`

is a logical value that indicates whether the data is paired , `var.equal `

is a logical value that indicates whether the variances for two samples are treated as equal or unequal (based on a prior `var.test()`

), and`conf.level`

gives the size of the confidence interval (which defaults to 0.95, or 95%, or \(\alpha = 0.05\)).

### Using R to Compare \(\overline{X}\) to \(\mu\)

Let's use R on the data from Example 7.2.1, which considers the determination of the \(\% \text{Na}_2 \text{CO}_3\) in a standard sample that is known to be 98.76 % w/w \(\text{Na}_2 \text{CO}_3\).

`# create vector to store the data`

` na2co3 = c(98.71, 98.59, 98.62, 98.44, 98.58)`

`# run a two-sided t-test, using mu to define the expected mean; because the default values `

`# for paired and var.equal are FALSE, we can omit them here`

` t.test(x = na2co3, alternative = "two.sided", mu = 98.76, conf.level = 0.95)`

The code above yields the following output

`One Sample t-test `

`data: na2co3 `

`t = -3.9522, df = 4, p-value = 0.01679 `

`alternative hypothesis: true mean is not equal to 98.76 `

`95 percent confidence interval: `

`98.46717 98.70883 `

`sample estimates: `

`mean of x `

`98.588 `

Two parts of this output lead us to reject the null hypothesis of equal variances. First, the reported *p*-value of 0.01679 is less than our critical value for \(\alpha\) of 0.05, and second, the 95% confidence interval for the experimental mean of 98.588, which runs from 98.467 to 98.709, does not includes the null hypothesis that it is 98.76.

### Using R to Compare Means for Two Samples

When comparing the means for two samples, we have to be careful to consider whether the data is unpaired or paired, and for unpaired data we must determine whether we can pool the variances for the two samples.

### Unpaired Data

Let's use R on the data from Example 7.2.4, which considers two sets of United States pennies. This data is unpaired and, as we showed earlier, there is no evidence to suggest that the variances of the two samples are different.

`# create vectors to store the data`

`sample1 = c(3.080, 3.094, 3.107, 3.056, 3.112, 3.174, 3.198)`

`sample2 = c(3.052, 3.141, 3.083, 3.083, 3.048)`

`# run a two-sided t-test, setting mu to 0 as the null hypothesis is that the means are the same, and setting var.equal to TRUE`

`t.test(x = sample1, y = sample2, alternative = "two.sided", mu = 0, var.equal = TRUE, conf.level = 0.95)`

The code above yields the following output

`Two Sample t-test `

`data: sample1 and sample2 `

`t = 1.3345, df = 10, p-value = 0.2116 `

`alternative hypothesis: true difference in means is not equal to 0 `

`95 percent confidence interval: `

`-0.02403040 0.09580182 `

`sample estimates: `

`mean of x mean of y `

`3.117286 3.081400 `

Two parts of this output lead us to retain the null hypothesis of equal means. First, the reported *p*-value of 0.2116 is greater than our critical value for \(\alpha\) of 0.05, and second, the 95% confidence interval for the difference in the experimental means, which runs from -0.0240 to 0.0958, includes the null hypothesis that it is 0.

### Paired Data

Let's use R on the data from Example 7.2.1, which compares two methods for determining the concentration of the antibiotic monensin in fermentation vats.

`# create vectors to store the data`

`microbiological = c(129.5, 89.6, 76.6, 52.2, 110.8, 50.4, 72.4, 141.4, 75.0, 34.1, 60.3)`

`electrochemical = c(132.3, 91.0, 73.6, 58.2, 104.2, 49.9, 82.1, 154.1, 73.4, 38.1, 60.1)`

`# run a two-tailed t-test, setting mu to 0 as the null hypothesis is that the means are the same, and setting paired to TRUE`

` t.test(x = microbiological, y = electrochemical, alternative = "two.sided", mu = 0, paired = TRUE, conf.level = 0.95)`

The code above yields the following output

`Paired t-test `

`data: microbiological and electrochemical `

`t = -1.3225, df = 10, p-value = 0.2155 `

`alternative hypothesis: true difference in means is not equal to 0 `

`95 percent confidence interval: `

`-6.028684 1.537775 `

`sample estimates: `

`mean of the differences `

`-2.245455 `

Two parts of this output lead us to retain the null hypothesis of equal means. First, the reported *p*-value of 0.2155 is greater than our critical value for \(\alpha\) of 0.05, and second, the 95% confidence interval for the difference in the experimental mean, which runs from -6.03 to 1.54, includes the null hypothesis that it is 0.

## Using R to Detect Outliers

The base installation of R does not include tests for outliers, but the `outliers`

package provided functions for Dixon's Q-test and Grubb's test. To install the package, use the following lines of code

`install.packages("outliers")`

`library(outliers)`

You only need to install the package once, but you must use `library()`

to make the package available when you begin a new R session.

### Dixon Q-test

The R function for Dixon's Q-test is `dixon.test()`

and takes the following form

`dixon.test(x, type, two.sided)`

where `x`

is a numeric vector with the data we are considering, `type`

defines the specific value(s) that we are testing (we will use `type = 10`

, which tests for a single outlier on either end of the ranked data), and `two.sided`

, which indicates whether we use a one-tailed or two-tailed test (we will use `two.sided = FALSE `

as we are interested in whether the smallest value is too small or the largest value is too large).

Let's use R on the data from Example 7.2.7, which considers the masses of a set of United States pennies.

`penny = c(3.067, 2.514, 3.094, 3.049, 3.048, 3.109, 3.039, 3.079, 3.102)`

`dixon.test(x = penny, two.sided = FALSE, type = 10)`

The code above yields the following output

`Dixon test for outliers `

`data: penny Q = 0.88235, p-value < 2.2e-16 `

`alternative hypothesis: lowest value 2.514 is an outlier`

The reported *p*-value of less than \(2.2 \times 10^{-16}\) is less than our critical value for \(\alpha\) of 0.05, which suggests that the penny with a mass of 2.514 g is drawn from a different population than the other pennies.

### Grubb's Test

The R function for the Grubb's test is `grubbs.test()`

and takes the following form

`gurbbs.test(x, type, two.sided)`

where `x`

is a numeric vector with the data we are considering, `type`

defines the specific value(s) that we are testing (we will use `type = 10`

, which tests for a single outlier on either end of the ranked data), and `two.sided`

, which indicated whether we use a one-tailed or two-tailed test (we will use `two.sided = FALSE `

as we are interested in whether the smallest value is too small or the largest value is too large).

Let's use R on the data from Example 7.2.7, which considers the masses of a set of United States pennies.

`penny = c(3.067, 2.514, 3.094, 3.049, 3.048, 3.109, 3.039, 3.079, 3.102)`

`grubbs.test(x = penny, two.sided = FALSE, type = 10)`

The code above yields the following output

`Grubbs test for one outlier `

`data: penny `

`G = 2.64300, U = 0.01768, p-value = 9.69e-07 `

`alternative hypothesis: lowest value 2.514 is an outlier`

The reported p-value of \(9.69 \times 10^{-7}\) is less than our critical value for \(\alpha\) of 0.05, which suggests that the penny with a mass of 2.514 g is drawn from a different population than the other pennies.

## Using R to Complete Non-Parametric Significance Tests

The R function for completing the Wilcoxson signed rank test and the Wilcoxson rank sum test is `wilcox.test()`

, which takes the following form

`wilcox.test(x, y = NULL, alternative = c("two.sided", "less", "greater"), mu = 0, `

`paired = FALSE, conf.level = 0.95, ...)`

where `x`

is a numeric vector that contains the data for one sample and `y`

is an optional vector that contains data for a second sample, `alternative`

is a character string that states the alternative hypothesis (which defaults to two-tailed), `mu`

is either the population's expected mean or the expected difference in the means of the two samples, `paired`

is a logical value that indicates whether the data is paired , and`conf.level`

gives the size of the confidence interval (which defaults to 0.95, or 95%, or \(\alpha = 0.05\)).

### Using R to Complete a Wilcoxson Signed Rank Test

Let's use R on the data from Example 7.3.1, which compares two methods for determining the concentration of the antibiotic monensin in fermentation vats.

`# create vectors to store the data`

`microbiological = c(129.5, 89.6, 76.6, 52.2, 110.8, 50.4, 72.4, 141.4, 75.0, 34.1, 60.3)`

`electrochemical = c(132.3, 91.0, 73.6, 58.2, 104.2, 49.9, 82.1, 154.1, 73.4, 38.1, 60.1)`

`# run a two-tailed wilcoxson signed rank test, setting mu to 0 as the null hypothesis is that `

`# the means are the same and setting paired to TRUE`

` wilcox.test(x = microbiological, y = electrochemical, alternative = "two.sided", mu = 0, paired = TRUE, conf.level = 0.95)`

The code above yields the following output

`Wilcoxon signed rank test `

`data: microbiological and electrochemical `

`V = 22, p-value = 0.3652 `

`alternative hypothesis: true location shift is not equal to 0`

where the value *V* is the smaller of the two signed ranks. The reported *p*-value of 0.3652 is greater than our critical value for \(\alpha\) of 0.05, which means we do not have evidence to suggest that there is a difference between the mean values for the two methods.

### Using R to Complete a Wilcoxson Rank Sum Test

Let's use R on the data from Example 7.3.2, which compares two methods for determining the amount of aspirin in tablets from two production lots.

`# create vectors to store the data`

`lot1 = c(256, 248, 245, 244, 248, 261) `

`lot2= c(241, 258, 241, 256, 254)`

`# run a two-tailed wilcoxson signed rank test, setting mu to 0 as the null hypothesis is `

`# that the means are the same, and setting paired to TRUE`

` wilcox.test(x = lot1, y = lot2, alternative = "two.sided", mu = 0, paired = FALSE, conf.level = 0.95)`

The code above yields the following output

`Wilcoxon rank sum test with continuity correction `

`data: lot1 and lot2 `

`W = 16.5, p-value = 0.8541 `

`alternative hypothesis: true location shift is not equal to 0 `

`Warning message:`

`In wilcox.test.default(x = lot1, y = lot2, alternative = "two.sided", : cannot compute exact p-value with ties`

where the value *W* is the larger of the two ranked sums. The reported *p*-value of 0.8541 is greater than our critical value for \(\alpha\) of 0.05, which means we do not have evidence to suggest that there is a difference between the mean values for the two methods. Note: we can ignore the warning message here as our calculated value for *p* is very large relative to an \(\alpha\) of 0.05.

## Using R to Complete an Analysis of Variance

Let's use the data in Example 7.3.1 to show how to complete an analysis of variance in R. First, we need to create individual numerical vectors for each treatment and then combine these vectors into a single numerical vector, which we will call recovery, that contains the results for each treatment.

`a = c(101, 101, 104)`

`b = c(100, 98, 102)`

`c = c(90, 92, 94)`

`recovery = c(a, b, c)`

We also need to create a vector of character strings that identifies the individual treatments for each element in the vector recovery.

`treatment = c(rep("a", 3), rep("b", 3), rep("c", 3))`

The R function for completing an analysis of variance is `aov()`

, which takes the following form

`aov(formula, ...)`

where formula is a way of telling R to "explain this variable by using that variable." We will examine formulas in more detail in Chapter 8, but in this case the syntax is `recovery ~ treatment`

, which means to model the recovery based on the treatment. In the code below, we assign the output of the `aov()`

function to a variable so that we have access to the results of the analysis of variance

`aov_output = aov(recovery ~ treatment)`

through the `summary() `

function

`summary(aov_output)`

`Df Sum Sq Mean Sq F value Pr(>F) `

`treatment 2 168 84.00 22.91 0.00155 ** `

`Residuals 6 22 3.67 `

`--- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1`

Note that what we earlier called the between variance is identified here as the variance due to the treatments, and that we earlier called the within variance is identified her as the residual variance. As we saw in Example 7.3.1, the value for *F _{exp}* is significantly greater than the critical value for

*F*at \(\alpha = 0.05\).

Having found evidence that there is a significant difference between the treatments, we can use R's `TukeyHSD()`

function to identify the source(s) of that difference (HSD stands for Honest Significant Difference), which takes the general form

`TukeyHSD(x, conf.level = 0.95, ...)`

where `x`

is an object that contains the results of an analysis of variance.

`TukeyHSD(aov_output)`

`Tukey multiple comparisons of means `

`95% family-wise confidence level `

`Fit: aov(formula = recovery ~ treatment) `

`$treatment `

`diff lwr upr p adj `

`b-a -2 -6.797161 2.797161 0.4554965 `

`c-a -10 -14.797161 -5.202839 0.0016720 `

`c-b -8 -12.797161 -3.202839 0.0052447`

The table at the end of the output shows, for each pair of treatments, the difference in their mean values, the lower and the upper values for the confidence interval about the mean, and the value for \(\alpha\), which in R is listed as an adjusted *p*-value, for which we can reject the null hypothesis that the means are identical. In this case, we can see that the results for treatment C are significantly different from both treatments A and B.

We also can view the results of the TukeyHSD analysis visually by passing it to R's `plot()`

function.

`plot(TukeyHSD(aov_output))`