# 4.2: Characterizing Experimental Errors

- Page ID
- 5743

Characterizing the mass of a penny using the data in Table 4.1 suggests two questions. First, does our measure of central tendency agree with the penny’s expected mass? Second, why is there so much variability in the individual results? The first of these questions addresses the accuracy of our measurements, and the second asks about their precision. In this section we consider the types of experimental errors affecting accuracy and precision.

#### 4.2.1 Errors Affecting Accuracy

Accuracy is a measure of how close a measure of central tendency is to the expected value, μ. We can express accuracy as either an absolute error, *e*

\[e = \overline{X} - μ\tag{4.2}\]

or as a percent relative error, %*e*_{r}.

\[\%e_\ce{r}= \dfrac{\overline{X} − μ}{μ} × 100\tag{4.3}\]

Note

The convention for representing statistical parameters is to use a Roman letter for a value calculated from experimental data, and a Greek letter for the corresponding expected value. For example, the experimentally determined mean is X, and its underlying expected value is ì. Likewise, the standard deviation by experiment is s, and the underlying expected value is s.

Although equations 4.2 and 4.3 use the mean as the measure of central tendency, we also can use the median.

We call errors affecting the accuracy of an analysis determinate. Although there may be several different sources of **determinate error**, each source has a specific magnitude and sign. Some sources of determinate error are positive and others are negative, and some are larger in magnitude and others are smaller. The cumulative effect of these determinate errors is a net positive or negative error in accuracy.

Note

It is possible, although unlikely, that the positive and negative determinate errors will offset each other, producing a result with no net error in accuracy.

We assign determinate errors into four categories—sampling errors, method errors, measurement errors, and personal errors—each of which we consider in this section.

##### Sampling Errors

A determinate **sampling error** occurs when our sampling strategy does not provide a representative sample. For example, if you monitor the environmental quality of a lake by sampling a single location near a point source of pollution, such as an outlet for industrial effluent, then your results will be misleading. In determining the mass of a U. S. penny, our strategy for selecting pennies must ensure that we do not include pennies from other countries.

Note

An awareness of potential sampling errors is especially important when working with heterogeneous materials. Strategies for obtaining representative samples are covered in Chapter 5.

##### Method Errors

In any analysis the relationship between the signal and the absolute amount of analyte, *n*_{A}, or the analyte’s concentration, *C*_{A}, is

\[S_\ce{total} = k_\ce{A}n_\ce{A} + S_\ce{mb}\tag{4.4}\]

\[S_\ce{total} = k_\ce{A}C_\ce{A} + S_\ce{mb}\tag{4.5}\]

where *k*_{A} is the method’s sensitivity for the analyte and *S*_{mb} is the signal from the method blank. A determinate **method error** exists when our value for *k*_{A} or *S*_{mb} is invalid. For example, a method in which *S*_{total} is the mass of a precipitate assumes that *k* is defined by a pure precipitate of known stoichiometry. If this assumption is not true, then the resulting determination of *n*_{A} or *C*_{A} is inaccurate. We can minimize a determinate error in *k*_{A} by calibrating the method. A method error due to an interferent in the reagents is minimized by using a proper method blank.

##### Measurement Errors

The manufacturers of analytical instruments and equipment, such as glassware and balances, usually provide a statement of the item’s maximum **measurement error**, or **tolerance**. For example, a 10-mL volumetric pipet (Figure 4.2) has a tolerance of ±0.02 mL, which means that the pipet delivers an actual volume within the range 9.98–10.02 mL at a temperature of 20 ^{o}C. Although we express this tolerance as a range, the error is determinate; thus, the pipet’s expected volume is a fixed value within the stated range.

**Figure 4.2** Close-up of a 10-mL volumetric pipet showing that it has a tolerance of ±0.02 mL at 20 ^{o}C.

Volumetric glassware is categorized into classes depending on its accuracy. Class A glassware is manufactured to comply with tolerances specified by agencies such as the National Institute of Standards and Technology or the American Society for Testing and Materials. The tolerance level for Class A glassware is small enough that we normally can use it without calibration. The tolerance levels for Class B glassware are usually twice those for Class A glassware. Other types of volumetric glassware, such as beakers and graduated cylinders, are unsuitable for accurately measuring volumes. Table 4.2 provides a summary of typical measurement errors for Class A volumetric glassware. Tolerances for digital pipets and for balances are listed in Table 4.3 and Table 4.4.

We can minimize determinate measurement errors by calibrating our equipment. Balances are calibrated using a reference weight whose mass can be traced back to the SI standard kilogram. Volumetric glassware and digital pipets can be calibrated by determining the mass of water that it delivers or contains and using the density of water to calculate the actual volume. It is never safe to assume that a calibration will remain unchanged during an analysis or over time. One study, for example, found that repeatedly exposing volumetric glassware to higher temperatures during machine washing and oven drying, leads to small, but significant changes in the glassware’s calibration.^{2} Many instruments drift out of calibration over time and may require frequent recalibration during an analysis.

^{†} Tolerance values are from the ASTM E288, E542, and E694 standards.

^{†} Values are from www.eppendorf.com. ^{‡} Units for volume match the units for the pipet’s range.

Balance | Capacity (g) | Measurement Error |
---|---|---|

Precisa 160M | 160 | ±1 mg |

A & D ER 120M | 120 | ±0.1 mg |

Metler H54 | 160 | ±0.01 mg |

##### Personal Errors

Finally, analytical work is always subject to **personal error**, including the ability to see a change in the color of an indicator signaling the endpoint of a titration; biases, such as consistently overestimating or underestimating the value on an instrument’s readout scale; failing to calibrate instrumentation; and misinterpreting procedural directions. You can minimize personal errors by taking proper care.

##### Identifying Determinate Errors

Determinate errors can be difficult to detect. Without knowing the expected value for an analysis, the usual situation in any analysis that matters, there is nothing to which we can compare our experimental result. Nevertheless, there are strategies we can use to detect determinate errors.

The magnitude of a **constant determinate error** is the same for all samples and is more significant when analyzing smaller samples. Analyzing samples of different sizes, therefore, allows us to detect a constant determinate error. For example, consider a quantitative analysis in which we separate the analyte from its matrix and determine its mass. Let’s assume that the sample is 50.0% w/w analyte. As shown in Table 4.5, the expected amount of analyte in a 0.100 g sample is 0.050 g. If the analysis has a positive constant determinate error of 0.010 g, then analyzing the sample gives 0.060 g of analyte, or a concentration of 60.0% w/w. As we increase the size of the sample the obtained results become closer to the expected result. An upward or downward trend in a graph of the analyte’s obtained concentration versus the sample’s mass (Figure 4.3) is evidence of a constant determinate error.

**Figure 4.3** Effect of a constant determinate error on the determination of an analyte in samples of varying size.

A **proportional determinate error**, in which the error’s magnitude depends on the amount of sample, is more difficult to detect because the result of the analysis is independent of the amount of sample. Table 4.6 outlines an example showing the effect of a positive proportional error of 1.0% on the analysis of a sample that is 50.0% w/w in analyte. Regardless of the sample’s size, each analysis gives the same result of 50.5% w/w analyte.

One approach for detecting a proportional determinate error is to analyze a standard containing a known amount of analyte in a matrix similar to the samples. Standards are available from a variety of sources, such as the National Institute of Standards and Technology (where they are called **Standard Reference Materials**) or the American Society for Testing and Materials. Table 4.7, for example, lists certified values for several analytes in a standard sample of *Gingko bilboa* leaves. Another approach is to compare your analysis to an analysis carried out using an independent analytical method known to give accurate results. If the two methods give significantly different results, then a determinate error is the likely cause.

^{†} The primary purpose of this Standard Reference Material is to validate analytical methods for determining flavonoids, terpene lactones, and toxic elements in Ginkgo biloba or other materials with a similar matrix. Values are from the official Certificate of Analysis available at www.nist.gov.

Constant and proportional determinate errors have distinctly different sources, which we can define in terms of the relationship between the signal and the moles or concentration of analyte (equation 4.4 and equation 4.5). An invalid method blank, *S*_{mb}, is a constant determinate error as it adds or subtracts a constant value to the signal. A poorly calibrated method, which yields an invalid sensitivity for the analyte, *k*_{A}, will result in a proportional determinate error.

#### 4.2.2 Errors Affecting Precision

Precision is a measure of the spread of individual measurements or results about a central value, which we express as a range, a standard deviation, or a variance. We make a distinction between two types of precision: repeatability and reproducibility. **Repeatability** is the precision when a single analyst completes the analysis in a single session using the same solutions, equipment, and instrumentation. **Reproducibility**, on the other hand, is the precision under any other set of conditions, including between analysts, or between laboratory sessions for a single analyst. Since reproducibility includes additional sources of variability, the reproducibility of an analysis cannot be better than its repeatability.

Errors affecting precision are indeterminate and are characterized by random variations in their magnitude and their direction. Because they are random, positive and negative **indeterminate errors** tend to cancel, provided that enough measurements are made. In such situations the mean or median is largely unaffected by the precision of the analysis.

##### Sources of Indeterminate Error

We can assign indeterminate errors to several sources, including collecting samples, manipulating samples during the analysis, and making measurements. When collecting a sample, for instance, only a small portion of the available material is taken, increasing the chance that small-scale inhomogeneities in the sample will affect repeatability. Individual pennies, for example, may show variations from several sources, including the manufacturing process, and the loss of small amounts of metal or the addition of dirt during circulation. These variations are sources of indeterminate sampling errors.

During an analysis there are many opportunities for introducing indeterminate method errors. If our method for determining the mass of a penny includes directions for cleaning them of dirt, then we must be careful to treat each penny in the same way. Cleaning some pennies more vigorously than others introduces an indeterminate method error.

Finally, any measuring device is subject to an indeterminate measurement error due to limitations in reading its scale. For example, a buret with scale divisions every 0.1 mL has an inherent indeterminate error of ±0.01–0.03 mL when we estimate the volume to the hundredth of a milliliter (Figure 4.4).

**Figure 4.4 **Close-up of a buret showing the difficulty in estimating volume. With scale divisions every 0.1 mL it is difficult to read the actual volume to better than ±0.01–0.03 mL.

##### Evaluating Indeterminate Error

An indeterminate error due to analytical equipment or instrumentation is generally easy to estimate by measuring the standard deviation for several replicate measurements, or by monitoring the signal’s fluctuations over time in the absence of analyte (Figure 4.5) and calculating the standard deviation. Other sources of indeterminate error, such as treating samples inconsistently, are more difficult to estimate.

**Figure 4.5** Background noise in an instrument showing the random fluctuations in the signal.

To evaluate the effect of indeterminate measurement error on our analysis of the mass of a circulating United States penny, we might make several determinations for the mass of a single penny (Table 4.8). The standard deviation for our original experiment (see Table 4.1) is 0.051 g, and it is 0.0024 g for the data in Table 4.8. The significantly better precision when determining the mass of a single penny suggests that the precision of our analysis is not limited by the balance. A more likely source of indeterminate error is a significant variability in the masses of individual pennies.

Note

In Section 4.5 we will discuss a statistical method—the F-test—that you can use to show that this difference is significant.

#### 4.2.3 Error and Uncertainty

Analytical chemists make a distinction between error and uncertainty.^{3} **Error** is the difference between a single measurement or result and its expected value. In other words, error is a measure of **bias**. As discussed earlier, we can divide error into determinate and indeterminate sources. Although we can correct for determinate errors, the indeterminate portion of the error remains. With statistical significance testing, which is discussed later in this chapter, we can determine if our results show evidence of bias.

**Uncertainty** expresses the range of possible values for a measurement or result. Note that this definition of uncertainty is not the same as our definition of precision. We calculate precision from our experimental data, providing an estimate of indeterminate errors. Uncertainty accounts for all errors—both determinate and indeterminate—that might reasonably affect a measurement or result. Although we always try to correct determinate errors before beginning an analysis, the correction itself is subject to uncertainty.

Here is an example to help illustrate the difference between precision and uncertainty. Suppose you purchase a 10-mL Class A pipet from a laboratory supply company and use it without any additional calibration. The pipet’s tolerance of ±0.02 mL is its uncertainty because your best estimate of its expected volume is 10.00 mL ± 0.02 mL. (See Table 4.2 for the tolerance of a 10-mL class A transfer pipet.) This uncertainty is primarily determinate error. If you use the pipet to dispense several replicate portions of solution, the resulting standard deviation is the pipet’s precision. Table 4.9 shows results for ten such trials, with a mean of 9.992 mL and a standard deviation of ±0.006 mL. This standard deviation is the precision with which we expect to deliver a solution using a Class A 10-mL pipet. In this case the published uncertainty for the pipet (±0.02 mL) is worse than its experimentally determined precision (±0.006 ml). Interestingly, the data in Table 4.9 allows us to calibrate this specific pipet’s delivery volume as 9.992 mL. If we use this volume as a better estimate of this pipet’s expected volume, then its uncertainty is ±0.006 mL. As expected, calibrating the pipet allows us to decrease its uncertainty.^{4}