When using an analytical method we make three separate evaluations of experimental error. First, before beginning an analysis we evaluate potential sources of errors to ensure that they will not adversely effect our results. Second, during the analysis we monitor our measurements to ensure that errors remain acceptable. Finally, at the end of the analysis we evaluate the quality of the measurements and results, comparing them to our original design criteria. This chapter provides an introduction to sources of error, to evaluating errors in analytical measurements, and to the statistical analysis of data.
One way to characterize data from multiple measurements/runs is to assume that the measurements are randomly scattered around a central value that provides the best estimate of expected, or “true” value. There are two common ways to estimate central tendency: the mean and the median.
We call errors affecting the accuracy of an analysis determinate. Although there may be several different sources of determinate error, each source has a specific magnitude and sign. Some sources of determinate error are positive and others are negative, and some are larger in magnitude and others are smaller. The cumulative effect of these determinate errors is a net positive or negative error in accuracy.
A propagation of uncertainty allows us to estimate the uncertainty in a result from the uncertainties in the measurements used to calculate the result.
A population is the set of all objects in the system we are investigating. For our experiment, the population is all United States pennies in circulation. This population is so large that we cannot analyze every member of the population. Instead, we select and analyze a limited subset, or sample of the population.
A confidence interval is a useful way to report the result of an analysis because it sets limits on the expected result. In the absence of determinate error, a confidence interval indicates the range of values in which we expect to find the population’s expected mean. When we report a 95% confidence interval for the mass of a penny as 3.117 g ± 0.047 g, for example, we are claiming that there is only a 5% probability that the expected mass of penny is less than 3.070 g or more than 3.164 g.
The normal distribution is the most common distribution used for experimental results. Because the area between any two limits of a normal distribution is well defined, constructing and evaluating significance tests is straightforward.
A method’s detection limit as the smallest concentration or absolute amount of analyte that has a signal significantly larger than the signal from a suitable blank. Although our interest is in the amount of analyte, in this section we will define the detection limit in terms of the analyte’s signal.
It can be tedious to work problems using nothing more than a calculator. Both Excel and R include functions for descriptive statistics, for finding probabilities for different distributions, and for carrying out significance tests. In addition, R provides useful functions for visualizing your data.
This is the summary to "Chapter 4: Evaluating Analytical Data" from Harvey's "Analytical Chemistry 2.0" Textmap.
Thumbnail: The blue vertical line segments represent multiple realizations of a confidence interval for the population mean μ, represented as a red horizontal dashed line; note that some confidence intervals do not contain the population mean, as expected. Image used with permission (Public Domain; Tsyplakov) .