Skip to main content
Chemistry LibreTexts

2.1: Measures of Center

  • Page ID
    493479
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    Both graphical and numerical methods of summarizing data make up the branch of statistics known as descriptive statistics.  This section introduces numerical measurements to describe sample data.

    This section focuses on measures of central tendency. Many times, you are asking what to expect “on average.” Such as when you pick a career, you would probably ask how much you expect to earn in that field. If you are trying to buy a home, you might ask how much homes are selling for in your area. If you are planting vegetables in your garden, you might want to know how long it will be until you can harvest. These questions, and many more, can be answered by knowing the center of the data set. The three most common measures of the “center” of the data are called the mode, mean, and median.

    3.1.1 Mode

    To find the mode, you count how often each data value occurs, and then determine which data value occurs most often.

    The mode is the data value that occurs the most frequently in the data.

    There may not be a mode at all, or you may have more than one mode. If there is a tie between two values for the greatest number of times then both values are the mode and the data is called bimodal (two modes). If every data point occurs the same number of times, there is no mode. If there are more than two numbers that appear the most times, then usually we write there is no mode. When looking at grouped data in a frequency distribution or a histogram then the largest frequency is called the modal class.

    Below is a dotplot showing the height of some 3-year-old children in cm and we would like to answer the question, “How tall are 3-year-olds?”

    clipboard_e4a18c28f493be223bad43f6a6c1cfe9c.png

    Figure 3-1

    From the graph, we can see that the most frequent value is 95 cm. This is not exactly the middle of the distribution, but it is the most common height and is close to the middle in this case. We call this most frequent value the mode.

    For larger data sets, use software to find the mode or at least sort the data so that you can see grouping of numbers. Excel reports a mode at the first repetitive value, so be careful in Excel with bimodal data or data with many multiples that would really have no mode at all.

    Note that zero may be the most frequent value in a data set. The mode = 0 is not the same as “no mode” in the data set.

    The mode is the observation that occurs most often.

    • Example 3-1: -5 4 8 3 4 2 0 mode = 4
    • Example 3-2: 3 -6 0 1 -2 1 0 5 0 mode = 0
    • Example 3-3: 18 25 15 32 10 27 no mode (Excel writes N/A)
    • Example 3-4: 15 23 18 15 24 23 17 modes = 15, 23 (bimodal)
    • Example 3-5: 100 125 100 125 130 140 130 140 no mode (Excel gives 100)

    Summation Notation

    Throughout this course, we will be using summation notation, also called sigma notation. The capital Greek letter Σ “sigma” means to add. For example, Σx means to sum up all of the x values where X is the variable name.

     

    A random sample of households had the following number of children living at home 4, –3, 2, 1, and 3. Calculate Σx.

    Solution

    Let x1 = 4, x2 = –3, x3 = 2, x4 = 1, x5 = 3. Start with the first value i = 1 up to the nth value i = 5 to get \(\sum_{i=1}^{n} x_{i}\) = 4 + –3 + 2 + 1 + 3 = 7.

    To make things simpler we will drop the subscripts and write \(\sum_{i=1}^{n} x_{i}\) as Σxi or Σx.

    The order of operations is important in summation notation.

    For example, Σx2 = (4)2 + (–3)2 + (2)2 + (1)2 + (3)2 = 39.

    When we insert parentheses (Σx)2 = (4 + –3 + 2 + 1 + 3)2 = (7)2 = 49.

    Note that Σx2 ≠ (Σx)2.

    “‘One of the interesting things about space,’ Arthur heard Slartibartfast saying to a large and voluminous creature who looked like someone losing a fight with a pink duvet and was gazing raptly at the old man's deep eyes and silver beard, ‘is how dull it is.’

    ‘Dull?’ said the creature, and blinked her rather wrinkled and bloodshot eyes.

    ‘Yes,’ said Slartibartfast, ‘staggeringly dull. Bewilderingly so. You see, there's so much of it and so little in it. Would you like me to quote some statistics?’

    ‘Er, well…’

    ‘Please, I would like to. They, too, are quite sensationally dull.’” (Adams, 2002)

    3.1.2 Mean

    The mean is the arithmetic average of the numbers. This is the center that most people call the average.

    Distinguishing between a population and a sample is very important in statistics. We frequently use a representative sample to generalize about a population.

    A statistic is any characteristic or measure from a sample. A parameter is any characteristic or measure from a population. We use sample statistics to make inferences about population parameters.

    The sample mean = \(\overline{ x }\) (pronounced “x bar”) of a sample of n observations x1, x2, x3,…,xn taken from a population, is given by the formula:

    \(\overline{ x }\) = \(\frac{\text { ∑x }}{\text { n }}\) = \(\frac{\text { x1+x2+x3+⋯+xn }}{\text { n }}\).

    The population mean = μ (pronounced “mu”) is the average of the entire population, is given by the formula:

    μ = \(\frac{\text { ∑x }}{\text { N }}\) = \(\frac{\text { x1+x2+x3+⋯+xN }}{\text {N }}\).

    Most cases, you cannot find the population parameter, so you use the sample statistic to estimate the population parameter. Since μ cannot be calculated in most situations, the value for ��̅is used to estimate μ. You should memorize the symbol μ and what it represents for future reference.

     

    Find the mean for the following sample of house prices ($1,000): 325, 375, 385, 395, 420, and 825.

    Solution

    Before starting any mathematics problem, it is always a good idea to define the unknown in the problem. In this case, you want to define the variable. The symbol for the variable is x. The variable is x = price of a house in $1,000.

    \(\overline{ x }\) = \(\frac{\text { ∑x }}{\text { n }}\) = \(\frac{\text { 325+375+385+395+420+825 }}{\text { 6 }}\) = 454.1\(\overline{6}\)

    The sample mean house price is $454,166.67.

     

    3.1.3 Median

    Another statistic that measures the center of a distribution is the median.

    The median is the data value in the middle of the ordered data that has 50% of the data below that point and 50% of the data above that point. The median is also referred to as the 50th percentile and is the midpoint of a distribution.

    To find the median:

    1. Arrange the observations from smallest to largest.
    2. If the number of observations n is odd, the middle observation is the median.
    3. If the number of observations n is even, the mean of the two middle observations is the median.

     

    Find the median for the following sample of ages: 15, 23, 18, 15, 24, 23, and 17.

    Solution

    First, sort the data: 15, 15, 17, 18, 23, 23, and 24. The sample size is odd so the median will be the middle number. Use your fingers to cover outside numbers, one pair at a time until you get to 18. Median = 18 years old.

     

    Find the median for the following sample of house prices (in $1,000): 325, 375, 385, 395, 420, and 825.

    Solution

    The data is already ordered from smallest to largest. The sample size is even so take the average of the two middle values \(\frac{\text { 385+395 }}{\text { 2 }}\) = 390. The median house price is $390,000.

    We can use technology to find the median. Directions for the TI calculators are in the next section. In Excel the median is found using the cell function MEDIAN(array). For this example, we can type the data into column A and then in a blank cell =MEDIAN(A1:A6).

    Recall that the sample mean house price is $454,167. Note that the median is much lower than the mean for this example. The observation of 825 is an outlier and is very large compared to the rest of the data. The sample mean is sensitive to unusual observations, i.e. outliers. The median is resistant to outliers.

    3.1.4 Outliers

    An outlier is a data value that is very different from the rest of the data and is far enough from the center. If there are extreme values in the data, the median is a better measure of the center than the mean. The mean is not a resistant measure because it is moved in the direction of the outlier. The median and the mode are resistant measures because they are not affected by extreme values.

    As a consumer, you need to be aware that people choose the measure of center that best supports their claim. When you read an article in the newspaper and it talks about the “average,” it usually means the mean but sometimes it refers to the median. Some articles will use the word “median” instead of “average” to be more specific. If you need to make an important decision and the information says “average,” it would be wise to ask if the “average” is the mean or the median before you decide.

    As an example, suppose that a company administration wants to use the mean salary as the average salary for the company. This is because the high salaries of the administration will pull the mean higher. The company can say that the employees are paid well because the average is high. However, the employees’ union wants to use the median since it discounts the extreme values of the administration and will give a lower value of the average. This will make the salaries seem lower and that a raise is in order.

    Why use the mean instead of the median? When multiple samples are taken from the same population, the sample means tend to be more consistent than other measures of the center. The sample mean is the more reliable measure of center.

     


    This page titled 2.1: Measures of Center is shared under a CC BY-SA 4.0 license and was authored, remixed, and/or curated by Rachel Webb via source content that was edited to the style and standards of the LibreTexts platform.