3.13: The Expected Value of a Function of Several Variables and the Central Limit Theorem

Last updated
Save as PDF

Page ID: 151673

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

We can extend the idea of an expected value to a function of multiple random variables. Let U and V be distributions whose random variables are \(u\) and \(v\), respectively. Let the probability density functions for these distributions be \({df_u\left(u\right)}/{du}\) and \({df_v\left(v\right)}/{dv}\). In general, these probability density functions are different functions; that is, \(U\) and \(V\) are different distributions. Let \(g\left(u,v\right)\) be some function of these random variables. The probability that an observation made on \(U\) produces a value of \(u\) in the range \(u^*<u<u^*+du\) is

\[P\left(u^*<u<u^*+du\right)\ =\frac{df_u\left(u^*\right)}{du}du\nonumber \]

and the probability that an observation made on \(V\) produces a value of \(v\) in the range \(v^*<v<v^*+dv\) is

\[P\left(v^*<v<v^*+dv\right)=\frac{df_v\left(v^*\right)}{dv}dv\nonumber \]

The probability that making one observation on each of these distributions produces a value of \(u\) that lies in the range \(u^*<u<u^*+du\) and a value of \(v\) that lies in the range \(v^*<v<v^*+dv\) is

\[\frac{df_u\left(u^*\right)}{du}\frac{df_v\left(v^*\right)}{dv}du\,dv\nonumber \]

In a straightforward generalization, we define the expected value of \(g\left(u,v\right)\), \(\left\langle g\left(u,v\right)\ \right\rangle\), as \[\left\langle g\left(u,v\right)\ \right\rangle =\int^{\infty }_{v=-\infty }{\int^{\infty }_{u=-\infty }{g\left(u,v\right)}}\frac{df_u\left(u\right)}{du}\frac{df_v\left(v\right)}{dv}dudv\nonumber \]

If \(g\left(u,v\right)\) is a sum of functions of independent variables, \(g\left(u,v\right)=h\left(u\right)+k\left(v\right)\), we have

\[\left\langle g\left(u,v\right)\right\rangle =\int^{\infty }_{-\infty }{\int^{\infty }_{-\infty }{\left[h\left(u\right)+k\left(v\right)\right]\frac{df_u\left(u\right)}{du}\frac{df_v\left(v\right)}{dv}}dudv}=\int^{\infty }_{-\infty }{h\left(u\right)\frac{df_u\left(u\right)}{du}}du+\int^{\infty }_{-\infty }{k\left(v\right)\frac{df_v\left(v\right)}{dv}}dv=\ \left\langle h\left(u\right)\ \right\rangle +\left\langle k\left(v\right)\ \right\rangle\nonumber \]

If \(g\left(u,v\right)\) is a product of independent functions, \(g\left(u,v\right)=h\left(u\right)k\left(v\right)\), we have

\[\left\langle g\left(u,v\right)\right\rangle =\int^{\infty }_{-\infty }{\int^{\infty }_{-\infty }{h\left(u\right)k\left(v\right)\frac{df_u\left(u\right)}{du}\frac{df_v\left(v\right)}{dv}}dudv}\ \ \ \ =\int^{\infty }_{-\infty }{h\left(u\right)\frac{df_u\left(u\right)}{du}}du\times \int^{\infty }_{-\infty }{k\left(v\right)\frac{df_v\left(v\right)}{dv}}d=\ \left\langle h\left(u\right)\right\rangle \ \left\langle k\left(v\right)\right\rangle\nonumber \]

We can extend these conclusions to functions of the random variables of any number of distributions. If \(u_i\) is the random variable of distribution \(U_i\) whose probability density function is \({df_i\left(u_i\right)}/{du_i}\), the expected value of

\[g\left(u_1,\dots ,u_i,\dots ,u_N\right)=h_1\left(u_1\right)+\dots +h_i\left(u_i\right)+\dots +h_N\left(u_N\right)\nonumber \]

becomes

\[\left\langle g\left(u_1,\dots ,u_i,\dots ,u_N\right)\right\rangle =\sum^N_{i=1}{\left\langle h_i\left(u_i\right)\right\rangle }\nonumber \]

and the expected value of

\[g\left(u_1,\dots ,u_i,\dots ,u_N\right)=h_1\left(u_1\right)\dots h_i\left(u_i\right)\dots h_N\left(u_N\right)\nonumber \]

becomes \[\left\langle g\left(u_1,\dots ,u_i,\dots ,u_N\right)\right\rangle =\ \ \prod^N_{i=1}{\left\langle h_i\left(u_i\right)\right\rangle }\nonumber \]

We are particularly interested in expected values for repeated trials made on the same distribution. We consider distributions for which the outcome of one trial is independent of the outcome of any other trial. The probability density function is the same for every trial, so we have \(f\left(u\right)=f_1\left(u_1\right)=\dots =f_i\left(u_i\right)=\dots =f_N\left(u_N\right)\). Let the values obtained for the random variable in a series of trials on the same distribution be \(u_1\),…, \(u_i\),…, \(u_N\). For each trial, we have

\[\left\langle h_i\left(u_i\right)\right\rangle \ =\ \ \int^{\infty }_{-\infty }{h_i\left(u_i\right)\frac{df_i\left(u_i\right)}{du_i}}du_i\nonumber \]

If we consider the special case of repeated trials in which the functions \(h_i\left(u_i\right)\) are all the same function, so that \(h\left(u\right)=h_1\left(u_1\right)=\dots =h_i\left(u_i\right)=\dots =h_N\left(u_N\right)\), the expected value of

\[g\left(u_1,\dots ,u_i,\dots ,u_N\right)\nonumber \] \[=h_1\left(u_1\right)+\dots +h_i\left(u_i\right)+\dots +h_N\left(u_N\right)\nonumber \]

becomes

\[\left\langle g\left(u_1,\dots ,u_i,\dots ,u_N\right)\right\rangle =\ \sum^N_{i=1}{\left\langle h_i\left(u_i\right)\right\rangle \ }=N\left\langle h\left(u\right)\right\rangle\nonumber \]

and the expected value of

\[g\left(u_1,\dots ,u_i,\dots ,u_N\right)=h_1\left(u_1\right)\dots h_i\left(u_i\right)\dots h_N\left(u_N\right)\nonumber \]

becomes

\[\left\langle g\left(u_1,\dots ,u_i,\dots ,u_N\right)\right\rangle =\ \prod^N_{i=1}{\left\langle h_i\left(u_i\right)\right\rangle }={\left\langle h\left(u\right)\right\rangle }^N\nonumber \]

Now let us consider \(N\) independent trials on the same distribution and let \(h_i\left(u_i\right)=h\left(u_i\right)=u_i\). Then, the expected value of

\[g\left(u_1,\dots ,u_i,\dots ,u_N\right)= h_1\left(u_1\right)+\dots +h_i\left(u_i\right)+\dots +h_N\left(u_N\right)\nonumber \]

becomes \[\ \left\langle u_1+\dots +u_i+\dots +u_N\right\rangle =\sum^N_{i=1}{\left\langle u_i\right\rangle }=N\left\langle u\right\rangle =N\mu\nonumber \]

By definition, the average of \(N\) repeated trials is

\({\overline{u}}_N={\left(u_1+\dots +u_i+\dots +u_N\right)}/{N}\), so that the expected value of the mean of a distribution of an average-of- \(N\) repeated trials is

\[\left\langle {\overline{u}}_N\right\rangle =\frac{\left\langle u_1+\dots +u_i+\dots +u_N\right\rangle }{N}=\mu\nonumber \]

This proves one element of the central limit theorem: The mean of a distribution of averages-of- \(N\) values of a random variable drawn from a parent distribution is equal to the mean of the parent distribution.

The variance of these averages-of- \(N\) is

\[\sigma^2_N=\left\langle {\left({\overline{u}}_N-\mu \right)}^2\right\rangle =\left\langle {\left[\left(\frac{1}{N}\sum^N_{i=1}{u_i}\right)-\mu \right]}^2\ \right\rangle =\left\langle {\left[\left(\frac{1}{N}\sum^N_{i=1}{u_i}\right)-\frac{N\mu }{N}\right]}^2\right\rangle =\ \frac{1}{N^2}\ \left\langle \ {\left[\left(\sum^N_{i=1}{u_i}\right)-N\mu \right]}^2\right\rangle =\ \frac{1}{N^2}\ \left\langle {\left[\left(\sum^N_{i=1}{\left(u_i-\mu \right)}\right)\right]}^2\ \right\rangle =\frac{1}{N^2\ }\ \left\langle \ \sum^N_{i=1}\left(u_i-\mu \right)^2 \right\rangle +\frac{2}{N^2} \left\langle \sum^{N-1}_{i=1} \left(u_i-\mu \right)\sum^N_{j=i+1} \left(u_j-\mu \right) \right \rangle =\frac{1}{N^2}\sum^N_{i=1} \left\langle \left(u_i-\mu \right)^2 \right\rangle +\frac{2}{N^2} \left\langle \sum^{N-1}_{i=1} \left(u_i-\mu \right) \right\rangle \ \left\langle \sum^N_{j=i+1} \left(u_j-\mu \right) \right\rangle\nonumber \]

Where the last term is zero, because

\[\left\langle \sum^{N-1}_{i=1}{\left(u_i-\mu \right)}\ \right\rangle = \sum^{N-1}_{i=1}{\left\langle \ \left(u_i-\mu \right)\ \right\rangle } \nonumber \]

and

\[\left\langle \ \left(u_i-\mu \right)\ \right\rangle \ =0\nonumber \]

By definition, \(\sigma^2=\ \left\langle \ {\left(u_i-\mu \right)}^2\ \right\rangle\), so that we have

\[\sigma^2_N=\frac{N\sigma^2}{N^2}=\frac{\sigma^2}{N}\nonumber \]

This proves a second element of the central limit theorem: The variance of an average of \(N\) values of a random variable drawn from a parent distribution is equal to the variance of the parent distribution divided by \(N\).