20.6: The Most Probable Population Set at Constant N, V, and T

Last updated
Save as PDF

Page ID: 151793

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

We are imagining that we can examine a collection of \(N\) distinguishable molecules and determine the energy of each molecule in the collection at any particular instant. If we do so, we find the population set, \(\{N_1,\ N_2,\dots ,N_i,\dots .\}\), that characterizes the system at that instant. In Section 3.9, we introduce the idea that the most probable population set, \(\{N^{\textrm{⦁}}_1,N^{\textrm{⦁}}_2,\dots N^{\textrm{⦁}}_i,.,,,\}\), or its proxy, \(\{NP\left({\epsilon }_1\right),NP\left({\epsilon }_2\right),\dots ,NP\left({\epsilon }_i\right),\dots .\}\), is the best prediction we can make about the outcome of a future replication of this measurement. In Section 20.2, we hypothesize that the properties of the system when it is characterized by the most probable population set are indistinguishable from the properties of the system at equilibrium.

Now let us show that this hypothesis is implied by the central limit theorem. We suppose that the population set that characterizes the system varies from instant to instant and that we can find this population set at any given instant. The population set that we find at a particular instant comprises a random sample of \(N\) molecular energies. For this sample, we can find the average energy from

\[\overline{\epsilon }=\sum^{\infty }_{i=1}{\left(\frac{N_i}{N}\right)}{\epsilon }_i \nonumber \]

The expected value of the molecular energy is \[\left\langle \epsilon \right\rangle =\sum^{\infty }_{i=1}{P_i{\epsilon }_i} \nonumber \]

It is important that we remember that \(\overline{\epsilon }\) and \(\left\langle \epsilon \right\rangle\) are not the same thing. There is a distribution of \(\overline{\epsilon }\) values, one \(\overline{\epsilon }\) value for each of the possible population sets \(\{N_1,\ N_2,\dots ,N_i,\dots .\}\). In contrast, when \(N\), \(V\), and \(T\) are fixed, the expected value, \(\left\langle \epsilon \right\rangle\), is a constant; the value of \(\left\langle \epsilon \right\rangle\) is completely determined by the values of the variables that determine the state of the system and fix the probabilities \(P_i\). If our theory is to be useful, the value of \(\left\langle \epsilon \right\rangle\) must be the per-molecule energy that we observe for the macroscopic system we are modeling.

According to the central limit theorem, the average energy of a randomly selected sample, \(\overline{\epsilon }\), approaches the expected value for the distribution, \(\left\langle \epsilon \right\rangle\), as the number of molecules in the sample becomes arbitrarily large. In the present instance, we hypothesize that the most probable population set, or its proxy, characterizes the equilibrium system. When \(N\) is sufficiently large, this hypothesis implies that the probability of the \(i^{th}\) energy level is given by \(P_i={N^{\textrm{⦁}}_i}/{N}\). Then the expected value of a molecular energy is

\[\left\langle \epsilon \right\rangle =\sum^{\infty }_{i=1}{P_i{\epsilon }_i}=\sum^{\infty }_{i=1}{\left(\frac{N^{\textrm{⦁}}_i}{N}\right){\epsilon }_i} \nonumber \]

Since the central limit theorem asserts that \(\overline{\epsilon }\) approaches \(\left\langle \epsilon \right\rangle\) as \(N\) becomes arbitrarily large:

\[0={\mathop{\mathrm{lim}}_{N\to \infty } \left(\overline{\epsilon }-\left\langle \epsilon \right\rangle \ \right)\ }={\mathop{\mathrm{lim}}_{N\to \infty } \sum^{\infty }_{i=1}{\left(\frac{N_i}{N}-P_i\right)}\ }{\varepsilon }_i={\mathop{\mathrm{lim}}_{N\to \infty } \sum^{\infty }_{i=1}{\left(\frac{N_i}{N}-\frac{N^{\textrm{⦁}}_i}{N}\right)}{\epsilon }_i\ } \nonumber \]

One way for the limit of this sum to be zero is for the limit of every individual term to be zero. If the \({\epsilon }_i\) were arbitrary, this would be the only way that the sum could always be zero. However, the \({\epsilon }_i\) and the \(P_i\) are related, so we might think that the sum is zero because of these relationships.

To see that the limit of every individual term must in fact be zero, we devise a new distribution. We assign a completely arbitrary number, \(X_i\), to each energy level. Now the \(i^{th}\) energy level is associated with an \(X_i\) as well as an \({\epsilon }_i\). We have an \(X\) distribution as well as an energy distribution. We can immediately calculate the expected value of \(X\). It is

\[\left\langle X\right\rangle =\sum^{\infty }_{i=1}{P_iX_i} \nonumber \]

When we find the population set \(\{N_1,\ N_2,\dots ,N_i,\dots .\}\), we can calculate the corresponding average value of \(X\). It is \[\overline{X}=\sum^{\infty }_{i=1}{\left(\frac{N_i}{N}\right)}X_i \nonumber \]

The central limit theorem applies to any distribution. So, it certainly applies to the \(X\) distribution; the average value of \(X\) approaches the expected value of \(X\) as \(N\) becomes arbitrarily large:

\[0={\mathop{\mathrm{lim}}_{N\to \infty } \left(\overline{X}-\left\langle X\right\rangle \ \right)\ }={\mathop{\mathrm{lim}}_{N\to \infty } \sum^{\infty }_{i=1}{\left(\frac{N_i}{N}-P_i\right)}\ }X_i={\mathop{\mathrm{lim}}_{N\to \infty } \sum^{\infty }_{i=1}{\left(\frac{N_i}{N}-\frac{N^{\textrm{⦁}}_i}{N}\right)}X_i\ } \nonumber \]

Now, because the \(X_i\) can be chosen completely arbitrarily, the only way that the limit of this sum can always be zero is that every individual term becomes zero.

In the limit as \(N\to \infty\), we find that

\[{N_i}/{N}\to {N^{\textrm{⦁}}_i}/{N} \nonumber \]

As the number of molecules in the equilibrium system becomes arbitrarily large, the fraction of the molecules in each energy level at an arbitrarily selected instant approaches the fraction in that energy level in the equilibrium-characterizing most-probable population set, \(\{N^{\textrm{⦁}}_1,N^{\textrm{⦁}}_2,\dots N^{\textrm{⦁}}_i\dots \}\). In other words, the only population sets that we have any significant chance of observing in a large equilibrium system are population sets whose occupation fractions, \({N_i}/{N}\), are all very close to those, \({N^{\textrm{⦁}}_i}/{N}\), in the equilibrium-characterizing population set. Estimating \(P_i\) as the ratio \({N_i}/{N}\) gives essentially the same result whichever of these population sets we use. Below, we see that the \({\epsilon }_i\) and the \(P_i\) determine the thermodynamic properties of the system. Consequently, when we calculate any observable property of the macroscopic system, each of these population sets gives the same result.

Since the only population sets that we have a significant chance of observing are those for which

\[{N_i}/{N}\approx {N^{\textrm{⦁}}_i}/{N} \nonumber \]

we frequently say that we can ignore all but the most probable population set. What we have in mind is that the most probable population set is the only one we need in order to calculate the macroscopic properties of the equilibrium system. We are incorrect, however, if we allow ourselves to think that the most probable population set is necessarily much more probable than any of the others. Nor does the fact that the \({N_i}/{N}\) are all very close to the \({N^{\textrm{⦁}}_i}/{N}\) mean that the \(N_i\) are all very close to the \(N^{\textrm{⦁}}_i\). Suppose that the difference between the two ratios is \({10}^{-10}\). If \(N={10}^{20}\), the difference between \(N_i\) and \(N^{\textrm{⦁}}_i\) is \({10}^{10}\), which probably falls outside the range of values that we usually understand by the words “very close.”

We develop a theory that includes a mathematical model for the probability that a molecule has any one of its quantum-mechanically possible energies. It turns out that we are frequently interested in macroscopic systems in which the number of energy levels greatly exceeds the number of molecules. For such systems, we find \(NP_i\ll 1\), and it is no longer possible to say that a single most-probable population set, \(\{N^{\textrm{⦁}}_1,N^{\textrm{⦁}}_2,\dots N^{\textrm{⦁}}_i,\dots \}\), describes the equilibrium state of the system. When it is very unlikely that any energy level is occupied by more than one molecule, the probability of any population set in which any \(N_i\) is greater than one becomes negligibly small. We can approximate the total probability sum as

\[1={\left(P_1+P_2+\dots +P_i+\dots \right)}^N\approx \sum_{\{N_i\}}{N!}P^{N_1}_1P^{N_2}_2\dots P^{N_i}_i\dots \nonumber \]

However, the idea that the proxy, \(\{NP\left({\epsilon }_1\right),NP\left({\epsilon }_2\right),\dots ,NP\left({\epsilon }_i\right),\dots .\}\), describes the equilibrium state of the system remains valid. In these circumstances, a great many population sets can have essentially identical properties; the properties calculated from any of these are indistinguishable from each other and indistinguishable from the properties calculated from the proxy. Since the equilibrium properties are fixed, the value of these extended products is fixed. For any of the population sets available to such a system at equilibrium, we have

\[P^{N_1}_1P^{N_2}_2\dots P^{N_i}_i\dots =P^{{NP}_1}_1P^{{NP}_2}_2\dots P^{{NP}_i}_i\dots =\mathrm{constant} \nonumber \]

It follows that, for some constant, \(c\), we have

\[c=\sum^{\infty }_{i=1}{NP_i{ \ln P_i\ }}=N\sum^{\infty }_{i=1}{P_i{ \ln P_i\ }} \nonumber \]

As it evolves, we see that the probability of finding a molecule in an energy level is the central feature of our theory.