Skip to main content
Chemistry LibreTexts

17.2: The Thermal Boltzman Distribution

Consider a N-particle ensemble. The particles are not necessarily indistinguishable and possibly have mutual potential energy. Since this is a large system, there are many different ways to arrange its particles and yet yield the same thermodynamic state. Only one arrangement can occur at a time. The sum of the probabilities of each separate arrangement equals the total number of separate arrangements. Then the probability of a system is

\[p_N=W_N p_i\]

where \(p_N\) is the probability of the system, \(W_N\) is the total number of different possible arrangements of the N particles in the system, and \(p_i\) is the probability of each separate arrangement. Heisenberg's uncertainty principle states that it is impossible to simultaneously know the momentum and the position of an object with complete precision. In agreement with the uncertainty principle, the total possible number of separate ways can be defined as the total number of distinguishable rearrangements of the N particles.

The most practical ensemble is the canonical ensemble with N, V, and T fixed. We can imagine a collection of boxes with equal volumes, and equal number of particles. The entire collection is kept in thermal equilibrium. Based on the Boltzmann factor, we can know that for a system has states with energies \(e_1,e_2,e_3\)..., the probability \(p_j\) that the system will be in the state \(j\) with energy \(E_j\) is exponentially proportional to the energy of state \(j\). The partition functions of the state places a very important role in calculating the properties of a system, for example, it can be used to calculate the probability, as well as the energy, heat capacity, and pressure.

The Boltzmann Distribution

We are ultimately interested in the probability that a given distribution will occur. The reason for this is that we must have this information in order to obtain useful thermodynamic averages. The method used to obtain the distribution function of the ensemble of systems is known as the method of the most probable distribution. We begin with the statistical entropy,

\[S = k \ln W\]

The weight, \(W\) (or thermodynamic probability) is the number of ways that distinguishable particles can be arranged into groups such \(a_0\) is the number in the zeroth group, \(a_1\) is the number in the first group etc. where \(A\) is the total number of systems in the ensemble.

  • \(A\) = total number of systems.
  • \(a_0, a_1, a_2 … \) are the occupation numbers for system in each quantum state.

The overall probability that \(P_j\) that a system is in the jth quantum state is obtained by averaging \(a_j/A\) over all the allowed distributions. Thus, \(P_j\) is given by

\[ P_j = \dfrac{\langle a_j \rangle}{A} = \dfrac{1}{A} \dfrac{ \sum_a W(a) a_j(a)}{\sum_a W(a)}\]

where the angle brackets indicate an ensemble average. Using this definition we can calculate any average property (i.e. any thermodynamic property)

\[ \langle M \rangle = \sum_j M_j P_j \label{avg}\]

The method of the most probable distribution is based on the idea that the average over \(P_j\) is identical to the most probable distribution (i.e. that the distribution is arbitrarily narrow in width). Physically, this results from the fact that we have so many particles in a typical system that the fluctuations from the mean are extremely (immeasurably) small. This point is confusing to most students. If we think only of translation motion, that the number of states increases dramatically as the energy (and quantum number increase). Although the number of states is an increasing function the kinetic energy is fixed and must be distributed in some statistical manner among all of the available molecules.

The equivalence of the average probability of an occupation number and the most probable distribution is expressed as follows:

\[ P_j = \dfrac{\langle a_j \rangle}{A} = \dfrac{a_j}{A}\]

To find the most probable distribution we maximize the probability function subject to two constraints.

  • Constraint 1: Conservation of energy requires \[ E_{total} = \sum_j a_j e_j \label{con1}\] where \(e_j\) is the energy of the jth system in its quantum state.
  • Constraint 2: Conservation of mass requires \[ A = \sum_j a_j \label{con2}\] which says only that the total number of the all of the systems in the ensemble must be \(A\).

Using \(S = k \ln W\) we can reason that the system will tend towards the distribution among the \(a_j\) that maximizes \(S\). This can be expressed as

\[\sum_j \left(\dfrac{\partial S}{\partial a_j}\right) = 0\]

This condition is satisfied by

\[\sum_j \left(\dfrac{\partial \ln W }{\partial a_j}\right) = 0\]

subject to constraints

\[ \sum_j e_j da_j =0\]

\[ \sum_j da_j =0\]

The method of Lagrange multipliers (named after Joseph Louis Lagrange is a strategy for finding the local maxima and minima of a function subject to equality constraints. Using the method of LaGrange undetermined multipliers we have:

\[ \sum_j \left[  \left(\dfrac{\partial \ln W }{\partial a_j}\right)da_j + \alpha da_j - \beta e_j da_j \right] = 0\]

We can use Stirling's approximation

\[\ln x! \approx x\ln x – x\]

to evaluate

\[ \left(\dfrac{\partial \ln W }{\partial a_j}\right) \]

to get

\[ \left(\dfrac{\partial A! }{\partial a_j}\right) - \sum_i \left(\dfrac{\partial \ln a_i }{\partial a_j}\right) = 0\]

as outlined below.

Application of Stirling's Approximation

First step is to note that

\[\ln W = \ln A! - \sum_j \ln a_j! a \approx A \ln A – A - \sum_j a_j \ln a_j - \sum_j a_j\]

Since (from Equation \(\ref{con2}\))

\[A = \sum_j a_j\]

these two cancel to give

\[\ln W = A \ln A - \sum_j a_j \ln a_j\]

The derivative is:

\[  \left(\dfrac{\partial \ln W}{\partial a_j} \right) =   \dfrac{\partial A \ln A}{\partial a_j}  - \sum_i  \dfrac{\partial a_i \ln a_i}{\partial a_j} \]

Therefore we have:

\[\left(\dfrac{\partial A \ln A}{\partial a_j} \right) =  \dfrac{\partial A }{\partial a_j}  \ln A - \dfrac{\partial A }{\partial a_j}  = \ln A -1\]

\[\left(\dfrac{\partial a_i \ln a_i}{\partial a_j} \right) =  \dfrac{\partial a_i }{\partial a_j}  \ln a_i - \dfrac{\partial a_i }{\partial a_j}  = \ln a_j +1\]

These latter derivatives result from the fact that

\[\left( \dfrac{\partial a_i}{\partial a_i} \right) = 1\]

\[\left( \dfrac{\partial a_j}{\partial a_i}\right)=0\]

The simple expression that results from these manipulations is:

\[ - \ln \left( \dfrac{a_j}{A} \right) + \alpha - \beta e_j =0\]

The most probable distribution is

\[ \dfrac{a_j}{A} = e^a \sum_j e^{-\beta e_j} \label{Eq3}\]

Now we need to find the undetermined multipliers \(\alpha\) and \(\beta\).

The left hand side of Equation \(\ref{Eq3}\) is 1. Thus, we have

\[ P_j= \dfrac{a_j}{A} = \dfrac{ e^{-\beta e_j}} {\sum_j e^{-\beta e_j}}\]

This determines a and defines the Boltzmann distribution. We will show that \(\beta\) from the optimization procedure of method of Lagrange multipliers is


This identification will show the importance of temperature in the Boltzmann distribution. The distribution represents a thermally equilibrated most probable distribution over all energy levels (Figure \(\PageIndex{1}\)).

Figure \(\PageIndex{1}\): At lower temperatures, the lower energy states are more greatly populated. At higher temperatures, there are more higher energy states populated, but each is populated less. \(k_BT ~ 2.5\; kJ \;mol^{-1}\) at 300 K. (CC-SA-BY; Wikiversity).

Note: Boltzmann Distribution

The Boltzmann distribution represents a thermally equilibrated most probable distribution over all energy levels. There is always a higher population in a state of lower energy than in one of higher energy.

Once we know the probability distribution for energy, we can calculate thermodynamic properties like the energy, entropy, free energies and heat capacities, which are all average quantities (Equation \(\ref{avg}\)). To calculate \(P_j\), we need the energy levels of a system (i.e., \(\{e_i\}\)). The energy ("levels") of a system can be built up from the quantum energy levels

It must always be remembered that no matter how large the energy spacing is, there is always a non-zero probability of the upper level being populated. The only exception is a system that is at absolute zero. This situation is however hypothetical as absolute zero can be approached but not reached.

Partition Function

The sum over all factors \( e^{-\beta e_j} \) is given a name. It is called the molecular partition function, \(q\).

\[ q = \sum_j e^{-\beta e_j}\]

The molecular partition function \(q\) gives an indication of the average number of states that are thermally accessible to a molecule at the temperature of the system. The partition function is a sum over states (of course with the Boltzmann factor \(beta\) multiplying the energy in the exponent) and is a number. Larger the value of \(q\), larger the number of states which are available for the molecular system to occupy (Figure \(\PageIndex{2}\)).

Figure \(\PageIndex{2}\): At lower temperatures, the lower energy states are more greatly populated. At higher temperatures, there are more higher energy states populated, but each is populated less.

We distinguish here between the partition function of the ensemble, \(Q\) and that of an individual molecule, \(q\). Since \(Q\) represents a sum over all states accessible to the system it can written as

\[ Q(N,V,T) = \sum_{i,j,k ...} e^{-\beta ( e_i + e_j +e_k ...)}\]

where the indices \(i,\,j,\,k\) represent energy levels of different particles.

Regardless of the type of particle the molecular partition function, \(q\) represents the energy levels of one individual molecule. We can rewrite the above sum as

\[Q = q_iq_jq_k…\]


\[Q = q^N\]

for \(N\) particles. Note that \(q_i\) means a sum over states or energy levels accessible to molecule \(i\) and \(q_j\) means the same for molecule \(j\). The molecular partition function, \(q\) counts the energy levels accessible to molecule \(i\) only. \(Q\) counts not only the states of all of the molecules, but all of the possible combinations of occupations of those states. However, if the particles are not distinguishable then we will have counted \(N!\) states too many. The factor of \(N!\) is exactly how many times we can swap the indices in \(Q(N,V,T)\) and get the same value (again provided that the particles are not distinguishable). See this video for more information.


  1. Hakala, R.W. (1967). Simple justification of the form of Boltzmann's distribution law. Journal of Chemical Education. 44(11), 657. doi: 10.1021/ed044p657
  2. Grigorenko, I, Garcia, M.E. (2002). Calculation of the partition function using quantum genetic algorithms. Physica A: Satistical Mechanics and its Applications. 313. 463-470. Retrieved from


  1. Complete the justification of Boltzmann's distribution law by computing the proportionality constant \(a\).
  2. A system contains two energy levels \(E_1, E_2\). Using Boltzmann statistics, express the average energy of the system in terms of \(E_1, E_2\).
  3. Consider a system contains N energy levels. Redo problem #2.
  4. Use the property of exponential function, derive equation (17.9).
  5. What are the uses of partition functions?