Skip to main content
Chemistry LibreTexts

4.2: The Variational Method

Let us now turn to the other method that is used to solve Schrödinger equations approximately, the variational method. In this approach, one must again have some reasonable wave function \(\psi^{(0)}\) that is used to approximate the true wave function. Within this approximate wave function, one imbeds one or more variables {\(\alpha_J\)} that one subsequently varies to achieve a minimum in the energy of \(\psi^{(0)}\) computed as an expectation value of the true Hamiltonian \(H\):

\[E({\alpha_J}) = \dfrac{\langle\psi^{(0)}| H | \psi^{(0)}\rangle}{\langle\psi^{(0)} | \psi^{(0)}\rangle}\]

The optimal values of the \(\alpha_J\) parameters are determined by making

\[\dfrac{dE}{d\alpha_J} = 0\]

To achieve the desired energy minimum. We also should verify that the second derivative matrix

\[\dfrac{\partial^2E}{\partial\alpha_J \partial \alpha_L}\]

has all positive eigenvalues, otherwise one may not have found the minimum.

The theoretical basis underlying the variational method can be understood through the following derivation. Suppose that someone knew the exact eigenstates (i.e., true \(\psi_K\) and true \(E_K\)) of the true Hamiltonian \(H\). These states obey

\[H \psi_K = E_K \psi_K.\]

Because these true states form a complete set (it can be shown that the eigenfunctions of all the Hamiltonian operators we ever encounter have this property), our so-called “trial wave function” \(\psi^{(0)}\) can, in principle, be expanded in terms of these \(\psi_K\):

\[\psi^{(0)}  = \sum_K c_K \psi_K.\]

Before proceeding further, allow me to overcome one likely misconception. What I am going through now is only a derivation of the working formula of the variational method. The final formula will not require us to ever know the exact \(\psi_K\) or the exact \(E_K\), but we are allowed to use them as tools in our derivation because we know they exist even if we never know them.

With the above expansion of our trial function in terms of the exact eigenfunctions, let us now substitute this into the quantity

\[\dfrac{\langle\psi^{(0)}| H | \psi^{(0)}\rangle}{\langle\psi^{(0)} | \psi^{(0)}\rangle}\]

that the variational method instructs us to compute:

\[E=\dfrac{\langle\psi^{(0)}| H | \psi^{(0)}\rangle}{\langle\psi^{(0)} | \psi^{(0)}\rangle}= \dfrac{\left \langle \sum_K c_K \psi_K | H | \sum_L c_L \psi_L \right \rangle}{\langle\sum_K c_K \psi_K|\sum_L c_L \psi_L\rangle} \]

Using the fact that the \(\psi_K\) obey \(H\psi_K = E_K \psi_K\) and that the \(\psi_K\) are orthonormal

\[\langle\psi_K|\psi_L\rangle = \delta_{K.L}\]

the above expression reduces to

\[E = \dfrac{\sum_K \langle c_K \psi_K | H | c_K \psi_K\rangle}{\sum_K\langle c_K \psi_K| c_K \psi_K\rangle} = \dfrac{\sum_K |c_K|^2 E_K}{\sum_K|c_K|^2}.\]

One of the basic properties of the kind of Hamiltonian we encounter is that they have a lowest-energy state. Sometimes we say they are bounded from below, which means their energy states do not continue all the way to minus infinity. There are systems for which this is not the case (we saw one earlier when studying the Stark effect), but we will now assume that we are not dealing with such systems. This allows us to introduce the inequality \(E_K \geq E_0\) which says that all of the energies are higher than or equal to the energy of the lowest state which we denote \(E_0\). Introducing this inequality into the above expression gives

\[E \geq \dfrac{\sum_K |c_K|^2 E_0}{\sum_K|c_K|^2} = E_0.\]

This means that the variational energy, computed as

\[\dfrac{\langle\psi^{(0)}| H | \psi^{(0)}\rangle}{\langle\psi^{(0)} | \psi^{(0)}\rangle} \label{energy}\]

will lie above the true ground-state energy no matter what trial function \(\psi^{(0)}\) we use.

The significance of the above result that \(E \geq E_0\) is as follows. We are allowed to imbed into our trial wave function \(\psi^{(0)}\) parameters that we can vary to make \(E\), computed as Equation \(\ref{energy}\) as low as possible because we know that we can never it lower than the true ground-state energy. The philosophy then is to vary the parameters in \(\psi^{(0)}\) to render \(E\) as low as possible, because the closer \(E\) is to \(E_0\) the “better” is our variational  wave function. Let me now demonstrate how the variational method is used in such a manner by solving an example problem.

Example \(\PageIndex{1}\)

Suppose you are given a trial wave function of the form:

\[ \phi = \dfrac{Z_e^3}{\pi a_0^3}\exp\left(\dfrac{-Z_er_1}{a_0}\right) \exp\left(\dfrac{-Z_er_2}{a_0}\right)​\]

to represent a two-electron ion of nuclear charge \(Z\) and suppose that you are lucky enough that I have already evaluated the

\[\dfrac{\langle\psi^{(0)}| H | \psi^{(0)}\rangle}{\langle\psi^{(0)} | \psi^{(0)}\rangle}\]

integral, which I’ll call \(W\), for you and found

\[W =\left(Z_e^2-2ZZ_e+\dfrac{5}{8}Z_e\right)\dfrac{e^2}{a_0} .\]

Now, let’s find the optimum value of the variational parameter \(Z_e\) for an arbitrary nuclear charge \(Z\) by setting \(= 0\) .  After finding the optimal value of \(Z_e\), we’ll then find the optimal energy by plugging this \(Z_e\) into the above W expression. I’ll do the algebra and see if you can follow.

\[W =\left(Z_e^2-2ZZ_e+\dfrac{5}{8}Z_e​\right)\dfrac{e^2}{a_0}\]

\[\dfrac{dW}{dZ_e}= \left(2Z_e-2Z+\dfrac{5}{8}​\right)\dfrac{e^2}{a_0}= 0\]

\[2Z_e - 2Z +\dfrac{5}{8} = 0\]

\[2Ze = 2Z -\dfrac{5}{8}\]

\[Z_e = Z - \dfrac{5}{16}= Z - 0.3125\]

(n.b., 0.3125 represents the shielding factor of one 1s electron to the other, reducing the optimal effective nuclear charge by this amount). Now, using this optimal \(Z_e\) in our energy expression gives

\[W = Z_e\left(2Z_e-2Z+\dfrac{5}{8}​\right)​\dfrac{e^2}{a_0}​ ,\]

\[W =\left(Z-\dfrac{5}{16}​\right) \left(\left(Z-\dfrac{5}{16}​\right)-2Z+​\dfrac{5}{8}\right)​\dfrac{e^2}{a_0}​\]

\[W =\left(Z-\dfrac{5}{16}​\right)\left(-Z+\dfrac{5}{16}​\right)​\dfrac{e^2}{a_0}\]

\[W = -\left(Z-\dfrac{5}{16}​\right)\left(Z-\dfrac{5}{16}​\right) \dfrac{e^2}{a_0}= -\left(Z-\dfrac{5}{16}​\right)^2\dfrac{e^2}{a_0}\]

\[= - (Z - 0.3125)^2(27.21) {\rm eV}\]

Since \(a_0\) is the Bohr radius 0.529 Å, \(e^2/a_0\) = 27.21 eV, or one atomic unit of energy. Is this energy any good? The total energies of some two-electron atoms and ions have been experimentally determined to be as shown in the Table below.

Z = 1      H-          -14.35 eV    
Z = 2 He -78.98 eV
Z = 3 Li+ -198.02 eV
Z = 4 Be+2 -371.5 eV
Z = 5 B+3 -599.3 eV
Z = 6 C+4 -881.6 eV
Z = 7 N+5 -1218.3 eV
Z = 8 O+6 -1609.5 eV
 

Using our optimized expression for \(W\), let’s now calculate the estimated total energies of each of these atoms and ions as well as the percent error in our estimate for each ion. 

Z Atom Experimental Calculated % Error
Z = 1 H- -14.35 eV -12.86 eV 10.38%
Z = 2 He -78.98 eV -77.46 eV 1.92%
Z = 3 Li+ -198.02 eV -196.46 eV 0.79%
Z = 4 Be+2 -371.5 eV -369.86 eV 0.44%
Z = 5 B+3 -599.3 eV -597.66 eV 0.27%
Z = 6 C+4 -881.6 eV -879.86 eV 0.19%
Z = 7 N+5 -1218.3 eV -1216.48 eV 0.15%
Z = 8 O+6 -1609.5 eV -1607.46 eV 0.13%
 

The energy errors are essentially constant over the range of \(Z\), but produce a larger percentage error at small Z.

In 1928, when quantum mechanics was quite young, it was not known whether the isolated, gas-phase hydride ion, \(H^-\), was stable with respect to loss of an electron to form a hydrogen atom.  Let’s compare our estimated total energy for \(H^-\) to the ground state energy of a hydrogen atom and an isolated electron (which is known to be -13.60 eV). When we use our expression for W and take \(Z = 1\), we obtain \(W = -12.86\) eV, which is greater than -13.6 eV (\(H + e^-\)), so this simple variational calculation erroneously predicts \(H^-\) to be unstable. More complicated variational treatments give a ground state energy of \(H^-\) of -14.35 eV, in agreement with experiment and agreeing that \(H^-\) is indeed stable with respect to electron detachment.

Linear Variational Method

A widely used example of is provided by the so-called linear variational method. Here one expresses the trial wave function a linear combination of so-called basis functions {\(c_j\)}.

\[\psi=\sum_j c_j \chi_j.\]

Substituting this expansion into \(\langle\psi|H|\psi\rangle\) and then making this quantity stationary with respect to variations in the \(c_i\) subject to the constraint that \(\psi\) remains normalized

\[1=\langle\psi|\psi\rangle=\sum_i\sum_j c_i^*\langle\chi_i|\chi_j\rangle c_j\]

gives

\[\sum_j \langle\chi_i|H|\chi_j\rangle c_j = E \sum_j \langle\chi_i|\chi_j\rangle c_j.\]

This is a generalized matrix eigenvalue problem that we can write in matrix notation as

\[\textbf{c}=\textbf{ESC}.\]

It is called a generalized eigenvalue problem because of the appearance of the overlap matrix \(\textbf{S}\) on its right hand side.

This set of equations for the \(c_j\) coefficients can be made into a conventional eigenvalue problem as follows:

1. The eigenvectors \(\textbf{v}_k\) and eigenvalues \(s_k\) of the overlap matrix are found by solving

\[\sum_j S_{i,j}\nu_{k,j}=s_k\nu_{k,i}\]

All of the eigenvalues \(s_k\) are positive because \(\textbf{S}\) is a positive-definite matrix.

2. Next one forms the matrix \(\textbf{S}^{-1/2}\) whose elements are

\[S_{i,j}^{-1/2}=\sum_k\nu_{k,i}\dfrac{1}{\sqrt{s_k}}\nu_{k,j}\]

(another matrix \(\textbf{S}^{1/2}\) can be formed in a similar way replacing  \(\dfrac{1}{\sqrt{s_k}}\) with \(\sqrt{s_k}\)).

3. One then multiplies the generalized eigenvalue equation on the left by \(\textbf{S}^{-1/2}\) to obtain

\[\textbf{S}^{-1/2}\textbf{HC}=\textbf{E} \textbf{S}^{-1/2}\textbf{SC}.\]

4. This equation is then rewritten, using \(\textbf{S}^{-1/2}\textbf{S}\)​ = \(\textbf{S}^{1/2}\)​ and \(1=\textbf{S}^{-1/2}\textbf{S}^{1/2}\)​ as

\[\textbf{S}^{-1/2}\textbf{H} \textbf{S}^{-1/2} (\textbf{S}^{1/2}\textbf{C})=\textbf{E} (\textbf{S}^{1/2}\textbf{C}).\]

This is a conventional eigenvalue problem in which the matrix is \(\textbf{S}^{-1/2}\textbf{H} \textbf{S}^{-1/2}\) and the eigenvectors are \((\textbf{S}^{1/2}\textbf{C})\).

The net result is that one can form \(\textbf{S}^{-1/2}\textbf{H} \textbf{S}^{-1/2}\) and then find its eigenvalues and eigenvectors. Its eigenvalues will be the same as those of the original generalized eigenvalue problem. Its eigenvectors \((\textbf{S}^{1/2}\textbf{C})\) can be used to determine the eigenvectors \(\textbf{C}\) of the original problem by multiplying by \(\textbf{S}^{-1/2}\)

\[\textbf{C}= \textbf{S}^{-1/2} (\textbf{S}^{1/2}\textbf{C}).\]

Although the derivation of the matrix eigenvalue equations resulting from the linear variational method was carried out as a means of minimizing \(\langle\psi|H|\psi\rangle\), it turns out that the solutions offer more than just an upper bound to the lowest true energy of the Hamiltonian. It can be shown that the nth eigenvalue of the matrix \(\textbf{S}^{-1/2}\textbf{H} \textbf{S}^{-1/2}\) is an upper bound to the true energy of the nth state of the Hamiltonian. A consequence of this is that, between any two eigenvalues of the matrix \(\textbf{S}^{-1/2}\textbf{H} \textbf{S}^{-1/2}\) there is at least one true energy of the Hamiltonian. This observation is often called the bracketing condition. The ability of linear variational methods to provide estimates to the ground- and excited-state energies from a single calculation is one of the main strengths of this approach.

Contributors