4.2: The Variational Method

Last updated
Save as PDF

Page ID: 11581

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

Let us now turn to the other method that is used to solve Schrödinger equations approximately, the variational method. In this approach, one must again have some reasonable wavefunction \(\psi^{(0)}\) that is used to approximate the true wavefunction. Within this approximate wavefunction, one embeds one or more variables {\(\alpha_J\)} that one subsequently varies to achieve a minimum in the energy of \(\psi^{(0)}\) computed as an expectation value of the true Hamiltonian \(H\):

\[E({\alpha_J}) = \dfrac{\langle\psi^{(0)}| H | \psi^{(0)}\rangle}{\langle\psi^{(0)} | \psi^{(0)}\rangle}\nonumber \]

The optimal values of the \(\alpha_J\) parameters are determined by making

\[\dfrac{dE}{d\alpha_J} = 0\nonumber \]

To achieve the desired energy minimum. We also should verify that the second derivative matrix

\[\dfrac{\partial^2E}{\partial\alpha_J \partial \alpha_L}\nonumber \]

has all positive eigenvalues, otherwise one may not have found the minimum.

The theoretical basis underlying the variational method can be understood through the following derivation. Suppose that someone knew the exact eigenstates (i.e., true \(\psi_K\) and true \(E_K\)) of the true Hamiltonian \(H\). These states obey

\[H \psi_K = E_K \psi_K.\nonumber \]

Because these true states form a complete set (it can be shown that the eigenfunctions of all the Hamiltonian operators we ever encounter have this property), our so-called “trial wavefunction” \(\psi^{(0)}\) can, in principle, be expanded in terms of these \(\psi_K\):

\[\psi^{(0)} = \displaystyle \sum_K c_K \psi_K.\nonumber \]

Before proceeding further, allow me to overcome one likely misconception. What I am going through now is only a derivation of the working formula of the variational method. The final formula will not require us to ever know the exact \(\psi_K\) or the exact \(E_K\), but we are allowed to use them as tools in our derivation because we know they exist even if we never know them.

With the above expansion of our trial function in terms of the exact eigenfunctions, let us now substitute this into the quantity

\[\dfrac{\langle\psi^{(0)}| H | \psi^{(0)}\rangle}{\langle\psi^{(0)} | \psi^{(0)}\rangle}\nonumber \]

that the variational method instructs us to compute:

\[E=\dfrac{\langle\psi^{(0)}| H | \psi^{(0)}\rangle}{\langle\psi^{(0)} | \psi^{(0)}\rangle}= \dfrac{\left \langle \displaystyle \sum_K c_K \psi_K | H | \displaystyle \sum_L c_L \psi_L \right \rangle}{\left \langle\displaystyle \sum_K c_K \psi_K|\displaystyle \sum_L c_L \psi_L \right \rangle} \nonumber \]

Using the fact that the \(\psi_K\) obey \(H\psi_K = E_K \psi_K\) and that the \(\psi_K\) are orthonormal

\[\langle\psi_K|\psi_L\rangle = \delta_{K.L}\nonumber \]

the above expression reduces to

\[E = \dfrac{\displaystyle \sum_K \langle c_K \psi_K | H | c_K \psi_K\rangle}{\displaystyle \sum_K\langle c_K \psi_K| c_K \psi_K\rangle} = \dfrac{\displaystyle \sum_K |c_K|^2 E_K}{\displaystyle \sum_K|c_K|^2}.\nonumber \]

One of the basic properties of the kind of Hamiltonian we encounter is that they have a lowest-energy state. Sometimes we say they are bounded from below, which means their energy states do not continue all the way to minus infinity. There are systems for which this is not the case (we saw one earlier when studying the Stark effect), but we will now assume that we are not dealing with such systems. This allows us to introduce the inequality \(E_K \geq E_0\) which says that all of the energies are higher than or equal to the energy of the lowest state which we denote \(E_0\). Introducing this inequality into the above expression gives

\[E \geq \dfrac{\displaystyle \sum_K |c_K|^2 E_0}{\displaystyle \sum_K|c_K|^2} = E_0.\nonumber \]

This means that the variational energy, computed as

\[\dfrac{\langle\psi^{(0)}| H | \psi^{(0)}\rangle}{\langle\psi^{(0)} | \psi^{(0)}\rangle} \label{energy}\]

will lie above the true ground-state energy no matter what trial function \(\psi^{(0)}\) we use.

The significance of the above result that \(E \geq E_0\) is as follows. We are allowed to imbed into our trial wavefunction \(\psi^{(0)}\) parameters that we can vary to make \(E\), computed as Equation \(\ref{energy}\) as low as possible because we know that we can never it lower than the true ground-state energy. The philosophy then is to vary the parameters in \(\psi^{(0)}\) to render \(E\) as low as possible, because the closer \(E\) is to \(E_0\) the “better” is our variational wavefunction. Let me now demonstrate how the variational method is used in such a manner by solving an example problem.

Example \(\PageIndex{1}\): Two electron Atoms

Suppose you are given a trial wavefunction of the form:

\[ \phi = \dfrac{Z_e^3}{\pi a_0^3}\exp\left(\dfrac{-Z_er_1}{a_0}\right) \exp\left(\dfrac{-Z_er_2}{a_0}\right)\nonumber \]

to represent a two-electron ion of nuclear charge \(Z\) and suppose that you are lucky enough that I have already evaluated the variational energy expression (Equation \ref{energy}, which I’ll call \(W\), for you and found

\[W =\left(Z_e^2-2ZZ_e+\dfrac{5}{8}Z_e\right)\dfrac{e^2}{a_0} .\nonumber \]

Now, let’s find the optimum value of the variational parameter \(Z_e\) for an arbitrary nuclear charge \(Z\) by setting \(= 0\). After finding the optimal value of \(Z_e\), we’ll then find the optimal energy by plugging this \(Z_e\) into the above \(W\) expression.

\[\begin{align*} \dfrac{dW}{dZ_e}= \left(2Z_e-2Z+\dfrac{5}{8}\right)\dfrac{e^2}{a_0}&= 0 \\[4pt] 2Z_e - 2Z +\dfrac{5}{8} &= 0 \\[4pt] 2Ze &= 2Z -\dfrac{5}{8} \\[4pt] Z_e &= Z - \dfrac{5}{16} \\[4pt] &= Z - 0.3125 \end{align*}\]

Note that 0.3125 represents the shielding factor of one 1s electron to the other, reducing the optimal effective nuclear charge by this amount (those familiar with Slater's Rules will not be surprised by this number). Now, using this optimal \(Z_e\) in our energy expression gives

\[ \begin{align*} W &= Z_e\left(2Z_e-2Z+\dfrac{5}{8}\right)\dfrac{e^2}{a_0} \\[4pt] &=\left(Z-\dfrac{5}{16}\right) \left(\left(Z-\dfrac{5}{16}\right)-2Z+\dfrac{5}{8}\right)\dfrac{e^2}{a_0} \\[4pt]&=\left(Z-\dfrac{5}{16}\right)\left(-Z+\dfrac{5}{16}\right)\dfrac{e^2}{a_0} \\[4pt]&= -\left(Z-\dfrac{5}{16}\right)\left(Z-\dfrac{5}{16}\right) \dfrac{e^2}{a_0} \\[4pt] &= -\left(Z-\dfrac{5}{16}\right)^2\dfrac{e^2}{a_0} \\[4pt] &= - (Z - 0.3125)^2(27.21) {\rm eV}\end{align*}\]

Since \(a_0\) is the Bohr radius 0.529 Å, \(e^2/a_0\) = 27.21 eV, or one atomic unit of energy. Is this energy any good? The total energies of some two-electron atoms and ions have been experimentally determined to be as shown in Table \(\PageIndex{1}\) below. Using our optimized expression for \(W\), let’s now calculate the estimated total energies of each of these atoms and ions as well as the percent error in our estimate for each ion.

Table \(\PageIndex{1}\): Comparison of Experimental (true) total energies with predicted for select two-electron species.
Z	Atom	Experimental	Calculated	% Error
1	H^-	-14.35 eV	-12.86 eV	10.38%
2	He	-78.98 eV	-77.46 eV	1.92%
3	Li⁺	-198.02 eV	-196.46 eV	0.79%
4	Be⁺²	-371.5 eV	-369.86 eV	0.44%
5	B⁺³	-599.3 eV	-597.66 eV	0.27%
6	C⁺⁴	-881.6 eV	-879.86 eV	0.19%
7	N⁺⁵	-1218.3 eV	-1216.48 eV	0.15%
8	O⁺⁶	-1609.5 eV	-1607.46 eV	0.13%

The energy errors are essentially constant over the range of \(Z\), but produce a larger percentage error at small Z.

Aside: In 1928, when quantum mechanics was quite young, it was not known whether the isolated, gas-phase hydride ion, \(H^-\), was stable with respect to loss of an electron to form a hydrogen atom. Let’s compare our estimated total energy for \(H^-\) to the ground state energy of a hydrogen atom and an isolated electron (which is known to be -13.60 eV). When we use our expression for W and take \(Z = 1\), we obtain \(W = -12.86\) eV, which is greater than -13.6 eV (\(H + e^-\)), so this simple variational calculation erroneously predicts \(H^-\) to be unstable. More complicated variational treatments give a ground state energy of \(H^-\) of -14.35 eV, in agreement with experiment and agreeing that \(H^-\) is indeed stable with respect to electron detachment.

Contributors and Attributions

Jack Simons (Henry Eyring Scientist and Professor of Chemistry, U. Utah) Telluride Schools on Theoretical Chemistry
Integrated by Tomoyuki Hayashi (UC Davis)

Search

Text Color

Text Size

Margin Size

Font Type