Skip to main content
Chemistry LibreTexts

6.1: Theoretical Treatment of Electronic Structure

In Chapter 5’s discussion of molecular structure, I introduced you to the strategies that theory uses to interpret experimental data relating to such matters, and how and why theory can also be used to simulate the behavior of molecules. In carrying out simulations, the Born-Oppenheimer electronic energy \(E(R)\) as a function of the \(3N\) coordinates of the \(N\) atoms in the molecule plays a central role. It is on this landscape that one searches for stable isomers and transition states, and it is the second derivative (Hessian) matrix of this function that provides the harmonic vibrational frequencies of such isomers. In the present Chapter, I want to provide you with an introduction to the tools that we use to solve the electronic Schrödinger equation to generate \(E(R)\) and the electronic wave function \(\psi(r|R)\). In essence, this treatment will focus on orbitals of atoms and molecules and how we obtain and interpret them.

For an atom, one can approximate the orbitals by using the solutions of the hydrogenic Schrödinger equation discussed in Part 1 of this text. Although such functions are not proper solutions to the actual \(N\)-electron Schrödinger equation (believe it or not, no one has ever solved exactly any such equation for \(N > 1\)) of any atom, they can be used as perturbation or variational starting-point approximations when one may be satisfied with qualitatively accurate answers. In particular, the solutions of this one-electron hydrogenic problem form the qualitative basis for much of atomic and molecular orbital theory. As discussed in detail in Part 1, these orbitals are labeled by \(n\), \(l\), and \(m\) quantum numbers for the bound states and by \(l\) and \(m\) quantum numbers and the energy \(E\) for the continuum states.

Much as the particle-in-a-box orbitals are used to qualitatively describe \(\pi\)-electrons in conjugated polyenes or electronic bands in solids, these so-called hydrogen-like orbitals provide qualitative descriptions of orbitals of atoms with more than a single electron. By introducing the concept of screening as a way to represent the repulsive interactions among the electrons of an atom, an effective nuclear charge \(Z_{\rm eff}\) can be used in place of \(Z\) in the hydrogenic \(\psi_{n,l,m}\) and \(E_{n,l}\) formulas to generate approximate atomic orbitals to be filled by electrons in a many-electron atom. For example, in the crudest approximation of a carbon atom, the two \(1s\) electrons experience the full nuclear attraction so \(Z_{\rm eff} =6\) for them, whereas the \(2s\) and \(2p\) electrons are screened by the two \(1s\) electrons, so \(Z_{\rm eff}= 4\) for them. Within this approximation, one then occupies two \(1s\) orbitals with \(Z=6\), two \(2s\) orbitals with \(Z=4\) and two \(2p\) orbitals with \(Z=4\) in forming the full six-electron product wave function of the lowest-energy state of carbon

\[\psi(1, 2, …, 6) = \psi_{1s} \alpha(1) \psi_{1s} \alpha(2) \psi_{2s} \alpha(3) … \psi_{1p}(0) \beta(6). \]

However, such approximate orbitals are not sufficiently accurate to be of use in quantitative simulations of atomic and molecular structure. In particular, their energies do not properly follow the trends in atomic orbital (AO) energies that are taught in introductory chemistry classes and that are shown pictorially in Figure 6.1.


Figure 6.1.1: Energies of Atomic Orbitals as Functions of Nuclear Charge for Neutral Atoms.

For example, the relative energies of the \(3d\) and \(4s\) orbitals are not adequately described in a model that treats electron repulsion effects in terms of a simple screening factor. So, now it is time to examine how we can move beyond the screening model and take the electron repulsion effects, which cause the inter-electronic couplings that render the Schrödinger equation insoluble, into account in a more reliable manner.

6.1.1 Orbitals

1. The Hartree Description

The energies and wave functions within the most commonly used theories of atomic structure are assumed to arise as solutions of a Schrödinger equation whose Hamiltonian \(h_e(r)\) possess three kinds of energies:

  1. Kinetic energy, whose average value is computed by taking the expectation value of the kinetic energy operator \(-\dfrac{\hbar^2}{2m} \nabla^2\) with respect to any particular solution \(\phi_j(r)\) to the Schrödinger equation: \[KE = \langle\phi_j| -\dfrac{\hbar^2}{2m} \nabla^2 |\phi_j\rangle \]
  2. Coulombic attraction energy with the nucleus of charge \(Z\): \[\langle\phi_j| -\dfrac{Z_e^2}{r} |\phi_j\rangle\]
  3. Coulomb repulsion energies with all of the \(N-1\) other electrons, which are assumed to occupy other atomic orbitals (AOs) denoted \(\phi_K\), with this energy computed as

\[\sum_K \langle\phi_j(r) \phi_K(r’) |\frac{e^2}{|r-r’|} | \phi_j(r) \phi_K(r’)\rangle.\label{6.1.2}\]

The Dirac notation \(\langle\phi_j(r) \phi_K(r’) |\dfrac{e^2}{|r-r’|} | \phi_j(r) \phi_K(r’)\rangle\) is used to represent the six-dimensional Coulomb integral

\[J_{J,K} =  \int |\phi_j(r)|^2 |\phi_K(r’)|^2 \dfrac{e^2}{r-r'} dr dr’ \label{6.1.3}\]

that describes the Coulomb repulsion between the charge density \(|\phi_j(r)|^2\) for the electron in \(\phi_j\) and the charge density \(|\phi_K(r’)|^2\) for the electron in \(\phi_K\). Of course, the sum over \(K\) must be limited to exclude \(K=J\) to avoid counting a “self-interaction” of the electron in orbital \(\phi_j\)  with itself.

The total energy \(\epsilon_J\) of the orbital \(\phi_j\),  is the sum of the above three contributions:

\[\epsilon_J = \langle\phi_j| - \frac{\hbar^2}{2m} \nabla^2 |\phi_j\rangle + \langle\phi_j| -\frac{Z_e^2}{r} |\phi_j\rangle + \sum_K \langle\phi_j(r) \phi_K(r’) |\frac{e^2}{|r-r’|} | \phi_j(r) \phi_K(r’)\rangle.\label{6.1.4}\]

This treatment of the electrons and their orbitals is referred to as the Hartree-level of theory. As stated above, when screened hydrogenic AOs are used to approximate the \(\phi_j\) and \(\phi_K\) orbitals, the resultant \(\epsilon_J\) values do not produce accurate predictions. For example, the negative of \(\epsilon_J\) should approximate the ionization energy for removal of an electron from the AO \(\phi_j\). Such ionization potentials (IP s) can be measured, and the measured values do not agree well with the theoretical values when a crude screening approximation is made for the AO s.

2. The LACO-Expansion

To improve upon the use of screened hydrogenic AOs, it is most common to approximate each of the Hartree AOs {\(\phi_K\)} as a linear combination of so-called basis AOs {\(\chi_\mu\)}:

\[\phi_J = \sum_\mu C_{J,\mu}  \chi_\mu.\label{6.1.5}\]

using what is termed the linear-combination-of-atomic-orbitals (LCAO) expansion. In this equation, the expansion coefficients {\(C_{J,\mu}\)} are the variables that are to be determined by solving the Schrödinger equation

\[h_e \phi_J = \epsilon_J \phi_J. \label{6.1.6}\]

After substituting the LCAO expansion for \(\phi_J\) into this Schrödinger equation, multiplying on the left by one of the basis AOs \(\chi_\nu\), and then integrating over the coordinates of the electron in \(\phi_J\), one obtains

\[\sum_\mu \langle\chi_\nu| h_e| \chi_\mu\rangle C_{J,\mu}  = \epsilon_J  \sum_\mu \langle\chi_\nu| \chi_\mu\rangle C_{J,\mu} . \label{6.1.7}\]

This is a matrix eigenvalue equation in which the \(\epsilon_J\) and {\(C_{J,\mu}\)} appear as eigenvalues and eigenvectors. The matrices \(\langle\chi_\nu| h_e| \chi_\mu\rangle\) and \(\langle\chi_\nu| \chi_\mu\rangle\) are called the Hamiltonian and overlap matrices, respectively. An explicit expression for the former is obtained by introducing the earlier definition of he:

\[\langle\chi_\nu| h_e| \chi_\mu\rangle = \langle\chi_\nu| - \frac{\hbar^2}{2m} \nabla^2 |\chi_\mu\rangle + \langle\chi_\nu| -\frac{Ze^2}{r} |\chi_\mu\rangle \label{6.1.8}\]

\[+ \sum_{\eta,\gamma} \sum_K C_{K,\eta} C_{K,\gamma} \langle\chi_\nu(r) \chi_\eta(r’) |\frac{e^2}{|r-r’|} | \chi_\mu(r) \chi_\gamma(r’)\rangle. \label{6.1.9}\]

An important thing to notice about the form of the matrix Hartree equations is that to compute the Hamiltonian matrix, one must know the LCAO coefficients {\(C_{K,\gamma}\)} of the orbitals which the electrons occupy. On the other hand, these LCAO coefficients are supposed to be found by solving the Hartree matrix eigenvalue equations. This paradox leads to the need to solve these equations iteratively in a so-called self-consistent field (SCF) technique. In the SCF process, one inputs an initial approximation to the {\(C_{K,\gamma}\)} coefficients. This then allows one to form the Hamiltonian matrix defined above. The Hartree matrix equations

\[\sum_\mu \langle\chi_\nu| h_e| \chi_\mu\rangle C_{J,m}  = \epsilon_J  \sum_\mu \langle\chi_\nu| \chi_\mu\rangle C_{J,\mu} \label{6.1.10}\]

are then solved for new {\(C_{K,\gamma}\)} coefficients and for the orbital energies {\(\epsilon_K\)}. The new LCAO coefficients of those orbitals that are occupied are then used to form a new Hamiltonian matrix, after which the Hartree equations are again solved for another generation of LCAO coefficients and orbital energies. This process is continued until the orbital energies and LCAO coefficients obtained in successive iterations do not differ appreciably. Upon such convergence, one says that a self-consistent field has been realized because the {\(C_{K,\gamma}\)} coefficients are used to form a Coulomb field potential that details the electron-electron interactions.

3. AO Basis Sets

a. Slater-type orbitals and Gaussian-type orbitals

As noted above, it is possible to use the screened hydrogenic orbitals as the {\(\chi_\mu\)}. However, much effort has been expended at developing alternative sets of functions to use as basis orbitals. The result of this effort has been to produce two kinds of functions that currently are widely used. The basis orbitals commonly used in the LCAO process fall into two primary classes:

  1. Slater-type orbitals (STOs) \[c_{n,l,m} (r,\theta,\phi) = N_{n,l,m,z}  Y_{l,m} (\theta,\phi) r_{n-1} e^{-zr}\] are characterized by quantum numbers \(n\), \(l\), and \(m\) and exponents (which characterize the orbital’s radial size) \(z\). The symbol \(N_{n,l,m,z}\) denotes the normalization constant.
  2. Cartesian Gaussian-type orbitals (GTOs) \[c_{a,b,c} (r,\theta,\phi) = N'_{a,b,c,a} x_a y_b z_c e^{-ar^2}\] are characterized by quantum numbers \(a\), \(b\), and \(c\), which detail the angular shape and direction of the orbital, and exponents \(a\) which govern the radial size.

For both types of AOs, the coordinates \(r\), \(\theta\), and \(\phi\) refer to the position of the electron relative to a set of axes attached to the nucleus on which the basis orbital is located. Note that Slater-type orbitals (STO's) are similar to hydrogenic orbitals in the region close to the nucleus. Specifically, they have a non-zero slope near the nucleus

\[\dfrac{d}{dr}(e^{-zr})_{r=0} = -z.\]

In contrast, GTOs, have zero slope near \(r=0\) because

\[\dfrac{d}{dr}(e^{-ar^2})_{r=0} = 0.\]

We say that STOs display a cusp at \(r=0\) that is characteristic of the hydrogenic solutions, whereas GTOs do not. Although STOs have the proper cusp behavior near nuclei, they are used primarily for atomic and linear-molecule calculations because the multi-center integrals

\[\langle\chi_\mu(1) \chi_\kappa(2)|\dfrac{e^2}{|r_1-r_2|}| \chi_\nu(1) \chi_\gamma(2)\rangle\label{6.1.11}\]

which arise in polyatomic-molecule calculations (we will discuss these integrals later in this Chapter) cannot efficiently be evaluated when STOs are employed. In contrast, such integrals can routinely be computed when GTOs are used. This fundamental advantage of GTOs has lead to the dominance of these functions in molecular quantum chemistry.

To overcome the primary weakness of GTO functions (i.e., their radial derivatives vanish at the nucleus), it is common to combine two, three, or more GTOs, with combination coefficients which are fixed and not treated as LCAO parameters, into new functions called contracted  GTOs (CGTOs). Typically, a series of radially tight, medium, and loose GTOs are multiplied by contraction coefficients and summed to produce a CGTO that approximates the proper cusp at the nuclear center (although no such combination of GTOs can exactly produce such a cusp because each GTO has zero slope at \(r = 0\).

Although most calculations on molecules are now performed using Gaussian orbitals, it should be noted that other basis sets can be used as long as they span enough of the regions of space (radial and angular) where significant electron density resides. In fact, it is possible to use plane wave orbitals of the form

\[\chi(r,\theta,\phi) = N\exp[i(k_x r \sin{\theta} \cos{\phi} + k_y  r \sin_{\theta} \sin{\phi} + k_z r \cos{\theta})],\label{6.1.12}\]

where \(N\) is a normalization constant and \(k_x\), \(k_y\), and \(k_z\) are quantum numbers detailing the momenta or wavelength of the orbital along the \(x\), \(y\), and \(z\) Cartesian directions. The advantage to using such simple orbitals is that the integrals one must perform are much easier to handle with such functions. The disadvantage is that one must use many such functions to accurately describe sharply peaked charge distributions of, for example, inner-shell core orbitals while still retaining enough flexibility to also describe the much smoother electron density in the valence regions. Much effort has been devoted to developing and tabulating in widely available locations sets of STO or GTO basis orbitals for main-group elements and transition metals. This ongoing effort is aimed at providing standard basis set libraries which:

  1. Yield predictable chemical accuracy in the resultant energies.
  2. Are cost effective to use in practical calculations.
  3. Are relatively transferable so that a given atom's basis is flexible enough to be used for that atom in various bonding environments (e.g., hybridization and degree of ionization).

b. The Fundamental Core and Valence Basis

In constructing an atomic orbital basis, one can choose from among several classes of functions. First, the size and nature of the primary core and valence basis must be specified. Within this category, the following choices are common:

  1. A minimal basis in which the number of CGTO orbitals is equal to the number of core and valence atomic orbitals in the atom.
  2. A double-zeta (DZ) basis in which twice as many CGTOs are used as there are core and valence atomic orbitals. The use of more basis functions is motivated by a desire to provide additional variational flexibility so the LCAO process can generate molecular orbitals of variable diffuseness as the local electronegativity of the atom varies. A valence double-zeta (VDZ) basis has only one CGTO to represent the inner-shell orbitals, but uses two sets of CGTOs to describe the valence orbitals.
  3. A triple-zeta (TZ) basis in which three times as many CGTOs are used as the number of core and valence atomic orbitals (of course, there are quadruple-zeta and higher-zeta bases also). Moreover, there are VTZ bases that treat the inner-shell orbitals with one CGTO and the valence orbitals with three CGTOs.

Optimization of the orbital exponents (z’s or a's) and the GTO-to-CGTO contraction coefficients for the kind of bases described above has undergone considerable growth in recent years. The theory group at the Pacific Northwest National Labs (PNNL) offer a world wide web site from which one can find (and even download in a form prepared for input to any of several commonly used electronic structure codes) a wide variety of Gaussian atomic basis sets. This site can be accessed here. Professor Kirk Peterson at Washington State University is involved in the PNNL basis set development project, but he also hosts his own basis set site.

c. Polarization Functions

One usually enhances any core and valence basis set with a set of so-called polarization functions. They are functions of one higher angular momentum than appears in the atom's valence orbital space (e.g., \(d\)-functions for C, N, and O and \(p\)-functions for H), and they have exponents (\(z\) or \(a\)) which cause their radial sizes to be similar to the sizes of the valence orbitals ( i.e., the polarization \(p\) orbitals of the H atom are similar in size to the \(1s\) orbital rather than to the \(2s\) valence orbital of hydrogen). Thus, they are not orbitals which describe the atom's valence orbital with one higher l-value; such higher-l valence orbitals would be radially more diffuse.

A primary purpose of polarization functions is to give additional angular flexibility to the LCAO process in forming bonding orbitals between pairs of valence atomic orbitals. This is illustrated in Figure 6.1.2 where polarization dp orbitals on C and O are seen to contribute to formation of the bonding \(p\) orbital of a carbonyl group by allowing polarization of the carbon atom's \(p_\pi\) orbital toward the right and of the oxygen atom's \(p_\pi\) orbital toward the left.



Figure 6.1.2 Oxygen and Carbon Form a \(\pi\) Bond That Uses the Polarization Functions on Each Atom

Polarization functions are essential in strained ring compounds such as cyclopropane because they provide the angular flexibility needed to direct the electron density into regions between bonded atoms, but they are also important in unstrained compounds when high accuracy is required.

d. Diffuse Functions

When dealing with anions or Rydberg states, one must further augment the AO basis set by adding so-called diffuse basis orbitals. The valence and polarization functions described above do not provide enough radial flexibility to adequately describe either of these cases. The PNNL web site data base cited above offers a good source for obtaining diffuse functions appropriate to a variety of atoms as does the site of Prof. Kirk Peterson.

Once one has specified an atomic orbital basis for each atom in the molecule, the LCAO-MO procedure can be used to determine the \(\chi_{\mu,i}\)  coefficients that describe the occupied and virtual (i.e., unoccupied) orbitals. It is important to keep in mind that the basis orbitals are not themselves the SCF orbitals of the isolated atoms; even the proper atomic orbitals are combinations (with atomic values for the \(\chi_{\mu,i}\)  coefficients) of the basis functions. The LCAO-MO-SCF process itself determines the magnitudes and signs of the \(\chi_{\nu,i}\).  In particular, it is alternations in the signs of these coefficients allow radial nodes to form.

4. The Hartree-Fock Approximation

Unfortunately, the Hartree approximation discussed above ignores an important property of electronic wave functions- their permutational antisymmetry. The full electronic Hamiltonian

\[H = \sum_j {- \dfrac{\hbar^2}{2m} \nabla^2_j  - \dfrac{Ze^2}{r_j}} + \dfrac{1}{2} \sum_j^k \dfrac{e^2}{|r_j-r_k|}\label{6.1.13}\]

is invariant (i.e., is left unchanged) under the operation \(P_{i,j}\) in which a pair of electrons have their labels (i, j) permuted. We say that \(H\) commutes with the permutation operator \(P_{i,j}\). This fact implies that any solution \(\psi\) to \(H\psi = E\psi\) must also be an eigenfunction of \(P_{i,j}\) Because permutation operators are idempotent, which means that if one applies \(P_{i,j}\) twice, one obtains the identity \(PP=1\), it can be seen that the eigenvalues of \(P_{i,j}\) must be either \(+1\) or\( –1\). That is, if \(P \psi=c\psi\), then \(P P \psi = cc \psi\), but \(PP=1\) means that \(cc = 1\), so \(c = +1\) or \(–1\).

As a result of \(H\) commuting with electron permutation operators and of the idempotency of \(P\), the eigenfunctions \(\psi\) must either be odd or even under the application of any such permutation. Particles whose wave functions are even under \(P\) are called Bose particles or Bosons; those for which \(\psi\) is odd are called Fermions. Electrons belong to the latter class of particles.

The simple spin-orbital product function used in Hartree theory

\[\psi= ​\prod_{k=1}^N \phi_k\label{6.1.14}\]

does not have the proper permutational symmetry. For example, the Be atom function

\[\psi = 1s\alpha(1) 1s\beta(2) 2s\alpha(3) 2s\beta(4) \label{6.1.15}\]

is not odd under the interchange of the labels of electrons 3 and 4; instead one obtains \(1s\alpha(1) 1s\beta(2) 2s\alpha(4) 2s\beta(3)\). However, such products of spin-orbitals (i.e., orbitals multiplied by a or b spin functions) can be made into properly antisymmetric functions by forming the determinant of an \(N \times N\) matrix whose row index labels the spin orbital and whose column index labels the electron. For example, the Be atom function \(1s\alpha(1) 1s\beta(2) 2s\alpha(3) 2s\beta(4)\) produces the \(4 \times 4\) matrix whose determinant is shown below

1s\alpha(1) & 1s\alpha(2) & 1s\alpha(3) & 1s\alpha(4)\\
1s\beta(1) & 1s\beta(2) & 1s\beta(3) & 1s\beta(4)\\
2s\alpha(1) & 2s\alpha(2) & 2s\alpha(3) & 2s\alpha(4)\\
 2s\beta(1) &  2s\beta(2) &  2s\beta(3) &  2s\beta(4)

Clearly, if one were to interchange any columns of this determinant, one changes the sign of the function. Moreover, if a determinant contains two or more rows that are identical (i.e., if one attempts to form such a function having two or more spin-orbitals equal), it vanishes. This is how such antisymmetric wave functions embody the Pauli exclusion principle.

A convenient way to write such a determinant is as follows:

\[\sum_P (-1)^p \phi_{P1} (1) \phi_{P2}(2) … \phi_{PN}(N),\]

where the sum is over all N! permutations of the \(N\) spin-orbitals and the notation \((-1)^p\)  means that a –1 is affixed to any permutation that involves an odd number of pair wise interchanges of spin-orbitals and a +1 sign is given to any that involves an even number. To properly normalize such a determinental wave function, one must multiply it by \(\dfrac{1}{\sqrt{N!}}\). So, the final result is that a wave function of the form

\[\psi = \frac{1}{\sqrt{N!}} \sum_P (-1)^p  \phi_{P1} (1) \phi_{P2}(2) … \phi_{PN}(N),\label{6.1.16}\]

which is often written in short-hand notation as,

\[\psi = |\phi_1 (1) \phi_2(2) … \phi_N(N)|\label{6.1.17}\]

has the proper permutational antisymmetry. Note that such functions consist of as sum of \(N!\) factors, all of which have exactly the same number of electrons occupying the same spin-orbitals; the only difference among the \(N!\) terms involves which electron occupies which spin-orbital. For example, in the \(1s\alpha 2s\alpha\) function appropriate to the excited state of He, one has

\[\psi = \frac{1}{\sqrt{2}} \{1s\alpha(1) 2s\alpha(2) – 2s\alpha(1) 1s\alpha(2)\} \label{6.1.18}\]

This function is clearly odd under the interchange of the labels of the two electrons, yet each of its two components has one electron is a \(1s\alpha\) spin-orbital and another electron in a \(2s\alpha\) spin-orbital.

Although having to make \(\psi\) antisymmetric appears to complicate matters significantly, it turns out that the Schrödinger equation appropriate to the spin-orbitals in such an antisymmetrized product wave function is nearly the same as the Hartree Schrödnger equation treated earlier. In fact, if one variationally minimizes the expectation value of the \(N\)-electron Hamiltonian for the above antisymmetric product wave function subject to the condition that the spin-orbitals are orthonormal

\[\langle\phi_J(r)| \phi_k(r)\rangle = \delta_{J,K} \label{6.1.19}\]

one obtains the following equation for the optimal \({\phi_J(r)}\):

\[h_e \phi_J =  \left[– \dfrac{\hbar^2}{2m} \nabla^2  -\dfrac{Ze^2}{r} + \sum_K \langle\phi_K(r’) | \dfrac{e^2}{|r-r’|} | \phi_K(r’)\rangle \right] \phi_J(r)\label{6.1.20}\]

\[- \sum_K \langle\phi_K(r’) |\dfrac{e^2}{|r-r’|} | \phi_J(r’)\rangle \phi_K(r)  = \epsilon_J \phi_J(r).\label{6.21}\]

In this expression, which is known as the Hartree-Fock equation, the same kinetic and nuclear attraction potentials occur as in the Hartree equation. Moreover, the same Coulomb potential

\[\sum_K \int \phi_K(r’) \dfrac{e^2}{|r-r’|} \phi_K(r’) dr’ = \sum_K \langle\phi_K(r’)|\dfrac{e^2}{|r-r’|} |\phi_K(r’)\rangle = \sum_K J_K (r)\label{6.1.22}\]

appears. However, one also finds a so-called exchange contribution to the Hartree-Fock potential that is equal to \(\sum_L \langle\phi_L(r’) |\dfrac{e^2}{|r-r’|} | \phi_J(r’)\rangle \phi_L(r)\) and is often written in short-hand notation as \(\sum_L K_L \phi_J(r)\). Notice that the Coulomb and exchange terms cancel for the \(L=J\) case; this causes the artificial self-interaction term \(J_L \phi_L(r) \) that can appear in the Hartree equations (unless one explicitly eliminates it) to automatically cancel with the exchange term \(K_L \phi_L(r)\) in the Hartree-Fock equations.

To derive the above Hartree-Fock equations, one must make use of the so-called Slater-Condon rules given in Section 6.1.2 of this Chapter (if you wish to follow all the details, it is probably wise to pause here and go to Section 6.1.2 to learn these rules and then return here to proceed) to express the Hamiltonian expectation value as

\[ \langle|\phi_1(1)\phi_2(2)\cdots \phi_{N-1}\phi_N(N)|H|\phi_1(1)\phi_2(2)\cdots \phi_{N-1}\phi_N(N)|\rangle\]


\[+\frac{1}{2}\sum_{j,k=1}^N\left[ \langle\phi_j(r)\phi_k(r')|\frac{e^2}{|r-r'|}|\phi_j(r)\phi_k(r')\rangle
- \langle\phi_j(r)\phi_k(r')|\frac{e^2}{|r-r'|}|\phi_k(r)\phi_j(r')\rangle\right]\] 

This expectation value is a sum of terms (the kinetic energy and electron-nuclear Coulomb potentials) that vary quadratically on the spin-orbitals (i.e., as \(\langle \phi| {\rm operator} |\phi\rangle\)) plus another sum (the Coulomb and exchange electron-electron interaction terms) that depend on the fourth power of the spin-orbitals (i.e., as \(\langle \phi\phi |{\rm operator} |\phi\phi \rangle\). When these terms are differentiated to minimize the expectation value, they generate factors that scale linearly and with the third power of the spin-orbitals. These are the factors

\[{-\dfrac{\hbar^2}{2m} \nabla^2  -\frac{Z_e^2}{r} } \phi_J(r)\] and \[\sum_K \langle\phi_K(r’) |\frac{e^2}{|r-r’|} | \phi_K(r’)\rangle \phi_J(r) - \sum_K \langle \phi_K(r’) |\frac{e^2}{|r-r’|} | \phi_J(r’)\rangle \phi_K(r) \label{6.1.23}\]

appearing in the Hartree-Fock equations shown above.

When the LCAO expansion of each Hartree-Fock (HF) spin-orbital is substituted into the above HF Schrödinger equation, a matrix equation is again obtained:

\[\sum_\mu \langle\chi_\nu |h_e| \chi_\mu\rangle C_{J,\mu} = \epsilon_J \sum_\mu \langle\chi_\nu|\chi_\mu\rangle C_{J,\mu}\label{6.1.24}\]

where the overlap integral \(\langle\chi_\nu|\chi_\mu\rangle\) is as defined earlier, and the \(h_e\) matrix element is

\[\langle\chi_\nu| h_e| \chi_\mu\rangle = \langle\chi_\nu| -\dfrac{\hbar^2}{2m} \nabla^2 |\chi_\mu\rangle + \langle\chi_\nu| -\frac{Ze^2}{r} |\chi_\mu\rangle \label{6.1.25}\]

\[+ \sum_{K,\eta,\gamma} C_{K,\eta} C_{K,\gamma} [\langle\chi_\nu(r) \chi_\eta(r’) |\frac{e^2}{|r-r’|} | \chi_\mu(r) \chi_\gamma(r’)\rangle - \langle\chi_\nu(r) \chi_\eta(r’) |\frac{e^2}{|r-r’|} | \chi_\gamma(r) \chi_\mu (r’)\rangle].\label{6.1.26}\]

Clearly, the only difference between this expression and the corresponding result of Hartree theory is the presence of the last term, the exchange integral. The SCF iterative procedure used to solve the Hartree equations is again used to solve the HF equations.

It is useful to reflect on the physical meaning of the Coulomb and exchange interactions between pairs of orbitals. For example, the Coulomb integral

\[ J_{1,2} =  \int |\phi_1(r)|^2  \frac{e^2}{|r-r’|} |\phi_2(r’)|^2 dr dr’ \label{6.1.27}\]\]

appropriate to the two orbitals shown in Figure 6.1.3 represents the Coulombic repulsion energy \(\dfrac{e^2}{|r-r’|}\) of two charge densities, \(|\phi_1|^2\) and \(|\phi_2|^2\), integrated over all locations \(r\) and \(r’\) of the two electrons.


Figure 6.1.3: An s and a p Orbital and Their Overlap Region

In contrast, the exchange integral

\[K_{1,2} = \int  \phi_1(r) \phi_2(r’) \frac{e^2}{|r-r’|}  \phi_2(r) \phi_1(r’) dr dr’\label{6.1.28}\]

can be thought of as the Coulombic repulsion between two electrons whose coordinates \(r\) and \(r’\) are both distributed throughout the “overlap region” \(\phi_1\) \(\phi_2\). This overlap region is where both \(\phi_1\) and \(\phi_2\) have appreciable magnitude, so exchange integrals tend to be significant in magnitude only when the two orbitals involved have substantial regions of overlap.

Finally, a few words are in order about one of the most computer time-consuming parts of any Hartree-Fock calculation (or those discussed later)- the task of evaluating and transforming the two-electron integrals

\[\langle\chi_\nu(r) \chi_\eta(r’) |\frac{e^2}{|r-r’|} | \chi_\mu(r) \chi_\gamma(r’)\rangle.\label{6.1.29}\]

When M GTOs are used as basis functions, the evaluation of \(\dfrac{M^4}{8}\) of these integrals often poses a major hurdle. For example, with 500 basis orbitals, there will be of the order of 7.8 x109 such integrals. With each integral requiring 2 words of disk storage (most integrals need to be evaluated in double precision), this would require at least 1.5 x104 Mwords of disk storage. Even in the era of modern computers that possess 500 Gby disks, this is a significant requirement. One of the more important technical advances that is under much current development is the efficient calculation of such integrals when the product functions \(\chi_\nu(r) \chi_\mu(r)\) and \(\chi_\gamma(r’) \chi_\eta(r’)\) that display the dependence on the two electrons’ coordinates r and r’ are spatially distant. In particular, so-called multipole expansions of these product functions are used to obtain more efficient approximations to their integrals when these functions are far apart. Moreover, such expansions offer a reliable way to ignore (i.e., approximate as zero) many integrals whose product functions are sufficiently distant. Such approaches show considerable promise for reducing the \(\dfrac{M^4}{8}\) two-electron integral list to one whose size scales much less strongly with the size of the AO basis and form an important component if efforts to achieve CPU and storage needs that scale linearly with the size of the molecule.

a. Koopmans’ Theorem

The HF-SCF equations \(h_e \phi_i = \epsilon_i \phi_i\)  imply that the orbital energies \(\epsilon_i\) can be written as:

\[\epsilon_i = \langle \phi_i | h_e | \phi_i \rangle = \langle \phi_i | T + V | \phi_i \rangle + \sum_{j({\rm occupied})} \langle \phi_i | J_j - K_j | \phi_i \rangle \]

\[= \langle \phi_i | T + V | \phi_i \rangle + \sum_{j({\rm occupied})} [ J_{i,j} - K_{i,j} ],\label{6.1.30}\]

where \(T + V\) represents the kinetic (\(T\)) and nuclear attraction (\(V\)) energies, respectively. Thus, \(\epsilon_i\) is the average value of the kinetic energy plus Coulombic attraction to the nuclei for an electron in \(\phi_i\) plus the sum over all of the spin-orbitals occupied in \(\psi\) of Coulomb minus Exchange interactions of these electrons with the electron in \(\phi_i\).

If \(\phi_i\) is an occupied spin-orbital, the \(j = i\) term \([ J_{i,i} - K_{i,i}]\) disappears in the above sum and the remaining terms in the sum represent the Coulomb minus exchange interaction of \(\phi_i\) with all of the \(N-1\) other occupied spin-orbitals. If \(\phi_i\) is a virtual spin-orbital, this cancelation does not occur because the sum over \(j\) does not include \(j = i\). So, one obtains the Coulomb minus exchange interaction of \(\phi_i\) with all \(N\) of the occupied spin-orbitals in \(\psi\). Hence the energies of occupied orbitals pertain to interactions appropriate to a total of \(N\) electrons, while the energies of virtual orbitals pertain to a system with \(N+1\) electrons. This difference is very important to understand and to keep in mind.

 Let us consider the following model of the detachment or attachment of an electron in an \(N\)-electron system.

  1. In this model, both the parent molecule and the species generated by adding or removing an electron are treated at the single-determinant level.
  2. The Hartree-Fock orbitals of the parent molecule are used to describe both species. It is said that such a model neglects orbital relaxation (i.e., the re-optimization of the spin-orbitals to allow them to become appropriate to the daughter species).

Within this model, the energy difference between the daughter and the parent can be written as follows (\(\phi_k\) represents the particular spin-orbital that is added or removed):

for electron detachment:

\[E_{N-1} - E_N = - \epsilon_k \label{6.1.31}\]

and for electron attachment:

\[E_N - E_{N+1} = - \epsilon_k .\label{6.1.32}\]

Let’s derive this result for the case in which an electron is added to the \(N+1^{st}\) spin-orbital. Again, using the Slater-Condon rules from Section 6.1.2 of this Chapter, the energy of the \(N\)-electron determinant with spin-orbitals \(\phi_1\) through \(f_N\) occupied is

\[E_N = \sum_{i=1}^N \langle \phi_i | T + V | \phi_i \rangle + \sum_{i=1}^{N} [ J_{i,j} - K_{i,j} ],\label{6.1.33}\]

which can also be written as

\[E_N = \sum_{i=1}^N \langle \phi_i | T + V | \phi_i \rangle + \frac{1}{2} \sum_{i,j=1}^{N} [ J_{i,j} - K_{i,j} ].\label{6.1.34}\]

Likewise, the energy of the \(N+1\)-electron determinant wave function is

\[E_{N+1} = \sum_{i=1}^{N+1} \langle \phi_i | T + V | \phi_i \rangle + \frac{1}{2} \sum_{i,j=1}^{N+1} [ J_{i,j} - K_{i,j} ].\label{6.1.35}\]

The difference between these two energies is given by

\[E_{N} – E_{N+1} = - \langle \phi_{N+1} | T + V | \phi_{N+1} \rangle - \frac{1}{2} \sum_{i=1}^{N+1} [ J_{i,N+1} - K_{i,N+1} ]\label{6.1.36}\]

\[- \frac{1}{2} \sum_{j=1}^{N+1} [ J_{N+1,j} - K_{N+1,j} ] = - \langle \phi_{N+1} | T + V | \phi_{N+1} \rangle - \sum_{i=1}^{N+1} [ J_{i,N+1} - K_{i,N+1} ]\label{6.1.37}\]

\[= - \epsilon_{N+1}.\label{6.1.38}\]

That is, the energy difference is equal to minus the expression for the energy of the \(N+1^{st}\) spin-orbital, which was given earlier.

So, within the limitations of the HF, frozen-orbital model, the ionization potentials (IPs) and electron affinities (EAs) are given as the negative of the occupied and virtual spin-orbital energies, respectively. This statement is referred to as Koopmans’ theorem; it is used extensively in quantum chemical calculations as a means of estimating IPs and EAs and often yields results that are qualitatively correct (i.e., ± 0.5 eV).

b. Orbital Energies and the Total Energy

The total HF-SCF electronic energy can be written as:

\[E = \sum_{i({\rm occupied})} \langle \phi_i | T + V | \phi_i  \rangle + \sum_{i>j_{({\rm occupied})}} [ J_{i,j} - K_{i,j} ] \label{6.1.39}\]

and the sum of the orbital energies of the occupied spin-orbitals is given by:

\[\sum_{i({\rm occupied})} \epsilon_i = \sum_{i({\rm occupied})} \langle \phi_i | T + V | \phi_i \rangle + \sum_{i,j({\rm occupied})} [J_{i,j} - K_{i,j} ]. \label{6.1.40}\]

These two expressions differ in a very important way; the sum of occupied orbital energies double counts the Coulomb minus exchange interaction energies. Thus, within the Hartree-Fock approximation, the sum of the occupied orbital energies is not equal to the total energy. This finding teaches us that we can not think of the total electronic energy of a given orbital occupation in terms of the orbital energies alone. We need to also keep track of the inter-electron Coulomb and Exchange energies.

5. Molecular Orbitals

Before moving on to discuss methods that go beyond the HF model, it is appropriate to examine some of the computational effort that goes into carrying out a HF SCF calculation on a molecule. The primary differences that appear when molecules rather than atoms are considered are

  1. The electronic Hamiltonian \(H_e\) contains not only one nuclear-attraction Coulomb potential \(\sum_j Ze^2/r_j\), but a sum of such terms, one for each nucleus in the molecule:

\[\sum_a \sum_j \dfrac{Z_a e^2}{|r_j-R_a|}, \label{6.1.41}\]

whose locations are denoted \(R_a\).

  1. One has AO basis functions of the type discussed above located on each nucleus of the molecule. These functions are still denoted \(\chi_\mu(r-R_a)\), but their radial and angular dependences involve the distance and orientation of the electron relative to the particular nucleus on which the AO is located.

Other than these two changes, performing a SCF calculation on a molecule (or molecular ion) proceeds just as in the atomic case detailed earlier. Let us briefly review how this iterative process occurs.

Once atomic basis sets have been chosen for each atom, the one- and two-electron integrals appearing in the \(H_e\) and overlap matrices must be evaluated. There are numerous highly efficient computer codes that allow such integrals to be computed for \(s\), \(p\), \(d\), \(f\), and even \(g\), \(h\), and \(i\) basis functions. After executing one of these so-called integral packages for a basis with a total of \(M\) functions, one has available (usually on the computer's hard disk) of the order of \(\dfrac{M^2}{2}\) one-electron (\(\langle \chi_\mu | H_e | \chi_\nu \rangle\) and \(\langle \chi_\mu | \chi_\nu \rangle\)) and \(\dfrac{M^4}{8}\) two-electron (\(\langle \chi_\mu \chi_\delta  | \chi_\nu \chi_\kappa \rangle\)) integrals. When treating extremely large atomic orbital basis sets (e.g., 500 or more basis functions), modern computer programs calculate the requisite integrals, but never store them on the disk. Instead, their contributions to the \(\langle\chi_\mu |H_e|\chi_\nu\rangle\) matrix elements are accumulated on the fly after which the integrals are discarded. This is usually referred to as the direct integral-driven approach.

a. Shapes, Sizes, and Energies of Orbitals

 Each molecular spin-orbital (MO) that results from solving the HF SCF equations for a molecule or molecular ion consists of a sum of components involving all of the basis AOs:

\[\phi_j = \sum_\mu  C_{J,\mu} \chi_\mu.\label{6.1.42}\]

In this expression, the \(C_{j,\mu}\)  are referred to as LCAO-MO coefficients because they tell us how to linearly combine AOs to form the MOs. Because the AOs have various angular shapes (e.g., \(s\), \(p\), or \(d\) shapes) and radial extents (i.e., different orbital exponents), the MOs constructed from them can be of different shapes and radial sizes. Let’s look at a few examples to see what I mean.

The first example is rather simple and pertains to two H atoms combining to form the \(H_2\) molecule. The valence AOs on each H atom are the \(1s\) AOs; they combine to form the two valence MOs (\(\sigma\) and \(\sigma^*\)) depicted in Figure 6.1.4.



Figure 6.1.4: Two \(1s\) Hydrogen Atomic Orbitals Combine to Form a Bonding and Antibonding Molecular Orbital

The bonding MO labeled s has LCAO-MO coefficients of equal sign for the two \(1s\) AOs, as a result of which this MO has the same sign near the left H nucleus (A) as near the right H nucleus (B). In contrast, the antibonding MO labeled \(\sigma^*\) has LCAO-MO coefficients of different sign for the A and B \(1s\) AOs. As was the case in the Hückel or tight-binding model outlined in Chapter 2, the energy splitting between the two MOs depends on the overlap \(\langle \chi_{1sA}|\chi_{1sB} \rangle\) between the two AOs which, in turn, depends on the distance \(R\) between the two nuclei.

An analogous pair of bonding and antibonding MOs arises when two \(p\) orbitals overlap sideways as in ethylene to form \(\pi\) and \(\pi^*\) MOs which are illustrated in Figure 6.1.5.



Figure 6.1.5 Two \(p_\pi\) Atomic Orbitals on Carbon Atoms Combine to Form a Bonding and Antibonding Molecular Orbital.

The shapes of these MOs clearly are dictated by the shapes of the AOs that comprise them and the relative signs of the LCAO-MO coefficients that relate the MOs to AOs. For the \(\pi\) MO, these coefficients have the same sign on the left and right atoms; for the \(\pi^*\) MO, they have opposite signs.

I should stress that the signs and magnitudes of the LCAO-MO coefficients arise as eigenvectors of the HF SCF matrix eigenvalue equation:

\[\sum_\mu \langle \chi_\nu|h_e| \chi_\mu \rangle  C_{j,m} = \epsilon_j \sum_\mu \langle \chi_\nu|\chi_\mu \rangle C_{j,m} \]

It is a characteristic of such eigenvalue problems for the lower energy eigenfunctions to have fewer nodes than the higher energy solutions as we learned from several examples that we solved in Part 1 of this text.

Another thing to note about the MOs shown above is that they will differ in their quantitative details, but not in their overall shapes, when various functional groups are attached to the ethylene molecule’s C atoms. For example, if electron-withdrawing groups such as Cl, OH or Br are attached to one of the C atoms, the attractive potential experienced by a \(\pi\) electron near that C atom will be enhanced relative to the potential near the other C atom. As a result, the bonding MO will have larger LCAO-MO coefficients Ck,m belonging to tighter basis AOs \(\chi_\mu\) on this C atom. This will make the bonding \(\pi\) MO more radially compact in this region of space, although its nodal character and gross shape will not change. Alternatively, an electron donating group such as H3C- or t-butyl attached to one of the C centers will cause the \(\pi\) MO to be more diffuse (by making its LCAO-MO coefficients for more diffuse basis AOs larger).

In addition to MOs formed primarily of AOs of one type (i.e., for \(H_2\) it is primarily s-type orbitals that form the \(\sigma\) and \(\sigma^*\) MOs; for ethylene’s \(\pi\) bond, it is primarily the C \(2p\) AOs that contribute), there are bonding and antibonding MOs formed by combining several AOs. For example, the four equivalent C-H bonding MOs in \(CH_4\) shown in Figure 6.1. 6 each involve C \(2s\) and \(2p\) as well as H \(1s\) basis AOs.



Figure 6.1. 6 The Four C-H Bonds in Methane

The energies of the MOs depend on two primary factors: the energies of the AOs from which the MOs are constructed and the overlap between these AOs. The pattern in energies for valence MOs formed by combining pairs of first-row atoms to form homo-nuclear diatomic molecules is shown in Figure 6.1. 7.



Figure 6.1.7 Energies of the Valence Molecular Orbitals in Homonuclear Diatomics Involving First-Row Atoms


In this figure, the core MOs formed from the \(1s\) AOs are not shown; only those MOs formed from \(2s\) and \(2p\) AOs appear. The clear trend toward lower orbital energies as one moves from left to right is due primarily to the trends in orbital energies of the constituent AOs. That is, F being more electronegative than \(N\) has a lower-energy \(2p\) orbital than does \(N\).

b. Bonding, Anti-bonding, Non-bonding, and Rydberg Orbitals

As noted above, when valence AOs combine to form MOs, the relative signs of the combination coefficients determine, along with the AO overlap magnitudes, the MO’s energy and nodal properties. In addition to the bonding and antibonding MOs discussed and illustrated earlier, two other kinds of MOs are important to know about.

Non-bonding MOs arise, for example, when an orbital on one atom is not directed toward and overlapping with an orbital on a neighboring atom. For example, the lone pair orbitals on \(H_2O\) or on the oxygen atom of \(H_2C=O\) are non-bonding orbitals. They still are described in the LCAO-MO manner, but their \(\chi_{m,i}\) coefficients do not contain dominant contributions from more than one atomic center.

Finally, there is a type of orbital that all molecules possess but that is ignored in most elementary discussions of electronic structure. All molecules have so-called Rydberg orbitals. These orbitals can be thought of as large diffuse orbitals that describe the regions of space an electron would occupy if it were in the presence of the corresponding closed-shell molecular cation. Two examples of such Rydberg orbitals are shown in Figure 6.1.8. On the left, we see the Rydberg orbital of \(NH_4\) and on the right, that of \(H_3N-CH_3\). The former species can be thought of as a closed-shell ammonium cation \(NH_4^+\)  around which a Rydberg orbital resides. The latter is protonated methyl amine with its Rydberg orbital.


Figure 6.1.8 Rydberg Orbitals of \(NH_4^+\) and of Protonated Methyl Amine

6.1.2. Deficiencies in the Single Determinant Model

To achieve reasonable chemical accuracy (e.g., ± 5 kcal/mole in EAs or IPs or bond energies) in electronic structure calculations, one can not describe the wave function \(\psi\) in terms of a single determinant. The reason such a wave function is inadequate is because the spatial probability density functions are not correlated. This means the probability of finding one electron at position r is independent of where the other electrons are, which is absurd because the electrons’ mutual Coulomb repulsion causes them to avoid one another. This mutual avoidance is what we call electron correlation because the electrons’ motions, as reflected in their spatial probability densities, are correlated (i.e., inter-related). Let us consider a simple example to illustrate this problem with single determinant functions. The \(|1s\alpha(r) 1s\beta(r’)|\) determinant, when written as

\[|1s\alpha(r) 1s\beta(r’)| = \frac{1}{\sqrt{2}}\{1s\alpha(r) 1s\beta(r’) - 1s\alpha(r’) 1s\beta(r)\} \]

can be multiplied by itself to produce the 2-electron spin- and spatial- probability density:

\[P(r, r’) = \frac{1}{2}\{[1s\alpha(r) 1s\beta(r’)]^2  + [1s\alpha(r’) 1s\beta(r)]^2 -1s\alpha(r) 1s\beta(r’) 1s\alpha(r’) 1s\beta(r) - 1s\alpha(r’) 1s\beta(r) 1s\alpha(r) 1s\beta(r’)\}.\]

If we now integrate over the spins of the two electrons and make use of

\[\langle a|a \rangle = \langle b|b \rangle = 1 \label{6.1.1a}\]


\[\langle a|b \rangle = \langle b|a\rangle = 0 \label{6.1.1b}\]

we obtain the following spatial (i.e., with spin absent) probability density:

\[P(r,r’) = |1s(r)|^2  |1s(r’)|^2. \]

This probability, being a product of the probability density for finding one electron at r times the density of finding another electron at \(r’\), clearly has no correlation in it. That is, the probability of finding one electron at r does not depend on where \((r’)\) the other electron is. This product form for \(P(r,r’)\) is a direct result of the single-determinant form for y, so this form must be wrong if electron correlation is to be accounted for.

1. Electron Correlation

Now, we need to ask how \(\psi\) should be written if electron correlation effects are to be taken into account. As we now demonstrate, it turns out that one can account for electron avoidance by taking \(\psi\) to be a combination of two or more determinants that differ by the promotion of two electrons from one orbital to another orbital. For example, in describing the \(\pi^2\) bonding electron pair of an olefin or the \(ns^2\) electron pair in alkaline earth atoms, one mixes in doubly excited determinants of the form \((\pi^*)^2\) or \(np^2\), respectively.

Briefly, the physical importance of such doubly-excited determinants can be made clear by using the following identity involving determinants:

\[C_1 | ..\phi_\alpha \phi_\beta​..| - C_2 | ..\phi'_\alpha \phi'_\beta..|\]

\[= \dfrac{C_1}{2} { | ..( \phi - x\phi')\alpha ( \phi + x\phi')b..| - | ..( \phi - x\phi')\beta ( \phi + x\phi')\alpha..| },\]


\[x = \sqrt{\dfrac{C_2}{C_1}} .\]

This identity is important to understand, so please make sure you can work through the algebra needed to prove it. It allows one to interpret the combination of two determinants that differ from one another by a double promotion from one orbital \((\phi)\) to another \((\phi')\) as equivalent to a singlet coupling (i.e., having \(\alpha\beta-\beta\alpha\) spin function) of two different orbitals \((\phi - x\phi')\) and \((\phi  + x\phi')\) that comprise what are called polarized orbital pairs. In the simplest embodiment of such a configuration interaction (CI) description of electron correlation, each electron pair in the atom or molecule is correlated by mixing in a configuration state function (CSF) in which that electron pair is doubly excited to a correlating orbital. A CSF is the minimum combination of determinants needed to express the proper spin eigenfunction for a given orbital occupation.

In the olefin example mentioned above, the two non-orthogonal polarized orbital pairs involve mixing the p and p* orbitals to produce two left-right polarized orbitals as depicted in Figure 6.1.9:


Figure 6.1. 9 Left and Right Polarized Orbitals of an Olefin


In this case, one says that the \(\pi^2\) electron pair undergoes left-right correlation when the \((\pi^*)^2\) determinant is mixed into the CI wave function.

In the alkaline earth atom case, the polarized orbital pairs are formed by mixing the \(ns\) and \(np\) orbitals (actually, one must mix in equal amounts of \(p_x, p_y\) , and \(p_z\) orbitals to preserve overall \(^1S\) symmetry in this case), and give rise to angular correlation of the electron pair. Such a pair of polarized orbitals is shown in Figure 6.1.10.



Figure 6.1.10 Angularly Polarized Orbital Pairs


More specifically, the following four determinants are found to have the largest  amplitudes in \(\psi\) for Be:

\[\psi \cong C_1 |1s^22s^2 | - C_2 [|1s^22p_x^2 | +|1s^22p_y^2 | +|1s^22p_z^2 |].\]

The fact that the latter three terms possess the same amplitude \(C_2\) is a result of the requirement that a state of \(^1S\) symmetry is desired. It can be shown that this function is equivalent to:

\[\psi \cong \frac{1}{6} C_1 |1s\alpha1s\beta  [[(2s-a2p_x)\alpha(2s+a2p_x)\beta - (2s-a2p_x)\beta(2s+a2p_x)\alpha]\\
+[(2s-a2p_y)\alpha(2s+a2p_y)\beta - (2s-a2p_y)\beta(2s+a2p_y)\alpha]\\
+[(2s-a2p_z)\alpha(2s+a2p_z)\beta -  (2s-a2p_z)\beta(2s+a2p_z)\alpha] ] |,\]


where \(a = \sqrt{3C_2/C_1}\).

Here two electrons occupy the \(1s\) orbital (with opposite, \(\alpha\) and \(\beta\) spins), and are thus not being treated in a correlated manner, while the other pair resides in \(2s\)/\(2p\) polarized orbitals in a manner that instantaneously correlates their motions. These polarized orbital pairs  \((2s ± a 2p_{x,y,\text{ or }z})\) are formed by combining the \(2s\) orbital with the \(2p_{x,y,\text{ or }z}\) orbital in a ratio determined by \(C_2/C_1\).

This ratio \(C_2/C_1\) can be shown using perturbation theory to be proportional to the magnitude of the coupling \(\langle 1s^22s^2 |H|1s^22p^2 \rangle\) matrix element between the two configurations involved and inversely proportional to the energy difference \([\langle 1s^22s^2H|1s^22s^2 \rangle - \langle 1s^22p^2|H|1s^22p^2 \rangle]\) between these configurations. In general, configurations that have similar Hamiltonian expectation values and that are coupled strongly give rise to strongly mixed (i.e., with large \(|C_2/C_1|\) ratios) polarized orbital pairs.

II.Later in this Chapter, you will learn how to evaluate Hamiltonian matrix elements between pairs of antisymmetric wave functions. If you are anxious to learn this now, go to the subsection entitled The Slater-Condon Rules and read that before returning here.

In each of the three equivalent terms in the alkaline earth wave function, one of the valence electrons moves in a \(2s+a2p\) orbital polarized in one direction while the other valence electron moves in the \(2s-a2p\) orbital polarized in the opposite direction. For example, the first term \([(2s-a2p_x)\alpha(2s+a2p_x)\beta - (2s-a2p_x)\beta(2s+a2p_x)\alpha]\) describes one electron occupying a \(2s-a2p_x\) polarized orbital while the other electron occupies the \(2s+a2p_x\) orbital. The electrons thus reduce their Coulomb repulsion by occupying different regions of space; in the SCF picture \(1s^22s^2\), both electrons reside in the same \(2s\) region of space. In this particular example, the electrons undergo angular correlation to avoid one another.

The use of doubly excited determinants is thus seen as a mechanism by which \(\psi\) can place electron pairs, which in the single-configuration picture occupy the same orbital, into different regions of space (i.e., each one into a different member of the polarized orbital pair) thereby lowering their mutual Coulomb repulsion. Such electron correlation effects are extremely important to include if one expects to achieve chemically meaningful accuracy (i.e., ± 5 kcal/mole).

2. Essential Configuration Interaction

There are occasions in which the inclusion of two or more determinants in \(\psi\) is essential to obtaining even a qualitatively correct description of the molecule’s electronic structure. In such cases, we say that we are including essential correlation effects. To illustrate, let us consider the description of the two electrons in a single covalent bond between two atoms or fragments that we label X and Y. The fragment orbitals from which the bonding \(\sigma\) and antibonding \(\sigma^*\) MOs are formed we will label \(s_X\)  and \(s_Y\), respectively.

Several spin- and spatial- symmetry adapted 2-electron determinants (i.e., CSFs) can be formed by placing two electrons into the \(\sigma\) and \(\sigma^*\) orbitals. For example, to describe the singlet determinant corresponding to the closed-shell \(\sigma^2\) orbital occupancy, a single Slater determinant 

\[^1\Sigma (0)  =  |\sigma\alpha \sigma\beta|  =  \frac{1}{\sqrt{2}} [\sigma\alpha(1)\sigma\beta(2) - \sigma\beta(1)\sigma\alpha(2)  ]\]

suffices. An analogous expression for the \((\sigma^*)^2\)  determinant is given by

\[{}^1\Sigma^{**} (0)  =  | \sigma^*\alpha \sigma^*\beta |  =  \frac{1}{\sqrt{2}} [ \sigma^*\alpha(1) \sigma^*\beta(2) - \sigma^*\alpha(2) \sigma^*\beta(1) ]\]

Also, the \(M_S = 1\) component of the triplet state having  \(\sigma\sigma^*\) orbital occupancy can be written as a single Slater determinant:

\[{}^3\Sigma^{*} (1)  =  |\sigma\alpha \sigma^*\alpha |  =  \frac{1}{\sqrt{2}} [\sigma\alpha(1) \sigma^*\alpha(2) -  \sigma^*\alpha(1)\sigma\alpha(2)  ],\]

 as can  the \(M_S = -1\) component of the triplet state

\[{}^3\Sigma^{*}(-1)  =  |\sigma\beta \sigma^*\beta |  =  \frac{1}{\sqrt{2}} [\sigma\beta(1) \sigma^*\beta(2) -  \sigma^*\beta(1)\sigma\beta(2)  ].\]

However, to describe the singlet and \(M_S = 0\) triplet states belonging to the  \(\sigma\sigma^*\) occupancy, two determinants are needed:

\[  {}^1\Sigma^{*} (0)  = \frac{1}{\sqrt{2}}   [\sigma\alpha \sigma^*\beta -  \sigma\beta\sigma^*\alpha ]\]

is the singlet  and

\[ {}^3\Sigma^{*}(0)  = \frac{1}{\sqrt{2}}   [\sigma\alpha \sigma^*\beta +  \sigma\beta\sigma^*\alpha ] \]

is the triplet (note, you can obtain this \(M_S = 0\) triplet by applying \(\textbf{S}_- = \textbf{S}_- (1) + \textbf{S}_- (2)\) to the \(M_S = 1\) triplet). In each case, the spin quantum number S, its z-axis projection \(M_S\) , and the \(L\) quantum number are given in the conventional \(^{2S+1}\Lambda(M_S)\) term symbol notation.

As the distance \(R\) between the X and Y fragments is changed from near its equilibrium value of \(R_e\) and approaches infinity, the energies of the \(\sigma\) and \(\sigma^*\) orbitals vary in a manner well known to chemists as depicted in Figure 6.1.11 if X and Y are identical.


Figure 6.1.11: Orbital Correlation Diagram Showing Two \(\sigma\)- Type Orbitals Combining to Form a Bonding and an Antibonding Molecular Orbital.

If X and Y are not identical, the \(s_x\) and \(s_y\) orbitals still combine to form a bonding \(\sigma\) and an antibonding \(\sigma^*\) orbital.  The energies of these orbitals, for R values ranging from near \(R_e\) to \(R\rightarrow \infty\), are depicted in Figure 6.1.12 for the case in which X is more electronegative than Y.



Figure 6.1.12: Orbital Correlation Diagram For \(\sigma\)- Type Orbitals in the Heteronuclear Case

The energy variation in these orbital energies gives rise to variations in the energies of the six determinants listed above. As \(R \rightarrow \infty\), the determinants’ energies are difficult to intuit because the \(\sigma\) and \(\sigma^*\) orbitals become degenerate (in the homonuclear case) or nearly so (in the \(X \ne Y\) case). To pursue this point and arrive at an energy ordering for the determinants that is appropriate to the \(R \rightarrow \infty\) region, it is useful to express each such function in terms of the fragment orbitals \(s_x \) and \(s_y\) that comprise \(\sigma\) and \(\sigma^*\). To do so, the LCAO-MO expressions for \(\sigma\) and \(\sigma^*\),

\[\sigma = C [s_x + z s_y]\]


\[\sigma^* = C^* [z s_x  - s_y],\]

are substituted into the Slater determinant definitions given above.  Here \(C\) and \(C^*\) are the normalization constants.  The parameter \(z\) is 1.0 in the homonuclear case and deviates from 1.0 in relation to the \(s_x\) and \(s_y\) orbital energy difference (if \(s_x\) lies below \(s_y\), then \(z < 1.0\); if \(s_x\) lies above \(s_y\), \(z > 1.0\)).

Let us examine the \(X=Y\) case to keep the analysis as simple as possible. The process of substituting the above expressions for \(\sigma\) and \(\sigma^*\) into the Slater determinants that define the singlet and triplet functions can be illustrated as follows for the \(^1\Sigma(0)\) case:

\[{}^1\Sigma(0) = |\sigma\alpha \sigma\beta| = C_2 | (s_x + s_y) \alpha(s_x + s_y) \beta| \]

\[= C_2 [|s_x \alpha s_x b| + |s_y \alpha s_y \beta| + |s_x \alpha s_y \beta| + |s_y \alpha s_x \beta|] \]

The first two of these atomic-orbital-based Slater determinants (\(|s_x \alpha s_x  b|\) and \(|s_y \alpha s_y \beta|\)) are called ionic because they describe atomic orbital occupancies, which are appropriate to the \(R \rightarrow \infty\) region that correspond to \(X \bullet\bullet + X \) and \(X + X \bullet\bullet\)  valence bond structures, while \(|s_x \alpha s_y \beta|\) and \(|s_y \alpha s_x \beta|\) are called "covalent" because they correspond to \(X\bullet   + X\bullet\) structures.

In similar fashion, the remaining five determinant functions may be expressed in terms of fragment-orbital-based Slater determinants. In so doing, use is made of the antisymmetry of the Slater determinants (e.g., \(| \phi_1 \phi_2 \phi_3 | =  - | \phi_1 \phi_3 \phi_2 |\)), which implies that any determinant in which two or more spin-orbitals are identical vanishes \(| \phi_1 \phi_2 \phi_2 | =  - | \phi_1 \phi_2 \phi_2 | = 0\). The result of decomposing the MO-based determinants into their fragment-orbital components is as follows:

           \[ {}^1\Sigma^{**} (0)  = |\sigma^*\alpha \sigma^*\beta |  = C^*{}^2 [ |s_x \alpha s_x \beta| + |s_y \alpha s_y \beta| - |s_x \alpha s_y \beta| - |s_y \alpha s_x \beta|]\]

             \[ {}^1\Sigma^{*} (0)  =\frac{1}{\sqrt{2}}[ |\sigma\alpha \sigma^*\beta | - |\sigma\beta \sigma^*\alpha | ]= CC^* \sqrt{2} [|s_x \alpha s_x \beta| - |s_y \alpha s_y \beta|] \]

                       \[ {}^3\Sigma^{*} (1) = |\sigma\alpha \sigma^*\alpha |  = CC^* 2|s_y \alpha s_x \alpha|\]

           \[ {}^3\Sigma^{*} (0) = \frac{1}{\sqrt{2}}[ \sigma\alpha \sigma^*\beta | + |\sigma\beta \sigma^*\alpha |]=CC^* \sqrt{2} [|s_y \alpha s_x \beta| - |s_x \alpha s_y \beta|]\]

\[  {}^3\Sigma^{*} (-1) = |\sigma\alpha \sigma^*\alpha | = CC^* 2|s_y \beta s_x \beta|\]

These decompositions of the six valence determinants into fragment-orbital or valence bond components allow the \(R  = \infty\) energies of these states to specified.  For example, the fact that both \({}^1\Sigma\) and \({}^1\Sigma^{**}\) contain 50% ionic and 50% covalent structures implies that, as \(R \rightarrow \infty\), both of their energies will approach the average of the covalent and ionic atomic energies \(\frac{1}{2} [E (X\bullet )  + E (X\bullet )  + E (X) + E ( X \bullet\bullet) ]\).  The \({}^1\Sigma^{*}\) energy approaches the purely ionic value \(E (X)+ E (X \bullet\bullet )\) as \(R \rightarrow \infty\). The energies of \({}^3\Sigma^{*}(0), {}^3\Sigma^{*}(1)\) and \({}^3\Sigma^{*}(-1)\) all approach the purely covalent value \(E (X\bullet ) + E (X\bullet )\)  as \(R \rightarrow \infty\).

The behaviors of the energies of the six valence determinants as \(R\) varies are depicted in Figure 6.1.13 for situations in which the homolytic bond cleavage is energetically favored (i.e., for which \(E (X\bullet ) + E (X\bullet )  <  E (X) +E ( X \bullet\bullet)\)).



Figure 6.1. 13: Configuration Correlation  Diagram Showing How the Determinants’ Energies Vary With \(R\).

It is essential to realize that the energies of the determinants do not represent the energies of the true electronic states. For \(R\)-values at which the determinant energies are separated widely, the true state energies are rather well approximated by individual determinant energies; such is the case near Re for the \({}^1\Sigma\) state.

However, at large \(R\), the situation is very different, and it is in such cases that what we term essential configuration interaction occurs. Specifically, for the \(X=Y\) example, the \({}^1\Sigma\) and \({}^1\Sigma^{**}\) determinants undergo essential CI coupling to form a pair of states of \({}^1\Sigma\) symmetry (the \({}^1\Sigma^{*}\) CSF cannot partake in this CI mixing because it is of ungerade symmetry; the \({}^3\Sigma^{*}\) states can not mix because they are of triplet spin symmetry). The CI mixing of the \({}^1\Sigma\) and \({}^1\Sigma^{**}\) determinants is described in terms of a 2x2 secular problem

\[ \left[\begin{array}{cc}
\langle ^1\Sigma | H | ^1\Sigma \rangle &  \langle ^1\Sigma | H | ^1\Sigma^{**} \rangle \\
\langle ^1\Sigma^{**} | H | ^1\Sigma \rangle &  \langle ^1\Sigma^{**} | H | ^1\Sigma^{**} \rangle 
= E\left[\begin{array}{c}A\\B\end{array}\right] \] 

The diagonal entries are the determinants’ energies depicted in Figure 6.1.13. The off-diagonal coupling matrix elements can be expressed in terms of an exchange integral between the \(\sigma\) and \(\sigma^*\) orbitals:

\[\langle {}^1\Sigma|H|{}^1\Sigma^{**} \rangle = \langle |\sigma\alpha \sigma\beta|H||\sigma^*\alpha \sigma^*\beta |\rangle = \langle \sigma\sigma|| \sigma^*\sigma^* \rangle = K_{\sigma \sigma^*}\]

Later in this Chapter, you will learn how to evaluate Hamiltonian matrix elements between pairs of antisymmetric wave functions and to express them in terms of one- and two-electron integrals.  If you are anxious to learn this now, go to the subsection entitled the Slater-Condon Rules and read that before returning here.

At \(R \rightarrow \infty\), where the \({}^1\Sigma\) and \({}^1\Sigma^{**}\) determinants are degenerate, the two solutions to the above CI matrix eigenvalue problem are:

\[ E =\frac{1}{2} [  E (X\bullet ) + E (X\bullet )  + E (X)+ E (X \bullet\bullet) ]  \mp \langle \sigma\sigma | \frac{1}{r_{12}} | \sigma^* \sigma^*\rangle \]

with respective amplitudes for the \({}^1\Sigma\) and \({}^1\Sigma^{**}\) CSFs given by

\[ A_\mp = \pm   \frac{1}{\sqrt{2}} ;   \hspace{15pt}      B_{\mp} = \mp ​\frac{1}{\sqrt{2}}.\]

The first solution thus has

\[ \psi_{-}  = \frac{1}{\sqrt{2}} [|\sigma\alpha \sigma\beta| - |\sigma^*\alpha \sigma^*\beta |]\]

which, when decomposed into atomic orbital components, yields

\[\psi_{-} = \frac{1}{\sqrt{2}} [ |s_x\alpha s_y\beta| - |s_x\beta s_y\alpha|].\]

The other root has

\[  \psi_{+}  = \frac{1}{\sqrt{2}}  [|\sigma\alpha \sigma\beta| + |\sigma^*\alpha  \sigma^*\beta |] = \frac{1}{\sqrt{2}} [ |s_x\alpha  s_x\beta| + |s_y a  s_y\beta|].\]

So, we see that \({}^1\Sigma\) and \({}^1\Sigma^{**}\), which both contain 50% ionic and 50% covalent parts, combine to produce \(\psi_{-}\) which is purely covalent and \(\psi_{+}\) which is purely ionic.

The above essential CI mixing of \({}^1\Sigma\) and \({}^1\Sigma^{**}\) as \(R \rightarrow \infty\) qualitatively alters the energy diagrams shown above. Descriptions of the resulting valence singlet and triplet S states are given in Figure 6.1.14 for homonuclear situations in which covalent products lie below the ionic fragments.


Figure 6.1.14: State Correlation Diagram Showing How the Energies of the States, Comprised of Combinations of Determinants, Vary With \(R\).

3. Various Approaches to Electron Correlation

There are numerous procedures currently in use for determining the best Born-Oppenheimer electronic wave function that is usually expressed in the form:

\[\psi = \sum_i m_i C_I \Phi_i,\]

where \(\Phi_I\) is a spin-and space- symmetry-adapted configuration state function (CSF) that consists of one or more determinants  \(| \phi_{I1}\phi_{I2}\phi_{I3}\cdots \phi_{IN}|\) combined to produce the desired symmetry. In all such wave functions, there are two kinds of parameters that need to be determined- the CI coefficients and the LCAO-MO coefficients describing the fIk in terms of the AO basis functions. The most commonly employed methods used to determine these parameters include:

a. The CI Method

In this approach, the LCAO-MO coefficients are determined first usually via a single-configuration HF SCF calculation. The CI coefficients are subsequently determined by making the expectation value \(\langle \psi | H | \psi \rangle / \langle \psi | \psi \rangle\) variationally stationary with \(\psi\) chosen to be of the form

\[\psi = \sum_I C_I \Phi_I.\]

As with all such linear variational problems, this generates a matrix eigenvalue equation

\[\sum_J \langle\Phi_I|H|\Phi_J\rangle C_J=EC_I\]

to be solved for the optimum {\(C_I\)} coefficients and for the optimal energy \(E\).

The CI wave function is most commonly constructed from spin- and spatial- symmetry adapted combinations of determinants called configuration state functions (CSFs) \(\Phi_J\)  that include:

  1. The so-called reference CSF that is the SCF wave function used to generate the molecular orbitals \(\phi_i\).
  2. CSFs generated by carrying out single, double, triple, etc. level excitations (i.e., orbital replacements) relative to the reference CSF. CI wave functions limited to include contributions through various levels of excitation are denoted S (singly), D (doubly), SD (singly and doubly), SDT (singly, doubly, and triply) excited.

The orbitals from which electrons are removed can be restricted to focus attention on correlations among certain orbitals. For example, if excitations out of core orbitals are excluded, one computes a total energy that contains no core correlation energy. The number of CSFs included in the CI calculation can be large. CI wave functions including 5,000 to 50,000 CSFs are routine, and functions with one to several billion CSFs are within the realm of practicality.

The need for such large CSF expansions can be appreciated by considering (i) that each electron pair requires at least two CSFs to form the polarized orbital pairs discussed earlier in this Chapter, (ii) there are of the order of \(\dfrac{N(N-1)}{2} = X\) electron pairs for a molecule containing \(N\) electrons, hence (iii) the number of terms in the CI wave function scales as \(2^X\). For a molecule containing ten electrons, there could be \(2^{45} = 3.5 \times 10^{13}\) terms in the CI expansion. This may be an over estimate of the number of CSFs needed, but it demonstrates how rapidly the number of CSFs can grow with the number of electrons.

The Hamiltonian matrix elements \(H_{I,J}\) between pairs of CSFs are, in practice, evaluated in terms of one- and two- electron integrals over the molecular orbitals. Prior to forming the \(H_{I,J}\) matrix elements, the one- and two- electron integrals, which can be computed only for the atomic (e.g., STO or GTO) basis, must be transformed to the molecular orbital basis. This transformation step requires computer resources proportional to the fifth power of the number of basis functions, and thus is one of the more troublesome steps in most configuration interaction (and most other correlated) calculations.

To transform the two-electron integrals \(\langle \chi_a(r)\chi_b(r')|\dfrac{1}{|r-r'|}|\chi_c(r)\chi_d(r')\rangle\)  from this AO basis to the MO basis, one proceeds as follows:

1. First one utilizes the original AO-based integrals to form a partially transformed set of integrals


\[\langle \chi_a(r)\chi_b(r')|\dfrac{1}{|r-r'|}|\chi_c(r)\phi_l(r')\rangle = \sum_d C_{l,d} \langle \chi_a(r)\chi_b(r')|\dfrac{1}{|r-r'|}|\chi_c(r)\chi_d(r')\rangle.\]


This step requires of the order of \(M^5\) operations.

2. Next one takes the list \(\langle \chi_a(r)\chi_b(r')|\dfrac{1}{|r-r'|}|\chi_c(r)\phi_l(r')\rangle\) and carries out another so-called one-index transformation


\[\langle \chi_a(r)\chi_b(r')|\dfrac{1}{|r-r'|}|\phi_k(r)\phi_l(r')\rangle = \sum_c C_{k,c} \langle \chi_a(r)\chi_b(r')|\dfrac{1}{|r-r'|}|\chi_c(r)\phi_l(r')\rangle.\]


3. This list \(\langle \chi_a(r)\chi_b(r')|\dfrac{1}{|r-r'|}|\phi_k(r)\phi_l(r')\rangle\) is then subjected to another one-index transformation to generate \(\langle \chi_a(r)\phi_j(r')|\dfrac{1}{|r-r'|}|\phi_k(r)\phi_l(r')\rangle\), after which

4. \(\langle \chi_a(r)\phi_j(r')|\dfrac{1}{|r-r'|}|\phi_k(r)\phi_l(r')\rangle\) is subjected to the fourth one-index transformation to form the final MO-based integral list \(\langle \phi_i(r)\phi_j(r')|\dfrac{1}{|r-r'|}|\phi_k(r)\phi_l(r')\rangle\). In total, these four transformation steps require \(4M^5\) computer operations.

A variant of the CI method that is sometimes used is called the multi-configurational self-consistent field (MCSCF) method. To derive the working equations of this approach, one minimizes the expectation value of the Hamiltonian for a trial wave function consisting of a linear combination of CSFs

\[ \psi = \sum_I C_I \Phi_I.\]

In carrying out this minimization process, one varies both the linear {\(C_I\)} expansion coefficients and the LCAO-MO coefficients {\(C_{J,\mu}\)} describing those spin-orbitals that appear in any of the CSFs {\(\Phi_I\)}. This produces two sets of equations that need to be solved:

1. A matrix eigenvalue equation

\[\sum_J \langle\Phi_I|H|\Phi_J\rangle C_J=EC_I\]

of the same form as arises in the CI method, and

2. equations that look very much like the HF equations

\[\sum_\mu \langle\chi_\nu |h_e| \chi_\mu\rangle C_{J,\mu} = \epsilon_J \sum_\mu \langle\chi_\nu|\chi_\mu\rangle C_{J,\mu} \]

but in which the he matrix element is

\[\langle\chi_\nu| h_e| \chi_\mu\rangle = \langle\chi_\nu| -\dfrac{\hbar^2}{2m} \nabla^2 |\chi_\mu\rangle + \langle\chi_\nu| -\frac{Ze^2}{r} |\chi_\mu\rangle\]

\[+ \sum_{\eta,\gamma} \Gamma{\eta,\gamma} [\langle\chi_\nu(r) \chi_\eta(r’) |\frac{e^2}{|r-r’|} | \chi_\mu(r) \chi_\gamma(r’)\rangle
- \langle\chi_\nu(r) \chi_\eta(r’) |\frac{e^2}{|r-r’|} | \chi_\gamma(r) \chi_\mu (r’)\rangle].\]


Here \(\Gamma_{\eta,\gamma}\) replaces the sum \(\sum_K C_{K,\eta} C_{K,\gamma}\) that appears in the HF equations, with \(\Gamma_{\eta,\gamma}\)  depending on both the LCAO-MO coefficients {\(C_{K,\eta}\)} of the spin-orbitals and on the {\(C_I\)} expansion coefficients. These equations are solved through a self-consistent process in which initial {\(C_{K,\eta}\)} coefficients are used to form the  matrix and solve for the {\(C_I\)} coefficients, after which the \(\Gamma_{\eta,\gamma}\) can be determined and the HF-like equations solved for a new set of {\(C_{K,\eta}\)} coefficients, and so on until convergence is reached.

b. Perturbation Theory

This method uses the single-configuration SCF process to determine a set of orbitals {\(\phi_i\)}. Then, with a zeroth-order Hamiltonian equal to the sum of the \(N\) electrons’ Fock operators \(H_0 = \sum_{i=1}^N h_e(i)\), perturbation theory is used to determine the CI amplitudes for the other CSFs. The Møller-Plesset perturbation (MPPT) procedure is a special case in which the above sum of Fock operators is used to define \(H_0\).  The amplitude for the reference CSF is taken as unity and the other CSFs' amplitudes are determined by using \(H-H_0\) as the perturbation. This perturbation is the difference between the true Coulomb interactions among the electrons and the mean-field approximation to those interactions:


\[V=H-H^{(0)}=\frac{1}{2}\sum_{i \ne j\ne 1}^N \frac{1}{r_{i,j}}-\sum_{k=1}^N[J_j(r)-K_k(r)]\]


where \(J_k\) and \(K_k\) are the Coulomb and exchange operators defined earlier in this Chapter and the sum over \(k\) runs over the \(N\) spin-orbitals that are occupied in the Hartree-Fock wave function that forms the zeroth-order approximation to \(\psi\).

In the MPPT method, once the reference CSF is chosen and the SCF orbitals belonging to this CSF are determined, the wave function \(\psi\) and energy \(E\) are determined in an order-by-order manner as is the case in the RSPT discussed in Chapter 3. In fact, MPPT is just RSPT with the above fluctuation potential as the perturbation. The perturbation equations determine what CSFs to include through any particular order. This is one of the primary strengths of this technique; it does not require one to make further choices, in contrast to the CI treatment where one needs to choose which CSFs to include.

For example, the first-order wave function correction \(\psi_1\) is:

\[\psi_1 = - \sum_{i < j,m < n} \dfrac{\langle i,j |\dfrac{1}{r_{12}}| m,n \rangle -\langle i,j |\dfrac{1}{r_{12}}| n,m \rangle}{ \varepsilon_m-\varepsilon_i +\varepsilon_n-\varepsilon_j} | \Phi_{i,j}^{m,n} \rangle,\]

where the SCF orbital energies are denoted \(\varepsilon_k\) and \(\Phi_{i,j}^{m,n}\) represents a CSF that is doubly excited (\(\phi_i\) and \(\phi_j\) are replaced by \(\phi_m\) and \(\phi_n\)) relative to the SCF wave function \(\Phi\). The denominators \([ \varepsilon_m-\varepsilon_i +\varepsilon_n-\varepsilon_j]\) arise from \(E_0-E_k^0\)  because each of these zeroth-order energies is the sum of the orbital energies for all spin-orbitals occupied. The excited CSFs \(\Phi_{i,j}^{m,n}\) are the zeroth-order wave functions other than the reference CSF. Only doubly excited CSFs contribute to the first-order wave function; the fact that the contributions from singly excited configurations vanish in \(\psi_1\) is known at the Brillouin theorem.

The Brillouin theorem can be proven by considering Hamiltonian matrix elements coupling the reference CSF \(F\) to singly-excited CSFs Fim. The rules for evaluating all such matrix elements are called Slater-Condon rules and are given later in this Chapter. If you don’t know them, this would be a good time to go read the subsection on these rules before returning here. From the Slater-Condon rules, we know that the matrix elements in question are given by


\[\langle \Phi|H|\Phi_i^m\rangle= \langle \phi_i(r)| -\frac{1}{2}\nabla^2 - \sum_a \dfrac{Z_a}{|r-R_a|} |\phi_m(r)\rangle  + \sum_{j=1(\ne i,m)}^N \langle \phi_i(r) \phi_j(r')|\dfrac{1-P_{r,r'}}{|r-r'|}| \phi_m(r) \phi_j(r')\rangle.\]


Here, the factor \(P_{r,r’}\) simply permutes the coordinates \(r\) and \(r’\) to generate the exchange integral. The sum of two electron integrals on the right-hand side above can be extended to include the terms arising from \(j =i\) because  vanishes. As a result, the entire right-hand side can be seen to reduce to the matrix element of the Fock operator \(h_{\rm HF}(r)\):


\[\langle \Phi|H|\Phi_i^m\rangle=\langle \phi_i|h_{\rm HF}(r)|\phi_m(r)\rangle=\varepsilon_m\delta_{i,m}=0.\]


The matrix elements vanish because the spin-orbitals are eigenfunctions of \(h_{\rm HF}(r)\) and are orthogonal to each other.

The MPPT energy \(E\) is given through second order as in RSPT by

\[E = E_{SCF} - \sum_{i < j,m < n} \frac{| \langle i,j | \dfrac{1}{r_{12}} | m,n \rangle -\langle i,j | \dfrac{1}{r_{12}} | n,m \rangle |^2}{ \varepsilon_m-\varepsilon_i +\varepsilon_n-\varepsilon_j }\]

and again only contains contributions from the doubly excited CSFs.  Both \(\psi\) and \(E\) are expressed in terms of two-electron integrals \(\langle i,j | \frac{1}{r_{12}} | m,n \rangle\) (that are sometimes denoted \(\langle i,j|k,l\rangle\)) coupling the virtual spin-orbitals \(\phi_m\)  and \(\phi_n\) to the spin-orbitals from which electrons were excited \(\phi_i\) and \(\phi_j\) as well as the orbital energy differences \([ \varepsilon_m-\varepsilon_i +\varepsilon_n-\varepsilon_j ]\) accompanying such  excitations. Clearly, major contributions to the correlation energy are made by double excitations into virtual orbitals \(\phi_m \phi_n\) with large \(\langle i,j | \frac{1}{r_{12}} | m,n \rangle\) integrals and small orbital energy gaps \([\varepsilon_m-\varepsilon_i +\varepsilon_n-\varepsilon_j]\). In higher order corrections, contributions from CSFs that are singly, triply, etc. excited relative to the HF reference function \(F\) appear, and additional contributions from the doubly excited CSFs also enter. The various orders of MPPT are usually denoted MPn (e.g., MP2 means second-order MPPT).

c. The Coupled-Cluster Method

As noted above, when the Hartree-Fock wave function \(\psi_0\) is used as the zeroth-order starting point in a perturbation expansion, the first (and presumably most important) corrections to this function are the doubly-excited determinants. In early studies of CI treatments of electron correlation, it was observed that double excitations had the largest \(C_J\) coefficients (after the SCF wave function, which has the very largest \(C_J\)). Moreover, in CI studies that included single, double, triple, and quadruple level excitations relative to the dominant SCF determinant, it was observed that quadruple excitations had the next largest \(C_J\) amplitudes after the double excitations. And, very importantly, it was observed that the amplitudes \(C_{abcd}^{mnpq}\) of the quadruply excited CSFs \(\Phi_{abcd}^{mnpq}\)​  could be very closely approximated as products of the amplitudes \(C_{ab}^{mn} C_{cd}^{pq}\)​ of the doubly excited CSFs \(\Phi_{ab}^{mn}\) and \(\Phi_{cd}^{pq}\). This observation prompted workers to suggest that a more compact and efficient expansion of the correlated wave function might be realized by writing \(\psi\) as:

\[\psi = \exp(T) \Phi,\]

where \(\Phi\) is the SCF determinant and the operator \(T\) appearing in the exponential is taken to be a sum of operators

\[T = T_1 + T_2 + T_3 + … + T­_N \]

that create single (\(T_1\)), double (\(T_2\)), etc. level excited CSFs when acting on \(\Phi\). As I show below, this so-called coupled-cluster (CC) form for \(\psi\) then has the characteristic that the dominant contributions from quadruple excitations have coefficients nearly equal to the products of the coefficients of their constituent double excitations.

In any practical calculation, this sum of \(T_n\) operators would be truncated to keep the calculation practical. For example, if excitation operators higher than \(T_3\) were neglected, then one would use \(T » T_1 + T_2 + T_3\). However, even when \(T\) is so truncated, the resultant \(\psi\) would contain excitations of higher order.  For example, using the truncation just introduced, we would have

\[y = (1 + T_1 + T_2 + T_3 + \frac{1}{2} (T_1 + T_2 + T_3) (T_1 + T_2 + T_3) + \frac{1}{6} (T_1 + T_2 + T_3) \]

\[(T_1 + T_2 + T_3) (T_1 + T_2 + T_3) + …) \Phi. \]

This function contains single excitations (in \(T_1\Phi\)), double excitations (in \(T_2\Phi\) and in \(T_1T_1\Phi\)), triple excitations (in \(T_3\Phi\), \(T_2T_1\Phi\), \(T_1T_2\Phi\), and \(T_1T_1T_1\Phi\)), and quadruple excitations in a variety of terms including \(T_3 T_1\Phi\) and \(T_2 T_2\Phi\), as well as even higher level excitations. By the design of this wave function, the quandruple excitations \(T_2 T_2\Phi\) will have amplitudes given as products of the amplitudes of the double excitations \(T_2\Phi\) just as were found by earlier CI workers to be most important. Hence, in CC theory, we say that quadruple excitations include unlinked products of double excitations arising from the \(T_2 T_2\) product; the quadruple excitations arising from \(T_4\Phi\) would involve linked terms and would have amplitudes that are not products of double-excitation amplitudes.

After writing \(\psi\) in terms of an exponential operator, one is faced with determining the amplitudes of the various single, double, etc. excitations generated by the \(T\) operator acting on \(\Phi\). This is done by writing the Schrödinger equation as:

\[H \exp(T) \Phi = E \exp(T) \Phi,\]

and then multiplying on the left by \(\exp(-T)\) to obtain:

\[exp(-T) H \exp(T) \Phi = E \Phi.\]

The CC energy is then calculated by multiplying this equation on the left by \(\Phi^*\) and integrating over the coordinates of all the electrons:

\[\langle\Phi| \exp(-T) H \exp(T) \Phi> = E.\]

In practice, the combination of operators appearing in this expression is rewritten and dealt with as follows:

\[E = \langle\Phi| T + [H,T] + \frac{1}{2} [[H,T],T] + \frac{1}{6} [[[H,T],T],T] + \frac{1}{24} [[[[H,T],T],T],T] |\Phi\rangle;\]

this so-called Baker-Campbell-Hausdorf expansion of the exponential operators can be shown truncate exactly after the fourth power term shown here. So, once the various operators and their amplitudes that comprise \(T\) are known, \(E\) is computed using the above expression that involves various powers of the \(T\) operators.

The equations used to find the amplitudes (e.g., those of the \(T_2\) operator \(\sum_{a,b,m,n} t_{ab}^{mn}T_{ab}^{mn}\), where the \(t_{ab}^{mn}\) are the amplitudes and \(T_{ab}^{mn}\) are the excitation operators that promote two electrons from \(\phi_a\) and \(\phi_b\) into \(\phi_m\) and \(\phi_n\)) of the various excitation level are obtained by multiplying the above Schrödinger equation on the left by an excited determinant of that level and integrating. For example, the equation for the double-excitations is:

\[0 = \langle\Phi_{ab}^{mn}| T + [H,T] + \frac{1}{2} [[H,T],T] + \frac{1}{6} [[[H,T],T],T] + \frac{1}{24} [[[[H,T],T],T],T] |\Phi\rangle. \]

The zero arises from the right-hand side of \(\exp(-T) H \exp(T) \Phi = E \Phi\) and the fact that \(\langle\Phi_{ab}^{mn}|\Phi\rangle = 0 \); that is, the determinants are orthonormal. The number of such equations is equal to the number of doubly excited determinants \(\Phi_{ab}^{mn}\), which is equal to the number of unknown \(t_{ab}^{mn}\) amplitudes. So, the above quartic equations must be solved to determine the amplitudes appearing in the various \(T_J\) operators. Then, as noted above, once these amplitudes are known, the energy \(E\) can be computed using the earlier quartic equation. Having to solve many coupled quartic equations is one of the most severe computational challenges of CC theory.

Clearly, the CC method contains additional complexity as a result of the exponential expansion form of the wave function \(\psi\) and the resulting coupled quartic equations that need to be solved to determine the \(t\) amplitudes. However, it is this way of writing \(\psi\) that allows us to automatically build in the fact that products of double excitations are the dominant contributors to quadruple excitations (and \(T_2 T_2 T_2\) is the dominant component of six-fold excitations, not \(T_6\)). In fact, the CC method is today one of the most accurate tools we have for calculating molecular electronic energies and wave functions.

d. The Density Functional Method

These approaches provide alternatives to the conventional tools of quantum chemistry, which move beyond the single-configuration picture by adding to the wave function more configurations (i.e., excited determinants) whose amplitudes they each determine in their own way. As noted earlier, these conventional approaches can lead to a very large number of CSFs in the correlated wave function, and, as a result, a need for extraordinary computer resources.

The density functional approaches are different. Here one solves a set of orbital-level equations

\[ - \frac{\hbar^2}{2m_e} \nabla^2 - \sum_a \frac{Z_ae^2}{|\textbf{r}-\textbf{R}_a|} + \int \rho(\textbf{r}')\frac{e^2}{|\textbf{r}-\textbf{r}'|} + U(r)] \phi_i = \varepsilon_i \phi_i\]

in which the orbitals {\(\phi_i\)} feel potentials due to the nuclear centers (having charges \(Z_a\)), Coulombic interaction with the total electron density \(\rho(\textbf{r}')\), and a so-called exchange-correlation potential denoted \(U(\textbf{r}')\). The particular electronic state for which the calculation is being performed is specified by forming a corresponding density \(\rho(\textbf{r}')\) that, in turn, is often expressed as a sum of squares of occupied orbitals multiplied by orbitial occupation numbers. Before going further in describing how DFT calculations are carried out, let us examine the origins underlying this theory.

The so-called Hohenberg-Kohn theorem states that the ground-state electron density \(\rho(\textbf{r})\) of the atom or molecule or ion of interest uniquely determines the potential \(V(\textbf{r})\) in the molecule’s electronic Hamiltonian (i.e., the positions and charges of the system’s nuclei)

\[H = \sum_j {-\frac{\hbar^2}{2m_e} \nabla_j^2 + V(r_j) + \frac{e^2}{2} \sum_{k\ne j} \frac{1}{r_{j,k}} },\]

and, because H  determines all of the energies and wave functions of the system, the ground-state density \(\rho(\textbf{r})\) therefore determines all properties of the system.

One proof of this theorem proceeds as follows:

  1. \(\rho(\textbf{r})\) determines the number of electrons \(N\) because \(\int \rho(\textbf{r}) d^3r = N\).
  2. Assume that there are two distinct potentials (aside from an additive constant that simply shifts the zero of total energy)  \(V(\textbf{r})\)  and  \(V'(\textbf{r})\)  which, when used in \(H\) and \(H’\), respectively, to solve for a ground state produce \(E_0\), \(\psi (r)\) and \(E_0’\), \(\psi'(r)\) that have the same one-electron density: \(\int |\psi|^2 dr_2 dr_3 ... dr_N = \rho(\textbf{r})=  \int   |\psi'|^2 dr_2 dr_3 ... dr_N \).
  3. If we think of \(\psi'\) as trial variational wave function for the Hamiltonian \(H\), we know that  \(E_0  < \langle \psi'|H|\psi'\rangle = \langle \psi'|H’|\psi'\rangle + \int \rho(\textbf{r}) [V(\textbf{r}) - V’(\textbf{r})] d^3r = E_0’ + \int \rho(\textbf{r}) [V(\textbf{r}) - V’(\textbf{r})] d^3r\).
  4. Similarly, taking \(\psi\) as a trial function for the \(H’\) Hamiltonian, one finds that \(E_0’  < E_0 + \int \rho(\textbf{r}) [V’(\textbf{r}) - V(\textbf{r})] d^3r\).
  5. Adding the equations in c and d gives \[E_0 + E_0’ < E_0 + E_0’,\] a clear contradiction unless the electronic state of interest is degenerate.


Hence, there cannot be two distinct potentials \(V\) and \(V’\) that give the same non-degenerate ground-state \(\rho(\textbf{r})\). So, the ground-state density \(\rho(\textbf{r})\) uniquely determines \(N\) and \(V\), and thus H. Furthermore, because the eigenfunctions of \(H\) determine all properties of the ground state, then \(\rho(\textbf{r})\), in principle, determines all such properties. This means that even the kinetic energy and the electron-electron interaction energy of the ground-state are determined by \(\rho(\textbf{r})\). It is easy to see that \(\int \rho(\textbf{r}) V(r) d^3r = V[\rho]\) gives the average value of the electron-nuclear (plus any additional one-electron additive potential) interaction in terms of the ground-state density \(\rho(\textbf{r})\). However, how are the kinetic energy \(T[\rho]\) and the electron-electron interaction \(V_{ee}[\rho]\) energy expressed in terms of r?

There is another point of view that I find sheds even more light on why it makes sense that the ground-state electron density \(\rho(\textbf{r})\) contains all the information needed to determine all properties. It was shown many years ago, by examining the mathematical character of the Schrödinger equation, that the ground-state wave function \(\psi_0(r)\) has certain so-called cusps in the neighborhoods of the nuclear centers \(R_a\). In particular \(\psi_0(r)\) must obey

\[\frac{\partial \psi_0(r_1,r_2,\cdots,r_N)}{\partial r_k}=-\frac{m_eZ_ae^2}{\hbar^2}\psi_0(r_1,r_2,\cdots,r_N)\text{ as }\textbf{r}_k \rightarrow \textbf{R}_a\]

That is, the derivative or slope of the natural logarithm of the true ground-state wave function must be  as any of the electrons’ positions approach the nucleus of charge \(Z_a\) residing at position \(R_a\). Because the ground-state electron density can be expressed in terms of the ground-state wave function as

it can be shown that the ground-state density also displays cusps at the nuclear centers  as \(r \rightarrow R_a\).

where me is the electron mass and e is the unit of charge. So, imagine that you knew the true ground-state density at all points in space. You could integrate the density over all space

to determine how many electrons the system has. Then, you could explore over all space to find points at which the density had sharp points characterized by non-zero derivatives in the natural logarithm of the density. The positions \(R_a\) of such points specify the nuclear centers, and by measuring the slopes in \(\ln(\rho(\textbf{r}))\) at each location, one could determine the charges of these nuclei through

\[{\rm slope}=\left(\dfrac{\partial\ln(\rho(r))}{dr}\right)_{r\rightarrow R_a}=-2\frac{m_eZ_ae^2}{\hbar^2}\]

This demonstrates why the ground-state density is all one needs to fully determine the locations and charges of the nuclei as well as the number of electrons and thus the entire Hamiltonian \(H\).

The main difficulty with DFT is that the Hohenberg-Kohn theorem shows the values of \(T\), \(V_{ee}\) , \(V\), etc. are all unique functionals of the ground-state \(\rho\) (i.e., that they can, in principle, be determined once \(\rho\) is given), but it does not tell us what these functional relations are.

To see how it might make sense that a property such as the kinetic energy, whose operator \(-\hbar^2 /2m_e \nabla^2\) involves derivatives, can be related to the electron density, consider a simple system of \(N\) non-interacting electrons moving in a three-dimensional cubic box potential. The energy states of such electrons are known to be

\[E  = \frac{\hbar^2}{8m_eL^2} (n_x^2 + n_y^2 +n_z^2 ),\]

where \(L\) is the length of the box along the three axes, and \(n_x\) , \(n_y\) , and \(n_z\)  are the quantum numbers describing the state. We can view \(n­_x^2 + n_y^2 +n_z^2 = R^2\)  as defining the squared radius of a sphere in three dimensions, and we realize that the density of quantum states in this space is one state per unit volume in the \(n_x\), \(n_y\), \(n_z\) space. Because \(n_x\), \(n_y\), and \(n_z\) must be positive integers, the volume covering all states with energy less than or equal to a specified energy \(E = (h^2/8m_eL^2) R^2\)  is 1/8 the volume of the sphere of radius \(R\):

\[\Phi(E) = \frac{1}{8} \frac{4\pi}{3} R^3 = \frac{\pi}{6} \left(\frac{8m_eL^2E}{\hbar^2}\right)^{3/2} \]

Since there is one state per unit of such volume, \(\Phi(E)\) is also the number of states with energy less than or equal to \(E\), and is called the integrated density of states. The number of states \(g(E) dE\) with energy between \(E\) and \(E+dE\), the density of states, is the derivative of \(\Phi\):

\[g(E) = \frac{d\Phi}{dE} = \frac{\pi}{4} \left(\frac{8m_eL^2}{\hbar^2}\right)^{3/2} \sqrt{E}  .\]

If we calculate the total energy for these non-interacting \(N\) electrons that doubly occupy all states having energies up to the so-called Fermi energy (i.e., the energy of the highest occupied molecular orbital HOMO), we obtain the ground-state energy:

 \[E_0=2\int_0^{E_F} g(E)EdE = \frac{8\pi}{5} \left(\frac{2m_e}{\hbar^2}\right)^{3/2} L^3 E\Phi^{5/2}.\]

The total number of electrons \(N\) can be expressed as

\[N =  2\int_0^{E_F} g(E)dE = \frac{8\pi}{3} \left(\frac{2m_e}{\hbar^2}\right)^{3/2} L^3 E\Phi^{3/2},\]

which can be solved for \(E\Phi\) in terms of \(N\) to then express \(E_0\)  in terms of \(N\) instead of in terms of \(E\Phi\):

\[E_0 = \frac{3\hbar^2}{10m_e} \left(\frac{3}{8\pi}\right)^{2/3} L^3 \left(\frac{N}{L^3}\right)^{5/3} .\]

This gives the total energy, which is also the kinetic energy in this case because the potential energy is zero within the box and because the electrons are assumed to have no interactions among themselves, in terms of the electron density \(\rho (x,y,z) = \dfrac{N}{L^3}\). It therefore may be plausible to express kinetic energies in terms of electron densities \(\rho(\textbf{r})\), but it is still by no means clear how to do so for real atoms and molecules with electron-nuclear and electron-electron interactions operative.

In one of the earliest DFT models, the Thomas-Fermi theory, the kinetic energy of an atom or molecule is approximated using the above kind of treatment on a local level. That is, for each volume element in \(\textbf{r}\) space, one assumes the expression given above to be valid, and then one integrates over all \(\textbf{r}\) to compute the total kinetic energy:

\[ T_{\rm TF}[\rho] = \int \frac{3\hbar^2}{10m_e} \left(\frac{3}{8\pi}\right)^{2/3}  [\rho(\textbf{r})]^{5/3} d^3r = C_F \int [\rho(\textbf{r})]^{5/3} d^3r ,\]

where the last equality simply defines the \(C_F\) constant. Ignoring the correlation and exchange contributions to the total energy, this \(T\) is combined with the electron-nuclear \(V\) and Coulombic electron-electron potential energies to give the Thomas-Fermi total energy:

\[E_{\rm 0,TF} [\rho] = C_F \int [\rho(\textbf{r})]^{5/3} d^3r +  \int V(r) \rho(\textbf{r}) d^3r + e^2/2  \int \frac{\rho(\textbf{r}) \rho(\textbf{r}’)}{|r-r’|}  d^3r d^3r’,\]

This expression is an example of how \(E_0\) is given as a local density functional approximation (LDA). The term local means that the energy is given as a functional (i.e., a function of \(\rho\)) which depends only on \(\rho(\textbf{r})\) at points in space but not on \(\rho(\textbf{r})\) at more than one point in space or on spatial derivatives of \(\rho(\textbf{r})\).

Unfortunately, the Thomas-Fermi energy functional does not produce results that are of sufficiently high accuracy to be of great use in chemistry. What is missing in this theory are the exchange energy and the electronic correlation energy.  Moreover, the kinetic energy is treated only in the approximate manner described earlier (i.e., for non-interacting electrons within a spatially uniform potential). 

Dirac was able to address the exchange energy for the uniform electron gas (\(N\) Coulomb interacting electrons moving in a uniform positive background charge whose magnitude balances the total charge of the \(N\) electrons). If the exact expression for the exchange energy of the uniform electron gas is applied on a local level, one obtains the commonly used Dirac local density approximation to the exchange energy:

\[E_{\rm ex,Dirac}[\rho] = - C_x \int [\rho(\textbf{r})]^{4/3} d^3r,\]

with \(C_x = (3/4) (3/\pi)^{1/3}\). Adding this exchange energy to the Thomas-Fermi total energy \(E_{\rm 0,TF} [\rho]\) gives the so-called Thomas-Fermi-Dirac (TFD) energy functional.

Because electron densities vary rather strongly spatially near the nuclei, corrections to the above approximations to \(T[\rho]\) and \(E_{\rm ex,Dirac}\)  are needed. One of the more commonly used so-called gradient-corrected approximations is that invented by Becke, and referred to as the Becke88 exchange functional:

\[E_{\rm ex}({\rm Becke88}) = E_{\rm ex,Dirac}[\rho] -\gamma \int \frac{x^2 r^{4/3}}{1+6 \gamma x \sinh^{-1}(x)} dr,\]

where \(x =r^{-4/3} |\nabla\rho|\), and \(\gamma\) is a parameter chosen so that the above exchange energy can best reproduce the known exchange energies of specific electronic states of the inert gas atoms (Becke finds \(\gamma\) to equal 0.0042). A common gradient correction to the earlier local kinetic energy functional \(T[\rho]\) is called the Weizsacker correction and is given by

\[\delta{T_{\rm Weizsacker}} = \frac{1}{72} \frac{\hbar}{m_e}  \int \frac{ | \nabla \rho(\textbf{r})|^2}{\rho(\textbf{r})} dr.\]

Although the above discussion suggests how one might compute the ground-state energy once the ground-state density \(\rho(\textbf{r})\) is given, one still needs to know how to obtain \(\rho\). Kohn and Sham (KS) introduced a set of so-called KS orbitals obeying the following equation:

\[{-\dfrac{\hbar^2}{2m} \nabla^2 + V(r)  + e^2  \int \frac{\rho(\textbf{r}’)}{|r-r’|}   dr’ + U_{\rm xc}(r) }\phi_J = \varepsilon_j \phi_j ,\]

where the so-called exchange-correlation potential \(U­_{xc} (r) = dE_{\rm xc}[\rho]/d\rho(\textbf{r})\) could be obtained by functional differentiation if the exchange-correlation energy functional \(E_{\rm xc}[\rho]\) were known. KS also showed that the KS orbitals {\(\phi_J\)} could be used to compute the density \(\rho\) by simply adding up the orbital densities multiplied by orbital occupancies \(n_j\):

\[\rho(\textbf{r}) = \sum_j n_j |\phi_J(r)|^2\]

(here \(n_j =0,1,\) or 2 is the occupation number of the orbital \(\phi_J\) in the state being studied) and that the kinetic energy should be calculated as

\[T = \sum_j n_j \langle \phi_J(r)| -\dfrac{\hbar^2}{2m} \nabla^2 |\phi_J(r)\rangle \]

The same investigations of the idealized uniform electron gas that identified the Dirac exchange functional found that the correlation energy (per electron) could also be written exactly as a function of the electron density \(\rho\) of the system for this model system, but only in two limiting cases- the high-density limit (large \(\rho\)) and the low-density limit. There still exists no exact expression for the correlation energy even for the uniform electron gas that is valid at arbitrary values of \(\rho\). Therefore, much work has been devoted to creating efficient and accurate interpolation formulas connecting the low- and high- density uniform electron gas. One such expression is

\[E_C[\rho] = \int \rho(\textbf{r}) \varepsilon_c(r) dr,\]


\[\varepsilon_c(r) = \dfrac{A}{2}\ln\Big(\dfrac{x}{X}\Big) + \dfrac{2b}{Q} \tan^{-1}\dfrac{Q}{2x+b} -\dfrac{bx_0}{X_0} [\ln\Big(\dfrac{(x-x_0)^2}{X}\Big) +\dfrac{2(b+2x_0)}{Q} \tan^{-1}\dfrac{Q}{2x+b}\]

is the correlation energy per electron. Here \(x = \sqrt{r_s}\) , \(X=x^2 +bx+c\), \(X_0 =x_0^2 +bx_0+c\) and \(Q=\sqrt{4c - b^2}\), \(A = 0.0621814\),  \(x_0= -0.409286\), \(b = 13.0720\), and \(c = 42.7198\). The parameter \(r_s\) is how the density \(\rho\) enters since \(4/3 \pi r_s^3\) is equal to \(1/\rho\); that is, \(r_s\) is the radius of a sphere whose volume is the effective volume occupied by one electron.

A reasonable approximation to the full \(E_{\rm xc}[\rho]\) would contain the Dirac (and perhaps gradient corrected) exchange functional plus the above \(E_C[\rho]\), but there are many alternative approximations to the exchange-correlation energy functional. Currently, many workers are doing their best to cook up functionals for the correlation and exchange energies, but no one has yet invented functionals that are so reliable that most workers agree to use them.

To summarize, in implementing any DFT, one usually proceeds as follows:

  1. An atomic orbital basis is chosen in terms of which the KS orbitals are to be expanded. Most commonly, this is a Gaussian basis or a plane-wave basis.
  2. Some initial guess is made for the LCAO-KS expansion coefficients \(C_{j,a}: \phi_J = \sum_a C_{j,a} \chi_a\) of the occupied KS orbitals.
  3. The density is computed as \[\rho(\textbf{r}) = \sum_j n_j |\phi_J(r)|^2\] . Often, \(\rho(\textbf{r})\) itself is expanded in an atomic orbital basis, which need not be the same as the basis used for the \(\phi_J\), and the expansion coefficients of \(\rho\) are computed in terms of those of the this new basis. It is also common to use an atomic orbital basis to expand \(\rho^{1/3}(r)\), which, together with \(\rho\), is needed to evaluate the exchange-correlation functional’s contribution to \(E_0\).
  4. The current iteration’s density is used in the KS equations to determine the Hamiltonian  \[{-\dfrac{\hbar^2}{2m} \nabla^2 + V(r)  + e^2  \int \frac{\rho(\textbf{r}’)}{|r-r’|}   dr’ + U_{\rm xc}(r) }\] whose new eigenfunctions {\(\phi_J\)} and eigenvalues {\(\epsilon_J\)} are found by solving the KS equations.
  5. These new \(\phi_J\)  are used to compute a new density, which, in turn, is used to solve a new set of KS equations. This process is continued until convergence is reached (i.e., until the \(\phi_J\) used to determine the current iteration’s \(\rho\) are the same \(\phi_J\) that arise as solutions on the next iteration.
  6.  Once the converged \(\rho(\textbf{r})\) is determined, the energy can be computed using the earlier expression

\[E [\rho] = \sum_j n_j \langle \phi_J(r)| -\dfrac{\hbar^2}{2m} \nabla^2|\phi_J(r)\rangle +  \int V(r) \rho(\textbf{r}) dr + \frac{e^2}{2} \int \frac{\rho(\textbf{r})\rho(\textbf{r}’)}{|r-r’|}dr dr’+ E_{\rm xc}[\rho].\]

e. Energy Difference Methods

In addition to the methods discussed above for treating the energies and wave functions as solutions to the electronic Schrödinger equation, there exists a family of tools that allow one to compute energy differences directly rather than by finding the energies of pairs of states and subsequently subtracting them. Various energy differences can be so computed: differences between two electronic states of the same molecule (i.e., electronic excitation energies \(\Delta E\)), differences between energy states of a molecule and the cation or anion formed by removing or adding an electron (i.e., ionization potentials (IPs) and electron affinities (EAs)). In the early 1970s, the author developed one such tool for computing EAs (J. Simons, and W. D. Smith, Theory of Electron Affinities of Small Molecules, J. Chem. Phys., 58, 4899-4907 (1973)) and he called this the equations of motion (EOM) method. Throughout much of the 1970s and 1980s, his group advanced and applied this tool to their studies of molecular EAs and electron-molecule interactions.

Because of space limitations, we will not be able to elaborate much in great detail on these methods. However, it is important to stress that:

  1. These so-called EOM or Greens function or propagator methods utilize essentially the same input information (e.g., atomic orbital basis sets) and perform many of the same computational steps (e.g., evaluation of one- and two- electron integrals, formation of a set of mean-field molecular orbitals, transformation of integrals to the MO basis, etc.) as do the other techniques discussed earlier.
  2. These methods are now rather routinely used when \(\Delta E\), IP, or EA information is sought.

The basic ideas underlying most if not all of the energy-difference methods are:

  1. One forms a reference wave function \(\psi\) (this can be of the SCF, MPn, CI, CC, DFT, etc. variety); the energy differences are computed relative to the energy of this function.
  2. One expresses the final-state wave function \(\psi’\) (i.e., that describing the excited, cation, or anion state) in terms of an operator \(\Omega\) acting on the reference \(\psi\): \(\psi’ = \Omega \psi\). Clearly, the \(\Omega\) operator must be one that removes or adds an electron when one is attempting to compute IPs or EAs, respectively.
  3. One writes equations which \(\psi\) and \(\psi’\) are expected to obey. For example, in the early development of these methods, the Schrödinger equation itself was assumed to be obeyed, so \(H\psi = E \psi \) and \(H\psi' = E’ \psi’\) are the two equations.
  4. One combines \(\Omega\psi = \psi’\) with the equations that \(\psi\) and \(\psi’\) obey to obtain an equation that \(\Omega\) must obey. In the above example, one (a) uses \(\Omega\psi = \psi’\) in the Schrödinger equation for \(\psi’\), (b) allows \(\Omega\) to act from the left on the Schrödinger equation for \(\psi\), and (c) subtracts the resulting two equations to achieve \((H\Omega - \Omega H) \psi = (E’ - E) \Omega \psi \), or, in commutator form \([H,\Omega] \psi = \Delta E \Omega \psi\).
  5. One can, for example, express \(\psi\) in terms of a superposition of configurations \(\psi = \sum_J C_J \phi_J\)  whose amplitudes \(C_J\)  have been determined from a CI or MPn calculation and express \(\Omega\) in terms of operators {\(O_K\)} that cause single-, double-, etc. level excitations (for the IP (EA) cases, \(\Omega\) is given in terms of operators that remove (add), remove and singly excite (add and singly excite, etc.) electrons): \(\Omega = \sum_K D_K O_K\) .
  6. Substituting the expansions for \(\psi\) and for \(\Omega\) into the equation of motion (EOM)


\([H,\Omega] \psi = \Delta E \Omega \psi\), and then projecting the resulting equation on the left against a set of functions (e.g., {\(O_{K’} |\psi>\)}) gives a matrix eigenvalue-eigenvector equation

\[\sum_K \langle O_{K’}\psi| [H,O_K] \psi \rangle D_K  = \Delta E \sum_K \langle O_{K’}\psi|O_K\psi\rangle D_K\]

to be solved for the \(D_K\) operator coefficients and the excitation (or IP or EA) energies \(\Delta E\). Such are the working equations of the EOM (or Greens function or propagator) methods.

In recent years, these methods have been greatly expanded and have reached a degree of reliability where they now offer some of the most accurate tools for studying excited and ionized states. In particular, the use of time dependent variational principles have allowed a much more rigorous development of equations for energy differences and non-linear response properties. In addition, the extension of the EOM theory to include coupled-cluster reference functions now allows one to compute excitation and ionization energies using some of the most accurate ab initio tools.

f. The Slater-Condon Rules

To form Hamiltonian matrix elements \(H_{K,L}\) between any pair of Slater determinants constructed from spin-orbitals that are orthonormal, one uses the so-called Slater-Condon rules. These rules express all non-vanishing matrix elements involving either one- or two- electron operators. One-electron operators are additive and appear as

\[F = \sum_i \phi(i);\]

two-electron operators are pairwise additive and appear as

\[G = \sum_{i< j}g(i,j) = \frac{1}{2} \sum_{i \ne j} g(i,j).\]

The Slater-Condon rules give the matrix elements between two determinants

\[| \rangle = |\phi_1\phi_2\phi_3...   \phi_N|\]


\[| '\rangle = |\phi'_1\phi'_2\phi'_3...\phi'_N|\]

for any quantum mechanical operator that is a sum of one- and two- electron operators (\(F + G\)). It expresses these matrix elements in terms of one-and two-electron integrals involving the spin-orbitals that appear in \(| \rangle\) and \(| '\rangle\) and the operators \(f\) and \(g\).

As a first step in applying these rules, one must examine \(| \rangle\) and \(| '\rangle\) and determine by how many (if any) spin-orbitals \(| \rangle\) and \(| '\rangle\) differ.  In so doing, one may have to reorder the spin-orbitals in one of the determinants to achieve maximal coincidence with those in the other determinant; it is essential to keep track of the number of permutations ( \(N_p\)) that one makes in achieving maximal coincidence. The results of the Slater-Condon rules given below are then multiplied by \((-1)^{N_p}\) to obtain the matrix elements between the original \(| \rangle\)  and \(| '\rangle\). The final result does not depend on whether one chooses to permute \(| \rangle\) or \(| '\rangle\) to determine \(N_p\).

The Hamiltonian is, of course, a specific example of such an operator that contains both one- and two-electron components; the electric dipole operator \(\sum_i e\textbf{r}_i\) and the electronic kinetic energy \(- \frac{\hbar^2}{2m_e}\sum_i\nabla_i^2\) are examples of one-electron operators (for which one takes \(g = 0\)); the electron-electron coulomb interaction \(\sum_{i<j} e^2/r_{ij}\)  is a two-electron operator (for which one takes \(f = 0\)).

The two Slater determinants whose matrix elements are to be determined can be written as

\[| \rangle = \frac{1}{\sqrt{N!}} \sum_{P=1}^{N!} (-1)^p P \phi_1(1)\phi_2(2)\cdots\phi_k(k)\cdots\phi_n(n)\cdots \phi_N(N)\]

\[| '\rangle = \frac{1}{\sqrt{N!}} \sum_{P=1}^{N!} (-1)^q Q \phi'_1(1)\phi'_2(2)\cdots\phi'_k(k)\cdots\phi'_n(n)\cdots\phi'_N(N)\]

where the spin-orbitals {\(\phi_j\)} and {\(\phi’_j\)} appear in the first and second determinants, respectively, and the operators \(P\) and \(Q\) describe the permutations of the spin-orbitals appearing in these two determinants. The factors \((-1)^p\) and \((-1)^q\) are the signs associated with these permutations as discussed earlier in Section 6.1.1. Any matrix element involving one- and two-electron operators

\[\langle |F+G|'\rangle =\frac{1}{\sqrt{N!}} \sum_{P,Q} (-1)^{p+q} \\\langle P \phi_1(1)\phi_2(2)\cdots\phi_k(k)\cdots\phi_n(n)\cdots \phi_N(N)|F+G|Q \phi'_1(1)\phi'_2(2)\cdots\phi'_k(k)\cdots\phi'_n(n)\cdots\phi'_N(N)\rangle \]

needs to be expressed in terms of integrals involving the spin-orbitals in the two determinants and the one- and two-electron operators.

To simplify the above expression, which contains \((N!)^2\) terms in its two summations, one proceeds as follows:

a. Use is made of the identity \(\langle P\psi |\psi’\rangle = \langle y|P\psi’\rangle\) to move the permutation operator \(P\) to just before the (\(F+G\))

\[\langle P \phi_1(1)\phi_2(2)\cdots\phi_k(k)\cdots\phi_n(n)\cdots \phi_N(N)| F+G |Q \phi'_1(1)\phi'_2(2)\cdots\phi'_k(k)\cdots\phi'_n(n)\cdots\phi'_N(N)\rangle \\
=\langle \phi_1(1)\phi_2(2)\cdots\phi_k(k)\cdots\phi_n(n)\cdots \phi_N(N)| P(F+G) |Q \phi'_1(1)\phi'_2(2)\cdots\phi'_k(k)\cdots\phi'_n(n)\cdots\phi'_N(N)\rangle

b. Because \(F\) and \(G\) contain sums over all \(N\) electrons in a symmetric fashion, any permutation \(P\) acting on \(F+G\) leaves these sums unchanged. So, \(P\) commutes with \(F\) and with \(G\). This allows the above quantity to be rewritten as

\[\langle \phi_1(1)\phi_2(2)\cdots\phi_k(k)\cdots\phi_n(n)\cdots \phi_N(N)| F+G |PQ \phi'_1(1)\phi'_2(2)\cdots\phi'_k(k)\cdots\phi'_n(n)\cdots\phi'_N(N)\rangle \]

c. For any permutation operator \(Q\), the operator \(PQ\) is just another permutation operator. Moreover, for any \(Q\), the set of all operators \(PQ\) runs over all \(N!\) permutations, and the sign associated with the operator \(PQ\) is the sign belonging to \(P\) times the sign associated with \(Q\), \((-1)^{p+q}\). So, the double sum (i.e., over \(P\) and over \(Q\)) appearing in the above expression for the general matrix element of \(F+G\) contains \(N!\) identical sums over the single operator \(PQ\) of the sign of this operator \((-1)^{p+q}\) multiplied by the effect of this operator on the spin-orbital product on the right-hand side

\[ \langle |F+G|'\rangle =\frac{1}{\sqrt{N!}}N!\\
\sum_{P,Q} (-1)^{p+q} \langle \phi_1(1)\phi_2(2)\cdots\phi_k(k)\cdots\phi_n(n)\cdots \phi_N(N)| F+G |PQ \phi'_1(1)\phi'_2(2)\cdots\phi'_k(k)\cdots\phi'_n(n)\cdots\phi'_N(N)\rangle \]

By assumption, as explained earlier, the two Slater determinants have been compared and arranged in an order of maximal coincidence and the factor \((-1)^{N_p}\) needed to bring them into maximal coincidence has been determined. So, let us begin by assuming that the two determinants differ by three spin-orbitals and let us first consider the terms arising from the identity permutation \(PQ = E\) (i.e., the permutation that alters none of the spin-orbitals’ labels). These terms will involve integrals of the form

\[\langle \phi_1(1)\phi_2(2)\cdots\phi_k(k)\cdots\phi_n(n)\cdots\phi_j(j)\cdots\phi_N(N)| F+G |PQ \phi'_1(1)\phi'_2(2)\cdots\phi'_k(k)\cdots\phi'_n(n)\cdots\phi'_j(j)\cdots​\phi'_N(N)\rangle\]


where the three-spin orbitals that differ in the two determinants appear in positions \(k\), \(n\), and \(j\). In these \(4N\)-dimensional (3 spatial and 1 spin coordinate for each of \(N\) electrons) integrals:

a. Integrals of the form (for all \(i\ne k\), \(n\), or \(j\))

\[\langle \phi_1(1)\phi_2(2)\cdots\phi_k(k)\cdots\phi_n(n)\cdots\phi_j(j)\cdots\phi_N(N)| f(i) | \phi'_1(1)\phi'_2(2)\cdots\phi'_k(k)\cdots\phi'_n(n)\cdots\phi'_j(j)\cdots​\phi'_N(N)\rangle\]


and (for all i and \(l \ne k\), \(n\), or \(j\))

\[\langle \phi_1(1)\phi_2(2)\cdots\phi_k(k)\cdots\phi_n(n)\cdots\phi_j(j)\cdots\phi_N(N)| g(i,l) | \phi'_1(1)\phi'_2(2)\cdots\phi'_k(k)\cdots\phi'_n(n)\cdots\phi'_j(j)\cdots​\phi'_N(N)\rangle\]



vanish because the spin-orbitals appearing in positions \(k\), \(n\), and \(j\) in the two determinants are orthogonal to one another. For the \(F\)-operator, even integrals with \(i = k\), \(n\), or \(j\) vanish because there are still two spin-orbital mismatches at the other two locations among \(k\), \(n\), and \(j\). For the \(G\)-operator, even integrals with \(i\) or \(l = k\), \(n\), or \(j\) vanish because two mismatches remain; and even with both \(i\) and \(l = k\), \(n\), or \(j\), the integrals vanish because one spin-orbital mismatch remains. The main observation to make is that, even for \(PQ = E\), if there are three spin-orbital differences, neither the \(F\) nor \(G\) operator gives rise to any non-vanishing results.

b. If we now consider any other permutation \(PQ\), the situation does not improve because any permutation cannot alter the fact that three spin-orbital mismatches do not generate any non-vanishing results.

If there are only two spin-orbital mismatches (say in locations \(k\) and \(n\)), the integrals we need to evaluate are of the form

\[\langle \phi_1(1)\phi_2(2)\cdots\phi_k(k)\cdots\phi_n(n)\cdots\phi_N(N)| f(i) |PQ \phi'_1(1)\phi'_2(2)\cdots\phi'_k(k)\cdots\phi'_n(n)\cdots\phi'_N(N)\rangle\]



\[\langle \phi_1(1)\phi_2(2)\cdots\phi_k(k)\cdots\phi_n(n)\cdots\phi_N(N)| g(i,l) |PQ \phi'_1(1)\phi'_2(2)\cdots\phi'_k(k)\cdots\phi'_n(n)\cdots\phi'_N(N)\rangle\]

c. Again, beginning with \(PQ = E\), we can conclude that all of the integrals involving the \(F\)-operator (i.e., \(\phi(i)\), \(\phi(k)\), and \(\phi(n)\)) vanish because the two spin-orbital mismatch is too much even for \(\phi(k)\) or \(\phi(n)\) to overcome; at least one spin-orbital orthogonality integral remains. For the \(G\)-operator, the only non-vanishing result arises from the \(i = k\) and \(l = n\) term \(\langle \phi_k(k)\phi_n(n)| g(k,n) | \phi'_k(k)\phi'_n(n)\rangle\).

d. The only other permutation that generates another non-vanishing result is the permutation that interchanges \(k\) and \(n\), and it produces \(-\langle \phi_k(k)\phi_n(n)| g(k,n) | \phi'_n(k)\phi'_k(n)\rangle\)

, where the negative sign arises from the \((-1)^{p+q}\)  factor. All other permutations would interchange other spin-orbitals and thus generate orthogonality integrals involving other electrons’ coordinates.

If there is only one spin-orbital mismatch (say in location \(k\)), the integrals we need to evaluate are of the form

\[\langle \phi_1(1)\phi_2(2)\cdots\phi_k(k)\cdots\phi_N(N)| f(i) |PQ \phi'_1(1)\phi'_2(2)\cdots\phi'_k(k)\cdots\phi'_N(N)\rangle\]


\[\langle \phi_1(1)\phi_2(2)\cdots\phi_k(k)\cdots\phi_N(N)| g(i,l) |PQ \phi'_1(1)\phi'_2(2)\cdots\phi'_k(k)\cdots\phi'_N(N)\rangle.\]


e. Again beginning with \(PQ = E\), the only non-vanishing contribution from the \(F\)-operator is \(\langle \phi_k(k)|f(k)|\phi'_k(k) \rangle\). For all other permutations, the \(F\)-operator produces no non-vanishing contributions because these permutations generate orthogonality integrals. For the \(G\)-operator and \(PQ = E\), the only non-vanishing contributions are

\[\langle \phi_k(k)\phi_j(j)| g(k,j) | \phi'_k(k)\phi_j(j)\rangle\]


where the sum over \(j\) runs over all of the spin-orbitals that are common to both of the two determinants.

f. Among all other permutations, the only one that produces a non-vanishing result are those that permute the spin-orbital in the kth location with another spin-orbital, and they produce


\[-\langle \phi_k(k)\phi_j(j)| g(k,j) | \phi'_j(k)\phi_k(j)\rangle.\]


The minus sign arises from the \((-1)^{p+q}\) factor associated with this pair wise permutation operator.

            Finally, if there is no mismatch (i.e., the two determinants are identical), then

g. The identity permutation generates

\[-\langle \phi_k(k)| f(k) | \phi_k(k)\rangle.\]

from the \(F\)-operator and

\[\frac{1}{2}\sum_{j \ne k=1}^N \langle \phi_j(j)\phi_k(k)| g(k,j) | \phi_j(j)\phi_k(k)\rangle\]


from the \(G\)-operator.

h. The permutation that interchanges spin-orbitals in the kth  and jth location produces


\[-\frac{1}{2}\sum_{j \ne k=1}^N \langle \phi_j(j)\phi_k(k)| g(k,j) | \phi_k(j)\phi_j(k)\rangle .\]


The summations over \(j\) and \(k\) appearing above can, alternatively, be written as

\[\sum_{j < k=1}^N \langle \phi_j(j)\phi_k(k)| g(k,j) | \phi_j(j)\phi_k(k)\rangle\]



\[-\sum_{j < k=1}^N \langle \phi_j(j)\phi_k(k)| g(k,j) | \phi_k(j)\phi_j(k)\rangle .\]


So, in summary, once maximal coincidence has been achieved, the Slater-Condon (SC) rules provide the following prescriptions for evaluating the matrix elements of any operator \(F+G\) containing a one-electron part \(F = \sum_i \phi(i)\) and a two-electron part \(G = \sum_{i< j}g(i,j)\).:

(i) If \(| \rangle\) and \(| '\rangle\) are identical, then

\[\langle | F+G | \rangle = \sum_i \langle \phi_i| f | \phi_i\rangle +\sum_{i\rangle j} [\langle \phi_i \phi_j | g | \phi_i \phi_j \rangle - \langle \phi_i \phi_j | g | \phi_j \phi_i​ \rangle ],\]

where the sums over \(i\) and \(j\) run over all spin-orbitals in \(| \rangle\) ;

(ii) If \(| \rangle\) and \(| '\rangle\) differ by a single spin-orbital mismatch ( \(\phi_p \ne \phi'_p\) ),

\[\langle | F+G | '\rangle = (-1)^{N_p} {\langle \phi_p | f | \phi'_p \rangle +\sum_j [\langle \phi_p\phi_j | g | \phi'_p\phi_j \rangle - \langle \phi_p\phi_j | g | \phi_j\phi'_p \rangle ]},\]

where the sum over \(j\) runs over all spin-orbitals in \(| \rangle\) except \(\phi_p\);

(iii) If \(| \rangle\) and \(| '\rangle\) differ by two spin-orbitals ( \(\phi_p \ne \phi'_p\) and \(\phi_q \ne \phi'_q\)),

\[\langle | F+G | '\rangle = (-1)^{N_p} {\langle \phi_p \phi_q | g | \phi'_p \phi'_q \rangle - \langle \phi_p \phi_q | g | \phi'_q \phi'_p \rangle }\]

(note that the \(F\) contribution vanishes in this case);

(iv) If \(| \rangle\) and \(| '\rangle\) differ by three or more spin orbitals, then

\[\langle | F+G | '\rangle = 0;\]

(v) \(\Phi\) or the identity operator \(I\), the matrix elements \(\langle | I | '\rangle = 0\) if \(| \rangle\) and \(| '\rangle\) differ by one or more spin-orbitals (i.e., the Slater determinants are orthonormal if their spin-orbitals are).

In these expressions,

\[\langle \phi_i| f | \phi_j \rangle \]

is used to denote the one-electron integral

\[\int \phi^*_i(r) f(r) \phi_j(r) dr\]


\[\langle \phi_i \phi_j | g | \phi_k\phi_l \rangle \]

(or, in short hand notation, \(\langle i j| k l \rangle\) ) represents the two-electron integral

\[\int \phi^*_i(r) \phi^*_j(r') g(r,r') \phi_k(r)\phi_l(r') drdr'.\]

The notation \(\langle i j | k l \rangle\) introduced above gives the two-electron integrals for the \(g(r,r')\) operator in the so-called Dirac notation, in which the \(i\) and \(k\) indices label the spin-orbitals that refer to the coordinates \(r\) and the \(j\) and l indices label the spin-orbitals referring to coordinates \(r'\). The \(r\) and \(r'\) denote \(r,\theta,\phi,\sigma\) and \(r',\theta',\phi',\sigma'\) (with \(\sigma\) and \(\sigma'\) being the \(\alpha\) or \(\beta\) spin functions).

If the operators \(f\) and \(g\) do not contain any electron spin operators, then the spin integrations implicit in these integrals (all of the \(\phi_i\) are spin-orbitals, so each \(\phi\) is accompanied by an \(\alpha\) or \(\beta\) spin function and each \(\phi^*\) involves the adjoint of one of the \(\alpha\) or \(\beta\) spin functions) can be carried out using \(\langle a|a\rangle =1\), \(\langle a|b\rangle =0\), \(\langle b|a\rangle =0\), \(\langle b|b\rangle =1\), thereby yielding integrals over spatial orbitals.

g. Atomic Units

The electronic Hamiltonian that appears throughout this text is commonly expressed in the literature and in other texts in so-called atomic units (aus). In that form, it is written as follows:

\[H_e = \sum_j \left[ -\frac{1}{2} \nabla_j^2 - \sum_a \frac{Z_a}{r_{j,a}} \right] + \sum_{j< k} \frac{1}{r_{j,k}} .\]

Atomic units are introduced to remove all of the h , e, and me factors from the Schrödinger equation.

To effect the unit transformation that results in the Hamiltonian appearing as above, one notes that the kinetic energy operator scales as \(r_j^{-2}\) whereas the Coulomb potentials scale as \(r_j^{-1}\)​ and as \(r_{j,k}^{-1}\). So, if each of the Cartesian coordinates of the electrons and nuclei were expressed as a unit of length \(a_0\) multiplied by a dimensionless length factor, the kinetic energy operator would involve terms of the form

\(( - \hbar^2/2(a_0)^2m_e ) \nabla_j^2\) , and the Coulomb potentials would appear as \(Z_ae^2/(a_0)r_{j,a}\)  and \(e^2/(a_0)r_{j,k}\) , with the \(r_{j,a}\) and \(r_{j,k}\) factors now referring to the dimensionless coordinates.  A factor of \(e^2/a_0\) (which has units of energy since a_0 has units of length) can then be removed from the Coulomb and kinetic energies, after which the kinetic energy terms appear as \(( - \hbar^2/2(e^2a_0)m_e ) \nabla_j^2\) and the potential energies appear as \(Z_a/r_{j,a}\) and \(1/r_{j,k}\). Then, choosing \(a_0 = \hbar^2/e^2m_e\) changes the kinetic energy terms into \(-1/2 \nabla_j^2\); as a result, the entire electronic Hamiltonian takes the form given above in which no \(e^2\), me, or \(\hbar^2\) factors appear. The value of the so-called Bohr radius \(a_0 = \hbar^2/e^2m_e\) turns out to be 0.529 Å, and the so-called Hartree energy unit \(e^2/a_0\), which factors out of He, is 27.21 eV or 627.51 kcal/mol.

6.1.3  Molecules Embedded in Condensed Media

Often one wants to model the behavior of a molecule or ion that is not isolated as it might be in a gas-phase experiment. When one attempts to describe a system that is embedded, for example, in a crystal lattice, in a liquid or a glass, one has to have some way to treat both the effects of the surrounding medium on the molecule of interest and the motions of the medium’s constituents.  In so-called quantum mechanics- molecular mechanics (QM-MM) approaches to this problem, one treats the molecule or ion of interest using the electronic structure methods outlined earlier in this Chapter, but with one modification. The one-electron component of the Hamiltonian, which contains the electron-nuclei Coulomb potential \(\sum_{a,i} (-Z_ae^2/|r_i – R_a|)\), is modified to also contain a term that describes the potential energy of interaction of the electrons and nuclei with the surrounding medium. In the simplest such models, this solvation potential depends only on the dielectric constant of the surroundings. In more sophisticated models, the surroundings are represented by a collection of (fractional) point charges that may also be attributed with local dipole moments and polarizabilities that allow them to respond to changes in the internal charge distribution of the molecule or ion. The locations of such partial charges and the magnitudes of their dipoles and polarizabilities are determined to make the resultant solvation potential reproduce known (from experiment or other simulations) solvation characteristics (e.g., solvation energy, radial distribution functions) in a variety of calibration cases. The book Molecular Modeling, 2nd ed., A. R. Leach, Prentice Hall, Englewood Cliffs (2001) offers a good source of information about how these terms are added into the one-electron component of the Hamiltonian to account for solvation effects.

In addition to describing how the surroundings affect the Hamiltonian of the molecule or ion of interest, one needs to describe the motions or spatial distributions of the medium’s constituent atoms or molecules. This is usually done within a purely classical treatment of these degrees of freedom. That is, if equilibrium properties of the solvated system are to be simulated, then Monte-Carlo (MC) sampling (this subject is treated in Chapter 7 of this text) of the surrounding medium’s coordinates is used. Within such a MC sampling, the potential energy of the entire system is calculated as a sum of two parts:

i. the electronic energy of the solute molecule or ion, which contains the interaction energy of the molecule’s electrons and nuclei with the surrounding medium, plus

ii. the intra-medium potential energy, which is taken to be of a simple molecular mechanics (MM) force field character (i.e., to depend on inter-atomic distances and internal angles in an analytical and easily computed manner). Again, the book Molecular Modeling, 2nd ed., A. R. Leach, Prentice Hall, Englewood Cliffs (2001) offers a good source of information about these matters.

If, alternatively, dynamical characteristics of the solvated species are to be simulated, a classical molecular dynamics (MD) treatment is used. In this approach, the solute-medium and internal-medium potential energies are handled in the same way as in the MC case but where the time evolution of the medium’s coordinates are computed using the MD techniques discussed in Chapter 7 of this text.

6.1.4 High-End Methods for Treating Electron Correlation

Although their detailed treatment is beyond the scope of this text, it is important to appreciate that new approaches are always under development in all areas of theoretical chemistry. In this Section, I want to introduce you to two tools that are proving to offer high precision in the treatment of electron correlation energies. These are the so-called quantum Quantum Monte-Carlo and r1,2- approaches to this problem.

1. Quantum Monte-Carlo

In this method, one first re-writes the time dependent Schrödinger equation

\[i \hbar \frac{d\Psi}{dt} = - \frac{\hbar^2}{2m_e} \sum_j \nabla_j^2 \Psi + V \Psi\]

for negative imaginary values of the time variable \(t\) (i.e., one simply replaces \(t\) by \(-it\)). This gives

\[\frac{d\Psi}{dt} = \frac{\hbar}{2m_e}​ \sum_j \nabla_j^2 \Psi - \frac{V}{\hbar} \Psi,\]

which is analogous to the well-known diffusion equation

\[\frac{dC}{dt} = D \nabla^2C + S C.\]

The re-written Schrödinger equation can be viewed as a diffusion equation in the \(3N\) spatial coordinates of the \(N\) electrons with a diffusion coefficient \(D\) that is related to the electrons' mass me by

\[D = \frac{\hbar}{2m_e}.\]

The so-called source and sink term \(S\) in the diffusion equation is related to the electron-nuclear and electron-electron Coulomb potential energies denoted V:

\[S = - \frac{V}{\hbar}.\]

In regions of space where \(V\) is large and negative (i.e., where the potential is highly attractive), \(V\) is large and negative, so \(S\) is large and positive. This causes the concentration \(C\) of the diffusing material to accumulate in such regions. Likewise, where \(V\) is positive, \(C\) will decrease. Clearly by recognizing \(\Psi\) as the concentration variable in this analogy, one understands that \(\Psi\) will accumulate where \(V\) is negative and will decay

where \(V\) is positive, as one expects. Prof. Bill Lester at Berkeley has done a lot to advance this method as applied to the electronic structure of molecules, so his web site offers a good source of further information.

So far, we see that the trick of taking \(t\) to be negative and imaginary causes the electronic Schrödinger equation to look like a \(3N\)-dimensional diffusion equation. Why is this useful and why does this trick work? It is useful because, as we see in Chapter 7 of this text, Monte-Carlo methods are highly efficient tools for solving certain equations; it turns out that the diffusion equation is one such case. So, the Quantum Monte-Carlo approach can be used to solve the imaginary-time Schrödinger equation even for systems containing many electrons. But, what does this imaginary time mean?

To understand the imaginary time trick, let us recall that any wave function
(e.g., the trial wave function with which one begins to use Monte-Carlo methods to propagate the diffusing \(\Psi\) function) \(\Phi\) can be written in terms of the exact eigenfunctions {\(\psi_K\)} of the Hamiltonian

\[H = - \frac{\hbar^2}{2m_e} \sum_j \nabla_j^2  + V\]

as follows:

\[F = \sum_K C_K \psi_K.\]

If the Monte-Carlo method can, in fact be used to propagate forward in time such a function but with \(t = -it\), then it will, in principle, generate the following function at such an imaginary time:

\[F = \sum_K C_K \psi_K \exp(-iEKt/\hbar) = \sum_K C_K \psi_K \exp(-EKt/\hbar).\]

As \(t\) increases, the relative amplitudes {\(C_K \exp(-E_Kt/\hbar)\)} of all states but the lowest state (i.e., that with smallest \(E_K\)) will decay compared to the amplitude \(C_0 \exp(-E_0t/\hbar)\) of the lowest state. So, the time-propagated wave function will, at long enough t, be dominated by its lowest-energy component. In this way, the quantum Monte-Carlo propagation method can generate a wave function in \(3N\) dimensions that approaches the ground-state wave function.

It has turned out that this approach, which tackles the \(N\)-electron correlation problem head-on, has proven to yield highly accurate energies and wave functions that display the proper cusps near nuclei as well as the negative cusps (i.e., the wave function vanishes) whenever two electrons' coordinates approach one another. Finally, it turns out that by using a starting function \(F\) of a given symmetry and nodal structure, this method can be extended to converge to the lowest-energy state of the chosen symmetry and nodal structure. So, the method can be used on excited states also. In Chapter 7 of this text, you will learn how the Monte-Carlo tools can be used to simulate the behavior of many-body systems (e.g., the \(N\)-electron system we just discussed) in a highly efficient and easily parallellized manner.

2. The \(r_{1,2}\) Method

In this approach to electron correlation, one employs a trial variational wave function that contains components that depend explicitly on the inter-electron distances \(r_{i,j}\). By so doing, one does not rely on the polarized orbital pair approach introduced earlier in this Chapter to represent all of the correlations among the electrons. An example of such an explicitly correlated wave function is:

\[\psi = |\phi_1 \phi_2 \phi_3 …\phi_N|  (1 + a \sum_{i<j} r_{i,j})\]

which consists of an antisymmetrized product of \(N\) spin-orbitals multiplied by a factor that is symmetric under interchange of any pair of electrons and contains the electron-electron distances in addition to a single variational parameter \(a\). Such a trial function is said to contain linear-\(r_{i,j}\) correlation factors. Of course, it is possible to write many other forms for such an explicitly correlated trial function. For example, one could use:

\[\psi = |\phi_1 \phi_2 \phi_3 …\phi_N|  \exp(-a \sum_{i<j} r_{i,j}))\]

as a trial function. Both the linear and the exponential forms have been used in developing this tool of quantum chemistry. Because the integrals that must be evaluated when one computes the Hamiltonian expectation value \(\langle \psi|H| \psi \rangle\) are most computationally feasible (albeit still very taxing) when the linear form is used, this particular parameterization is currently the most widely used.

Both the \(r_{i,j}\)- and quantum Monte-Carlo methods currently are used when one wishes to obtain the absolute highest precision in an electronic structure calculation. The computational requirements of both of these methods are very high, so, at present, they can only be used on species containing fewer than ca. 100 electrons. However, with the power and speed of computers growing as fast as they are, it is likely that these high-end methods will be more and more widely used as time goes by.


Jack Simons (Henry Eyring Scientist and Professor of Chemistry, U. Utah) Telluride Schools on Theoretical Chemistry

Integrated by Tomoyuki Hayashi (UC Davis)