6.5: Various Approaches to Electron Correlation

Last updated
Save as PDF

Page ID: 162850

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

There are numerous procedures currently in use for determining the best Born-Oppenheimer electronic wave function that is usually expressed in the form:

\[\psi = \sum_i m_i C_I \Phi_i,\]

where \(\Phi_I\) is a spin-and space- symmetry-adapted configuration state function (CSF) that consists of one or more determinants \(| \phi_{I1}\phi_{I2}\phi_{I3}\cdots \phi_{IN}|\) combined to produce the desired symmetry. In all such wave functions, there are two kinds of parameters that need to be determined- the CI coefficients and the LCAO-MO coefficients describing the fIk in terms of the AO basis functions. The most commonly employed methods used to determine these parameters include:

The CI Method

In this approach, the LCAO-MO coefficients are determined first usually via a single-configuration HF SCF calculation. The CI coefficients are subsequently determined by making the expectation value \(\langle \psi | H | \psi \rangle / \langle \psi | \psi \rangle\) variationally stationary with \(\psi\) chosen to be of the form

\[\psi = \sum_I C_I \Phi_I.\]

As with all such linear variational problems, this generates a matrix eigenvalue equation

\[\sum_J \langle\Phi_I|H|\Phi_J\rangle C_J=EC_I\]

to be solved for the optimum {\(C_I\)} coefficients and for the optimal energy \(E\).

The CI wave function is most commonly constructed from spin- and spatial- symmetry adapted combinations of determinants called configuration state functions (CSFs) \(\Phi_J\) that include:

The so-called reference CSF that is the SCF wave function used to generate the molecular orbitals \(\phi_i\).
CSFs generated by carrying out single, double, triple, etc. level excitations (i.e., orbital replacements) relative to the reference CSF. CI wave functions limited to include contributions through various levels of excitation are denoted S (singly), D (doubly), SD (singly and doubly), SDT (singly, doubly, and triply) excited.

The orbitals from which electrons are removed can be restricted to focus attention on correlations among certain orbitals. For example, if excitations out of core orbitals are excluded, one computes a total energy that contains no core correlation energy. The number of CSFs included in the CI calculation can be large. CI wave functions including 5,000 to 50,000 CSFs are routine, and functions with one to several billion CSFs are within the realm of practicality.

The need for such large CSF expansions can be appreciated by considering (i) that each electron pair requires at least two CSFs to form the polarized orbital pairs discussed earlier in this Chapter, (ii) there are of the order of \(\dfrac{N(N-1)}{2} = X\) electron pairs for a molecule containing \(N\) electrons, hence (iii) the number of terms in the CI wave function scales as \(2^X\). For a molecule containing ten electrons, there could be \(2^{45} = 3.5 \times 10^{13}\) terms in the CI expansion. This may be an over estimate of the number of CSFs needed, but it demonstrates how rapidly the number of CSFs can grow with the number of electrons.

The Hamiltonian matrix elements \(H_{I,J}\) between pairs of CSFs are, in practice, evaluated in terms of one- and two- electron integrals over the molecular orbitals. Prior to forming the \(H_{I,J}\) matrix elements, the one- and two- electron integrals, which can be computed only for the atomic (e.g., STO or GTO) basis, must be transformed to the molecular orbital basis. This transformation step requires computer resources proportional to the fifth power of the number of basis functions, and thus is one of the more troublesome steps in most configuration interaction (and most other correlated) calculations.

To transform the two-electron integrals \(\langle \chi_a(r)\chi_b(r')|\dfrac{1}{|r-r'|}|\chi_c(r)\chi_d(r')\rangle\) from this AO basis to the MO basis, one proceeds as follows:

1. First one utilizes the original AO-based integrals to form a partially transformed set of integrals

\[\langle \chi_a(r)\chi_b(r')|\dfrac{1}{|r-r'|}|\chi_c(r)\phi_l(r')\rangle = \sum_d C_{l,d} \langle \chi_a(r)\chi_b(r')|\dfrac{1}{|r-r'|}|\chi_c(r)\chi_d(r')\rangle.\]

This step requires of the order of \(M^5\) operations.

2. Next one takes the list \(\langle \chi_a(r)\chi_b(r')|\dfrac{1}{|r-r'|}|\chi_c(r)\phi_l(r')\rangle\) and carries out another so-called one-index transformation

\[\langle \chi_a(r)\chi_b(r')|\dfrac{1}{|r-r'|}|\phi_k(r)\phi_l(r')\rangle = \sum_c C_{k,c} \langle \chi_a(r)\chi_b(r')|\dfrac{1}{|r-r'|}|\chi_c(r)\phi_l(r')\rangle.\]

3. This list \(\langle \chi_a(r)\chi_b(r')|\dfrac{1}{|r-r'|}|\phi_k(r)\phi_l(r')\rangle\) is then subjected to another one-index transformation to generate \(\langle \chi_a(r)\phi_j(r')|\dfrac{1}{|r-r'|}|\phi_k(r)\phi_l(r')\rangle\), after which

4. \(\langle \chi_a(r)\phi_j(r')|\dfrac{1}{|r-r'|}|\phi_k(r)\phi_l(r')\rangle\) is subjected to the fourth one-index transformation to form the final MO-based integral list \(\langle \phi_i(r)\phi_j(r')|\dfrac{1}{|r-r'|}|\phi_k(r)\phi_l(r')\rangle\). In total, these four transformation steps require \(4M^5\) computer operations.

A variant of the CI method that is sometimes used is called the multi-configurational self-consistent field (MCSCF) method. To derive the working equations of this approach, one minimizes the expectation value of the Hamiltonian for a trial wave function consisting of a linear combination of CSFs

\[ \psi = \sum_I C_I \Phi_I.\]

In carrying out this minimization process, one varies both the linear {\(C_I\)} expansion coefficients and the LCAO-MO coefficients {\(C_{J,\mu}\)} describing those spin-orbitals that appear in any of the CSFs {\(\Phi_I\)}. This produces two sets of equations that need to be solved:

1. A matrix eigenvalue equation

\[\sum_J \langle\Phi_I|H|\Phi_J\rangle C_J=EC_I\]

of the same form as arises in the CI method, and

2. equations that look very much like the HF equations

\[\sum_\mu \langle\chi_\nu |h_e| \chi_\mu\rangle C_{J,\mu} = \epsilon_J \sum_\mu \langle\chi_\nu|\chi_\mu\rangle C_{J,\mu} \]

but in which the he matrix element is

\[\langle\chi_\nu| h_e| \chi_\mu\rangle = \langle\chi_\nu| -\dfrac{\hbar^2}{2m} \nabla^2 |\chi_\mu\rangle + \langle\chi_\nu| -\frac{Ze^2}{r} |\chi_\mu\rangle\]

\[+ \sum_{\eta,\gamma} \Gamma{\eta,\gamma} [\langle\chi_\nu(r) \chi_\eta(r’) |\frac{e^2}{|r-r’|} | \chi_\mu(r) \chi_\gamma(r’)\rangle
- \langle\chi_\nu(r) \chi_\eta(r’) |\frac{e^2}{|r-r’|} | \chi_\gamma(r) \chi_\mu (r’)\rangle].\]

Here \(\Gamma_{\eta,\gamma}\) replaces the sum \(\sum_K C_{K,\eta} C_{K,\gamma}\) that appears in the HF equations, with \(\Gamma_{\eta,\gamma}\) depending on both the LCAO-MO coefficients {\(C_{K,\eta}\)} of the spin-orbitals and on the {\(C_I\)} expansion coefficients. These equations are solved through a self-consistent process in which initial {\(C_{K,\eta}\)} coefficients are used to form the matrix and solve for the {\(C_I\)} coefficients, after which the \(\Gamma_{\eta,\gamma}\) can be determined and the HF-like equations solved for a new set of {\(C_{K,\eta}\)} coefficients, and so on until convergence is reached.

Perturbation Theory

This method uses the single-configuration SCF process to determine a set of orbitals {\(\phi_i\)}. Then, with a zeroth-order Hamiltonian equal to the sum of the \(N\) electrons’ Fock operators \(H_0 = \sum_{i=1}^N h_e(i)\), perturbation theory is used to determine the CI amplitudes for the other CSFs. The Møller-Plesset perturbation (MPPT) procedure is a special case in which the above sum of Fock operators is used to define \(H_0\). The amplitude for the reference CSF is taken as unity and the other CSFs' amplitudes are determined by using \(H-H_0\) as the perturbation. This perturbation is the difference between the true Coulomb interactions among the electrons and the mean-field approximation to those interactions:

\[V=H-H^{(0)}=\frac{1}{2}\sum_{i \ne j\ne 1}^N \frac{1}{r_{i,j}}-\sum_{k=1}^N[J_j(r)-K_k(r)]\]

where \(J_k\) and \(K_k\) are the Coulomb and exchange operators defined earlier in this Chapter and the sum over \(k\) runs over the \(N\) spin-orbitals that are occupied in the Hartree-Fock wave function that forms the zeroth-order approximation to \(\psi\).

In the MPPT method, once the reference CSF is chosen and the SCF orbitals belonging to this CSF are determined, the wave function \(\psi\) and energy \(E\) are determined in an order-by-order manner as is the case in the RSPT discussed in Chapter 3. In fact, MPPT is just RSPT with the above fluctuation potential as the perturbation. The perturbation equations determine what CSFs to include through any particular order. This is one of the primary strengths of this technique; it does not require one to make further choices, in contrast to the CI treatment where one needs to choose which CSFs to include.

For example, the first-order wave function correction \(\psi_1\) is:

\[\psi_1 = - \sum_{i < j,m < n} \dfrac{\langle i,j |\dfrac{1}{r_{12}}| m,n \rangle -\langle i,j |\dfrac{1}{r_{12}}| n,m \rangle}{ \varepsilon_m-\varepsilon_i +\varepsilon_n-\varepsilon_j} | \Phi_{i,j}^{m,n} \rangle,\]

where the SCF orbital energies are denoted \(\varepsilon_k\) and \(\Phi_{i,j}^{m,n}\) represents a CSF that is doubly excited (\(\phi_i\) and \(\phi_j\) are replaced by \(\phi_m\) and \(\phi_n\)) relative to the SCF wave function \(\Phi\). The denominators \([ \varepsilon_m-\varepsilon_i +\varepsilon_n-\varepsilon_j]\) arise from \(E_0-E_k^0\) because each of these zeroth-order energies is the sum of the orbital energies for all spin-orbitals occupied. The excited CSFs \(\Phi_{i,j}^{m,n}\) are the zeroth-order wave functions other than the reference CSF. Only doubly excited CSFs contribute to the first-order wave function; the fact that the contributions from singly excited configurations vanish in \(\psi_1\) is known at the Brillouin theorem.

The Brillouin theorem can be proven by considering Hamiltonian matrix elements coupling the reference CSF \(F\) to singly-excited CSFs Fim. The rules for evaluating all such matrix elements are called Slater-Condon rules and are given later in this Chapter. If you don’t know them, this would be a good time to go read the subsection on these rules before returning here. From the Slater-Condon rules, we know that the matrix elements in question are given by

\[\langle \Phi|H|\Phi_i^m\rangle= \langle \phi_i(r)| -\frac{1}{2}\nabla^2 - \sum_a \dfrac{Z_a}{|r-R_a|} |\phi_m(r)\rangle + \sum_{j=1(\ne i,m)}^N \langle \phi_i(r) \phi_j(r')|\dfrac{1-P_{r,r'}}{|r-r'|}| \phi_m(r) \phi_j(r')\rangle.\]

Here, the factor \(P_{r,r’}\) simply permutes the coordinates \(r\) and \(r’\) to generate the exchange integral. The sum of two electron integrals on the right-hand side above can be extended to include the terms arising from \(j =i\) because vanishes. As a result, the entire right-hand side can be seen to reduce to the matrix element of the Fock operator \(h_{\rm HF}(r)\):

\[\langle \Phi|H|\Phi_i^m\rangle=\langle \phi_i|h_{\rm HF}(r)|\phi_m(r)\rangle=\varepsilon_m\delta_{i,m}=0.\]

The matrix elements vanish because the spin-orbitals are eigenfunctions of \(h_{\rm HF}(r)\) and are orthogonal to each other.

The MPPT energy \(E\) is given through second order as in RSPT by

\[E = E_{SCF} - \sum_{i < j,m < n} \frac{| \langle i,j | \dfrac{1}{r_{12}} | m,n \rangle -\langle i,j | \dfrac{1}{r_{12}} | n,m \rangle |^2}{ \varepsilon_m-\varepsilon_i +\varepsilon_n-\varepsilon_j }\]

and again only contains contributions from the doubly excited CSFs. Both \(\psi\) and \(E\) are expressed in terms of two-electron integrals \(\langle i,j | \frac{1}{r_{12}} | m,n \rangle\) (that are sometimes denoted \(\langle i,j|k,l\rangle\)) coupling the virtual spin-orbitals \(\phi_m\) and \(\phi_n\) to the spin-orbitals from which electrons were excited \(\phi_i\) and \(\phi_j\) as well as the orbital energy differences \([ \varepsilon_m-\varepsilon_i +\varepsilon_n-\varepsilon_j ]\) accompanying such excitations. Clearly, major contributions to the correlation energy are made by double excitations into virtual orbitals \(\phi_m \phi_n\) with large \(\langle i,j | \frac{1}{r_{12}} | m,n \rangle\) integrals and small orbital energy gaps \([\varepsilon_m-\varepsilon_i +\varepsilon_n-\varepsilon_j]\). In higher order corrections, contributions from CSFs that are singly, triply, etc. excited relative to the HF reference function \(F\) appear, and additional contributions from the doubly excited CSFs also enter. The various orders of MPPT are usually denoted MPn (e.g., MP2 means second-order MPPT).

The Coupled-Cluster Method

As noted above, when the Hartree-Fock wave function \(\psi_0\) is used as the zeroth-order starting point in a perturbation expansion, the first (and presumably most important) corrections to this function are the doubly-excited determinants. In early studies of CI treatments of electron correlation, it was observed that double excitations had the largest \(C_J\) coefficients (after the SCF wave function, which has the very largest \(C_J\)). Moreover, in CI studies that included single, double, triple, and quadruple level excitations relative to the dominant SCF determinant, it was observed that quadruple excitations had the next largest \(C_J\) amplitudes after the double excitations. And, very importantly, it was observed that the amplitudes \(C_{abcd}^{mnpq}\) of the quadruply excited CSFs \(\Phi_{abcd}^{mnpq}\) could be very closely approximated as products of the amplitudes \(C_{ab}^{mn} C_{cd}^{pq}\) of the doubly excited CSFs \(\Phi_{ab}^{mn}\) and \(\Phi_{cd}^{pq}\). This observation prompted workers to suggest that a more compact and efficient expansion of the correlated wave function might be realized by writing \(\psi\) as:

\[\psi = \exp(T) \Phi,\]

where \(\Phi\) is the SCF determinant and the operator \(T\) appearing in the exponential is taken to be a sum of operators

\[T = T_1 + T_2 + T_3 + … + T_N \]

that create single (\(T_1\)), double (\(T_2\)), etc. level excited CSFs when acting on \(\Phi\). As I show below, this so-called coupled-cluster (CC) form for \(\psi\) then has the characteristic that the dominant contributions from quadruple excitations have coefficients nearly equal to the products of the coefficients of their constituent double excitations.

In any practical calculation, this sum of \(T_n\) operators would be truncated to keep the calculation practical. For example, if excitation operators higher than \(T_3\) were neglected, then one would use \(T » T_1 + T_2 + T_3\). However, even when \(T\) is so truncated, the resultant \(\psi\) would contain excitations of higher order. For example, using the truncation just introduced, we would have

\[y = (1 + T_1 + T_2 + T_3 + \frac{1}{2} (T_1 + T_2 + T_3) (T_1 + T_2 + T_3) + \frac{1}{6} (T_1 + T_2 + T_3) \]

\[(T_1 + T_2 + T_3) (T_1 + T_2 + T_3) + …) \Phi. \]

This function contains single excitations (in \(T_1\Phi\)), double excitations (in \(T_2\Phi\) and in \(T_1T_1\Phi\)), triple excitations (in \(T_3\Phi\), \(T_2T_1\Phi\), \(T_1T_2\Phi\), and \(T_1T_1T_1\Phi\)), and quadruple excitations in a variety of terms including \(T_3 T_1\Phi\) and \(T_2 T_2\Phi\), as well as even higher level excitations. By the design of this wave function, the quandruple excitations \(T_2 T_2\Phi\) will have amplitudes given as products of the amplitudes of the double excitations \(T_2\Phi\) just as were found by earlier CI workers to be most important. Hence, in CC theory, we say that quadruple excitations include unlinked products of double excitations arising from the \(T_2 T_2\) product; the quadruple excitations arising from \(T_4\Phi\) would involve linked terms and would have amplitudes that are not products of double-excitation amplitudes.

After writing \(\psi\) in terms of an exponential operator, one is faced with determining the amplitudes of the various single, double, etc. excitations generated by the \(T\) operator acting on \(\Phi\). This is done by writing the Schrödinger equation as:

\[H \exp(T) \Phi = E \exp(T) \Phi,\]

and then multiplying on the left by \(\exp(-T)\) to obtain:

\[exp(-T) H \exp(T) \Phi = E \Phi.\]

The CC energy is then calculated by multiplying this equation on the left by \(\Phi^*\) and integrating over the coordinates of all the electrons:

\[\langle\Phi| \exp(-T) H \exp(T) \Phi> = E.\]

In practice, the combination of operators appearing in this expression is rewritten and dealt with as follows:

\[E = \langle\Phi| T + [H,T] + \frac{1}{2} [[H,T],T] + \frac{1}{6} [[[H,T],T],T] + \frac{1}{24} [[[[H,T],T],T],T] |\Phi\rangle;\]

this so-called Baker-Campbell-Hausdorf expansion of the exponential operators can be shown truncate exactly after the fourth power term shown here. So, once the various operators and their amplitudes that comprise \(T\) are known, \(E\) is computed using the above expression that involves various powers of the \(T\) operators.

The equations used to find the amplitudes (e.g., those of the \(T_2\) operator \(\sum_{a,b,m,n} t_{ab}^{mn}T_{ab}^{mn}\), where the \(t_{ab}^{mn}\) are the amplitudes and \(T_{ab}^{mn}\) are the excitation operators that promote two electrons from \(\phi_a\) and \(\phi_b\) into \(\phi_m\) and \(\phi_n\)) of the various excitation level are obtained by multiplying the above Schrödinger equation on the left by an excited determinant of that level and integrating. For example, the equation for the double-excitations is:

\[0 = \langle\Phi_{ab}^{mn}| T + [H,T] + \frac{1}{2} [[H,T],T] + \frac{1}{6} [[[H,T],T],T] + \frac{1}{24} [[[[H,T],T],T],T] |\Phi\rangle. \]

The zero arises from the right-hand side of \(\exp(-T) H \exp(T) \Phi = E \Phi\) and the fact that \(\langle\Phi_{ab}^{mn}|\Phi\rangle = 0 \); that is, the determinants are orthonormal. The number of such equations is equal to the number of doubly excited determinants \(\Phi_{ab}^{mn}\), which is equal to the number of unknown \(t_{ab}^{mn}\) amplitudes. So, the above quartic equations must be solved to determine the amplitudes appearing in the various \(T_J\) operators. Then, as noted above, once these amplitudes are known, the energy \(E\) can be computed using the earlier quartic equation. Having to solve many coupled quartic equations is one of the most severe computational challenges of CC theory.

Clearly, the CC method contains additional complexity as a result of the exponential expansion form of the wave function \(\psi\) and the resulting coupled quartic equations that need to be solved to determine the \(t\) amplitudes. However, it is this way of writing \(\psi\) that allows us to automatically build in the fact that products of double excitations are the dominant contributors to quadruple excitations (and \(T_2 T_2 T_2\) is the dominant component of six-fold excitations, not \(T_6\)). In fact, the CC method is today one of the most accurate tools we have for calculating molecular electronic energies and wave functions.

The Density Functional Method

These approaches provide alternatives to the conventional tools of quantum chemistry, which move beyond the single-configuration picture by adding to the wave function more configurations (i.e., excited determinants) whose amplitudes they each determine in their own way. As noted earlier, these conventional approaches can lead to a very large number of CSFs in the correlated wave function, and, as a result, a need for extraordinary computer resources.

The density functional approaches are different. Here one solves a set of orbital-level equations

\[ - \frac{\hbar^2}{2m_e} \nabla^2 - \sum_a \frac{Z_ae^2}{|\textbf{r}-\textbf{R}_a|} + \int \rho(\textbf{r}')\frac{e^2}{|\textbf{r}-\textbf{r}'|} + U(r)] \phi_i = \varepsilon_i \phi_i\]

in which the orbitals {\(\phi_i\)} feel potentials due to the nuclear centers (having charges \(Z_a\)), Coulombic interaction with the total electron density \(\rho(\textbf{r}')\), and a so-called exchange-correlation potential denoted \(U(\textbf{r}')\). The particular electronic state for which the calculation is being performed is specified by forming a corresponding density \(\rho(\textbf{r}')\) that, in turn, is often expressed as a sum of squares of occupied orbitals multiplied by orbitial occupation numbers. Before going further in describing how DFT calculations are carried out, let us examine the origins underlying this theory.

The so-called Hohenberg-Kohn theorem states that the ground-state electron density \(\rho(\textbf{r})\) of the atom or molecule or ion of interest uniquely determines the potential \(V(\textbf{r})\) in the molecule’s electronic Hamiltonian (i.e., the positions and charges of the system’s nuclei)

\[H = \sum_j {-\frac{\hbar^2}{2m_e} \nabla_j^2 + V(r_j) + \frac{e^2}{2} \sum_{k\ne j} \frac{1}{r_{j,k}} },\]

and, because H determines all of the energies and wave functions of the system, the ground-state density \(\rho(\textbf{r})\) therefore determines all properties of the system.

One proof of this theorem proceeds as follows:

\(\rho(\textbf{r})\) determines the number of electrons \(N\) because \(\int \rho(\textbf{r}) d^3r = N\).
Assume that there are two distinct potentials (aside from an additive constant that simply shifts the zero of total energy) \(V(\textbf{r})\) and \(V'(\textbf{r})\) which, when used in \(H\) and \(H’\), respectively, to solve for a ground state produce \(E_0\), \(\psi (r)\) and \(E_0’\), \(\psi'(r)\) that have the same one-electron density: \(\int |\psi|^2 dr_2 dr_3 ... dr_N = \rho(\textbf{r})= \int |\psi'|^2 dr_2 dr_3 ... dr_N \).
If we think of \(\psi'\) as trial variational wave function for the Hamiltonian \(H\), we know that \(E_0 < \langle \psi'|H|\psi'\rangle = \langle \psi'|H’|\psi'\rangle + \int \rho(\textbf{r}) [V(\textbf{r}) - V’(\textbf{r})] d^3r = E_0’ + \int \rho(\textbf{r}) [V(\textbf{r}) - V’(\textbf{r})] d^3r\).
Similarly, taking \(\psi\) as a trial function for the \(H’\) Hamiltonian, one finds that \(E_0’ < E_0 + \int \rho(\textbf{r}) [V’(\textbf{r}) - V(\textbf{r})] d^3r\).
Adding the equations in c and d gives \[E_0 + E_0’ < E_0 + E_0’,\] a clear contradiction unless the electronic state of interest is degenerate.

Hence, there cannot be two distinct potentials \(V\) and \(V’\) that give the same non-degenerate ground-state \(\rho(\textbf{r})\). So, the ground-state density \(\rho(\textbf{r})\) uniquely determines \(N\) and \(V\), and thus H. Furthermore, because the eigenfunctions of \(H\) determine all properties of the ground state, then \(\rho(\textbf{r})\), in principle, determines all such properties. This means that even the kinetic energy and the electron-electron interaction energy of the ground-state are determined by \(\rho(\textbf{r})\). It is easy to see that \(\int \rho(\textbf{r}) V(r) d^3r = V[\rho]\) gives the average value of the electron-nuclear (plus any additional one-electron additive potential) interaction in terms of the ground-state density \(\rho(\textbf{r})\). However, how are the kinetic energy \(T[\rho]\) and the electron-electron interaction \(V_{ee}[\rho]\) energy expressed in terms of r?

There is another point of view that I find sheds even more light on why it makes sense that the ground-state electron density \(\rho(\textbf{r})\) contains all the information needed to determine all properties. It was shown many years ago, by examining the mathematical character of the Schrödinger equation, that the ground-state wave function \(\psi_0(r)\) has certain so-called cusps in the neighborhoods of the nuclear centers \(R_a\). In particular \(\psi_0(r)\) must obey

\[\frac{\partial \psi_0(r_1,r_2,\cdots,r_N)}{\partial r_k}=-\frac{m_eZ_ae^2}{\hbar^2}\psi_0(r_1,r_2,\cdots,r_N)\text{ as }\textbf{r}_k \rightarrow \textbf{R}_a\]

That is, the derivative or slope of the natural logarithm of the true ground-state wave function must be as any of the electrons’ positions approach the nucleus of charge \(Z_a\) residing at position \(R_a\). Because the ground-state electron density can be expressed in terms of the ground-state wave function as

it can be shown that the ground-state density also displays cusps at the nuclear centers as \(r \rightarrow R_a\).

where me is the electron mass and e is the unit of charge. So, imagine that you knew the true ground-state density at all points in space. You could integrate the density over all space

to determine how many electrons the system has. Then, you could explore over all space to find points at which the density had sharp points characterized by non-zero derivatives in the natural logarithm of the density. The positions \(R_a\) of such points specify the nuclear centers, and by measuring the slopes in \(\ln(\rho(\textbf{r}))\) at each location, one could determine the charges of these nuclei through

\[{\rm slope}=\left(\dfrac{\partial\ln(\rho(r))}{dr}\right)_{r\rightarrow R_a}=-2\frac{m_eZ_ae^2}{\hbar^2}\]

This demonstrates why the ground-state density is all one needs to fully determine the locations and charges of the nuclei as well as the number of electrons and thus the entire Hamiltonian \(H\).

The main difficulty with DFT is that the Hohenberg-Kohn theorem shows the values of \(T\), \(V_{ee}\), \(V\), etc. are all unique functionals of the ground-state \(\rho\) (i.e., that they can, in principle, be determined once \(\rho\) is given), but it does not tell us what these functional relations are.

To see how it might make sense that a property such as the kinetic energy, whose operator \(-\hbar^2 /2m_e \nabla^2\) involves derivatives, can be related to the electron density, consider a simple system of \(N\) non-interacting electrons moving in a three-dimensional cubic box potential. The energy states of such electrons are known to be

\[E = \frac{\hbar^2}{8m_eL^2} (n_x^2 + n_y^2 +n_z^2 ),\]

where \(L\) is the length of the box along the three axes, and \(n_x\), \(n_y\), and \(n_z\) are the quantum numbers describing the state. We can view \(n_x^2 + n_y^2 +n_z^2 = R^2\) as defining the squared radius of a sphere in three dimensions, and we realize that the density of quantum states in this space is one state per unit volume in the \(n_x\), \(n_y\), \(n_z\) space. Because \(n_x\), \(n_y\), and \(n_z\) must be positive integers, the volume covering all states with energy less than or equal to a specified energy \(E = (h^2/8m_eL^2) R^2\) is 1/8 the volume of the sphere of radius \(R\):

\[\Phi(E) = \frac{1}{8} \frac{4\pi}{3} R^3 = \frac{\pi}{6} \left(\frac{8m_eL^2E}{\hbar^2}\right)^{3/2} \]

Since there is one state per unit of such volume, \(\Phi(E)\) is also the number of states with energy less than or equal to \(E\), and is called the integrated density of states. The number of states \(g(E) dE\) with energy between \(E\) and \(E+dE\), the density of states, is the derivative of \(\Phi\):

\[g(E) = \frac{d\Phi}{dE} = \frac{\pi}{4} \left(\frac{8m_eL^2}{\hbar^2}\right)^{3/2} \sqrt{E} .\]

If we calculate the total energy for these non-interacting \(N\) electrons that doubly occupy all states having energies up to the so-called Fermi energy (i.e., the energy of the highest occupied molecular orbital HOMO), we obtain the ground-state energy:

\[E_0=2\int_0^{E_F} g(E)EdE = \frac{8\pi}{5} \left(\frac{2m_e}{\hbar^2}\right)^{3/2} L^3 E\Phi^{5/2}.\]

The total number of electrons \(N\) can be expressed as

\[N = 2\int_0^{E_F} g(E)dE = \frac{8\pi}{3} \left(\frac{2m_e}{\hbar^2}\right)^{3/2} L^3 E\Phi^{3/2},\]

which can be solved for \(E\Phi\) in terms of \(N\) to then express \(E_0\) in terms of \(N\) instead of in terms of \(E\Phi\):

\[E_0 = \frac{3\hbar^2}{10m_e} \left(\frac{3}{8\pi}\right)^{2/3} L^3 \left(\frac{N}{L^3}\right)^{5/3} .\]

This gives the total energy, which is also the kinetic energy in this case because the potential energy is zero within the box and because the electrons are assumed to have no interactions among themselves, in terms of the electron density \(\rho (x,y,z) = \dfrac{N}{L^3}\). It therefore may be plausible to express kinetic energies in terms of electron densities \(\rho(\textbf{r})\), but it is still by no means clear how to do so for real atoms and molecules with electron-nuclear and electron-electron interactions operative.

In one of the earliest DFT models, the Thomas-Fermi theory, the kinetic energy of an atom or molecule is approximated using the above kind of treatment on a local level. That is, for each volume element in \(\textbf{r}\) space, one assumes the expression given above to be valid, and then one integrates over all \(\textbf{r}\) to compute the total kinetic energy:

\[ T_{\rm TF}[\rho] = \int \frac{3\hbar^2}{10m_e} \left(\frac{3}{8\pi}\right)^{2/3} [\rho(\textbf{r})]^{5/3} d^3r = C_F \int [\rho(\textbf{r})]^{5/3} d^3r ,\]

where the last equality simply defines the \(C_F\) constant. Ignoring the correlation and exchange contributions to the total energy, this \(T\) is combined with the electron-nuclear \(V\) and Coulombic electron-electron potential energies to give the Thomas-Fermi total energy:

\[E_{\rm 0,TF} [\rho] = C_F \int [\rho(\textbf{r})]^{5/3} d^3r + \int V(r) \rho(\textbf{r}) d^3r + e^2/2 \int \frac{\rho(\textbf{r}) \rho(\textbf{r}’)}{|r-r’|} d^3r d^3r’,\]

This expression is an example of how \(E_0\) is given as a local density functional approximation (LDA). The term local means that the energy is given as a functional (i.e., a function of \(\rho\)) which depends only on \(\rho(\textbf{r})\) at points in space but not on \(\rho(\textbf{r})\) at more than one point in space or on spatial derivatives of \(\rho(\textbf{r})\).

Unfortunately, the Thomas-Fermi energy functional does not produce results that are of sufficiently high accuracy to be of great use in chemistry. What is missing in this theory are the exchange energy and the electronic correlation energy. Moreover, the kinetic energy is treated only in the approximate manner described earlier (i.e., for non-interacting electrons within a spatially uniform potential).

Dirac was able to address the exchange energy for the uniform electron gas (\(N\) Coulomb interacting electrons moving in a uniform positive background charge whose magnitude balances the total charge of the \(N\) electrons). If the exact expression for the exchange energy of the uniform electron gas is applied on a local level, one obtains the commonly used Dirac local density approximation to the exchange energy:

\[E_{\rm ex,Dirac}[\rho] = - C_x \int [\rho(\textbf{r})]^{4/3} d^3r,\]

with \(C_x = (3/4) (3/\pi)^{1/3}\). Adding this exchange energy to the Thomas-Fermi total energy \(E_{\rm 0,TF} [\rho]\) gives the so-called Thomas-Fermi-Dirac (TFD) energy functional.

Because electron densities vary rather strongly spatially near the nuclei, corrections to the above approximations to \(T[\rho]\) and \(E_{\rm ex,Dirac}\) are needed. One of the more commonly used so-called gradient-corrected approximations is that invented by Becke, and referred to as the Becke88 exchange functional:

\[E_{\rm ex}({\rm Becke88}) = E_{\rm ex,Dirac}[\rho] -\gamma \int \frac{x^2 r^{4/3}}{1+6 \gamma x \sinh^{-1}(x)} dr,\]

where \(x =r^{-4/3} |\nabla\rho|\), and \(\gamma\) is a parameter chosen so that the above exchange energy can best reproduce the known exchange energies of specific electronic states of the inert gas atoms (Becke finds \(\gamma\) to equal 0.0042). A common gradient correction to the earlier local kinetic energy functional \(T[\rho]\) is called the Weizsacker correction and is given by

\[\delta{T_{\rm Weizsacker}} = \frac{1}{72} \frac{\hbar}{m_e} \int \frac{ | \nabla \rho(\textbf{r})|^2}{\rho(\textbf{r})} dr.\]

Although the above discussion suggests how one might compute the ground-state energy once the ground-state density \(\rho(\textbf{r})\) is given, one still needs to know how to obtain \(\rho\). Kohn and Sham (KS) introduced a set of so-called KS orbitals obeying the following equation:

\[{-\dfrac{\hbar^2}{2m} \nabla^2 + V(r) + e^2 \int \frac{\rho(\textbf{r}’)}{|r-r’|} dr’ + U_{\rm xc}(r) }\phi_J = \varepsilon_j \phi_j ,\]

where the so-called exchange-correlation potential \(U_{xc} (r) = dE_{\rm xc}[\rho]/d\rho(\textbf{r})\) could be obtained by functional differentiation if the exchange-correlation energy functional \(E_{\rm xc}[\rho]\) were known. KS also showed that the KS orbitals {\(\phi_J\)} could be used to compute the density \(\rho\) by simply adding up the orbital densities multiplied by orbital occupancies \(n_j\):

\[\rho(\textbf{r}) = \sum_j n_j |\phi_J(r)|^2\]

(here \(n_j =0,1,\) or 2 is the occupation number of the orbital \(\phi_J\) in the state being studied) and that the kinetic energy should be calculated as

\[T = \sum_j n_j \langle \phi_J(r)| -\dfrac{\hbar^2}{2m} \nabla^2 |\phi_J(r)\rangle \]

The same investigations of the idealized uniform electron gas that identified the Dirac exchange functional found that the correlation energy (per electron) could also be written exactly as a function of the electron density \(\rho\) of the system for this model system, but only in two limiting cases- the high-density limit (large \(\rho\)) and the low-density limit. There still exists no exact expression for the correlation energy even for the uniform electron gas that is valid at arbitrary values of \(\rho\). Therefore, much work has been devoted to creating efficient and accurate interpolation formulas connecting the low- and high- density uniform electron gas. One such expression is

\[E_C[\rho] = \int \rho(\textbf{r}) \varepsilon_c(r) dr,\]

where

\[\varepsilon_c(r) = \dfrac{A}{2}\ln\Big(\dfrac{x}{X}\Big) + \dfrac{2b}{Q} \tan^{-1}\dfrac{Q}{2x+b} -\dfrac{bx_0}{X_0} [\ln\Big(\dfrac{(x-x_0)^2}{X}\Big) +\dfrac{2(b+2x_0)}{Q} \tan^{-1}\dfrac{Q}{2x+b}\]

is the correlation energy per electron. Here \(x = \sqrt{r_s}\), \(X=x^2 +bx+c\), \(X_0 =x_0^2 +bx_0+c\) and \(Q=\sqrt{4c - b^2}\), \(A = 0.0621814\), \(x_0= -0.409286\), \(b = 13.0720\), and \(c = 42.7198\). The parameter \(r_s\) is how the density \(\rho\) enters since \(4/3 \pi r_s^3\) is equal to \(1/\rho\); that is, \(r_s\) is the radius of a sphere whose volume is the effective volume occupied by one electron.

A reasonable approximation to the full \(E_{\rm xc}[\rho]\) would contain the Dirac (and perhaps gradient corrected) exchange functional plus the above \(E_C[\rho]\), but there are many alternative approximations to the exchange-correlation energy functional. Currently, many workers are doing their best to cook up functionals for the correlation and exchange energies, but no one has yet invented functionals that are so reliable that most workers agree to use them.

To summarize, in implementing any DFT, one usually proceeds as follows:

An atomic orbital basis is chosen in terms of which the KS orbitals are to be expanded. Most commonly, this is a Gaussian basis or a plane-wave basis.
Some initial guess is made for the LCAO-KS expansion coefficients \(C_{j,a}: \phi_J = \sum_a C_{j,a} \chi_a\) of the occupied KS orbitals.
The density is computed as \[\rho(\textbf{r}) = \sum_j n_j |\phi_J(r)|^2\] . Often, \(\rho(\textbf{r})\) itself is expanded in an atomic orbital basis, which need not be the same as the basis used for the \(\phi_J\), and the expansion coefficients of \(\rho\) are computed in terms of those of the this new basis. It is also common to use an atomic orbital basis to expand \(\rho^{1/3}(r)\), which, together with \(\rho\), is needed to evaluate the exchange-correlation functional’s contribution to \(E_0\).
The current iteration’s density is used in the KS equations to determine the Hamiltonian \[{-\dfrac{\hbar^2}{2m} \nabla^2 + V(r) + e^2 \int \frac{\rho(\textbf{r}’)}{|r-r’|} dr’ + U_{\rm xc}(r) }\] whose new eigenfunctions {\(\phi_J\)} and eigenvalues {\(\epsilon_J\)} are found by solving the KS equations.
These new \(\phi_J\) are used to compute a new density, which, in turn, is used to solve a new set of KS equations. This process is continued until convergence is reached (i.e., until the \(\phi_J\) used to determine the current iteration’s \(\rho\) are the same \(\phi_J\) that arise as solutions on the next iteration.
Once the converged \(\rho(\textbf{r})\) is determined, the energy can be computed using the earlier expression

\[E [\rho] = \sum_j n_j \langle \phi_J(r)| -\dfrac{\hbar^2}{2m} \nabla^2|\phi_J(r)\rangle + \int V(r) \rho(\textbf{r}) dr + \frac{e^2}{2} \int \frac{\rho(\textbf{r})\rho(\textbf{r}’)}{|r-r’|}dr dr’+ E_{\rm xc}[\rho].\]

Energy Difference Methods

In addition to the methods discussed above for treating the energies and wave functions as solutions to the electronic Schrödinger equation, there exists a family of tools that allow one to compute energy differences directly rather than by finding the energies of pairs of states and subsequently subtracting them. Various energy differences can be so computed: differences between two electronic states of the same molecule (i.e., electronic excitation energies \(\Delta E\)), differences between energy states of a molecule and the cation or anion formed by removing or adding an electron (i.e., ionization potentials (IPs) and electron affinities (EAs)). In the early 1970s, the author developed one such tool for computing EAs (J. Simons, and W. D. Smith, Theory of Electron Affinities of Small Molecules, J. Chem. Phys., 58, 4899-4907 (1973)) and he called this the equations of motion (EOM) method. Throughout much of the 1970s and 1980s, his group advanced and applied this tool to their studies of molecular EAs and electron-molecule interactions.

Because of space limitations, we will not be able to elaborate much in great detail on these methods. However, it is important to stress that:

These so-called EOM or Greens function or propagator methods utilize essentially the same input information (e.g., atomic orbital basis sets) and perform many of the same computational steps (e.g., evaluation of one- and two- electron integrals, formation of a set of mean-field molecular orbitals, transformation of integrals to the MO basis, etc.) as do the other techniques discussed earlier.
These methods are now rather routinely used when \(\Delta E\), IP, or EA information is sought.

The basic ideas underlying most if not all of the energy-difference methods are:

One forms a reference wave function \(\psi\) (this can be of the SCF, MPn, CI, CC, DFT, etc. variety); the energy differences are computed relative to the energy of this function.
One expresses the final-state wave function \(\psi’\) (i.e., that describing the excited, cation, or anion state) in terms of an operator \(\Omega\) acting on the reference \(\psi\): \(\psi’ = \Omega \psi\). Clearly, the \(\Omega\) operator must be one that removes or adds an electron when one is attempting to compute IPs or EAs, respectively.
One writes equations which \(\psi\) and \(\psi’\) are expected to obey. For example, in the early development of these methods, the Schrödinger equation itself was assumed to be obeyed, so \(H\psi = E \psi \) and \(H\psi' = E’ \psi’\) are the two equations.
One combines \(\Omega\psi = \psi’\) with the equations that \(\psi\) and \(\psi’\) obey to obtain an equation that \(\Omega\) must obey. In the above example, one (a) uses \(\Omega\psi = \psi’\) in the Schrödinger equation for \(\psi’\), (b) allows \(\Omega\) to act from the left on the Schrödinger equation for \(\psi\), and (c) subtracts the resulting two equations to achieve \((H\Omega - \Omega H) \psi = (E’ - E) \Omega \psi \), or, in commutator form \([H,\Omega] \psi = \Delta E \Omega \psi\).
One can, for example, express \(\psi\) in terms of a superposition of configurations \(\psi = \sum_J C_J \phi_J\) whose amplitudes \(C_J\) have been determined from a CI or MPn calculation and express \(\Omega\) in terms of operators {\(O_K\)} that cause single-, double-, etc. level excitations (for the IP (EA) cases, \(\Omega\) is given in terms of operators that remove (add), remove and singly excite (add and singly excite, etc.) electrons): \(\Omega = \sum_K D_K O_K\).
Substituting the expansions for \(\psi\) and for \(\Omega\) into the equation of motion (EOM)

\([H,\Omega] \psi = \Delta E \Omega \psi\), and then projecting the resulting equation on the left against a set of functions (e.g., {\(O_{K’} |\psi>\)}) gives a matrix eigenvalue-eigenvector equation

\[\sum_K \langle O_{K’}\psi| [H,O_K] \psi \rangle D_K = \Delta E \sum_K \langle O_{K’}\psi|O_K\psi\rangle D_K\]

to be solved for the \(D_K\) operator coefficients and the excitation (or IP or EA) energies \(\Delta E\). Such are the working equations of the EOM (or Greens function or propagator) methods.

In recent years, these methods have been greatly expanded and have reached a degree of reliability where they now offer some of the most accurate tools for studying excited and ionized states. In particular, the use of time dependent variational principles have allowed a much more rigorous development of equations for energy differences and non-linear response properties. In addition, the extension of the EOM theory to include coupled-cluster reference functions now allows one to compute excitation and ionization energies using some of the most accurate ab initio tools.

Search

Text Color

Text Size

Margin Size

Font Type