19.3: Strengths and Weaknesses of Various Methods

Last updated
Save as PDF

Page ID: 70436

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

Variational Methods Such as MCSCF, SCF, and CI Produce Energies that are Upper Bounds, but These Energies are not Size-Extensive

Methods that are based on making the energy functional \( \frac{\langle \Psi \big| \text{ H } \big| \Psi \rangle}{\langle \Psi \big| \Psi \rangle} \) stationary (i.e., variational methods) yield upper bounds to the lowest energy of the symmetry which characterizes the CSFs which comprise \(\Psi\). These methods also can provide approximate excited-state energies and wavefunctions (e. g., in the form of other solutions of the secular equation \( \sum\limits_J \text{H}_{I,J}C_J = C_I \) that arises in the CI and MCSCF methods). Excited-state energies obtained in this manner can be shown to 'bracket' the true energies of the given symmetry in that between any two approximate energies obtained in the variational calculation, there exists at least one true eigenvalue. This characteristic is commonly referred to as the 'bracketing theorem' (E. A. Hylleraas and B. Undheim, Z. Phys. 65 , 759 (1930); J. K. L. MacDonald, Phys. Rev. 43 , 830 (1933)). These are strong attributes of the variational methods, as is the long and rich history of developments of analytical and computational tools for efficiently implementing such methods (see the discussions of the CI and MCSCF methods in MTC and ACP).

However, all variational techniques suffer from at least one serious drawback; they are not size-extensive (J. A. Pople, pg. 51 in Energy, Structure, and Reactivity , D. W. Smith and W. B. McRae, Eds., Wiley, New York (1973)). This means that the energy computed using these tools can not be trusted to scale with the size of the system. For example, a calculation performed on two \(CH_3\) species at large separation may not yield an energy equal to twice the energy obtained by performing the same kind of calculation on a single \(CH_3\) species. Lack of size-extensivity precludes these methods from use in extended systems (e.g., solids) where errors due to improper scaling of the energy with the number of molecules produce nonsensical results.

By carefully adjusting the kind of variational wavefunction used, it is possible to circumvent size-extensivity problems for selected species. For example, a CI calculation on \(Be_2\) using all \(^1\sum\limits_g\) CSFs that can be formed by placing the four valence electrons into the orbitals \(2\sigma_g \text{, } 2\sigma_u \text{, } 3\sigma_g \text{, } 3\sigma_u \text{, } 1\pi_u \text{, and } 1\pi_g\) can yield an energy equal to twice that of the Be atom described by CSFs in which the two valence electrons of the Be atom are placed into the 2s and 2p orbitals in all ways consistent with a \(^1\)S symmetry. Such special choices of configurations give rise to what are called complete-active-space (CAS) MCSCF or CI calculations (see the article by B. O. Roos in ACP for an overview of this approach).

Let us consider an example to understand why the CAS choice of configurations works. The \(^1\)S ground state of the Be atom is known to form a wavefunction that is a strong mixture of CSFs that arise from the \(2s^2 \text{ and } 2p^2\) configurations:

\[ \Psi_{\text{Be}} = C_1 \big| 1s^2 2s^2 \big| + C_2 \big| 1s^2 2p^2 \big| , \nonumber \]

where the latter CSF is a short-hand representation for the proper spin- and spacesymmetry adapted CSF

\[ \big| 1s^2 2p^2 \big| = \dfrac{1}{\sqrt{3}} \left[ \big| 1s\alpha 1s\beta 2p_0 \alpha 2p_0 \beta \big| \text{ - } \big| 1s\alpha 1s\beta 2p_1 \alpha 2p_{-1}\beta \big| \text{ - } \big| 1s\alpha 1s\beta 2p_{-1}\alpha 2p_1\beta \big| \right] . \nonumber \]

The reason the CAS process works is that the Be\(_2\) CAS wavefunction has the flexibility to dissociate into the product of two CAS Be wavefunctions:

\[ \Psi = \Psi_{\text{Bea}} \Psi_{\text{Beb}} = \left[ C_1 \big| 1s^2 2s^2 \big| + C_2 \big| 1s^2 2p^2\big| \right]_a \left[ C_1 \big| 1s^2 2s^2 \big| + C_2 \big| 1s^2 2p^2 \big| \right]_b , \nonumber \]

where the subscripts a and b label the two Be atoms, because the four electron CAS function distributes the four electrons in all ways among the \(2\sigma_a \text{, } 2\sigma_b \text{, } 2\pi_a \text{, and } 2\pi_b\) orbitals. In contrast, if the Be\(_2\) calculation had been carried out using only the following CSFs : \(\big| 1\sigma^2_g \text{ } 1\sigma^2_u \text{ } 2\sigma^2_g \text{ } 2\sigma^2_u \big|\) and all single and double excitations relative to this (dominant) CSF, which is a very common type of CI procedure to follow, the Be\(_2\) wavefunction would not have contained the particular CSFs \(\big| 1\sigma^2 2\pi^2 \big|_a \big| 1\sigma^2 2\pi^2 \big|_b\) because these CSFs are four-fold excited relative to the \(\big| 1\sigma^2_g 1\sigma^2_u 2\sigma^2_g 2\sigma^2_u \big|\) 'reference' CSF.

In general, one finds that if the 'monomer' uses CSFs that are K-fold excited relative to its dominant CSF to achieve an accurate description of its electron correlation, a size-extensive variational calculation on the 'dimer' will require the inclusion of CSFs that are 2K-fold excited relative to the dimer's dominant CSF. To perform a size-extensive variational calculation on a species containing M monomers therefore requires the inclusion of CSFs that are MxK-fold excited relative to the M-mer's dominant CSF.

Non-Variational Methods Such as MPPT/MBPT and CC do not Produce Upper Bounds, but Yield Size-Extensive Energies

In contrast to variational methods, perturbation theory and coupled-cluster methods achieve their energies from a 'transition formula' \( \langle \Phi \big| \text{ H } \big| \Psi \rangle \) rather than from an expectation value \( \langle \Psi \big| \text{ H } \big| \Psi \rangle\). It can be shown (H. P. Kelly, Phys. Rev. 131 , 684 (1963)) that this difference allows non-variational techniques to yield size-extensive energies. This can be seen in the MPPT/MBPT case by considering the energy of two non-interacting Be atoms. The reference CSF is \( \Phi = \big| s_g^2 2s_a^2 as_b^2 2s_b^2 \big| \); the Slater-Condon rules limit the CSFs in Y which can contribute to

\[ \text{E} = \langle \Phi \big| \text{ H } \big| \Psi \rangle = \langle \Phi \big| \text{ H } \sum\limits_J C_J \Phi_J \rangle . \nonumber \]

to be \(\Phi\) itself and those CSFs that are singly or doubly excited relative to \(\Phi\). These 'excitations' can involve atom a, atom b, or both atoms. However, any CSFs that involve excitations on both atoms ( e.g., \(\big| 1s_a^2 2s_a 2p_a 1s_b^2 2s_b 2p_b \big|\) ) give rise, via the SC rules, to one- and two- electron integrals over orbitals on both atoms; these integrals ( e.g., \(\langle 2s_a 2p_a \big| \text{ g } \big| 2s_b 2p_b \rangle\) ) vanish if the atoms are far apart, as a result of which the contributions due to such CSFs vanish in our consideration of size-extensivity. Thus, only CSFs that are excited on one or the other atom contribute to the energy:

\[ \text{E} = \langle \Phi_a \Phi_b \big| \text{ H } \sum\limits_{Ja} C_{Ja} \Phi_{Ja}^{\text{*}}\Phi_b + \sum\limits_{Jb}C_{Jb}\Phi_a \Phi^{\text{*}}_{Jb} \rangle , \nonumber \]

where \(\Phi_a \text{ and } \Phi_b \text{ as well as } \Phi^{\text{*}}_{Ja} \text{ and } \Phi^{\text{*}}_{Jb}\) are used to denote the a and b parts of the reference and excited CSFs, respectively.

This expression, once the SC rules are used to reduce it to one- and two- electron integrals, is of the additive form required of any size-extensive method:

\[ \text{E} = \langle \Phi_a \big| \text{ H } \sum\limits_{Ja}C_{Ja}\Phi_{Ja} \rangle + \langle \Phi_b \big| \text{ H } \big| \sum\limits_{Jb} C_{Jb} \Phi_{Jb} \rangle , \nonumber \]

and will yield a size-extensive energy if the equations used to determine the CJa and CJb amplitudes are themselves separable. In MPPT/MBPT, these amplitudes are expressed, in first order, as:

\[ C_{Ja} = \dfrac{\langle \Phi_a \Phi_b \big| \text{ H } \big| \Phi_{Ja}^{\text{*}} \Phi_b}{\text{E}^0_a + \text{E}^0_b - \text{E}^{\text{*}}_{Ja} - \text{E}^0_b } \nonumber \]

(and analogously for C\(_{Jb}\)). Again using the SC rules, this expression reduces to one that involves only atom a:

\[ C_{Ja} = \dfrac{\langle \big| \text{ H } \big| \Phi_{Ja}^{\text{*}} \rangle}{\text{E}_a^0 - \text{E}_{Ja}^{\text{*}}} . \nonumber \]

The additivity of E and the separability of the equations determining the C\(_J\) coefficients make the MPPT/MBPT energy size-extensive. This property can also be demonstrated for the Coupled-Cluster energy (see the references given above in Chapter 19. I.4). However, size-extensive methods have at least one serious weakness; their energies do not provide upper bounds to the true energies of the system (because their energy functional is not of the expectation-value form for which the upper bound property has been proven).

Which Method is Best?

At this time, it may not possible to say which method is preferred for applications where all are practical. Nor is it possible to assess, in a way that is applicable to most chemical species, the accuracies with which various methods predict bond lengths and energies or other properties. However, there are reasons to recommend some methods over others in specific cases. For example, certain applications require a size-extensive energy (e.g., extended systems that consist of a large or macroscopic number of units or studies of weak intermolecular interactions), so MBPT/MPPT or CC or CAS-based MCSCF are preferred. Moreover, certain chemical reactions (e.g., Woodward-Hoffmann forbidden reactions) and certain bond-breaking events require two or more 'essential' electronic configurations. For them, single-configuration-based methods such as conventional CC and MBTP/MPPT should not be used; MCSCF or CI calculations would be better. Very large molecules, in which thousands of atomic orbital basis functions are required, may be impossible to treat by methods whose effort scales as \(N^4\) or higher; density functional methods would be better to use then.

For all calculations, the choice of atomic orbital basis set must be made carefully, keeping in mind the N\(^4\) scaling of the one- and two-electron integral evaluation step and the \(N^5\) scaling of the two-electron integral transformation step. Of course, basis functions that describe the essence of the states to be studied are essential (e.g., Rydberg or anion states require diffuse functions, and strained rings require polarization functions).

As larger atomic basis sets are employed, the size of the CSF list used to treat dynamic correlation increases rapidly. For example, most of the above methods use singly and doubly excited CSFs for this purpose. For large basis sets, the number of such CSFs, \(N_C\), scales as the number of electrons squared, \(n_e^2\), times the number of basis functions squared, N\(^2\). Since the effort needed to solve the CI secular problem varies as \(N_C^2 \text{ or } N_C^3\), a dependence as strong as N\(^4\) to N\(^6\) can result. To handle such large CSF spaces, all of the multiconfigurational techniques mentioned in this paper have been developed to the extent that calculations involving of the order of 100 to 5,000 CSFs are routinely performed and calculations using 10,000, 100,000, and even several million CSFs are practical

Other methods, most of which can be viewed as derivatives of the techniques introduced above, have been and are still being developed. This ongoing process has been, in large part, stimulated by the explosive growth in computer power and change in computer architecture that has been realized in recent years. All indications are that this growth pattern will continue, so ab initio quantum chemistry will likely have an even larger impact on future chemistry research and education (through new insights and concepts).

Search

Text Color

Text Size

Margin Size

Font Type