# Exercise I: Structure and Electronic Energy of a Small Molecule

- Page ID
- 94998

## Introduction

In this exercise the equilibrium structure and the electronic energy of a small molecule (or molecules) are calculated using various ab initio methods. This exercise has seven basic goals:

- to get to know the computing environment used in ab initio calculations
- to learn how to connect to a remote computer from the computers at the university and how to use the unix environment.
- to learn how to construct a simple z-matrix.
- to obtain basic knowledge of the Molpro program, so that you can use it to perform your ab initio calculations.
- to familiarize yourself with various ab initio methods encountered in the lectures such as HF, MP2, QCISD, CCSD(T) and B3LYP, and to let you understand their accuracy and capabilities with respect to experimental data.
- to familiarize yourself with some of the common ab initio basis set acronyms, to give you some hints on the methods which are used to obtain them, and to give you an idea of the computational efforts to obtain accurate results.
- to apply these methods and basis sets for the calculation of the equilibrium structures and electronic energies for H2, CO, H2, H2O2 and NH3.

## 2 A Quick Theoretical Overview

### 2.1 Ab initio methods

The term *ab initio* means from first principles. This does not mean we are solving the Schr¨odinger equation exactly. Rather, we are selecting a method that, in principle, can lead to a reasonable approximation to the solution of the Schr¨odinger equation, and then selecting a basis set that will implement that method in a reasonable way. By reasonable, we mean that the results are adequate for the application at hand. A method and basis set that is adequate for one application may be inadequate for another. We also have to take into 1 account the cost of doing calculations and the total amount of time required. A wide range of methods have been employed, but in this exercise we will restrict ourselves to one density funtional method and some commonly used methods that use molecular orbital theory (i.e. Hartree-Fock). The methods used in this exercise are the following:

- HF
- MP2
- QCISD
- CCSD(T)
- B3LYP(To use B3LYP you should type the following for method in the work.com file $functional=b3lyp rks)

In the basic HF (Hartree-Fock) calculation each electron is considered in the average field caused by all the other electrons and the nuclei. The next three methods try to improve the Hartree-Fock electronic wavefunction: the MP2 (Møller-Plesset) method, employs perturbation theory at the second order. The QCISD (Quadratic Configuration Interaction of Single and Double excitations) calculates the electronic wavefunction as a linear combination of Hartree-Fock determinants in which all the single and double excitations are included. The CCSD(T) (Coupled-Cluster Single, Double, and perturbative Triple excitations) is based on an exponential approach. B3LYP is a density functional method; the electronic energy is expressed using the electronic density, not the wave function.

### 2.2 Basis functions

Historically, quantum calculations for molecules were performed as LCAO MO, i.e. Linear Combination of Atomic Orbitals - Molecular Orbitals:

\[ψ_i = \sum_{j=1}^n c_{ij}φ_{j} \label{1}\]

where ψi is the i-th molecular orbital, cij are the coefficients of linear combination, φj is the j-th atomic orbital, and n is the number of atomic orbitals. Atomic 2 Orbitals (AO) are solutions of the Hartree-Fock equations for the atom, i.e. a wave functions for a single electron in the atom. More recently, the term atomic orbital has been replaced by ”basis function” or ”contraction”, when appropriate.

Slater-Type-Orbitals (STOs) are similar to the AOs of the hydrogen atom. They are described by a function that depends on spherical coordinates:

\[φ_i (ζ, n, l; r, θ, φ) = Nr^{n−1} e^{−ζr}Y_l^m (θ, φ) \label{2}\]

where \(N\) is the normalization constant and ζ is the exponential factor. The variables r, θ, and φ are spherical coordinates, and Ylm is the angular momentum part (function describing ”shape”). The integers n, l, and m are quantum numbers: principal, angular momentum, and magnetic, respectively. Unfortunately, STOs are not suitable for fast calculations of necessary two-electron integrals. That is why the Gaussian-Type-Orbitals (GTOs) were introduced. They can be used to approximate the shape of the STO function by summing up a number of GTOs with different exponents and coefficients. You can use several GTOs (4-5) to represent an STO, and still calculate integrals faster than if the original STOs were used. The GTO is expressed as:

\[φ GT O (α, l, m, n; x, y, z) = Ne−αr2 x l y mz n (3)\]

where N is the normalization constant and α is the exponential factor. The variables x, y, and z are cartesian coordinates. The GTOs are not really orbitals, they are simpler functions. In recent literature they have been called gaussian primitives.

In order to describe the shapes of the molecular orbitals correctly, several basis functions (or contractions) are required for accurate treatment in molecules. Basis sets that possess more than one contraction to describe each electron are called extended basis sets. There are several types of extended basis sets:

- Double-Zeta, Triple-Zeta,... Multiple contractions are used to represent a single Slater orbital.
- Split-Valence Multiple contractions are used for only the valence (outer) orbitals. 3
- Polarized Sets The basic picture of atomic orbitals existing only as s, p, d, f etc. is modified by mixing the different types. This treatment takes into account the polarization effect which distorts the shape of the atomic orbitals when atoms are brought together.
- Diffuse Sets These basis sets utilize very small exponents to clarify the properties of the tail of the wave function. When an atom is in an anion, in an excited state or when long range effects are important for some other reason, the tail of the wave function becomes important. In this exercise we will focus on only five different basis sets. These are
- STO-3G: Minimal basis set. Each Slater-Type-Orbital expressed by 3 gaussians.
- 6-31G(d) (or 6-31G*): Pople’s (Split Valence) basis set which includes 6 gaussian primitives for the inner shells and uses the 31 contraction scheme for the valence electrons. In this contraction scheme each orbital is described by two basis functions, the first of which includes three gaussian primitives and the second of which contains one. Adding a single polarization function to 6-31G (i.e. 6-31G(d)) will result in one d function for first and second row atoms and one f function for first transition row atoms, since d functions are already present for the valence electrons in the latter.
- aug-cc-pVXZ(X=D,T): Dunning’s correlation consistent basis set. Includes polarization functions by definition. The terms DZ (double zeta) and TZ (triple zeta) mean that two (DZ) and three (TZ) contractions are used to represent a single Slater orbital. These basis sets are augmented with diffuse functions by including the aug- prefix to the basis set keyword. However, the elements He, Mg, Li, Be, and Na do not have diffuse functions defined within this basis set. 4

## 3 Practical Issues

### 3.1 The Computing Environment

Many of the ab initio programs have such high capacity and memory requirements that it is unfeasible to run them on a standard desktop computer. In addition, some of the ab initio programs are so expensive that most users cannot afford them. Because of these reasons, ab initio calculations are in general performed on supercomputers or computer clusters. In this exercise the calculations will be performed on Hippu which is one of the supercomputers of the Finnish CSC-IT Center for Science. This computer is located in Espoo and will be accessed by remote connection from the student’s computers. (Further information on Hippu and CSC in general can be found in http://www.csc.fi/english/english/pages/hippu guide and http://www.csc.fi)

### 3.2 Connecting to a CSC Supercomputer

The connecting procedure differs whether you are using a unix or windows computer.

### 3.2.1 Unix Systems

- Open Konsole. (Applications → System Tools → Konsole)
- Type ssh -x jlkX@hippu.csc.fi where X = 01, 02, . . . , 16. The password will be given at the start of the exercise.

### 3.2.2 Windows Systems

- Open WinaXe. (Start → All Programs → WinaXe v7.1 → Xsession) Open PuTTY. (Start → All Programs → PuTTY → PuTTY) On the PuTTY window select SSH → X11 from the panel on the left and check the box for Enable X11 Forwarding. Pick the session interleaf at the very beginning of the panel. Write hippu.csc.fi below Host Name (or IP adress).
- Answer yes to any questions 5
- write your user id to the login field. The user id is of the form jlkX (X = 01, 02, ..., 16). The password will be given at the start of the exercise. 3.3 The Unix Environment 3.3.1 Essential Unix Commands mkdir creates a new directory (mkdir folder1 )
- cd changes from one directory to another (cd folder1 )
- cp copies to a folder (cp work.com folder1 )
- mv moves to a folder (mv work.com folder1 ) rm removes a file (rm work.com) pwd returns the directory you are currently in ll and ls return the contents of the current directory -r extends the cp or rm command to the contents of the whole directory (cp -r folder1 folder2 ) 3.3.2 Moving Between Directories with the cd Command
- cd exercise1 moves you to the exercise1 directory cd exercise1/task1 moves you to the task1 directory within the exercise1 directory cd .. moves you up one directory from the present directory
- cd ../.. moves you up two directories from the present directory cd ../exercise2 moves you up one directory from your present directory and to exercise2 directory therein
- cd $WRKDIR moves you to you work directory where all the computations will be performed. 6 3.4 Nano the Text Editor and Using the less Command The work file work.com can be opened with the command nano work.com. You can exit the text editor by simultaneously pressing CTRL and X. The program asks whether you want to save your files to which you should replay y. You can view your work file with the command less work.com. Pressing SHIFT and G simultaneously will take you to the end of the file. You can exit the viewer by typing q.

### 3.5 Molden

We will use the Molden program to visualize our results. To use molden you must first type the command module load molden followed by molden work.molden to open the program.

### 3.6 Using Molpro

- All the computations will be performed in the work directory accessible via the command cd $WRKDIR.
- Before you perform any calculations you must call molpro with the command module load molpro
- The file work.com will tell the program what to calculate.
- You can start the work by typing molpro work.com. The command molpro work.com & & will perform the calculation on the background. You can view ongoing computations with the command ps You can stop any ongoing process by pressing CTRL and C. (This is useful if you spot an error in your work file.)
- Molpro automatically makes the files work.out and work.xml
- After the calculation you should visualize your results with Molden

### 3.7 A Sample work.com File

With this sample file the geometry optimization of the H2 molecule can be performed. The texts after # are comments and can be left out of your file. ***, H2, #THIS IS HOW THE WORK NAME IS GIVEN ang #THE BOND LENGTHS WILL BE GIVEN IN Å geometry={ H1; H2,H1,r12} #THIS Z-MATRIX IDENTIFIES THE MOLECULE UNDER INVESTIGATION r12=0.74144 #HERE ARE THE STARTING VALUES FOR THE VARIABLES WITHIN THE Z-MATRIX basis=STO-3G #THE BASIS SET IS GIVEN HERE hf ccsd(t) #THIS IS THE COMPUTATIONAL METHOD USED. ALL METHODS EMPLOYING THE HF DETERMINANT AS A STARTING POINT SHOULD DO A HF CALCULATION FIRST optg #THE COMMAND TO OPTIMIZE GEOMETRY. IF THIS IS MISSING, ELECTRONIC ENERGY WILL BE COMPUTED ONLY WITH THE GIVEN GEOMETRY put,molden,h2.molden #MAKES AN INPUT FILE FOR MOLDEN 8

### 3.8 The Structure of the Z-Matrix

In general the z-matrix has the following structure

geometry={

Atom1;

Atom2,Atom1,r12;

Atom3,Atom1,r13,Atom2,a312;

Atom4,Atom3,r43,Atom1,a431,Atom2,t4312;

Atom5,Atom3,r53,Atom1,a531,Atom4,t5314}

Where Atomx should be replaced by the chemical symbol for the atom in question (O for oxygen etc.). To differentiate between atoms of the same kind, a serial number can be inserted behind the symbol (such as H1 and H2 in the sample work.com file). The letters r stand for bond lengths, a stand for bond angles and t for dihedral angles. For example r12 is the bond length between atoms 1 and 2, a312 is the bond angle between atoms 3, 1 and 2 (with atom 1 in the middle) and t4312 is the dihedral angle between atoms 4,3,1 and 2.

### 3.9 The Results

You can collect some of your results to the table below. Feel free to extend the table for the other calculations you perform. If you have time you may try other methods besides those described in this exercise, such as the explicitly correlated CCSD(T)-F12a-method (or if you are looking for a real challenge the local DFSCS-LMP2 method). In these cases you should consult the Molpro manual at http://www.molpro.net/info/current/doc/manual.pdf

Molecule | HF/STO-3G | MP2/6-31G(d) | CCSD(T)/AVTZ | Experimental |

Coord1 | ||||

Coord2 | ||||

Energy | ||||

time |

## 4 Experimental structures of some molecules:

- \(\ce{H2}\): r=0.74144 Å
- \(\ce{CO}\): r=1.12832 Å
- \(\ce{H2O}\): r=0.9578 Å and a=104.48°
- \(\ce{H2O2}\): r(O-H)=0.967 Å, r(O-O)=1.4556 Å, a(O-O-H)=102.32°, and a(dihedral)=113.70°
- \(\ce{NH3}\): r(N-H)=1.016, and a(H-N-H)=106.7 °