Skip to main content

gaussian

About

Gaussian is a quantum chemistry program produced by Gaussian, Inc. Since Gaussian is commercially licensed software, only users from an institution that has purchased a license can use Gaussian, or any additionally licensed special features, on that institution's LONI machine.

Version and Availability

Softenv Keys for gaussian on pandora
Machine Version Softenv Key
pandora 03 +gaussian-03
pandora 09 +gaussian-09
pandora 09 +gaussian-09-D01
pandora 4.1.2 +gaussian-view-4.1.2
pandora 5.0.8 +gaussian-view-5.0.8
▶ Display Softenv Keys for gaussian all clusters
Machine Version Softenv Key
eric 03 +gaussian-03
eric 09 +gaussian-09
eric 09 +gaussian-09-C01
eric 09 +gaussian-09-D01
oliver 03 +gaussian-03
louie 03 +gaussian-03
poseidon 09 +gaussian-09-D01
painter 03 +gaussian-03
painter 09 +gaussian-09
painter 09 C01 +gaussian-09-C01
philip 03 +gaussian-03
philip 09 +gaussian-09
philip 09 Rev C01 +gaussian-09-C01
philip 09 Rev D01 +gaussian-09-D01
pandora 03 +gaussian-03
pandora 09 +gaussian-09
pandora 09 +gaussian-09-D01
pandora 4.1.2 +gaussian-view-4.1.2
pandora 5.0.8 +gaussian-view-5.0.8
supermike2 09 +gaussian-09
supermike2 09 +gaussian-09-C01
supermike2 09 +gaussian-09-D01
▶ Softenv FAQ?

The information here is applicable to LSU HPC and LONI systems.

Softenv

SoftEnv is a utility that is supposed to help users manage complex user environments with potentially conflicting application versions and libraries.

System Default Path

When a user logs in, the system /etc/profile or /etc/csh.cshrc (depending on login shell, and mirrored from csm:/cfmroot/etc/profile) calls /usr/local/packages/softenv-1.6.2/bin/use.softenv.sh to set up the default path via the SoftEnv database.

SoftEnv looks for a user's ~/.soft file and updates the variables and paths accordingly.

Viewing Available Packages

Using the softenv command, a user may view the list of available packages. Currently, it can not be ensured that the packages shown are actually available or working on the particular machine. Every attempt is made to present an identical environment on all of the LONI clusters, but sometimes this is not the case.

Example,

$ softenv
These are the macros available:
*   @default
These are the keywords explicitly available:
+amber-8                       Applications: 'Amber', version: 8 Amber is a
+apache-ant-1.6.5              Ant, Java based XML make system version: 1.6.
+charm-5.9                     Applications: 'Charm++', version: 5.9 Charm++
+default                       this is the default environment...nukes /etc/
+essl-4.2                      Libraries: 'ESSL', version: 4.2 ESSL is a sta
+gaussian-03                   Applications: 'Gaussian', version: 03 Gaussia
....
Listing of Available Packages

See Packages Available via SoftEnv on LSU HPC and LONI.

For a more accurate, up to date list, use the softenv command.

Caveats

Currently there are some caveats to using this tool.

  1. packages might be out of sync between what is listed and what is actually available
  2. resoft and soft utilities are not; to update the environment for now, log out and login after modifying the ~/.soft file.
Availability

softenv is available on all LSU HPC and LONI clusters to all users in both interactive login sessions (i.e., just logging into the machine) and the batch environment created by the PBS job scheduler on Linux clusters and by loadleveler on AIX clusters..

Packages Availability

This information can be viewed using the softenv command:

% softenv
Managing Environment with SoftEnv

The file ~/.soft in the user's home directory is where the different packages are managed. Add the +keyword into your .soft file. For instance, ff one wants to add the Amber Molecular Dynamics package into their environment, the end of the .soft file should look like this:

+amber-8

@default

To update the environment after modifying this file, one simply uses the resoft command:

% resoft

Usage

Gaussian is run from the command line, and does not provide a graphical interface. Thus interactive and batch job usage is the same. The TCP Linda extension is required run Gaussian in parallel using more than 1 node per job. Currently only LSU has such a license.

Please refer to the FAQ on Common Problems below or the Gaussian User Manual for Memory Requirements for the your gaussian job.

An input file is used to specify the desired calculations. It may be as simple as:

  
%chk=water.chk

# HF/6-31G(d)     

water energy              Title section

0   1            
O  -0.464   0.177   0.0
H  -0.464   1.137   0.0
H   0.441  -0.143   0.0

Please refer to the program documentation for details.

One an input file has been created, the next step is creating a PBS job file.

▶ QSub FAQ?

Portable Batch System: qsub

qsub

All HPC@LSU clusters use the Portable Batch System (PBS) for production processing. Jobs are submitted to PBS using the qsub command. A PBS job file is basically a shell script which also contains directives for PBS.

Usage
$ qsub job_script

Where job_script is the name of the file containing the script.

PBS Directives

PBS directives take the form:

#PBS -X value

Where X is one of many single letter options, and value is the desired setting. All PBS directives must appear before any active shell statement.

Example Job Script
 #!/bin/bash
 #
 # Use "workq" as the job queue, and specify the allocation code.
 #
 #PBS -q workq
 #PBS -A your_allocation_code
 # 
 # Assuming you want to run 16 processes, and each node supports 4 processes, 
 # you need to ask for a total of 4 nodes. The number of processes per node 
 # will vary from machine to machine, so double-check that your have the right 
 # values before submitting the job.
 #
 #PBS -l nodes=4:ppn=4
 # 
 # Set the maximum wall-clock time. In this case, 10 minutes.
 #
 #PBS -l walltime=00:10:00
 # 
 # Specify the name of a file which will receive all standard output,
 # and merge standard error with standard output.
 #
 #PBS -o /scratch/myName/parallel/output
 #PBS -j oe
 # 
 # Give the job a name so it can be easily tracked with qstat.
 #
 #PBS -N MyParJob
 #
 # That is it for PBS instructions. The rest of the file is a shell script.
 # 
 # PLEASE ADOPT THE EXECUTION SCHEME USED HERE IN YOUR OWN PBS SCRIPTS:
 #
 #   1. Copy the necessary files from your home directory to your scratch directory.
 #   2. Execute in your scratch directory.
 #   3. Copy any necessary files back to your home directory.

 # Let's mark the time things get started.

 date

 # Set some handy environment variables.

 export HOME_DIR=/home/$USER/parallel
 export WORK_DIR=/scratch/myName/parallel
 
 # Set a variable that will be used to tell MPI how many processes will be run.
 # This makes sure MPI gets the same information provided to PBS above.

 export NPROCS=`wc -l $PBS_NODEFILE |gawk '//{print $1}'`

 # Copy the files, jump to WORK_DIR, and execute! The program is named "hydro".

 cp $HOME_DIR/hydro $WORK_DIR
 cd $WORK_DIR
 mpirun -machinefile $PBS_NODEFILE -np $NPROCS $WORK_DIR/hydro

 # Mark the time processing ends.

 date
 
 # And we're out'a here!

 exit 0

An example batch file follows:

 #!/bin/tcsh
 #PBS -A your_allocation
 # specify the allocation. Change it to your allocation
 #PBS -q checkpt
 # the queue to be used. 
 #PBS -l nodes=1:ppn=4
 # Number of nodes and processors
 #PBS -l walltime=1:00:00
 # requested Wall-clock time.
 #PBS -o g09_output
 # name of the standard out file to be "g09_output".
 #PBS -j oe
 # standard error output merge to the standard output file.
 #PBS -N g09test
 # name of the job (that will appear on executing the qstat command).
 #
 # cd to the directory with Your input file
 cd ~USER/g09test
 #  
 # Change this line to reflect your input file and output file
 g09 water.inp

Multi-node Job Submission

Only LSU has the TCP Linda license and the following applies only to LSU users. Everyone else is not permitted to run gaussian jobs on more than one node

There are two methods in which one can use Gaussian with Linda across multiple nodes. This is done by manipulating the %lindaworkers Link 0 command. Here is a brief on this new command.

%LindaWorkers=node1[:#procs],node2[:#procs]

%LindaWorkers=Johnny:4, Cash:2 Will spawn 1 –4 way SMP worker on Johnny and 1 –2 way SMP worker on Cash.

A couple notes,

  1. [:#procs] is optional if all nodes are equal.
  2. One can modify this information to use only 1 SMP node. You would be using %LindaWorkers=1, which is the default value.
1. The first method uses a submission script and the standard G'09 input file.

In one's input file, there must be a Link 0 command of "%lindaworkers=LINDA" Below is the submission script.

#!/bin/tcsh
#PBS -q checkpt
# the queue to be used. "small" is the only queue available at present.
#
#PBS -l nodes=2:ppn=4
#  
#PBS -l cput=00:20:00
# requested CPU time.
#
#PBS -l walltime=00:20:00
# requested Wall-clock time.
#
#PBS -o output-file_nodes
# name of the standard out file to be "output-file".
#
#PBS -j oe
# standard error output merge to the standard output file.
#PBS -V
#PBS -N jobtest
# name of the job 

set NPROCS=`wc -l $PBS_NODEFILE |gawk '//{print $1}'`
set LINDA=`cat $PBS_NODEFILE | uniq | tr '\n' "," | sed 's|,$||' `
setenv GAUSS_SCRDIR /work/YOUR_USERNAME/
source $g09root/g09/bsd/g09.login
#move to directory with input file
cd /PATH/TO/INPUT
cat INPUT_FILE.inp | sed "s/LINDA/$LINDA/" > temp$$.inp
g09 < temp$$.inp > OUTPUT_FILE.log
rm -f temp$$.inp

In this submission manner, one creates a temporary input file, named temp$$.inp (where $$ is a variable index of the job ID).

In the sample input above, INPUT_FILE.inp, the Link 0 directives should be

%chk=/work/mmcken6/g09.chk
%mem=16mw
%nprocs=4
%lindaworkers=LINDA
(rest of input)
2. Working with the %lindaworkers command, one can insert their input file into the submission script.

In the below example, the variable $LINDA is set and properly name-mangled for the input file.

#!/bin/tcsh
#PBS -q checkpt
#PBS -l nodes=2:ppn=4
#PBS -l cput=00:20:00
#PBS -l walltime=00:20:00
#PBS -o output-file
#PBS -j oe
#PBS -N jobtest

set NPROCS=`wc -l $PBS_NODEFILE |gawk '//{print $1}'`

set LINDA=`cat $PBS_NODEFILE | uniq | tr '\n' "," | sed 's|,$||' `

cd/work/USER
# Change this line to reflect your input file and output file
g09 << END > gaussian.log
%mem=16mw
%nprocs= 4
%lindaworkers=$LINDA
#p rb3lyp/3-21g force test scf=novaracc
Valinomycinforce2
0,1
O,-1.3754834437,-2.5956821046,3.7664927822
… rest of input …
<< Gaussian's typical ending blank line >>
END

The execution line, "g09 <<END > gaussian.log " means the following

Everything after '<< ' to 'END' is taken as regular Gaussian input and direct output '>' to output file gaussian.log

Resources

▶ Common Problems FAQ?

Gaussian Common Problems

There are a few common Gaussian problems that can be easily resolved. These issues usually stem from disk or memory space limitations.

Memory Requirements

%Mem=N sets the amount of dynamic memory used to N 8-byte words (default); this value may also be followed by KB,MB,GB,KW,MW or GW (without intervening spaces) to specify units of kilo-, mega- or giga- bytes or words. The default memory size is 256 MB.

All LONI clusters and LSU HPC Tezpur cluster has only 4GB RAM per node. For running jobs on these clusters, the value of N should not be greater than 3500MB or 450MW.

LSU HPC clusters such as Philip, Pandora and SuperMike II have 24/48/96, 128 and 32 GB RAM per node respectively. The maximum value of N should be 120GB or 15GW on Pandora, 28GB or 3500MW on SuperMike II and 20/40/90GB on Philip (depending on queue).

If you use a value of N greater than these value, your job will use virtual memory making not only the job to run slower but also cause excessive swapping of memory which can bring down the node. If your jobs repeatedly use more memory than that available on the node and/or bring down the compute node, your privileges of using the cluster will be suspended.

LSU HPC users have access to TCP Linda to run gaussian jobs on multiple nodes. Note that the %Mem=N sets the amount of dynamic memory per node and not total memory for the job, so the maximum value of N should be the same as described above.

You can estimate the amount of memory in 8-byte words that your job will require using the formula

N = M + 2(NB)2

where where NB is the number of basis functions used in the calculation, and M is a minimum value that is usually generously covered by the default memory size.

Please refer to Gaussian manual on Link 0 Commands and Efficiency Considerations for more details.

Scaling

First one needs to understand the basic run-time needs of Gaussian calculations. The table below is the Formal Scaling Behavior of Gaussian, in which N = the number of basis functions. Use this table to determine how much work will be required, compared to current selections, if N is increased (e.g. if the behavior is N4, doubling N would result in 16 times more work).

Scaling Behavior Method(s)
N4 HF
N5 MP2
N6 MP3, CISD, CCSD, QCISD
N7 MP4, CCSD(T), QCISD(T)
N8 MP5, CISDT, CCSDT
N9 MP6
N10 MP7, CISDTQ, CCSDTQ
Large files and memory usage

Computational cost and demand increases quickly when trying to obtain accuracies better than the MP2 level. On the other hand, one can supply a large molecule at a lower level of theory and still come across the same disk/memory errors.

If one has a large model and needs a good electron correlation method, starting this calculation from an initial guess wave function will likely cause it to fail instantly. A typical route in achieving such accuracies with a large model begins with a good initial guess of the wave function at a lower level of theory. In this method, one uses the orbital coefficients from the lower level of theory calculations, projects them onto a larger basis set, and uses that as an initial guess for the high level of theory. Every chemical model is different; care and caution needs to be taken at each step, perhaps even repeat the calculation using a different set of inputs to see if it converges properly.

For instance, if one would like to run a large model at the MP2/6-311G** level of theory.

  1. Optimize wave function at the HF/3-21G
  2. Re-optimize at the MP2/6-31G*
  3. Re-optimize at MP2/6-311G**

When restarting the calculation the following Guess and SCF options are important

Guess=Read
Reads the initial guess from the checkpoint file. If the basis set specified is different from the basis set used in the job which generated the checkpoint file, then the wave function will be projected from one basis to the other. This is an efficient way to switch from one basis to another.
Geom=AllCheckpoint
Reads the molecular geometry, charge, multiplicity and title from the checkpoint file. This is often used to start a second calculation at a different level of theory.
SCF=Restart
Enable use of checkpoint file.
Break up restart files

Sometimes when writing a large restart file, Gaussian will crash complaining about shared memory is too small, or not enough memory. This is caused by reading/writing too much information at one time. One can break up how it writes its read-write restart file (*.rwf) by:

%rwf=/work/username/tmp1,2GB,/work/username/tmp2,2GB,/work/username/tmp3,2GB

If the last file doesn't have a number, then the rest of the rwf is written to that file.

One problem, two solutions

If one experiences two different solutions to the same problem- either same calculation on two different machines or same calculation run at different times on the same machine - one is likely using an incorrect restart file. Check your output calculations - namely the NOrb value (a different number of orbitals will likely produce a different energy result).

More information on memory and disk space usage is available on-line.

Warnings not to be ignored
Warning!!: The largest alpha MO coefficient is

This warning is usually associated with post-HF calculation (MP2 or CC). Although, this is not an error will and will not cause your job to crash, it is an important warning. It warns on the accuracy of your calculation. This occurs when one has a near-linear dependencies in the basis sets. For instance, diffuse functions on two close atoms are likely linearly dependent. When transforming to molecular orbitals, the atomic orbital integrals are multiplied by all the molecular orbital coefficients. The accuracy of the molecular orbital will decrease since one or more atomic orbitals are very large.

Last modified: March 14 2013 09:59:18.