Logging in to HPC Linux systems for the first time
HPC @ LSU has several big Linux systems. Tezpur, LSU's fatest supercomputer, is used as example here in this Guide.
Tezpur has two head nodes, tezpur1 and tezpur2. You can login to one of them by connecting via ssh to any of the two. If you are a windows user, you can find a good ssh client here.
When you first log in to mike you'll see something like this:
Generating public/private dsa key pair. Enter file in which to save the key (/home/honggao/.ssh/id_dsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/honggao/.ssh/id_dsa. Your public key has been saved in /home/honggao/.ssh/id_dsa.pub. The key fingerprint is: 31:53:a6:cb:ed:dd:8c:44:57:fd:d1:81:b5:b2:ec:29 honggao@tezpur2
You should accept the default file as the one in which to save the key, and you should use an empty passphrase. This will configure your account so that you can ssh to the nodes without receiving the login prompt. This is necessary if you want to run parallel jobs on Tezpur.
For this reason you should also be careful about modifying anything in your .ssh directory. If you cannot freely ssh between Tezpur nodes you will not be able to get your parallel program to run thus you need to reset your ssh key by using the following commands (Notes: accept the default file, answer "y" to Overwrite and use an empty passphase):
$ cd ~/.ssh
$ ssh-keygen -t dsa Generating public/private dsa key pair. Enter file in which to save the key (/home/honggao/.ssh/id_dsa): /home/honggao/.ssh/id_dsa already exists. Overwrite (y/n)? y Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/honggao/.ssh/id_dsa. Your public key has been saved in /home/honggao/.ssh/id_dsa.pub. The key fingerprint is: 11:a0:3f:a9:b5:85:69:6b:c7:6e:fb:f5:5d:fe:7c:57 honggao@tezpur2
Or you can create a new ssh key with NO passphrase and save it to the default ssh key file by adding flags to ssh-keygen.
$ ssh-keygen -N "" -q -t dsa -f ~/.ssh/id_dsa /home/honggao/.ssh/id_dsa already exists. Overwrite (y/n)? y
$ cp -p id_dsa.pub authorized_keys
Tezpur has 2 head nodes (tepur1 and tezpur2) and 360 compute nodes (tezpur001 to tezpur360). You will compile your code on a headnode, and execute it on one or more compute nodes. The remainder of this tutorial will guide you through an example of executing a parallel job on the compute nodes.
Setting up your environment.
First you have to set up your environment. You must decide which packages you want from the big list. Take note of the magic strings under the "softenv" column. In this case the magic strings we want are
- +intel-fc-9.1
- +intel-cc-9.1
- +mvapich-0.98-intel9.1
Next you need to add the appropriate variables to your environment. You can do this by using softenv. You just need to add these magic strings to your .soft file under your home directory (${HOME}/.soft) and then reset your environment by using command resoft.
[uer_name@tezpur ~]$ vi (${HOME}/.soft
@default +intel-cc-9.0 +intel-fc-9.0 +mvapich-0.98-intel9.1
[user_name@tezpur ~]$ resoftOr you can simply use the soft-dbq command to get the information associated with the package and then modify your bashrc/cshrc.
If you want to use soft-dbq you could do this:
[uer_name@tezpur ~]$ /usr/local/packages/softenv-1.4.2/bin/soft-dbq +intel-fc-9.1
The result should be something like this (you'll get something similar but different if you a csh or tcsh user):
This is all the information associated with
the key or macro +intel-fc-9.1.
-------------------------------------------
Name: +intel-fc-9.1
Description: compiler: 'Intel Fortran Compilers', version: 9.1. The Fortran compilers from Intel. docs => http://www.intel.com/software/products/compilers/clin/docs/manuals.htm
Flags: none
Groups: none
Exists on: Linux
-------------------------------------------
On the Linux architecture,
the following will be done to the environment:
The following environment changes will be made:
INTEL_HOME = /usr/local/compilers/Intel/intel_fc91
LD_LIBRARY_PATH = ${LD_LIBRARY_PATH}:/usr/local/compilers/Intel/intel_fc91/lib
MANPATH = ${MANPATH}:/usr/local/compilers/Intel/intel_fc91/man
PATH = ${PATH}:/usr/local/compilers/Intel/intel_fc91/bin
-------------------------------------------
[uer_name@tezpur ~]$ /usr/local/packages/softenv-1.4.2/bin/soft-dbq +intel-cc-9.1
This is all the information associated with
the key or macro +intel-cc-9.1.
-------------------------------------------
Name: +intel-cc-9.1
Description: compiler: 'Intel C/C++ Compiler', version: 9.1. The compilers for C/C++ from Intel. docs => http://www.intel.com/software/products/compilers/clin/docs/manuals.htm
Flags: none
Groups: none
Exists on: Linux
-------------------------------------------
On the Linux architecture,
the following will be done to the environment:
The following environment changes will be made:
INTEL_HOME = /usr/local/compilers/Intel/intel_cc91
LD_LIBRARY_PATH = ${LD_LIBRARY_PATH}:/usr/local/compilers/Intel/intel_cc91/lib
MANPATH = ${MANPATH}:/usr/local/compilers/Intel/intel_cc91/man
PATH = ${PATH}:/usr/local/compilers/Intel/intel_cc91/bin
-------------------------------------------
[uer_name@tezpur ~]$ /usr/local/packages/softenv-1.4.2/bin/soft-dbq +mvapich-0.98-intel9.1
This is all the information associated with
the key or macro +mvapich-0.98-intel9.1.
-------------------------------------------
Name: +mvapich-0.98-intel9.1
Description: library: 'Mvapich', version: 0.98-intel9.1. Mvapich is a free, portable implementation of the message passing libraries. docs => http://mvapich.cse.ohio-state.edu/overview/
Flags: none
Groups: none
Exists on: Linux
-------------------------------------------
On the Linux architecture,
the following will be done to the environment:
The following environment changes will be made:
LD_LIBRARY_PATH = ${LD_LIBRARY_PATH}:/usr/local/packages/mvapich-0.98-intel9.1/lib
MANPATH = ${MANPATH}:/usr/local/packages/mvapich-0.98-intel9.1/man
PATH = ${PATH}:/usr/local/packages/mvapich-0.98-intel9.1/bin
-------------------------------------------
What we have done here is to display the required settings to the screen. What you really need is to set up the environment by using these information. You can add the follow to your .bashrc file (using export command) if you are using bash shell.
# Environment setup for +intel-fc-9.1 export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/compilers/Intel/intel_fc91/lib" export MANPATH="$MANPATH:/usr/local/compilers/Intel/intel_fc91/man" export INTEL_HOME="/usr/local/compilers/Intel/intel_fc91" export PATH="$PATH:/usr/local/compilers/Intel/intel_fc91/bin" # Environment setup for +intel-cc-9.1 export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/compilers/Intel/intel_cc91/lib" export MANPATH="$MANPATH:/usr/local/compilers/Intel/intel_cc91/man" export INTEL_HOME="/usr/local/compilers/Intel/intel_cc91" export PATH="$PATH:/usr/local/compilers/Intel/intel_cc91/bin" # Environment setup for +mvapich-0.98-intel9.1 export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/packages/mvapich-0.98-intel9.1/lib" export MANPATH="$MANPATH:/usr/local/packages/mvapich-0.98-intel9.1/man" export PATH="$PATH:/usr/local/packages/mvapich-0.98-intel9.1/bin" export MPICH_HOME="/usr/local/packages/mvapich-0.98-intel9.1"
Changing your password
You can change or reset your password on-line at https://www.cct.lsu.edu/scss/allocations/profile.php (click "Forgot your password?" to reset your password if you forget it). You can also change your password with the "elpasswd" command once you login Tezpur. This will change your password in LDAP, affecting all your logins on CCT machines that are LDAP enabled.
There are some related commands to "elpasswd", namely "elchsh" to change your password and "elchfn" to change your finger info.
Man pages are available for these commands.
Gotchas
- It is important that you do not display any text from inside your .cshrc or .bashrc files. If you do this, you will not be able to get the status back from your batch jobs.
- Do not set environment variables in your .bash_profile or .login if you will need them during parallel computation, as these files will not be sourced on the compute nodes during a normal MPI run.
Compiling on Tezpur
So, you've managed to login and set up your environment on Tezpur. You've done whatever tweaking you like to do on any linux machine you've worked on in the past and you've got your environment set up to point to the intel compiler, MPI packages. What now?
Assuming we have a fortran program named "test.f" that we wish to compile and run under MPI we issue the following command:
[uer_name@tezpur ~]$ mpif90 -o test test.f
The packages on Tezpur will usually have a HOME variable of some kind set to facilitate writing compile commands like this. There are several flavors of MPICH available, and using the MPICH_HOME variable in your make files will make it easier for you to switch flavors if you need to.
Running on Tezpur
To run a parallel job on Tezpur you will want to submit to the batch queue. Our queuing system is Torque Portable Batch System (PBS) which is the professional workload management system from Cluster Resources and Moab which works as job scheduler from Cluster Resources also. The command that you use to submit your job is qsub.
Below is a sample PBS script, submitted using this command:
[uer_name@tezpur ~]$ qsub test.qsub
The contents of test.qsub are as follows:
#!/bin/sh
#PBS -q checkpt
# the queue to be used.
#PBS -M your_mail_address@somehost.edu
#
#PBS -l nodes=16:ppn=4
#
# number of nodes and number of processors on each node to be used.
# Do NOT use ppn = 1 except for serail job submitting to single queue.
#
#PBS -l cput=01:00:00
# requested CPU time.
#
#PBS -l walltime=01:00:00
# requested Wall-clock time.
#
#PBS -V
#
#PBS -o stdout
#PBS -e stdout
# name of the standard out file to be "output-file".
#
#PBS -j oe
# standard error output merge to the standard output file.
#
#PBS -N pbs-test
# name of the job (that will appear on executing the qstat command) to be "syschk".
#
# Following are non PBS commands. PLEASE ADOPT THE SAME EXECUTION SCHEME
# i.e. execute the job by copying the necessary files from your home directpory
# to the scratch space, execute in the scratch space, and copy back
# the necessary files to your home directory.
#
export WORK_DIR=/scratch/$USER/
export NPROCS=`wc -l $PBS_NODEFILE |gawk '//{print $1}'`
# REQUIRED for PBS to work.
# copies necessary files from home directory to scratch space.
cd $WORK_DIR
# changing the working directory to the scratch space
mpirun -machinefile $PBS_NODEFILE -np $NPROCS $WORK_DIR/test
# executes the executable.
So now you've successfully submitted your job to the queue -- but is it actually running? And if it does run, how can you analyze how well it did?
Commands for Monitoring
- qstat: this will show you the status of your job and the jobs of others in the queue. It can show you various other bits of information about your job as well, such as the number of nodes it intends to use, the name of the queue it's in, etc.
- mshow: this command lists all the jobs in the queue, first those that are running, then those that are queued in the order that they will be run.
- showstart: this command gives an estimated starting time for your job.
- qshow: this command shows the load on each compute node that your job is using.
Job queuing priority
The queuing system schedules jobs based on the job priority which takes in account several factors. Jobs with a higher job priority are scheduled ahead of jobs with a lower priority. Also it has a backfill capability when scheduling jobs that are short in duration or require a small number of nodes. That is the scheduler schedules small jobs while waiting for the start time of any large job requiring many nodes. In determining which jobs to run first, Moab is using the following formula to calculate Job priority: Job priority = credential priority + fairshare priority + resource priority + service priority(1) Credential Priority Subcomponent:
credential priority = credweight * (userweight * job.user.priority) credential priority = 100 * (10 * 100) = 100000 ( a constant )
(2) Fairshare Priority Subcomponent:
fairshare priority = fsweight * min (fscap, (fsuserweight * DeltaUserFSUsage)) fairshare priority = 100 * (10 * DeltaUserFSUsage)
A user's fair share usage is the sum of seven days of used daily processor seconds times daily decay factor divided by the sum of seven days of daily total processor seconds used times the daily decay factor. The decay factor is 0.9. DeltaUserFSUsage is the fair share target percent for each user (20 percent) minus the the calculated fair share usage percent. In other words the target percentage minus the actual used percentage. For a user who has not used the cluster for a week:
fairshare priority = 100 * (10 * 20) = 20000
(3) Resource Priority Subcomponent:
resource priority = resweight * min (rescap, (procweight * TotalProcessorsRequested) resource priority = 30 * min (3840, (10 * TotalProcessorsRequested)
For a 32 processor job: resource priority = 30 * 10 * 32 = 9600
(4) Service Priority Subcomponent:
service priority = serviceweight * (queuetimeweight * QUEUETIME + xfactorweight * XFACTOR ) service priority = 2 * (2 * QUEUETIME + 20 * XFACTOR) QUEUETIME is the time the job has been queued in minutes. XFACTOR = 1 + QUEUETIME / WALLTIMELIMIT
For a one hour job in the queue for one day: service priority = 2 * (2 * 1440 + 20 * (1 + 1440 / 60 ) ) service priority = 2 * (2880 + 500 ) = 6760
These factors are adjusted as needed to make jobs of all sizes start fairly.





