Skip to main content

Painter

Painter is a 4.77 TFlops Peak Performance 128 compute node cluster running the Red Hat Enterprise Linux 4 operating system. Each node contains two Dual Core Xeon 64-bit processors operating at a core frequency of 2.33 GHz. Painter is a LONI's Dell Linux cluster housed at Louisiana Tech University.

  • 128 Compute Nodes
    • Two 2.33 GHz Dual Core Xeon 64-bit Processors
    • 4 GB Ram
    • 10 Gb/sec Infiniband network interface
    • 10/100/1000 Ethernet network interface
    • Red Hat Enterprise Linux 4
  • 1 Interactive Node
    • Two 3.00 GHz Dual Core Xeon 64-bit Processors
    • 8 GB Ram
    • 10/100/1000 Ethernet network interface
    • Red Hat Enterprise Linux 4
  • Cluster Storage
    • 2.3 TB of local storage
    • 12 TB Lustre filesystem

1. Access to Painter

*nix and Mac Users - One would issue a command similar to the following:

$ ssh -X username@painter.loni.org

The user would then be prompted for his password. The -X flags allow for X11 Forwarding to be set up automatically.

Windows Users - For a Windows client please use the PuTTY utility.

If you have forgotten your password, or you wish to reset it, see here (click "Forgot your password?").

Back to Top

2. User Environment.

Painter makes use of softenv to allow for adding software to the user's environment. Executing softenv on a cluster will display a lisf of the available software:

$ softenv
  +ImageMagick-6.4.6.9-intel-11.1
  +ParMetis-3.1.1-intel-11.1-mpich-1.2.7p1
  +R-2.8.1-gcc-4.3.2
  ...

In order to add software to your environment, you'll need to add the appropriate key to your ~/.soft file. For example, to add the package ImageMagick to your user environment, you would need to add the following:

$ cat ~/.soft
  +ImageMagick-6.4.6.9-intel-11.1
  @default

The order in which you add keys to ~/.soft is important. The first occurrence of a setting takes presedence.

Once the entries are to your liking, you must then execute the command resoft, i.e.:

$ resoft

If your code needs to link to a library of given package, you will find all software installed under /usr/local/packages/, e.g.:

$ ls /usr/local/packages/
  apache_ant  boost     fuse     gold      hdf5         ...
  arpack      boostjam  gamess   graphviz  hypre
  atlas       condor    git      gromacs   ImageMagick
  blacs       fftw      gnuplot  gsl       iozone

Back to Top

3. File Storage

3.1. Home Directory

The /home file system quota on Painter is 5 GB. Files can be stored on /home permanently, which makes it an ideal place for your source code and executables. The /home file system is meant for interactive use such as editing and active code development. Do not use /home for batch job I/O.

Back to Top

3.2. Work (Scratch) Directory

The /work volume meant for the input and output of executing batch jobs and not for long term storage. We expect files to be copied to other locations or deleted in a timely manner, usually within 30-120 days. For performance reasons on all volumes, our policy is to limit the number of files per directory to around 10,000 and total files to about 500,000.

The /work file system quota on Painter is 100 GB. If it becomes over utilized we will enforce a 30 days purging policy, which means that any files that have not been accessed for the last 30 days will be permanently deleted. An email message will be sent out weekly to users targeted for a purge informing them of their /work utilization.

Please do not try to circumvent the removal process by date changing methods. We expect most files over 30 days old to disappear. If you try to circumvent the purge process, this may lead to access restrictions to the /work volume or the cluster.

Please note that the /work volume is not unlimited. Please limit your usage rate to a reasonable amount. When the utilization of /work is over 80%, a 14 day purge may be performed on users using more than 2 TB or having more than 500,000 files. Should disk space become critically low, all files not accessed in 14 days will be purged or even more drastic measures if needed. Users using the largest portions of the /work volume will be contacted when problems arise and they will be expected to take action to help resolve issues.

Back to Top

4. Programming/Compiling

Version 11.1 of the Intel compilers are loaded by default, codes can be compiled according to the following chart:

Serial Codes MPI Codes OpenMP Codes Hybrid Codes
Fortran ifort mpif90 ifort -openmp mpif90 -openmp
C icc mpicc icc -openmp mpicc -openmp
C++ icpc mpiCC icpc -openmp mpiCC -openmp

Default MPI: mvapich 1.1 compiled with intel 11.1

Back to Top

5. Running Jobs

Below are the possible job queues to choose from:

  • single - Used for jobs that will only execute on a single node, i.e. nodes=1:ppn<=4.
  • workq - Used for jobs that will use at least one node, i.e. nodes>=1:ppn=4. Currently, this queue has a limit of 72 hours (3 days) of wallclock time.
  • checkpt - Used for jobs that will use at least one node.
Queue Name Max Walltime Max Nodes (per job)
workq 72 24
checkpt 72 48
single 336 1

Back to Top

Single Queue Job Script Template

$ cat ~/script

#!/bin/bash
#PBS -q single
#PBS -l nodes=1:ppn=1
#PBS -l walltime=HH:MM:SS
#PBS -o desired_output_file_name
#PBS -N NAME_OF_JOB

/path/to/your/executable

Workq Queue Job Script Template

$ cat ~/script

#!/bin/bash
#PBS -q workq
#PBS -l nodes=1:ppn=4
#PBS -l walltime=HH:MM:SS
#PBS -o desired_output_file_name
#PBS -j oe
#PBS -N NAME_OF_JOB

# mpi jobs would execute:
#   mpirun -np 4 -machinefile $PBS_NODEFILE /path/to/your/executable
# OpenMP jobs would execute:
#   export OMP_NUM_THREADS=4; /path/to/your/executable

Checkpt Queue Job Script Template

$ cat ~/script

#!/bin/bash
#PBS -q checkpt 
#PBS -l nodes=1:ppn=4
#PBS -l walltime=HH:MM:SS
#PBS -o desired_output_file_name
#PBS -j oe
#PBS -N NAME_OF_JOB

# mpi jobs would execute:
#   mpirun -np 4 -machinefile $PBS_NODEFILE /path/to/your/executable
# OpenMP jobs would execute:
#   export OMP_NUM_THREADS=4; /path/to/your/executable

Back to Top

6. Monitoring Jobs

The following commands can be used to view/modify the queue

  • qdel jobid - deletes a PBS job in the queue.
  • qstat - shows you the status of your job and the jobs of others in the queue. It can show you various other bits of information about your job as well, such as the number of nodes it intends to use, the name of the queue it's in, etc
  • showq - displays jobs info within the batch system.
  • showstart jobid - gives an estimated starting time for your job.
  • checkjob jobid - displays detailed job state information

More detailed information on the Torque PBS commands and Moab to schedule and monitor jobs can be found at Adaptive Computing on-line documentations.

Back to Top