Skip to main content

LSU HPC Storage Policy

1. Storage Systems

Available storage on LSU@HPC high performance computing machines is divided into four file systems (Table 1).

Table 1. Storage File Systems
File System Description
Scratch Temporary storage on each node that is available only during job execution.
Work Shared storage provided to each user for input, intermediate, and output files of a job.
Home Persistant storage provided to each active user account.
Project Persistant storage provided by request for a limited time to hold large amounts project-specific data.

All of the file systems share common characteristics. When they are too full, system performance begins to suffer. Similarly, placing too many individual files in a directory degrades performance. System management activities are aim at keeping the different file systems below 85% of capacity. It is strongly recommended that users hold file counts below 1,000, and never exceed 10,000, per directory. If atypical usage begins to impact performance, individual users may be contacted and asked to help resolve the issue. When performance or capacity issues become significant, system managers may intervene by requiring users to offload files, stop jobs, or take other actions to ensure the continued operation of the system.

Management makes every effort to avoid data loss. In the event of a failure, every effort will be made to recover data. However, the volume of data housed makes it impractical to provide system-wide data backup. Users are expected to safeguard their own data by making sure that all important code, scripts, documents and data are transferred to another location in a timely manner. The use of inexpensive, high capacity external hard drives attached to a workstation is highly recommended for individual user backup.

Back to Top

2. File System Details

2.1. Scratch

Scratch space is available on all systems, and to all users during the course of a job, but files are subject to deletion once a job ends. The size of this file system will vary from system to system, and possibly across nodes within a system. This is the preferred place to put any intermediate files required while a job is executing. Users should not have any expectation that files will exist after a job terminates, and are expected to move the data from Scratch to their Work or Home directory as part of the clean up process in their job script.

Back to Top

2.2. Work

Work is the primary file storage area to utilize when running jobs. Work is a common file system that all system nodes have access to. It is ideal for input files, checkpoint files and output. A storage quota may be enforced per user on the Work space file system, and usage allowed will vary by system, as shown in Table 3. This does not imply that each user should intend to fully consume this amount, since the quota system uses overbooking to optimize available space. This quota serves as a hard upper limit to prevent a single user from disrupting performance of the entire system. Files on Work will persist indefinitely. Users should not have any expectation of backup. The safekeeping of data is the responsibility of the user.

On systems which do not enforce a quota on Work space, the file system is fully shared. However, if capacity approaches 85%, the files are subject to purge by management. The purge process targets files with the oldest age, and continues until the capacity drops to an acceptable level. The purge process will normal occur once per month, as necessary. On systems with no Work quota, persistent storage for large amounts of data is provided on the Project file system.

Should the need arise; users may request an increase in their Work quota for a reasonable period of time. A request containing a justification for the increased storage should be sent to sys-help@loni.org. The email must include the name of the PI, a valid contact email, a summary of any existing quotas, and a justification statement. Table 2 shows the approval authority for quota increases, based on size.

Table 2. Work Quota Class and Approval Authority
Class Size Approval Authority
Default 50/100GB Default by system for each user
Medium Between 100GB and 1TB HPC@LSU Operations Manager
Large Over 1TB HPC@LSU Directory
Back to Top

2.3. Home

All user home directories are located in the Home file system. Home is intended for the user to store source code, executables, and scripts. Home may be considered persistent storage. The data here should remain as long as the user has a valid account on the system. Home will always have a storage quota, and is clearly a hard limit. While Home is not subject to management activities controlling capacity and performance, it should not be considered permanent storage, as system failure may result in the loss of information. Users should arrange for backing up their own data.

Back to Top

2.4. Project

The Project file system may be available on some systems, and provides space specific to a project. Project space allocations must be requested, and they are available for a limited time period. Allocations are typically 6 months or less. Shortly before an allocation expires the user will be notified of the upcoming expiration. Users may request to have the allocation extended. Renewal requests should be submitted at least 1 month prior to expiration to allow decision and planning time. Users should have no expectation that data will persist, and may be erased any time after 1 month from a project’s expiration. Thus alternate safe keeping and protection actions must be taken in advance.

Back to Top

3. System Specific Information

Table 3. System Specific File System Information
System File System Storage
(Tb)
Quota
(GB)
Purge File Limit
(Million)
IBM P5-575
Pelican
Work 28 (GPFS) N/A 5
Home (GPFS)[1] 5 N/A
Dell x86 Cluster
Tezpur
Work 60(LPFS)[2] N/A 8
Home (NFS)[3] 5 N/A
Project 60 By Request
Back to Top

4. Job Use

On all systems, jobs must be run from the Work file systems, and not from the Home or Project file systems. Files should be copied from Home or Project space to Work before a job is executed, and back when a job terminates, to avoid excessive I/O during execution that degrades system performance.

Back to Top

5. /project Allocation Requests

Limitations: Space is allocated on the /project file systems by request for periods of upto 6 months at a time. Renewal requests are allowed but, in the interest of fairness, are subject to availability and competing request.

How to Apply: A storage allocation may be requested by completing a web form. The provided information should fully justify the need for the storage, and indicate how the data will be handled in the event the allocation can not be renewed.

Allocation Class: Allocations are divided into 3 classes, each with a separate approval authority (Table 4). All storage allocation requests are limited by the available space, and justifications must be commensurate with the amount of space required.

Table 4. Project Allocation Class and Approval Authority
Class Size Approval Authority
Small Up to 100GB HPC@LSU Staff
Medium Between 100GB and 1TB HPC@LSU Operations Manager
Large Over 1TB HPC@LSU Director
Back to Top

[1] Global Parallel File System

[2] Lustre Parallel File System

[3] Network File System



Last revised: 6 August 2013