LSU HPC Storage Policy
▶ Table of Contents
See also: ITS Faculty Data Storage.
1. Storage Systems
Available storage on LSU@HPC high performance computing machines is divided into four file systems (Table 1). More complete descriptions of follow below.
|/var/scratch||Local storage on each node where existence of files are guaranteed only during job execution (i.e. could be deleted as soon as the job finishes).|
|/work||Shared storage provided to all users for job input, output, and related data files. Files are not backed up and may be subject to purge!|
|/home||Persistant storage provided to each active user account. Files are backed up to tape.|
|/project||Persistant storage provided by request for a limited time to hold large amounts of project-specific data. Files are not backed up.|
All file systems share a common characteristic. When they are too full, system performance begins to suffer and all users suffer from the effects. Similarly, placing too many individual files in a directory degrades performance. System management activities are aimed at keeping the different file systems below 80% of capacity. It is strongly recommended that users hold file counts below 1,000, and never exceed 10,000, per subdirectory. If atypical usage begins to impact performance, individual users may be contacted and asked to help resolve the issue. When performance or capacity issues become significant, system management may intervene by requiring users to offload files, stop jobs, or take other actions to ensure the stability and continued operation of the system.
Management makes every effort to avoid data loss. In the event of a failure, every effort will be made to recover data. However, the volume of data housed makes it impractical to provide system-wide data backup. Users are expected to safeguard their own data by making sure that all important code, scripts, documents and data are transferred to another location in a timely manner. The use of inexpensive, high capacity external hard drives attached to a local workstation is highly recommended for individual user backup.Back to Top
2. File System Details
/var/scratch space is provided on all compute nodes, and is local to each node (i.e. files stored in /var/scratch cannot be accessed by other nodes). The size of this file system will vary from system to system, and possibly across nodes within a system. This is the preferred place to put any intermediate files required while a job is executing. Once the job ends, the files it stores in /var/scratch are subject to deletion. Users should not have any expectation that files will exist after a job terminates, and are expected to move the data from /var/scratch to their /work or /home directory as part of the clean up process in their job script.Back to Top
/work is the primary file storage area to utilize when running jobs. /work is a common file system that all system nodes have access to. It is the recommended location for input files, checkpoint files, other job output, as well as related data files. Files on /work are not backed up, making the safekeeping of data the responsibility of the user.
User may consume as much space as needed to run jobs, but must be aware that since the /work file systems are fully shared, they are subject to purge if they become overfull. If capacity approaches 80%, an automatic purge process is started by management. This process targets files with the oldest age and size, removing them in turn until the capacity drops to an acceptable level. The purge process will normal occur once per month, as necessary. In short, use the space required, but clean up afterwards. See the description of /project space below for longer term storage options.Back to Top
All user home directories are located in the /home file system. /home is intended for the user to store source code, executables, and scripts. /home may be considered persistent storage. The data here should remain as long as the user has a valid account on the system. /home will always have a storage quota, and is clearly a hard limit. While /home is not subject to management activities controlling capacity and performance, it should not be considered permanent storage, as system failure could result in the loss of information. Users should arrange for backing up their own data, even though /home is periodically backed up to tape.Back to Top
The /project file system may be available on some systems, and provides storage space for a specific project. /project space is allocated and must be requested. It is made available for 6 months at a time. Shortly before an allocation expires the user will be notified of the upcoming expiration. Users may request to have the allocation extended. Renewal requests should be submitted at least 1 month prior to expiration to allow decision and planning time. Users should have no expectation that data will persist, and may be erased any time after 1 month from a project’s expiration. Thus the user is encouraged to employ alternate safe keeping and protection solutions of their own.Back to Top
2.5 System Specific Information
|Purge File Limit
|/project||200 (LPFS)||By Request|
|/project||840 (LPFS)||By Request|
3. Job Use
On all systems, jobs must be run from the /work file systems, and not from the /home or /project file systems. The individual nodes assigned to a job will have access to their local /var/scratch space. Files should be copied from /home or /project space to /work before a job is executed, and back when a job terminates, to avoid excessive I/O during execution that degrades system performance.Back to Top
4. /project Allocation Requests
Limitations: Space is allocated on the /project file systems by request for periods of upto 6 months at a time. Renewal requests are allowed but, in the interest of fairness, are subject to availability and competing request.
How to Apply: A storage allocation may be requested by completing a web form. The provided information should fully justify the need for the storage, and indicate how the data will be handled in the event the allocation can not be renewed. A user group can be set up for sharing access to the space if the requestor includes a list of user names.
Allocation Class: Allocations are divided into 3 classes, each with a separate approval authority (Table 3). All storage allocation requests are limited by the available space, and justifications must be commensurate with the amount of space required.
|Small||Up to 100GB||HPC@LSU Staff|
|Medium||Between 100GB and 1TB||HPC@LSU Operations Manager|
|Large||Over 1TB||HPC@LSU Director|
 Global Parallel File System
 Lustre Parallel File System
 Network File System
 Physically shares /work
Last revised: 10 Dec 2014