LONI User Allocation and Account Policy
Last revised: 10 Apr 2015.
▶ Table of Contents
- Accounts and Allocations
- The Allocation Process
- Machine Access Policy
- Resource Allocation Committee Members
1. Accounts and Allocations
LONI (Louisiana Optical Network Initiative) maintains a high speed optical network that connects most of the institutions of higher education in the State of Louisiana as well as several other entities. LONI also provides high performance computing (HPC) systems to support research, education, and economic development in the State. Detailed information about LONI can be found on the LONI web site. LONI controls access to these resources via a formal user account and resource allocation process. All decisions in these matters are made by the LONI Resource Allocation Committee (LRAC) made up of representives from each of the LONI member organizations.
Gaining access to LONI resources, be it computer time or data storage, involves a two step process. Since all interactions with the systems are controlled by user account, the first step requires placing a request for one. Only one account is needed to access any of the LONI systems, as well as the process management tools provided on the various web sites. All faculty and research staff are eligible for user accounts, which may be applied for on-line.. Be aware that all usage is governed by the LONI Usage Policy. Student and research collaborators are eligible for accounts, but they must be sponsored by a faculty member.
The second step in the access process is to associate a user account with a resource allocation. Compute time is a finite resource, so access is controlled by an allocation process. An allocation provides a block of computational time, measured in Service Units of 1 wallclock hour on 1 CPU processor core. Likewise, storage space (on some systems) is controlled by a quota system, and allows for storing and sharing large data sets. The remainder of this document will discuss allocations, while use of storage is discussed in more detail in the LONI Storage Policy document.
Allocations are awarded through a proposal submitted by a principle investigator (PI). Full-time faculty and research staff at LONI Member and Associate institutions are eligible to serve as a PI for allocation purposes. Resource allocations for undergraduate and graduate students should be requested by a faculty advisor. The LONI Management Council reserves the right to designate others for PI eligibility. Once awarded, the PI is able to authorize other users to charge resource consumption against the allocation. The PI is responsible for all authorizations, but may formally delegate some other user to manage this process. Once a user account is authorized to use an allocation, the user is enabled to submit jobs or otherwise expend the allocation resources. This implies that a user must be authorized for at least one non-expired allocation before the systems can be used. Likewise, a user account may be authorized to use multiple allocations. It becomes the responsibility of the PI and user to make sure the appropriate allocation is charged for work.
The second step involves associating user accounts with specific allocations against which they can charge resource usage. To accomplish this, tools are made available to the PI to control authorization of users on their allocations. The PI associates an email address with an allocation to which all user account requests are sent for authorization. The PI is responsible for all authorizations, but may formally delegate someone to manage this process. Once a user account is authorized to use an allocation, the user is allowed to submit jobs or otherwise expend the allocation. This implies that a user must be authorized for at least one non-expired allocation before the systems can be used. Likewise, a user account may be authorized to use multiple allocations. It becomes the joint responsibility of the PI and user to make sure the appropriate allocation is charged for work performed.
A single machine will have multiple nodes (individual servers) available, with multiple cores within each node. This allows many cores to be used for a single parallel processing job, but can lead to not-so-obvious charges. For example, running for 1 hour using 8 cores on an 8-core node consumes 8 SU's. On the other hand, running for 1 hour on 1 core of the same 8 core node, which allows the single core to access all of the node memory, also consumes 8 SU's. In simple terms, the number of cores that are reserved for a job, and hence are unavailable to others, is the number used to calculate SU usage.Back to Top
2. The Allocation Process
LONI resources are partitioned by category, and all allocation proposals go through the LONI Resource Allocation Committee (LRAC). The LRAC represents three distinct approval authorities: individual member institutions; the LONI Management Council (LMC), represented by the LONI director; and the LRAC as a panel. Table 1 shows the distribution of LONI resources by category:
|Category||Available Resources||Allocation Authority|
|Louisiana Non-Member LONI||5%||LRAC|
|Small Allocations (< 50,000 SU’s)||30%||By member, 5% per institution.|
|Large Allocations (> 50,000 SU’s)||45%||LRAC|
All allocations have a duration of one year, and are considered active until they expire regardless of current balance. They are considered valid so long as they have a positive balance and have not yet expired. Small Allocations may be awarded at any time, while Large Allocations are granted at the beginning of every calendar quarter. Application deadlines for Large Allocations are one month prior to the start date, as summarized in Table 2.
|January 1||December 1 (prior year)|
|April 1||March 1|
|July 1||June 1|
|October 1||September 1|
A PI may request, or be requested, to make a formal presentation to the allocation authority in conjunction with their proposal submission. At the end of a large allocation, a report in PDF format must be uploaded to the allocation web site. Requests for new large allocations will not be approved until reports for expired allocations have been submitted. The report should detail:
- If the goals of the proposed work were accomplished
- Any major results or outcomes
- All resulting publications/presentations.
To facilitate management of proposals, they are identified as belonging to one of several classes (Table 3).
|Economic||A proposal to assist commercial interests with the adoption of high performance computing as part of their business process. Awarded by the LMC from the Economic Development resources.|
|Discretionary||A proposal determined to be of value, but lying outside the normal allocation process. Awarded by the LMC from the Discretionary resources.|
|Small||A proposal requesting an allocation for the purpose of exploring the value of high performance computing for a new project. This may be awarded in any category, within a member's 5% resource pool. Only 2 Small allocations may be active at any time per PI. Serial application for Small Allocations is suitable only for projects with low processing demands. Their primary purpose is to provide resources for developing proposal justification material for a Large allocation.|
|Large||A proposal requesting a Large allocation. May be awarded by the LRAC against Large or Non-Member resources. Large requests are limited to a maximum of 4M SU, and a PI may have a total of 6M SU active at any given time. Only faculty members of LONI member and Louisiana associate institutions may serve as the PI for large allocations.|
2.1 Proposal Requirements
All requests for an allocation are made via the LONI web interface to the LRAC. Small allocations require only the data requested on the form. Large allocations must attach a more extensive proposal in PDF format. The proposal is expected to be 5 pages or less, and concentrate on justifying the computational resources requested. The following outline should be followed:
- Problem Statement - section limited to 1 page, or less, describing the desired outcomes of the project.
- Background - section limited to 1 page or less, describing how the resources will be used to address the problem (i.e. student access for course work, specific models, etc.).
- Methodology - section limited to 1 page, or less, describing the computational methodology that will be used. This should include the applications required.
- Research Plan - section limited to 1 page, or less, describing the research schedule, including the anticipated expenditure of granted resources. Allocations are assumed to be uniformly consumed over their lifetime. If this will not be the case, an estimate of expenditure by calendar quarter is required.
- Requirements Analysis - section limited to 2 pages, or less, detailing the basis for the requested computer time. Requests for large allocations must exhibit an understanding of application efficiency, scaling, and provide accurate estimations of the SU requirements.
- Attachments - Supplementary information relating to the proposal, such as copies of awarded grants which will be supported by the allocation, and lists of resulting publications. This same attachment area is used for submitting reports at the end of large allocations.
Please note that section page limits are suggestions, but the maximum proposal page limit remains 5, not including any Attachments. The page limit was chosen with an eye to making it relatively easy to compose an allocation request, or to modify and reuse a successful application made to another center.
Project allocations are competitively reviewed and granted based upon the description of the proposed research and use of the available technology. They are not vetted on scientific merit. For Large Allocations, priority is given to funded research. All decisions made by the LRAC are deemed final. Appeals can be directed to the LMC or the member institution council representative.
Renewal allocations follow the same process as all other allocations. Proposal writers should be aware that both past usage history and submission of reports will be considered in the award determination. Applications for allocation renewals should ideally cite peer reviewed publications that acknowledge LONI resources and only require an updated version of a previously successful application.Back to Top
2.2 Allocation Management
User accounts without access to a valid allocation will be blocked from submitting work on the systems. There is no mechanism for extending an allocation beyond one year, nor for adding resources once an allocation has been expended.
User accounts must be associated with a valid allocation, and if not, will be retained for a maximum of 1 year pending authorization against a renewed or different allocation. With these restrictions in mind, the PI is required to use the tools provided to monitor system usage and control authorization of project member accounts. PI's are strongly advised to carefully budget their usage appropriately throughout the year. Automatic reminder emails will be sent by the management system as an allocation nears expiration. PI's are ultimately responsible for assuring that a current and actively monitored management email address has been assigned to each allocation.
At the end of a Large allocation, a short report must be submitted, as detailed above. Failure to submit this report may be used in the consideration of future allocations.Back to Top
2.3 Early Allocation Access
If a PI who has already been awarded a Large allocation by the LRAC puts in a new request, and there is a good reason to start the project before the next cycle, then the local representative can request staff to award as much as 25% of the project request immediately. Justification must be provided in the renewal proposal. The LRAC committee must then be made aware of this action and its reasons by the local representative using the "LONI Allocations" listerver.Back to Top
3. Machine Access Policy
3.1 Job Queueing
Various workload balancing algorithms are used to determine how jobs are assigned resources on a given machine. The way a job is handled is determined by the job queue it is submitted to. Efficient use of the queuing system requires that users request runtimes consistent with estimated runtimes of their jobs. In particular, requesting more time than is necessary for a particular job can lead to inefficient and unfair queuing. Therefore, users that routinely request more time than is needed for their jobs are subject to a “priority penalty” that will lower the priority of their jobs. Each system sets a maximum number of jobs that a single user may have running without special permission (see below). There is no limit to the number of jobs that are particular use may have queued. Users that wish to obtain a higher priority for their jobs may use special priority queues (see below).Back to Top
The processors on the systems are further subdivided into preemptory and dedicated pools. Certain mission critical applications, such as storm surge prediction during a hurricane threat, are granted immediate access to processors in the preemptory pool. Processors in the dedicated pool are used to run all other job types. The processors are accessed through one of 6 different types of job queue. This allows for different combinations of processors and job characteristics. Every type may not be available on every machine.
3.2.1. Preempt Queue
The preempt queue controls access to the preemptory pool. Authorized applications submitted to this queue will cause the termination of all other user applications running on preemptory nodes.
3.2.2. Checkpt Queue
The checkpt queue controls access to nodes in both the dedicated and preemptory pool. Jobs running in this queue may be subject to termination by the preempt queue, thus are implicitly assumed to support restarts based on periodically saved information. No refunds of lost SU's are offered if jobs in the checkpt queue are terminated by preemption. The user running jobs without restart capability assume this risk. However, the benefits from using this queue include access to larger numbers of nodes, and/or faster throughput, depending on how busy the queue is.
3.2.3. Workq Queue
The workq queue controls access to nodes in the dedicated pool. Jobs in the work queue will run until they terminate as planned, their requested run time has expired, or they stop due to an abnormal system failure. Jobs which are terminated due to system errors beyond the user's control may be subject to refund of expended SU's. Poor planning is not considered grounds for a refund.
3.2.4 Single Queue
The single queue allows several users to share the resources of a single node. The resource share given to a user is determined by how many cores the job requests.
3.2.5. Priority Queue
The priority queue controls nodes in the dedicated pool, but allows applications to be given higher priority with prior approval. Approval may be granted during training sessions, for demonstration purposes, or other special needs. The SUs charged will be adjusted by a factor of 1.3, and no more than 20% of an allocation may be expended in the queue. Requests for priority access should be directed to email@example.com. This queue does not impact other running user jobs, but will delay the start of lower priority jobs already in the queue.
3.2.5. Queue Availability
The availability of the queues may be impacted by system maintenance and servicing. The appropriate job management commands can be used to check their status in real time. The maximum time allowed in the queues may be adjusted from time to time in order to improve overall utilization of the machines.Back to Top
Disk space usage is controlled in one of 3 ways. The /home file system is quota controlled. A large pool of shared space is provided in the /work file system. Data in it is purged periodically, and users are cautioned to treat it as large, but temporary, storage space. Storage on the /project file system is quota controlled and may be requested via the allocation process. It serves as longer term (6 months at a time, subject to renewal and current demand) storage for very large data sets. It also supports user groups for sharing access to the data. For more details, see the separate LONI Storage PolicyBack to Top
3.4. Special Requests
A request for special access to LONI machines (such as usage of all nodes on a machine or exceptionally long runs) must be explicitly stated in the proposal for LONI resources. Appeals to the decision of the LRAC may be made to the LONI Management Council.Back to Top
4. Resource Allocation Committee Members
The current LRAC members are shown in Table 4.
Each member holds approval authority for their respective institution. The LONI Director is also a member, and holds approval authority for the LONI Management Council.Back to Top