The University of Arizona
    For questions, please open a UAService ticket and assign to the Tools Team.
Page tree
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 125 Next »

 

 

Storage Allocations

See our Storage page for:

When you obtain a new HPC account, you will be provided with storage.  The shared storage (/home, /groups, /xdisk) is accessible from any of the three production clusters: Puma, Ocelote and ElGato. The temporary (/tmp) space is unique to each compute node.

LocationAllocationUsage
Permanent Storage
/home/uxx/netid50 GBIndividual allocations specific to each user.
/groups/PI500 GBAllocated as a communal space to each PI and their
group members.
Temporary Storage
/xdisk/PIUp to 20 TBRequested at the PI level. Available for up to 150 days
with one 150 day extension possible for a total of 300
days.
/tmp~1400GB NVMe    (Puma)
~840GB spinning  (Ocelote)
~840GB spinning  (El Gato)
Local storage specific to each compute node. Usable
as a scratch space for compute jobs. Not accessible 
once jobs end. 



Job Allocations

All University of Arizona Principal Investigators (PIs; aka Faculty) that register for access to the UA High Performance Computing (HPC) receive these free allocations on the HPC machines which is shared among all members of their team. Currently all PIs receive:

HPC MachineStandard Allocation Time per Month per PIWindfall
Puma100,000 CPU Hours per monthUnlimited but can be pre-empted
Ocelote35,000 CPU Hours per monthUnlimited but can be pre-empted
El Gato7,000 CPU Hours per monthUnlimited but can be pre-empted

Best practices

  1. Use your standard allocation first! The standard allocation is guaranteed time on the HPC. It refreshes monthly and does not accrue (if a month's allocation isn't used it is lost).
  2. Use the windfall queue when your standard allocation is exhausted. Windfall provides unlimited CPU-hours, but jobs in this queue can be stopped and restarted (pre-empted) by standard jobs.
  3. If your group consistently needs more time than the free allocations, consider the HPC buy-in program.
  4. Last resort for tight deadlines: PIs can request a special project allocation once per year (https://portal.hpc.arizona.edu/portal/; under the Support tab). Requesting a special project will provide qualified hours which are effectively the same as standard hours.
  5. For several reasons we do not offer checkpointing.  It may be desirable to have this capability in your code.

How to Find Your Remaining Allocation

You can view your remaining allocation using the HPC User Portal at https://portal.hpc.arizona.edu/portal/.
PIs can create groups, manage time in each of their groups, subdivide their allocation etc. in the user portal as well.

The command va will display your remaining time on the terminal.
The command va -v will display more details

You can use this time on either the standard nodes which do not require special attributes in the scheduler script, or on the GPU nodes which do require special attributes. The queues are set up so that jobs that do not request GPU's will not run there.

 Slurm and PBS Batch Queues

The batch queues on the different systems have the following memory, time and core limits.

QueueDescription
standardUsed to consume the monthly allocation of hours provided to each group
windfallUsed when standard is depleted but subject to preemption
high_priorityUsed by 'buy-in' users for purchased nodes
qualifiedUsed by groups who have a temporary special project allocation


Job Resource Limits

SystemQueue Name

Number of
Compute Nodes

Max Wallclock 
Hrs Per Job
Largest Job
(Max Cores)
Total Cores in
Use Per Group
Total GPUs in
Use Per Group
Largest Job
(Max Memory GB)

Max Number of
Running Jobs

Max Queued
+ Running Jobs
Per User

Max Queued 
Jobs

Puma


standard34824020162016**4806450010003000
windfall4002402016
-80647510003000
high_pri522402016
***8064500
5000
qualified3482402016

12096100

Ocelotestandard34824020162016**-8064500
3000
windfall4002402016
-806475
3000
high_pri522402016
-8064500
5000
qualified3482402016
-12096100

El Gato

standard131240512512-102475
1000
windfall131240512512-102475
100
high_pri1312402016704-12096704
5000

**  This limit is shared by all members of a group across all queues. So you can use the system 2016 core limit by one user on the standard queue or share it across multiple users or queues. 
*** Groups who have purchased GPUs will have a limit set to the number purchased. If no GPUs were purchased, the high_pri hours will be restricted to standard CPU nodes. 




  • No labels