The University of Arizona
    For questions, please open a UAService ticket and assign to the Tools Team.
Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Disk Storage

 

Page Banner
imagehttps://public.confluence.arizona.edu/download/attachments/86409274/timer.jpg?api=v2
titleAllocations and Limits

Excerpt Include
Getting Help
Getting Help
nopaneltrue

 

 High Priority

 cluster_high

 105

 11,520

 720

 512

 1024 GB

 64

 

 

 

 

 

 

 

 

 

 UV 1000

 Standard

 standard

 

 3,200

 240

 256

 512 GB

 30

 

 Windfall

 windfall

 

 3,200

 240

 256

 512 GB

 no limit

 

High Priority

 smp_high

 

 11,520

 720

 512

 1024 GB

 64

 

 

 

 

 

 

 

 

 

HTC 

Standard

 standard

 104

 11,520

 720

 256

 512 GB

 30

 

 Windfall

 windfall

 104

 11,520

 720

 256

 512 GB

 no limit

 

 High Priority

 htc_high

 10

 11,520

 720

 512

 1024 GB

 

 Gen 2

 Gen 2

 Gen 2

 Gen 2

 Gen 2

 Gen 2

 Gen 2

 Gen 2

 Gen 2

 New Cluster

 Standard

 standard

 300

 3,200

 240

 512

 1024 GB

 60

 

 Windfall

 windfall

 300

 3,200

 240

 512

 1024 GB

 no limit

 

 High Priority

 new_high

 0

 11,520

 720

 512

 2024 GB

 

 

 Projects

 new_qual

 0

 TBD

 TBD

 TBD

 TBD

 TBD


Panel
borderColor#9c9fb5
bgColor#fafafe
borderStylesolid

Storage Allocations

Tip

See our Storage page for:

When you obtain a new HPC account, you will be provided with

the following storage:

storage.  The shared storage (/home

/netid - 15GB    (backed up nightly)  
  • /extra/netid  -   200GB (no backups) 
  • /scratch/netid -  temp space, data removed nightly (no backups)
  • Additional storage:

    • /xdisk/netid  -  200GB to 900GB available on request  (time=45 days with one extension)(no backups)
    • /rsgrps/netid -  rented/purchased space  (no backups)

     

    Job Time Limits

     Each group is allocated of 24,000 hours of compute time, this allocation is refreshed monthly.  Half the allocation, 12,000 hours, is on the older clusters (Gen 1); the other half, 12,000 hours, is on the new cluster (Gen 2). 

     

     

    Image Removed

     Batch Queue Limits

    The batch queues on the different systems have the following memory, time and core limits.

     

     System 

     Priorities

     Queue Name

     # of Compute Nodes

    # of  CPU Hours / Job

     Wall Clock Hours / Job

     Max # of CPUs 

    Allocated/ Job

     Max Memory

    Allocated/ Job

     Max # of Running Jobs

     Gen 1

     Gen 1

     Gen 1

     Gen 1

     Gen 1

     Gen 1

     Gen 1

     Gen 1

     Gen 1

     Ice (Altix 8400)

     Standard

     standard

     124

     3,200

     240

     256

     512 GB

     30

     

     Windfall

     windfall

     229

     3,200

     240

     256

     512 GB

     no limit

    , /groups, /xdisk) is accessible from any of the three production clusters: Puma, Ocelote and ElGato. The temporary (/tmp) space is unique to each compute node.

    LocationAllocationUsage
    Permanent Storage
    /home/uxx/netid50 GBIndividual allocations specific to each user.
    /groups/PI500 GBAllocated as a communal space to each PI and their
    group members.
    Temporary Storage
    /xdisk/PIUp to 20 TBRequested at the PI level. Available for up to 150 days
    with one 150 day extension possible for a total of 300
    days.
    /tmp~1400GB NVMe    (Puma)
    ~840GB spinning  (Ocelote)
    ~840GB spinning  (El Gato)
    Local storage specific to each compute node. Usable
    as a scratch space for compute jobs. Not accessible 
    once jobs end. 




    Panel
    borderColor#9c9fb5
    bgColor#fafafe
    titleColor#fcfcfc
    titleBGColor#021D61
    borderStylesolid
    titleContents

    Table of Contents
    maxLevel2




    Panel
    borderColor#9c9fb5
    bgColor#fafafe
    borderStylesolid

    Job Allocations

    Overview

    All University of Arizona Principal Investigators (PIs; aka Faculty) that register for access to the UA High Performance Computing (HPC) receive these free allocations on the HPC machines which is shared among all members of their team. Currently all PIs receive:

    HPC MachineStandard Allocation Time per Month per PIWindfall
    Puma100,000 CPU Hours per monthUnlimited but can be pre-empted
    Ocelote70,000 CPU Hours per monthUnlimited but can be pre-empted
    El Gato7,000 CPU Hours per monthUnlimited but can be pre-empted

    Best practices

    1. Use your standard allocation first! The standard allocation is guaranteed time on the HPC. It refreshes monthly and does not accrue (if a month's allocation isn't used it is lost).
    2. Use the windfall queue when your standard allocation is exhausted. Windfall provides unlimited CPU-hours, but jobs in this queue can be stopped and restarted (pre-empted) by standard jobs.
    3. If your group consistently needs more time than the free allocations, consider the HPC buy-in program.
    4. Last resort for tight deadlines: PIs can request a special project allocation once per year (https://portal.hpc.arizona.edu/portal/; under the Support tab). Requesting a special project will provide qualified hours which are effectively the same as standard hours.
    5. For several reasons we do not offer checkpointing.  It may be desirable to have this capability in your code.

    How Allocations are Charged

    The number of CPU hours a job consumes is determined by the number of CPUs it is allocated multiplied by its requested walltime. When a job is submitted, the CPU hours it requires are automatically deducted from the account. If the job ends early, the unused hours are automatically refunded.

    For example, a job requesting 50 CPUs for 10 hours will be charged 500 CPU hours. When the job is submitted, all 500 CPU hours are deducted from the user's account, however, if the job only runs for 5 hours and then completes, the unused 250 hours would be refunded.

    This accounting is the same regardless of which type of node you request. Standard, GPU, and high memory nodes are all charged using the same model and use the same allocation pool. If you find you are being charged for more CPUs that you are specifying in your submission script, it may be an issue with your job's memory request.

    Allocations are refreshed on the first day of each month. Unused hours from the previous month do not roll over.

    How to Use Your Allocation

    To use your allocation, you will include your account and partition information as a SLURM directive in your batch script. The formatting for this can be found in our Running Jobs with SLURM documentation.

    How to Find Your Remaining Allocation

    To view your remaining allocation, use the command va in a terminal. For example:

    Code Block
    languagebash
    themeMidnight
    (elgato) [user@gpu5 ~]$ va
    Windfall: Unlimited
    
    PI: parent_974 Total time: 7000:00:00
    	Total used*: 1306:39:00
    	Total encumbered: 92:49:00
    	Total remaining: 5600:32:00
    	Group: group1 Time used: 862:08:00 Time encumbered: 92:49:00
    	Group: group2 Time used: 0:00:00 Time encumbered: 0:00:00
    
    *Usage includes all subgroups, some of which may not be displayed here


    FieldDescription
    Total time:The total number of CPU hours available on the cluster each month
    Total used:The total number of CPU hours used by the group for the current month. This includes all the subgroups that are managed by your PI, not just the subgroups you are a member of.
    Total encumbered:The total number of CPU hours that are reserved by active (pending/running) jobs.

    Total remaining:

    The total number of CPU hours that are unencumbered/unused

    Group:

    Hours used broken down by the groups your PI manages. Only usage for the groups you are a member of will be shown.

    SLURM Batch Queues

    The batch queues, also known as partitions, on the different systems are the following:

    QueueDescription
    standardUsed to consume the monthly allocation of hours provided to each group
    windfallUsed when standard is depleted but subject to preemption
    high_priorityUsed by 'buy-in' users for purchased nodes
    qualifiedUsed by groups who have a temporary special project allocation





    Panel
    borderColor#9c9fb5
    bgColor#fafafe
    borderStylesolid

    Job Limits

    To check group, user, and job limitations on resource usage, use the command job-limits $YOUR_GROUP in the terminal.




    Panel
    borderColor#9c9fb5
    bgColor#fafafe
    borderStylesolid

    Special Allocations

    Sometimes you may need an extra allocation for a conference deadline or paper submission.  Or something else.  We can offer a temporary allocation according to the guidelines here: Special Projects