The University of Arizona
    For questions, please open a UAService ticket and assign to the Tools Team.
Page tree
Skip to end of metadata
Go to start of metadata

This is an abbreviated version of information mostly found on other pages intended for existing users who wish to use the new cluster.

Puma

Puma is our latest supercomputer which came online in the middle of 2020.  The key differences from Ocelote are; that it runs CentOS7 and not CentOS6; and it uses Slurm for a scheduler.  Detailed information on using Slurm is here.

As is the case for our other supercomputers, we use the RFP process to get the best value for our financial resources, that meet our technical requirements.  This time Penguin Computing one with AMD processors.  This is tremendously valuable as each node comes with:

  • AMD Zen2 96 core processors
  • 512GB RAM
  • 25Gb path to storage
  • 25Gb path to other nodes for MPI
  • 2TB internal NVME disk (largely available as /tmp)
  • Qumulo all flash storage array for shared filesystems
  • Two large memory nodes with 3TB memory and the same processors and memory as the other nodes
  • Six nodes with four Nvidia V100S GPU's each 

Ocelote

Ocelote arrived in 2016.  Lenovo's Nextscale M5 technology was the winner of the RFP mainly on price, performance and meeting our specific requirements. This cluster is actually the next generation of the IBM cluster we call ElGato.  Lenovo purchased IBM's Intel server line in 2015.

Ocelote will continue through 2020, and will be rebuilt with a new operating system, and will be configured to use Slurm like Puma.  It will continue until it is either too expensive to maintain or it is replaced by something else.

Ocelote was built with CentOS6. The exception is that half of Ocelote's GPU nodes have been upgraded to CentOS 7.  See here for more information

Features:

  • Intel Haswell V3 28 core processors 
  • 192GB RAM per node
  • FDR infiniband for fast MPI interconnect
  • Qumulo all flash storage array (all HPC storage is integrated into one array)
  • One large memory node with 2TB RAM,  Intel Ivy Bridge V2 48 cores
  • 46 nodes with Nvidia P100 GPU's


ElGato

ElGato is the cluster we obtained prior to Ocelote.  It was rebuilt last year with CentOS 7 compared to CentOS 6 that is on Ocelote.  Ocelote will be upgraded when its official life span of 5 years is up.  This is policy to preserve a consistent compute environment.  So if you need CentOS 7 or want another place to run jobs ElGato is available.  

Usage information is here. 

Access

We use a bastion host now for access to all clusters.  This is required to access all clusters. There is a useful screenshot below.

When you log in for the first time, your home directory is created (see below).

Bastion Host Access:

~ > ssh <insert NetID>@hpc.arizona.edu
Password:
Duo two-factor login for netid 

Enter a passcode or select one of the following options:

1. Duo Push to XXX-XXX-3614
2. Phone call to XXX-XXX-3614
3. SMS passcodes to XXX-XXX-3614 (next code starts with: 1) 

Passcode or option (1-3):
Success. Logging you in...
Creating home directory for cmitts.
This is a bastion host used to access the rest of the RT/HPC environment.

Shortcut commands to access each resource
-----------------------------------------
Ocelote:
$ ocelote

El Gato:
$ elgato

Puma:
$ puma

[<hostname> ~]$ 


Schedulers

Puma uses Slurm. More details are at Running Jobs with Slurm (Puma)

Ocelote and ElGato use PBS. More details are at Running Jobs with PBS (Ocelote and El Gato)

The reason we have both is that we are transitioning to Slurm.  When clusters are acquired or upgraded they will use Slurm.



 



  • No labels