The University of Arizona
    For questions, please open a UAService ticket and assign to the Tools Team.
Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Overview

El Gato

Page Banner
imagehttps://public.confluence.arizona.edu/download/attachments/86409308/HPC-Photo.jpg?api=v2
actionTitleBatch Job Resource Request Examples
actionUrlhttps://public.confluence.arizona.edu/display/UAHPC/Running+Jobs+with+SLURM#RunningJobswithSLURM-examplerequestsNodeTypes/ExampleResourceRequests
titleCompute Resources


Panel
borderColor#9c9fb5
bgColor#fafafe
borderWidth2
borderStyledouble

Overview

ElGato

Note

During the quarterly maintenance cycle on April 27, 2022 the ElGato K20s were removed because they are no longer supported by Nvidia.

Implemented at the start of 2014,

it

ElGato has been reprovisioned with CentOS 7 and new compilers and libraries.

 From

From July 2021 it has been using Slurm for job submission.

El Gato is a large GPU cluster, purchased

ElGato is our smallest cluster with 130 standard nodes each with 16 CPUs. Purchased by an NSF MRI grant by researchers in Astronomy and SISTA.

Ocelote

Implemented in the middle of 2016

.  It

, Ocelote is designed to support

all

the majority of workloads on the standard nodes

except:
  • Large memory workloads beyond 188GB are handled with the large memory node which has 2TB of memory.
  • GPU workload is supported on 46 nodes with Nvidia P100 GPU's.

    . Additionally, Ocelote has one large memory  node with 2TB of memory and 46 nodes with Nvidia P100 GPUs for GPU-accelerated workflows

    Puma

    Implemented in 2020, Puma is the biggest cat yet. Similar to Ocelote, it has standard CPU nodes (with 94 cores and 512 GB of memory per node), GPU nodes (with Nvidia V100) and two high-memory nodes (3 TB). Local scratch storage increased to ~1.4 TB. Puma runs on CentOS 7.



    Panel
    borderColor#07105b
    bgColor#fcfcfc
    titleColor#fcfcfc
    titleBGColor#021D61
    borderStylesolid
    titleContents

    Table of Contents
    maxLevel2




    Panel
    borderColor#9c9fb5
    bgColor#fafafe
    borderWidth2
    borderStyledouble

    Free vs. Buy-In

    The HPC resources at

    UA

    UArizona are differentiated from many other universities in that there is central funding for a significant portion of the available resources. Each PI receives a standard monthly allocation of hours at no charge.

     

    There is no charge to the allocation for windfall usage and that has proven to be very valuable for researchers with substantial compute requirements.  

    Research groups can 'Buy-In' (adding additional compute nodes) to the base HPC systems as funding becomes available. Buy-In research groups will have highest priority on the resources they add to the system.

     

    If the expansion resources are not fully utilized by the Buy-In group they will be made available to all users as windfall.




    Panel
    borderColor#9c9fb5
    bgColor#fafafe
    borderWidth2
    borderStyledouble

    Compute System Details

    Note

    During the quarterly maintenance cycle on April 27, 2022 the ElGato K20s and Ocelote K80s were removed because they are no longer supported by Nvidia.


    Name

    El Gato

    Ocelote


    Puma

    Model

    IBM System X iDataPlex dx360 M4

    Lenovo NeXtScale nx360 M5Penguin Altus XE2242

    Year Purchased

    2013

    2016 (2018 P100 nodes)2020

    Node Count

    131

    400

    236 CPU-only
    8 GPU
    2 High-memory

    Total System Memory (TB)

    26TB

    82.6TB128TB

    Processors

    2x Xeon E5-2650v2 8-core (Ivy Bridge)

    2x Xeon E5-2695v3 14-core (Haswell)
    2x Xeon E5-2695v4 14-core (Broadwell)
    4x Xeon E7-4850v2 12-core (Ivy Bridge)

    2x AMD EPYC 7642 48-core (Rome)

    Cores / Node (schedulable)

    16c

    28c (48c - High-memory node)94c

    Total Cores

    2160*

    11528*23616*

    Processor Speed

    2.66GHz

    2.3GHz (2.4GHz - Broadwell CPUs)2.4GHz

    Memory / Node

    256GB - GPU nodes
    64GB - CPU-only nodes

    192GB (2TB - High-memory node)

    512GB (3TB - High-memory nodes)

    Accelerators


    46 NVIDIA P100 (16GB)

    29 NVIDIA V100S

    /tmp~840 GB spinning
    /tmp is part of root filesystem
    ~840 GB spinning
    /tmp is part of root filesystem
    ~1440 TB NVMe
    /tmp

    HPL Rmax (TFlop/s)

    46

    382

    OS

    Centos 7

     CentOS 7CentOS 7

    Interconnect

    FDR Inifinband

    FDR Infiniband for node-node
    10 Gb Ethernet node-storage

    1x 25Gb/s Ethernet RDMA (RoCEv2)
    1x 25Gb/s Ethernet to storage


    * Includes high-memory and GPU node
    CPUs

    Example Resource Requests

    See our User Guide under Running Jobs with SLURM → Example Resource Requests.

    Image Removed
    CPU