The University of Arizona
    For questions, please open a UAService ticket and assign to the Tools Team.
Page tree
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 25 Next »

Ocelote has 46 new compute nodes with Nvidia P100 GPU's.  These are available to researchers on campus.  There will be fairshare limitations but the intention is for them to be as widely available as possible. There are still compute nodes on El Gato with 70 nodes provisioned with Nvidia Tesla K20's. 

Specifications


Cuda Modules

Currently the following Cuda modules are available on Ocelote:

/cm/shared/modulefiles




cuda75/blas/7.5.18 cuda75/nsight/7.5.18cuda80/blas/8.0.61cuda80/nsight/8.0.61
cuda75/fft/7.5.18cuda75/profiler/7.5.18cuda80/fft/8.0.61 cuda80/profiler/8.0.61
cuda75/gdk/352.79cuda75/toolkit/7.5.18cuda80/gdk/352.79 cuda80/toolkit/8.0.61
/cm/shared/uamodulefiles


cuda75/neuralnet/5/5.1cuda75/neuralnet/6/6.0cuda80/neuralnet/5/5.1cuda80/neuralnet/6/6.0

OpenACC

We support OpenACC in the GCC Compiler 6.1 which is automatically loaded as a module when you log in.  Verify with "module list".
The GCC 6 release includes a much improved implementation of the OpenACC 2.0a specification.
A useful quick reference guide can be  found at:
https://gcc.gnu.org/wiki/OpenACC#Quick_Reference_Guide 

About two times a year we host the Xsede Workshop on Programming GPU's with OpenACC.  Watch for announcements to the HPC-Info list.

Nvidia has available free online OpenACC courses:
https://developer.nvidia.com/openacc/overview
https://developer.nvidia.com/openacc-courses 

Applications

Many applications have been optimized to run faster on GPU's.  These include:

  • NAMD - installed as a module; module load namd
  • VASP - A restricted license version is installed on Ocelote; only available to the licensed users
  • GROMACS - Installed as a module on Ocelote; module load gromacs
  • LAMMPS - Installed as a module on Ocelote; module load lammps/gcc/16Mar18
  • ABAQUS - Installed as a module on Ocelote; module load abaqus
  • GAUSSIAN
  • MATLAB - Review the GPU Coder at their web site
  • AMBER
  • ANSYS Fluent
  • ML and DL frameworks - See the next section below

Machine Learning 

*** Nvidia Provided GPU Codes ***

Nvidia builds the popular set of ML and DL frameworks which is not a trivial task. They have made them available to us and they will be updated regularly.  They are currently located at:
/unsupported/singularity/nvidia  

Current list:



nvidia-caffe.18.03-py2.simgCaffe is a deep learning framework made with expression, speed, and modularity in mind. It was originally developed by the Berkeley Vision and Learning Center (BVLC) 
nvidia-pytorch.18.03-py3.simg

PyTorch is a Python package that provides two high-level features:

  • Tensor computation (like numpy) with strong GPU acceleration
  • Deep Neural Networks built on a tape-based autograd system

nvidia-mxnet.18.03.simg

MXNet is a deep learning framework designed for both efficiency and flexibility. It allows you to mix the flavors of symbolic programming and imperative programming to maximize efficiency and productivity.
nvidia-tensorflow.18.03-py3.simgTensorFlow is an open source software library for numerical computation using data flow graphs. TensorFlow was originally developed by researchers and engineers working on the Google Brain team within Google's Machine Intelligence research organization for the purposes of conducting machine learning and deep neural networks research.
nvidia-theano.18.03.simgTheano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently.

Each is provided in a Singularity container.

The file name has a tag at the end that represents when it was made, so 18.01 is January 2018

USAGE

Copy the file you wish to use to your directory.  Your home path as well as /extra and /xdisk have been bound to the image, so those are your choices.

For interactive use, start an interactive job on a GPU node modifying this command:

$ qsub -I -N jobname -m bea -W group_list=GROUP-NAME -q windfall -l select=1:ncpus=28:mem=168gb:ngpus=1 -l cput=1:0:0 -l walltime=1:0:0

You must change the group_list and you should change the other attributes as desired.

On the compute node assigned to you, as an example you can run:

$ module load singularity
$ singularity exec --nv nvidia-tensorflow.18.01-py3.simg python tensorflow_example.py

You need to include the --nv and note it has two dashes.  This will bind the Cuda libraries.
The example file is included in this directory. "tensorflow_example.py"

For batch use, you will include these two lines in your submission script

module load singularity
singularity exec --nv nvidia-tensorflow.18.01-py3.simg python tensorflow_example.py

There are more detailed examples here

Singularity

For more information on Singularity, see their web site at:

http://singularity.lbl.gov/user-guide

There are tutorials for Singularity on HPC here

Training

We host workshops from the Pittsburgh Supercomputer Center which is a NSF funded location.  We are working with Nvidia to offer a workshop in the April 2018 timeframe.

Watch for announcements from the hpc-info list.

  • No labels