The University of Arizona
    For questions, please open a UAService ticket and assign to the Tools Team.
Page tree
Skip to end of metadata
Go to start of metadata




Simple Cases

You can use the pull or build commands to download pre-built images from resources like Singularity Hub or Docker.
This example pulls a container from Singularity Hub

$ singularity pull shub://vsoch/hello-world
Progress |===================================| 100.0% 
Done. Container is at: /tmp/vsoch-hello-world-master.sif
$ singularity run vsoch-hello-world-master.sif
RaawwWWWWWRRRR!!

This example pulls a container from Docker

$  singularity pull docker://godlovedc/lolcow
INFO:    Converting OCI blobs to SIF format
INFO:    Starting build...
Getting image source signatures
[...]
Writing manifest to image destination
Storing signatures
INFO:    Creating SIF file...
INFO:    Build complete: /home/u23/ric/.singularity/cache/oci-tmp/a692b57abc43035b197b10390ea2c12855d21649f2ea2cc28094d18b93360eeb/lolcow_latest.sif


$ singularity run lolcow_latest.sif
 ______________________________________
/ Perilous to all of us are the devices \
| of an art deeper than we ourselves    |
| possess.                              |
|                                       |
| -- Gandalf the Grey [J.R.R. Tolkien,  |
\ "Lord of the Rings"]                  /
 ---------------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||


Singularity Build

CentOS with Tensorflow Example

This is an example of creating a singularity image to run code that is not supported on HPC.  This example uses Tensorflow but any application could be installed in its place.  It also uses CentOS but it could just as easily be Ubuntu.

  1. Install Singularity on linux workstation - http://singularity.lbl.gov/install-linux

  2. Create the recipe file (which used to be called a definition file) on a workstation with root authority. Let's call it centosTflow.def

    BootStrap: yum
    OSVersion: 7
    MirrorURL: http://mirror.centos.org/centos-%{OSVERSION}/%{OSVERSION}/os/$basearch/
    Include: yum wget
    # best to build up container using kickstart mentality.  
    # ie, to add more packages to image, 
    # re-run bootstrap command again.  
    # bootstrap on existing image will build on top of it, not overwriting it/restarting from scratch
    # singularity .def file is like kickstart file
    # unix commands can be run, but if there is any error, the bootstrap process ends
    %setup
       # commands to be executed on host outside container during bootstrap
    %post
        # commands to be executed inside container during bootstrap
        # add python and install some packages
        yum -y install vim wget python epel-release
        yum -y install python-pip
         # install tensorflow
        pip install --upgrade pip
        pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.9.0-cp27-none-linux_x86_64.whl
         # create bind points for storage.
         mkdir /extra
         mkdir /xdisk
         exit 0
     %runscript
       # commands to be executed when the container runs
       echo "Arguments received: $*"
       exec /usr/bin/python "$@"
     %test
       # commands to be executed within container at close of bootstrap process
       python --version



  3. Create the Singularity container using the recipe file.  This new image is about 240MB in size.  Using older Singularity versions, the image created was 1.5GB.

    This must be done on a workstation where you have root authority. This capability is not permitted on the Ocelote login or compute nodes

    singularity build centosTflow.img centosTflow.def



  4. Copy the new image file to your space on HPC.  /extra might be a good location as the image might use up your remaining home.  There is a line in the definition file that will create the mount for /extra. 

    WARNING: Not mounting current directory: user bind control is disabled by system administrator
  5. Test with a simple command

    $module load singularity
    $singularity exec centosTFlow.simg python --version
    Python 2.7.5
  6. Or slightly more complex create a simple python script called hello.py:

    $singularity exec centosTFlow.simg python hello.py
    Hello World: The Python version is 2.7.5 
    $
    $singularity shell centosTFlow.simg
    Singularity.centosTFlow.img>python hello.py 
    Hello World: The Python version is 2.7.5 
    Singularity.centosTFlow.simg>
  7. And now test tensorflow with this example from their web site, TFlow_example.py:

    $singularity exec centosTFlow.simg python /extra/netid/TFlow_example.py
    (0, array([-0.08299404], dtype=float32), array([ 0.59591037], dtype=float32))
    (20, array([ 0.03721666], dtype=float32), array([ 0.3361423], dtype=float32))
    (40, array([ 0.08514741], dtype=float32), array([ 0.30855015], dtype=float32))
    (60, array([ 0.09648635], dtype=float32), array([ 0.3020227], dtype=float32))
    (80, array([ 0.0991688], dtype=float32), array([ 0.30047852], dtype=float32))
    (100, array([ 0.09980337], dtype=float32), array([ 0.3001132], dtype=float32))
    (120, array([ 0.09995351], dtype=float32), array([ 0.30002677], dtype=float32))
    (140, array([ 0.09998903], dtype=float32), array([ 0.30000633], dtype=float32))
    (160, array([ 0.0999974], dtype=float32), array([ 0.3000015], dtype=float32))
    (180, array([ 0.09999938], dtype=float32), array([ 0.30000037], dtype=float32))
    (200, array([ 0.09999986], dtype=float32), array([ 0.3000001], dtype=float32)) 

    You may see messages like:

    2017-07-26 00:07:48.647369: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.

    This is because you are not using a version of Tensorflow that was compiled on this cluster, so it lacks some of the optimizations.  

    This example has been created and copied to /unsupported/singularity which is available for anyone to copy to their personal space and run. Look for 24CentOStflow.img and the recipe file if you want is 24CentOStflow.recipe

    Note that this example specifies dtype=float32. The default is float64 but that is not supported on the GPU's on ElGato

Running Jobs

Singularity is not to be run on login nodes.  That is a general policy for any application.

To run a Singularity container image on ElGato or Ocelote interactively, you need to either allocate an interactive session or submit a job using a script. In this sample session, the Tensorflow Singularity container from above is started, and python is run. Note that in this example, you would be running the version of python that is installed within the Singularity container, not the version on the cluster.

ElGato Interactive Example

[netid@elgato singularity]$ bsub -Is -R gpu bash
Job <633365> is submitted to default queue <windfall>.
<<Waiting for dispatch ...>>
<<Starting on gpu44>>

[netid@gpu44 singularity]$ module load singularity
[netid@gpu44 singularity]$ singularity exec dockerTF.img python\ /extra/chrisreidy/singularity/TFlow_example.py 
Instructions for updating:
Use `tf.global_variables_initializer` instead.
(0, array([ 0.12366909], dtype=float32), array([ 0.3937912], dtype=float32))
(20, array([ 0.0952933], dtype=float32), array([ 0.30251619], dtype=float32))
...
(200, array([ 0.0999999], dtype=float32), array([ 0.30000007], dtype=float32))
[netid@gpu44 singularity]$ exit

Ocelote Interactive Example

The process is the same except that the command to initiate the interactive session will look more like:

 $ qsub -I -N jobname -W group_list=hpcteam -q windfall -l select=1:ncpus=28:mem=168gb:ngpus=1 -l cput=1:0:0 -l walltime=1:0:0

ElGato Job Submission

Running a job with Singularity is as easy as running other jobs.  The LSF script might look like this, and the results will be found in lsf_tf.out

###========================================
#!/bin/bash
#BSUB -n 1
#BSUB -q "windfall"
#BSUB -R "span[ptile=1]"
#BSUB -o lsf_tf.out
#BSUB -e lsf_tf.err
#BSUB -J testtensorflow
#---------------------------------------------------------------------

module load singularity
cd /extra/netid/data
singularity exec dockerTF.img python /extra/chrisreidy/singularity/TFlow_example.py


Ocelote Job Submission

The PBS script might look like this, and the results will be found in singularity-job.ojobid.

#!/bin/bash
#PBS -N singularity-job
#PBS -W group_list=pi
#PBS -q windfall
#PBS -j oe
#PBS -l select=1:ncpus=1:mem=6gb
#PBS -l walltime=01:00:00
#PBS -l cput=12:00:00
module load singularity
cd /extra/chrisreidy/singularity
date
singularity exec --nv dockerTF.img python TFlow_example.py
date

GPU Computing

The nodes on Ocelote are required to keep the operating system at CentOS 6 for reasons of consistency.  All of the Machine Learning workflows require CentOS 7.  By using Singularity containers that are built with CentOS 7 (or Ubuntu 16) you bypass this constraint.

There is more detail in the section on GPU's

Cuda / Tensorflow Example

This is a complex example that details the use of Singularity. We already maintain a library of Machine Learning workflows that are completely built as Singularity containers


If you want to build a singularity container image that can run applications on Ocelote or ElGato GPU nodes, you must prepare your container:

  • Download the .run installer for the same NVIDIA driver that is currently running on our GPU nodes
  • Extract the contents (don't actually need to install the driver)
  • Move all of the libraries to a central location like /usr/local/nvidia and make all necessary symbolic links
  • Download and install the .run file for the same CUDA libraries that are currently running on the GPU node
  • Download, extract, and copy the correct cuDNN libraries
  • Edit and export $PATH, $LD_LIBRARY_PATH, and $CUDA_HOME to point to the correct libraries and binaries

For your convenience, the staff at the National Institute of Health maintains an installation script (called cuda4singularity) that automates this process. It has been tested with Ubuntu 16.4 and CentOS 7. You can either copy or download cuda4singularity into an existing container and execute it with root privileges, or you can add the lines of code into your .def file and install the NVIDIA/CUDA libraries during the bootstrap procedure. (Note that your container will probably need ~10GB of empty space for the CUDA libraries to install properly. This is because the installer checks for minimum space requirements before running.) 

BootStrap: yum
OSVersion: 7
MirrorURL: http://mirror.centos.org/centos-%{OSVERSION}/%{OSVERSION}/os/$basearch/
Include: yum wget

%setup
    # commands to be executed on host outside container during bootstrap

%post
    # commands to be executed inside container during bootstrap

    # yum needs some tlc to work properly in container
    RELEASEVER=7
    ARCH=x86_64
    echo $RELEASEVER > /etc/yum/vars/releasever
    echo $ARCH > /etc/yum/vars/arch
    wget http://dl.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-8.noarch.rpm
    rpm -ivh --nodeps epel-release-7-8.noarch.rpm
    # yum -d 10 check-update  # this line caused problems in testing

    # install other needed packages
    yum -y install man which tar gzip vim-minimal perl python python-dev python-pip

    # create bind points for NIH HPC environment
    mkdir -p /extra /rsgrps

    # download and run NIH HPC cuda for singularity installer
    wget ftp://helix.nih.gov/CUDA/cuda4singularity
    chmod 755 cuda4singularity
    ./cuda4singularity
    rm cuda4singularity

    # install tensorflow
    pip install --upgrade pip
    pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.10.0-cp27-none-linux_x86_64.whl

%runscript
    # commands to be executed when the container runs

%test
    # commands to be executed within container at close of bootstrap process

Be patient. This bootstrap procedure will take a long time to run and may look like it has stalled at several points. If you watch the output you may see CUDA issuing warnings and complaining about an incomplete installation. Don't worry. The drivers are running on the GPU nodes so they don't need to be installed within the container.

After creating a container with one of these files, you can copy it to ElGato and test it.  On Ocelote, the same test will access a GPU node with this interactive command:

qsub -I -N jobname -W group_list=YOUR_GROUP_HERE -q windfall -l select=1:ncpus=28:mem=168gb:ngpus=1 -l cput=3:0:0 -l walltime=1:0:0
[netid@elgato]$ bsub -R gpu -Is bash
Job <633858> is submitted to default queue <windfall>.
<<Waiting for dispatch ...>>
<<Starting on gpu64>>
[netid@gpu64]$ module load singularity
[netid@gpu64]$ singularity shell cuda4tf.img 
WARNING: Not mounting current directory: user bind control is disabled by system administrator
Singularity: Invoking an interactive shell within container...

Singularity.cuda4tf.img> nvidia-smi
Mon Dec 12 14:17:55 2016       
+------------------------------------------------------+                       
| NVIDIA-SMI 352.39     Driver Version: 352.39         |                       
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla K20Xm         Off  | 0000:20:00.0     Off |                    0 |
| N/A   24C    P8    30W / 235W |     12MiB /  5759MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla K20Xm         Off  | 0000:8B:00.0     Off |                    0 |
| N/A   24C    P8    30W / 235W |     12MiB /  5759MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
Singularity.cuda4tf.img> python -m tensorflow.models.image.mnist.convolutional
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting data/train-images-idx3-ubyte.gz
Extracting data/train-labels-idx1-ubyte.gz
Extracting data/t10k-images-idx3-ubyte.gz
Extracting data/t10k-labels-idx1-ubyte.gz
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties: 
name: Tesla K20Xm
major: 3 minor: 5 memoryClockRate (GHz) 0.732
pciBusID 0000:20:00.0
Total memory: 5.62GiB
Free memory: 5.54GiB
W tensorflow/stream_executor/cuda/cuda_driver.cc:572] creating context when one is currently active; existing: 0x4c25cb0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 1 with properties: 
name: Tesla K20Xm
major: 3 minor: 5 memoryClockRate (GHz) 0.732
pciBusID 0000:8b:00.0
Total memory: 5.62GiB
Free memory: 5.54GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:59] cannot enable peer access from device ordinal 0 to device ordinal 1
I tensorflow/core/common_runtime/gpu/gpu_init.cc:59] cannot enable peer access from device ordinal 1 to device ordinal 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 1 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0:   Y N 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 1:   N Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K20Xm, pci bus id: 0000:20:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] Creating TensorFlow device (/gpu:1) -> (device: 1, name: Tesla K20Xm, pci bus id: 0000:8b:00.0)
Initialized!
Step 0 (epoch 0.00), 205.4 ms
....


Writable Container

You might want to create a Singularity image from a Docker container but update it since the code is no longer current.  There are other reasons for doing this but the feature is the same.  The container will be made with the --writable option.

This must be done on a workstation where you have root authority. This capability is not permitted on the Ocelote login or compute nodes

$ sudo singularity build --writable ubuntu.simg  docker://ubuntu

Docker image path: index.docker.io/library/ubuntu:latest
Cache folder set to /root/.singularity/docker
Importing: base Singularity environment
Importing: /root/.singularity/docker/sha256:1be7f2b886e89a58e59c4e685fcc5905a26ddef3201f290b96f1eff7d778e122.tar.gz
...
Creating empty Singularity writable container 120MB
Creating empty 150MiB image file: ubuntu.simg
Formatting image with ext3 file system
Image is done: ubuntu.simg
Building Singularity image...
Singularity container built: ubuntu.simg
Cleaning up...

$ singularity shell ubuntu.simg

Singularity: Invoking an interactive shell within container...

Singularity ubuntu.simg:~> apt-get update

Hit:1 http://archive.ubuntu.com/ubuntu xenial InRelease
Get:2 http://security.ubuntu.com/ubuntu xenial-security InRelease [102 kB]
...
Fetched 23.2 MB in 4s (5569 kB/s)
Reading package lists... Done

You will need to have root authority to execute this function since root owns the container and you are modifying it.  So for example you might have to use sudo:

sudo singularity shell ubuntu.simg


MPI

Singularity supports MPI fairly well. Since (by default) the network is the same insde and outside the container, the communication between containers usually just works. The more complicated bit is making sure that the container has the right set of MPI libraries. MPI is an open specification, but there are several implementations (OpenMPI, MVAPICH2, and Intel MPI to name three) with some non-overlapping feature sets. If the host and container are running different MPI implementations, or even different versions of the same implementation, hilarity may ensue.

The general rule is that you want the version of MPI inside the container to be the same version or newer than the host. You may be thinking that this is not good for the portability of your container, and you are right. Containerizing MPI applications is not terribly difficult with Singularity, but it comes at the cost of additional requirements for the host system.

BootStrap: debootstrap
OSVersion: xenial
MirrorURL: http://us.archive.ubuntu.com/ubuntu/


%runscript
    echo "This is what happens when you run the container..."


%post
    echo "Hello from inside the container"
    sed -i 's/$/ universe/' /etc/apt/sources.list
    apt update
    apt -y --allow-unauthenticated install vim build-essential wget gfortran bison libibverbs-dev libibmad-dev libibumad-dev librdmacm-dev libmlx5-dev libmlx4-dev
    wget http://mvapich.cse.ohio-state.edu/download/mvapich/mv2/mvapich2-2.1.tar.gz
    tar xvf mvapich2-2.1.tar.gz
    cd mvapich2-2.1
    ./configure --prefix=/usr/local
    make -j4
    make install
    /usr/local/bin/mpicc examples/hellow.c -o /usr/bin/hellow

In the example above the infiniband pieces are installed and then the MVAPICH version of MPI.  When the job is run the script will need to load the correct module with the matching version of MVAPICH

module load mvapich2/gcc/64/2.2rc1
  • No labels