Column |
---|
Column | ||||
---|---|---|---|---|
|
Column | ||
---|---|---|
| ||
A container is a packaged unit of software that contains code and all its dependencies including, but not limited to: system tools, libraries, settings, and data. This makes applications and pipelines portable and reproducible, allowing for a consistent environment that can run on multiple platforms. Shipping containers have frequently been used as an analogy because the container is standard, does not care what is put inside, and will be carried on any ship; or in the case of computing containers, it can run on many different systems. Docker is widely used by researchers, however, Docker images require root privileges which means they cannot be run in an HPC environment. Singularity addresses this by completely containing the authority so that all privileges needed at runtime stay inside the container. This makes it ideal for the shared environment of a supercomputer. Even better, a Docker image can be encapsulated inside a Singularity image. Some ideal use cases that can be supported by Singularity on HPC include:
The documentation here provides instructions on how to either take a Docker image and run it from Singularity, or create an image using Singularity only. |
Column | ||
---|---|---|
| ||
Accessing Singularity on HPC
Singularity
Page Banner | ||||||||
---|---|---|---|---|---|---|---|---|
|
Excerpt Include | ||||||
---|---|---|---|---|---|---|
|
Note | |||||
---|---|---|---|---|---|
| |||||
During the October 26, 2022 maintenance window, Singularity was removed and replaced with Apptainer. The commands
|
Panel | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
Accessing Apptainer on HPCApptainer is installed on the operating systems of all HPC compute nodes, so can be easily accessed either from an interactive session or batch script without worrying about software modules. |
Pulling Images
Tip |
---|
|
Panel | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
Building a Container |
Local Builds
Building a container locally requires root authority which users do not have on HPC. This means you must use a Mac or Linux workstation where you have sudo privileges and Singularity installed. The Sylabs website has instructions that can help users get started on building their own containers. Additionally, Nvidia provides an HPC Container Maker which lets you build a recipe without having to know all the syntax. You will just include the blocks you need (e.g., Cuda or Infiniband) and it will create the recipe that you can use for a build on your local workstation.Remote Builds
To bypass the issue of needing root privileges to build your container, Singularity Hub lets you build and keep containers in the cloud as well as share them with other users. You maintain your recipes there and each time you need to pull one, it gets built remotely and is retrieved to your workstation. This conveniently allows you to build containers directly from HPC.
As an example, if you want to build a container in your account, first go to https://cloud.sylabs.io, generate an access token (API key), and save it to your clipboard. Next, log in to an interactive terminal session and find your recipe file. In this example, we'll use the recipe:
Code Block | ||||
---|---|---|---|---|
| ||||
BootStrap: docker
From: nersc/ubuntu-mpi:14.04
%runscript
echo "This is what happens when you run the container..." |
Then, assuming the recipe is stored in our home directory, we can build it remotely using:
Code Block | ||||
---|---|---|---|---|
| ||||
$ singularity remote login # paste in your API key at the prompt
$ singularity build --remote ~/nersc.sif ~/nersc.recipe |
This will produce a .sif file in your home directory that is ready for use.
Singularity, Nvidia, and GPU's
Column |
---|
Pulling Nvidia Images
The NVIDIA GPU Cloud (NGC) provides GPU-accelerated HPC and deep learning containers for scientific computing. NVIDIA tests HPC container compatibility with the Singularity runtime through a rigorous QA process. Application-specific information may vary so it is recommended that you follow the container-specific documentation before running with Singularity. If the container documentation does not include Singularity information, then the container has not yet been tested under Singularity.
With the introduction of Apptainer during the October 26, 2022 maintenance cycle, remote builds on SyLabs are no longer supported. Instead, in most cases it should be possible to build your images directly on a compute node using:
This has been tested for recipes bootstrapping off of Docker images. We have found that in some cases (e.g. |
Panel | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Apptainer, Nvidia, and GPUs
|
|
|
Code Block | ||||
---|---|---|---|---|
| ||||
$ singularity build ~/namd.simg docker://nvcr.io/hpc/namd:2.12-171025 |
Running
Directory access:
Singularity
|
|
|
|
You may also make use of the --pwd <container_dir>
flag, which will be used to set the present working directory of the command to be run within the container.
Ocelote does not support filesystem overlay and as such the container_dst_dir must exist within the image for a bind to be successful. To get around the inability to bind arbitrary directories $HOME and /tmp are mounted in automatically and may be used for application I/O.
GPU support:
All NGC containers are optimized for NVIDIA GPU acceleration so you will always want to add the --nv
flag to enable NVIDIA GPU support within the container.
Standard run command:
The Singularity command below represents the canonical form that will be used on the Ocelote cluster.
Code Block | ||||||||
---|---|---|---|---|---|---|---|---|
language | bash | theme |
|
Panel | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
Containers Available on HPCWe support the use of HPC and ML/DL containers available on NVIDIA GPU Cloud (NGC). Many of the popular HPC applications including NAMD, LAMMPS and GROMACS containers are optimized for performance and available to run in |
Apptainer on Ocelote or Puma. The containers and respective README files can be found in /contrib/singularity/nvidia. |
But. They are only available from compute nodes, so start an interactive session if you want to view them. We do not update these very often as it is time consuming and some of them change frequently. So we encourage you to pull your own from Nvidia
|
|
|
|
|
|
|
|
|
Tutorials
- The Sylabs GitHub site has files and instructions for creating sample containers.
- Our Github repository has Singularity examples available that can be run on HPC.
Simple Example
The lolcow image is often used as the standard "hello world!" introduction to containers and is described in Singularity's documentation. To follow their example, first start by logging into an interactive terminal sessionPanel | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
Sharing Your ContainersIf you have containers that you would like to share with your research group or broader HPC community, you may do so in the space To do this, start an interactive session and change to
|
.singularity
. Next, run the image simply using singularity run
Next, create a directory, set the group ownership, and set the permissions. For example, if you wanted your directory to only be writable by you and be accessible to the whole HPC community, you could run (changing
Next, add any images you'd like to share to your new directory, for example:
As soon as your images are in this location, other HPC users can access them interactively or in a batch script. An example batch job is shown below:
Submitting the job and checking the output:
|
Panel | ||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||
Tutorials
|
Panel | |||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| |||||||||||||||||||||||||||||||||||||||||||||||
Example Recipe Files
|
Running Singularity in a Batch Job
Running a job with Singularity is as easy as running other jobs, simply include your resource requests, and include any commands necessary to execute your workflow. For more detailed information on creating and running jobs, see our SLURM documentation or Puma Quick Start. An example script might look like:
Code Block | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| |||||||||||||||||
#!/bin/bash
#SBATCH --job-name singularity-job
#SBATCH --account=your_pi
#SBATCH --partition=standard
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --time=01:00:00
date
singularity
|
Example Recipe Files
CentOS with Tensorflow
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
BootStrap: yum
OSVersion: 7
MirrorURL: http://mirror.centos.org/centos-%{OSVERSION}/%{OSVERSION}/os/$basearch/
Include: yum
# best to build up container using kickstart mentality.
# ie, to add more packages to image,
# re-run bootstrap command again.
# bootstrap on existing image will build on top of it, not overwriting it/restarting from scratch
# singularity .def file is like kickstart file
# unix commands can be run, but if there is any error, the bootstrap process ends
%setup
# commands to be executed on host outside container during bootstrap
%post
# commands to be executed inside container during bootstrap
# add python and install some packages
yum -y install vim wget python3 epel-release
# install tensorflow
pip3 install --upgrade pip
pip3 install tensorflow-gpu==2.0.0-rc1
# create bind points for storage.
mkdir /xdisk
mkdir /groups
exit 0
# %runscript
# commands to be executed when the container runs
# %test
# commands to be executed within container at close of bootstrap process
python --version |
To build and test a container from the recipe from an interactive session on a GPU node:
Code Block | ||||
---|---|---|---|---|
| ||||
$ singularity build centosTflow.sif centosTflow.def # Remember, you will need to either build this on a workstation where you have root privileges or will need to user a --remote build
$ singularity exec --nv centosTFlow.simg python3 TFlow_example.py |
As a tensorflow example, you could use the following script:
Code Block | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
#Linear Regression Example with TensorFlow v2 library
from __future__ import absolute_import, division, print_function
#
import tensorflow as tf
import numpy as np
rng = np.random
#
# Parameters.
learning_rate = 0.01
training_steps = 1000
display_step = 50
#
# Training Data.
X = np.array([3.3,4.4,5.5,6.71,6.93,4.168,9.779,6.182,7.59,2.167,
7.042,10.791,5.313,7.997,5.654,9.27,3.1])
Y = np.array([1.7,2.76,2.09,3.19,1.694,1.573,3.366,2.596,2.53,1.221,
2.827,3.465,1.65,2.904,2.42,2.94,1.3])
n_samples = X.shape[0]
#
# Weight and Bias, initialized randomly.
W = tf.Variable(rng.randn(), name="weight")
b = tf.Variable(rng.randn(), name="bias")
# Linear regression (Wx + b).
def linear_regression(x):
return W * x + b
# Mean square error.
def mean_square(y_pred, y_true):
return tf.reduce_sum(tf.pow(y_pred-y_true, 2)) / (2 * n_samples)
# Stochastic Gradient Descent Optimizer.
optimizer = tf.optimizers.SGD(learning_rate)
#
# Optimization process.
def run_optimization():
# Wrap computation inside a GradientTape for automatic differentiation.
with tf.GradientTape() as g:
pred = linear_regression(X)
loss = mean_square(pred, Y)
# Compute gradients.
gradients = g.gradient(loss, [W, b])
# Update W and b following gradients.
optimizer.apply_gradients(zip(gradients, [W, b]))
#
# Run training for the given number of steps.
for step in range(1, training_steps + 1):
# Run the optimization to update W and b values.
run_optimization()
if step % display_step == 0:
pred = linear_regression(X)
loss = mean_square(pred, Y)
print("step: %i, loss: %f, W: %f, b: %f" % (step, loss, W.numpy(), b.numpy())) |
MPI
Singularity
|
Panel | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
Cache DirectoryTo speed up image downloads for faster, less redundant builds and pulls, Apptainer sets a cache directory in your home under ~/.apptainer. This directory stores images, metadata, and docker layers that can wind up being reasonably large. If you're struggling with space usage and your home's 50GB quota, one option is to set a new Apptainer cache directory. You can do this by setting the environment variable APPTAINER_CACHEDIR to a new directory. From Apptainer's documentation:
For example, if you wanted to set your cache directory to your PI's /groups directory under a directory you own, you could use:
Assuming |