Creating a Custom R Library
R packages can be finicky. See Switching Between Custom Libraries and Common Problems below to help with frequent user issues.
Creating Your First Library
Make a local directory to store your packages:
Tell R where the directory is by creating an environment file:
That's it! Now you can install packages normally. For example, to install and load the package "ggplot2":
Switching Between Custom Libraries
If you're using different versions of R, we recommend you use different libraries. See Common Problems below for more information. When creating a library, consider including pertinent information in the name such as R version. For example:
If you start by using R version 4.0, following the instructions provided above:
If you later decide to switch to R version 4.1, instead of using your existing library, create a new one:
To use your new library, edit your .Renviron file:
Now you can go about your business and install as you normally would.
Working on a cluster without root privileges can lead to complications. Some are listed below with suggested solutions:
- General Installation Questions
Solution: Check out http://www.r-bloggers.com/installing-r-packages/ for more information.
- A Corrupted Environment: One of the most common reasons R packages won't install is an altered environment. Most frequently this is caused by the presence anaconda (or miniconda) installed locally.
Solution: If Anaconda is present, follow the instructions in the Resolving Anaconda Issues section below. Otherwise:
Look for any of the file types listed below on your account. If you find them, remove them (make a backup somewhere if you need them) and try the installation again.
- Saved R sessions. If this is the case, after starting a session, you will get the message "[Previously saved workspace restored]". Old sessions are saved as a hidden file .RData in your home directory.
- Gnu compilers
- Windows files
Solution: Double-check that you have an .Renviron file. This is a hidden file located in your home directory and should set the path to your custom R library. If you do not have a custom library name set up, R will create one for you saved as something like:
This directory can lead to unwanted behavior. For example, if you're trying to use a new custom library (such as when switching R version), R will still search x86_64-pc-linux-gnu-library for package dependencies and may cause installs to fail. To fix this, rename these types of folders something unique and descriptive.
To set up/switch custom libraries, follow the instructions in the Creating a Custom R Library section above.
- Mixing R Versions: Because HPC is a cluster where multiple versions of R are available, users should take care to avoid mixing and matching. Because packages often depend on one another, libraries using different versions of R can turn into a tangled mess. Common errors that can crop up include: "Error: package or namespace load failed."
Solution: If you're switching R versions, we recommend creating a new library.
OOD RStudio Issues: OOD RStudio is a great tool! Sometimes though, because it's a different environment than working directly from the terminal, you may run into problems. Specifically, these typically arise for installs or when using packages that rely on software modules.
Package Installations: If you're trying to install a package in an OOD RStudio session and you've tried all the troubleshooting advice above without luck, try starting R in the terminal and give the installation another try. You can start an R session in the terminal using:
Remember that when you log into HPC, you're on a login node so you'll want to start up an interactive session to access R and for the installation.
Accessing Modules: RStudio does not have access to module load commands. This means that if you have a package that relies on a system module, the easiest option is to work through an interactive terminal session.
The alternative is to to modify your RStudio environment. For example, the library hdf5r relies on the hdf5 software module. If you try to load hdf5r, you will get an error complaining about a shared object file. To get around this, you will need to manually add that shared object to your environment using dyn.load(). For example:
This requires that you know the location of the relevant file(s). These can usually be tracked down by looking at your system path variables (e.g. LD_LIBRARY_PATH) after loading the relevant module in a terminal. It should be noted that modifying your system paths from RStudio will not help since RStudio has its own configuration file that overrides these.
- Software Installed in Non-Standard Locations: When packages are dependent on 3rd party software, particularly when the software is installed locally, R can have trouble finding it. This can usually be fixed by changing your environment paths but can sometimes be challenging.
- Packages that require 3rd party software should be installed in a terminal session and not through an OOD RStudio session.
- Check whether the software you need is installed as a module using the module avail command.
- If you know which paths need to be changed, point them to the correct location.
- Search online help forums such as R-Bloggers, Stack Exchange, Stack Overflow, etc. for your specific error. It's likely others have experienced the same problem you're encountering and know where the trouble spots are.
- If you're in too deep, reach out to the consultants with a support ticket.
- Sometimes it can't be helped and you need the software installed globally. Submit a software installation request to get things up and running. Note: there is no expected timeframe for software requests.
Resolving Anaconda Issues
When Anaconda is initialized, your .bashrc file is edited so that it becomes the first thing in your PATH variable. This can cause all sorts of mayhem. To get around this, you can either remove anaconda from your PATH and deactivate your environment, or comment out/delete the initialization in your ~/.bashrc if you want the change to be permanent.
Turn off Auto-activation
Anaconda's initialization will tell it to automatically activate itself when you log in (when anaconda is active, you will see a "(conda)" preceding your command prompt). To disable this behavior, run the following from the command line in an interactive terminal session:
This will suppress anaconda's activation until you explicitly call
conda activate and is a handy way to have more control over your environment. Once you run this, you will either need to log out and log back in again to make the changes live, or you can follow the instructions in the section below.
You can either use the command conda deactivate and then manually edit your PATH variable to remove all instances of anaconda/miniconda or copy the following and run it in your terminal:
Your .bashrc file configures your environment each time you start a new session. You may consider making a backup before editing in case of unwanted changes.
Note: this change will remove anaconda from all future terminal sessions but will not make the changes live right away. To make the changes live, either follow the instructions above for removing anaconda from your PATH, or log out and back in again.
Then comment out or delete the following lines and the text in between:
To exit and save, use control+x and follow the prompts.
Open OnDemand Application
We provide access to the popular development environment RStudio through our Open OnDemand web interface. This is a very handy tool, though it should be noted that it is a less flexible environment than using R from the command line. This is because RStudio sets its own environment which prevents easy access to third party software installed as system modules. These issues can sometimes worked around by following the guide in the debugging section above.
In some circumstances, you may want to run RStudio using your own Singularity image. For example, this allows access to different versions of R not provided when using our OOD application. We have some instructions on how to do this below.
First, log into HPC using an Open OnDemand Desktop session and open a terminal. A Desktop session is the easiest solution to access RStudio since it eliminates the need for port forwarding.
In the terminal, make an RStudio directory where all of the necessary files will be stored. In this example, we'll be working in our home directory and will pull an RStudio image from Dockerhub to use as a test. If you're interested, you can find different RStudio images under rocker in Dockerhub.
Next, create the necessary directories RStudio will use to generate temporary files. You will also generate a secure cookie key.
Next, create a file in your RStudio directory called rserver.sh and make it an executable:
Open the file in your favorite editor and enter the content below. Modify the variables under USER OPTIONS to match your account if necessary. You can change PASSWORD to any password you'd like to use. Once you've entered the contents, save and exit:
Now, in your desktop session's terminal, execute the rserver.sh script using:
Next, open a Firefox window and enter "localhost:8787" for the URL. In your browser, you will be prompted to log into your RStudio server. Enter your NetID under Username. Under Password, enter the password you defined in the script server.sh.
This will open your RStudio session:
Example R Scripts
- No labels