Files are transferred to shared data storage and not to the bastion node, login nodes, or compute nodes. Note that because storage is not cluster-specific, your files are accessible on both ocelote and elgato.
For small data transfers the web portal offers the most intuitive method.
For data transfers <100GB we recommend sftp, scp or rsync using filexfer.hpc.arizona.edu.
Interestingly, Globus is efficient for moving data within the system. Let's say you want to make a significant transfer from /extra space to /rsgrps, by using Globus instead of a login node it is not only easier, it reduces the load on the login node.
For large data transfers (>100GB) or transfers outside the University we recommend using Globus (GridFTP).
The bastion host has limited storage capacity and is not intended for file transfers.
GridFTP / Globus
GridFTP is an extension of the standard File Transfer Protocol (FTP) for high-speed, reliable, and secure data transfer. Because GridFTP provides a more reliable and high performance file transfer (compared to protocols such as SCP or rsync), it enables the transmission of very large files. GridFTP also addresses the problem of incompatibility between storage and access systems. (You can read more about the advantages of GridFTP here.)
To use GridFTP, one method that the UA has compatibility with is Globus. To use Globus, you'll first need to do a one-time setup to enable your local machine as a Globus endpoint, then you'll be able to transfer files.
Even if you have an active HPC account, if you have never logged in you will not be able to successfully set up and use Globus. If this is the case, when attempting to connect to HPC you will receive the error: "The directory was not found. Please check the path and try again." To log in, you may either connect using the CLI through the bastion or you may log into OnDemand.
Set up Globus Connect Personal endpoint:
1) Go to https://www.globus.org/ and click “Log In” in the top right corner
2) In the “Use your existing organizational login“ box, type in or find “The University of Arizona” and hit Continue
3) This will take you to Webauth, log in as normal
4) You will end up at the Globus “File Manager” web interface, but wait, there’s more.
5) Choose Endpoints on the left and Create New Endpoint on the top right. Now you have the choice to download Globus Connect Personal.
6) Type a descriptive name for your local computer into the “Endpoint Display Name” box and click “Generate Setup Key”.
7) Copy the key to your clipboard, and also just leave this page open for the moment.
8) Under the Setup Key it returned, you’ll see links to download the software. Click the appropriate software download button for your operating system.
9) Install the software as normal and launch it.
10) It will ask for your setup key, copy/paste that from the web site. It should now show up as a small “g” icon in your menu bar/system tray.
Transfer files via the Globus interface:
1) Go toand log in again if you need to
2) You should see a pretty classic “commander”-style file transfer view. You’ll pick an endpoint for each side and then tell it to move files from one to the other. Click the “Endpoint” box on the left-hand side and it should pop up a search interface.
3) Click the “My Endpoints” tab and you should see an entry that matches the “Display Name” you typed in earlier. Click that. The interface will load a view of the files on your local machine.
4) On the right-hand side, click the “Endpoint” box, you should see the same search interface as before.
5) In the search box, type this: arizona#sdmz-dtn
That will take you back to the Transfer Files screen and you should see a list of your files on the HPC system in the pane on the right-hand side.
6) Browse in the left-hand pane to the file(s) you want to transfer to HPC and once you’ve selected them, you should see the arrow facing to the right at the top of the interface light up blue. Click the arrow. You’ll get a green alert box at the top of the screen that says something like “Transfer request submitted successfully. Task id: <a uuid of many letters and numbers>”. This confirms that we have asked Globus very politely to tell your computer to send some files to HPC. It will just start happening.
7) Depending on how large/how many files, it may take a bit to transfer. You can see in-progress transfers by clicking the Activity tab near the top of the screen.
You might want to change Globus preferences to make more directories available than /home
The intent is that filexfer.hpc.arizona.edu is to be used for most file transfers.
sftp encrypts data before it is sent across the network. Additional capabilities include resuming interrupted transfers, directory listings, and remote file removal.
- Open a SSH v2 compliant terminal client and navigate do a desired working directory on your local machine.
- sftp NetId@filexfer.hpc.arizona.edu
- **NetId@ can be omitted if it's the same on both local and remote machines**
- Use put or get command at the sftp> prompt for the file transfer
- Type help at the sftp> prompt for commands and their usages
ftp / lftp
HPC uses the ftp client lftp to transfer files between the file transfer node and remote machines. This can be done by following the steps outlined below:
Due to security risks, it is not possible to ftp to the file transfer node from a remote machine, however, you may ftp from the file transfer node to a remote machine.
Connect to the data transfer node:
Connect to the external host using the command lftp:
Use the commands get and put to transfer files :
For complete documentation on lftp usage:
scp uses Secure Shell (SSH) for data transfer and utilizes the same mechanisms for authentication, thereby ensuring the authenticity and confidentiality of the data in transit.
Moving a File or Directory to the HPC:
- Open an SSH v2 compliant terminal client
- Navigate to a desired working directory on your local machine (laptop or desktop usually)
- To transfer files to a login node; scp -rp filenameordirectory NetId@filexfer.hpc.arizona.edu:subdirectory
**NetId can be omitted if it's the same on both local and remote machines**
- The transferred file will be at the specified directory.
Getting a File or Directory From the HPC:
- Open an SSH v2 compliant terminal client
- Navigate to a working directory on your local machine (laptop or desktop usually)
- "scp -rp NetId@filexfer.hpc.arizona.edu:filenameordirectory ."
** the space folllowed by a period at the end means the destination is the current directory**
Wildcards can be used for multiple file transfers (e.g. all files with .dat extension):
- scp NetId@filexfer.hpc.arizona.edu: subdirectory /\*. dat (Note: the backslash " \ " preceding *)
For More Information Type:
- man scp at the shell prompt
- -r option is good for transferring directories and files in the directories
- -p option is good for preserving time and mode from the original files
rsync is a fast and extraordinarily versatile file copying tool. It synchronizes files and directories between two different locations (or servers). Rsync copies only the differences of files that have actually changed.
An important feature of rsync not found in most similar programs/protocols is that the mirroring takes place with only one transmission in each direction. Rsync can copy or display directory contents and copy files, optionally using compression and recursion.
You use rsync in the same way you use scp. You must specify a source and a destination, one of which may be remote.
rsync -avz computer-name:src/directory-name email@example.com:/data/tmp --log-file=hpc-user-rsync.log
This would recursively transfer all files from the directory src/directory-name on the machine computer-name into the /data/tmp/directory-name directory on the local machine. The files are transferred in archive mode, which ensures that symbolic links, devices, attributes, permissions, ownerships, etc. are preserved in the transfer. Additionally, compression will be used to reduce the size of data portions of the transfer.
rsync -avz computer-name:src/directory-name/ firstname.lastname@example.org:/data/tmp --log-file=hpc-user-rsync.log
A trailing slash on the source changes this behavior to avoid creating an additional directory level at the destination. You can think of a trailing / on a source as meaning “copy the contents of this directory” as opposed to “copy the directory by name”, but in both cases the attributes of the containing directory are transferred to the containing directory on the destination.
-a archive mode; will preserve timestamps
-v increase verbosity
-z compress file data during the transfer.
--log-file=FILE log what we're doing to the specified FILE.
The Research Computing test iRODS instance has been dismantled. iRODs servers are available elsewhere (like CyVerse).
There are two ways to use it - either by command line or using iRODS on your workstation via a gui like Cyberduck.
Note that iCommands cannot be used to upload files into Data Store via URL from other sites (ftp, http, etc.).
To transfer data from an external site, you first must download the file to a local machine using wget or a similar mechanism, and then use iput to upload it to the Data Store.
On Ocelote, iRODS 4 is installed as a standard package to the operating system on every node and so you will not "module load irods". You will still need to "iinit" the first time (see below).
For any system using iRods 4.x
iRODS 4 iinit, unlike its iRODS3 counterpart, does not help you set up the environment the first time you run iinit. You need to run create_irods_env with suitable options for the iRods host, zone, username... manually for iRods 4. As an example, we'll set up for the UA test iRods instance, and presume you have an account there.
|For this key:||Enter this:|
|-h||<hostname of iRODS server>|
|-p||<port number of iRODS server> (1247 is default)|
|-z||<Zone name of iRODS zone>|
|-u||<user name on the iRODS server> (may not match your netid)|
|-a||<authentication method for the iRODS server> (PAM, native,...)|
will suffice to create an appropriate ~/.irods/irods_environment.json file to allow you to run iinit; we took the default -p 1247, -u <your NetId> in the above example by omitting -p and -u. You only need to do this step ONE time; subsequent times you will just run iinit and it will asked for your password. Note create_irods_env wil NOT overwrite or alter an existing ~/.irods/irods_environment.json file.
Once the ~/.irods/irods_environment.json file is created properly, you should be able to sign in to the iRods server your selected using iinit, viz:
At this point you can use other iRods commands such as icp to move files.
Changes working directory
For help, enter icmod -h.
Grant read-only permission level for specified user to selected file or folder.
Grant read and write permission level for specified user to selected file or folder.
Grant full ownership permission level for specified user to selected file or folder
Remove permission level for the user to the file or folder
Log off/disconnect from the Data Store.
Download file/directory from iRODS to local device
Initialize and start the connection to iRODS
Lists contents of current working directory. For help, enter ils -h
|ils -A||Lists directory permissions|
Creates new directory
Uploads file/directory from local device to iRODS
Shows name and path of current remote folder
Moves a file to the trash
Deletes a file.
Moves a folder to the trash.
Deletes a folder.
In the following examples:
- my-files-to-transfer/ is the example name of the directory or folder for bulk transfers.
- my files-to-transfer.txt is the example name for single file transfers.
- Any filename may be used for the checkpoint-file.
Bulk files transfer
Example: BULK FILES TRANSFER
iput -P -b -r -T --retries 3 -X checkpoint-file my-files-to-transfer/
Single large file transfer
Example: SINGLE LARGE FILE TRANSFER
iput -P -T --retries 3 --lfrestart checkpoint-lf-file my-file-to-transfer.txt
Graphical / Cyberduck
Cyberduck is a free cross-platform, high-throughput and parallel data transfer open source file transfer program that supports multiple transfer protocols (FTP, SFTP, WebDAV, Cloud files, Amazon S3, etc.). It serves as an alternative to the iDrop Java applet, and has been extensively tested with large data transfers (60-70 GB). This allows users to transfer large files, depending on the user's available bandwidth and network settings.
Cyberduck versions are available for Mac OS (10.6 and higher on Intel 64-bit) and Windows (Windows XP, Windows Vista, Windows 7, or Windows 8). LINUX users should use iDrop Desktop or iCommands. Cyberduck version 4.7.1 (released July 7, 2015) and later supports the iRODS protocol.
Install or Update Cyberduck
- If Cyberduck is already installed, check if you need to update:
- Click the Cyberduck menu.
- Click Check for Updates.
- If an update is available click Install Update.
- To install Cyberduck for your operating system for the first time:
- Go to the Cyberduck installation page at https://cyberduck.io/.
- Follow the steps for your OS (not available for LINUX users):
- For Mac OS:
- Click Download Cyberduck-5.3.9.zip (or current).
- Move the downloaded file (either a zip file or the unzipped application file, depending on your browser) to your Applications folder. If the zip file is listed, unzip the file in your Applications folder.
IMPORTANT: The file must be located in your Applications folder.
- For Mac OS:
- For Windows:
- Click Download Cyberduck-Installer-5.3.9.exe (or current).
- Locate the downloaded file and double click to begin installation.
- Go through the install process.
Configure Cyberduck for use with iRODS
- Click to open Cyberduck.
See the Cyberduck Preferences Help page on the Cyberduck website for more information on installation.
- Click on Open Connection.
- In the first drop down field, choose enter a profile name
- Create the connection:
- The Server field contains <your irods server>
- The Port field contains 1247.
- To create the connection:
- Enter your user name in the username field
- Verify your userid is added to the URL field, as shown above.
- The remaining fields are populated.
7. Click in the Transfer Files drop-down list and select Open multiple connections.
8. Close the window.
- No labels