Today's research generates datasets that are increasingly complex, larger, and distributed. This makes modern research analysis, archiving, and sharing ever more challenging. The support for advanced techniques to transport, store, manipulate, visualize, and interpret large datasets is critical to advancing modern science.
The University’s Research Data Center provides data storage for active analysis on the high-performance computers (HPCs). Using central computing storage services and resources, University researchers, faculty researchers, and post-doctoral researchers are able to:
- Share research data in a collaborative environment with other UA affiliates on the HPC system
- Store large-scale computational research data
- Request additional storage for further data analysis
New storage is available with the new cluster. All storage will be consolidated for all compute clusters, dramatically increasing capacity.
Ending 2020: A DDN SFA12KX storage array is the primary storage for all the systems. With a sustained performance of 44 GB/second raw I/O and 2PB of raw disk, expandable to 5PB.
Qumulo Storage array (2020-2025)
Check Disk Quota - Disk quotas can be checked through https://portal.hpc.arizona.edu/portal by selecting Storage.
All allocations will be updated between March 7th, 2020 - December 1st, 2020 as members of the community are moved to the new 2020 data array.
|/home||/xdisk||On node /tmp|
Lasts as long as your
duration of job
200GB to 1TB
|File count limit|
600 files / GB
- xdisk has been simplified if you have been using it. Nor more tables to guess the size or duration
- The capacity has been greatly increased - the default size is 200GB and the maximum is 1TB
- You can specify less than 45 days, but 45 days is the default and the maximum duration you can select, except that you can renew once for a total of 90 days.
- The usage is detailed on this page.
Is being merged with /home and /group allocations.
- Purchase disk drives to be added to the storage system for dedicated group storage. The end of support for the current array is at the end of 2025.
- Cost estimated for the Qumulo storage array is $120,000 for 133 TB. This is more expensive than our previous clusters. The rental program based on the DDN array will no longer be available
- For groups that need less than 133TB: using the free storage allocations or /xdisk option is the best option.
- This space is NOT backed up
- Files that need to be kept for 1-3 years can be offloaded to other platforms, for example: the University of Arizona Google Drive.
We strongly recommend that you do some regular housekeeping of your allocated space.
Millions of files are hard to manage for both the user and systems support. Archiving or using a tool like tar will help keep our disk arrays efficient.
Reading or writing millions of files will likely cause response time issues for other users. Please use our Consultants for ideas on efficient use of storage.
Research computing is implementing an iRODS configuration. This resource will provide large capacity for the location of large datasets. iRODS is implemented with policies for the retention of data.
Benefits of iRODS
- iRODS enables data discovery using a metadata catalog that describes every file, every directory, and every storage resource in the data grid.
- iRODS automates data workflows, with a rule engine that permits any action to be initiated by any trigger on any server or client in the grid.
- iRODS enables secure collaboration, so users only need to log in to their home grid to access data hosted on a remote grid.
- iRODS implements data virtualization, allowing access to distributed storage assets under a unified namespace, and freeing organizations from getting locked in to single-vendor storage solutions.
This section has details on how to use iRODS
- No labels