This service does not support HIPAA or other protected data.
Research Technologies in partnership with UITS is implementing an AWS rental storage solution. This is necessitated by the expiration of free unlimited Google Drive storage. The documentation below will walk researchers through creating an S3 account which is managed by AWS Intelligent Tiering. After 90 days of nonuse, data will be moved to Glacier and after 90 additional days, will be moved to Deep Glacier. There will be no charge for data stored at either Glacier level, nor for any transfer charges. The data can be retrieved at any time, although it will take a while.
This AWS option is called Tier 2 which differs from Tier 1, the primary storage that is directly connected to the HPC clusters. Tier 1 is very fast, very expensive, and immediately available for active analyses. Tier 2 is intended for data not immediately undergoing active analyses and for backups (highly encouraged!). Researchers can use the software Globus to move data to Tier 2, and can also move data from other sources (called endpoints) like Google Drive. The data in Tier 2 will not be mounted on HPC, and so Globus will be used to move it back to Tier 1 if needed.
AWS storage is organized in buckets. One S3 intelligent tiering bucket is supported per KFS account. A PI could sponsor multiple buckets by submitting separate requests each with a unique KFS number, and then provide permissions as they see fit. Note this is different from Google Drive where anyone could create one.
For any support questions, our consultants use ServiceNow and can be reached with a support ticket.
Very small files (less than 128KB ) are not subject to intelligent tiering and are not migrated to Glacier/Deep Glacier. This means they are permanently stored in the paid storage class. If you have many small files, we recommend making archives of your directories (.tar.gz, .zip, etc) prior to uploading them to AWS. This will also reduce transfer times significantly.
Part of this service is paid for by researchers and the rest is either subsidized or covered by UITS. The data that is stored in S3 will be billed monthly by AWS to the KFS account used when this is set up. The first TB is free, meaning that it will be covered by UITS. The data that get migrated to Glacier or Deep Glacier is covered by UITS. Any transfer or other costs are covered by UITS. Refer to AWS's website for more detailed, up-to-date information on storage costs.
- The PI will go to the Portal and request a special AWS allocation. They will need to provide KFS account information including the Department's financial contact for billing purposes.
- Our infrastructure team will create the "S3 bucket". Once the bucket is ready, the PI will be notified by email.
- The PI and their group will set up a Globus endpoint by following the detailed instructions below.
- Once the Globus endpoint is created, data can be moved between the new AWS account and Tier 1, Google Drive, or external data sources.
General Globus usage information is here.
- A bill will be generated monthly for S3 usage beyond the subsidized 1 TB.
Who can submit a request?
First, log into the User Portal and navigate to the Storage tab at the top of the page. Select Submit Tier 2 Storage Request.
This will open a web form. Add your KFS number under KFS Number and the email address for the Department's financial contact under Business contact email. There will also be two optional fields: Subaccount and Project. These are used for tagging/reporting purposes in KFS billing. You can safely leave these entries blank if you're not sure what they are. Once you have completed the form, click Send request. The KFS number can be obtained from the same financial contact.
Submitting this form will open a ServiceNow ticket. Processing time may take up to a few days. Once your request has been completed, you will receive a confirmation email with a link to subscribe for account alerts (e.g., notifications for a sudden spike in usage).
Checking Your Usage
AWS runs a batch update every night with the results being reported the following day. This means that if you have made any modifications to your allocation, your usage information will not be accurately reflected until the next batch update.
You may check your storage usage at any time in the User Portal. Navigate to the Storage tab, select View Tier 2 Storage, and click Query Usage.
Generate Access Keys
Access keys will allow you to connect your AWS bucket to the software Globus. This will enable you to make transfers directly between HPC and your Tier 2 storage allocation.
To generate an access key, log into the User Portal, navigate to the Storage tab, and select Regenerate IAM Access Key.
This will generate a KeyID and Secret Access Key used to establish the connection. Save these keys somewhere safe since once the window is closed, they cannot be retrieved. If you forget your keys, you can regenerate them.
The easiest way to transfer files from AWS to HPC is using Globus. We have instructions in our Transferring Files page on how to set up an endpoint to access your AWS bucket as well as how to initiate file transfers.
Some other file transfer programs include rclone and Cyberduck.
Restoring Archived Data
Data that are not touched for at least 90 and 180 days are automatically retiered to archival storage (Glacier and Deep Glacier, respectively). Files stored in an archival state cannot be transferred out of AWS until they are restored. Restore requests can be submitted through the user portal under the Storage tab by clicking Restore Archived Tier 2 Storage Object:
This will open a box where you can enter the path to a file or directory in your bucket. Enter the path to the object you would like to restore:
Once you select an object, click Send Request to initiate the retrieval
The time it takes for an object to be retrieved is dependent on its storage class. Objects in Glacier may take a few hours while objects in Deep Glacier may take up to a day or two. Once an object has been restored, it will move back up to the frequent access tier and can be downloaded using any transfer method you prefer.
FAQ Frequently Asked Questions
Official AWS FAQs are here:https://aws.amazon.com/s3/faqs/
You should check the Amazon site: https://aws.amazon.com/s3/pricing/?nc=sn&loc=4
As of March 2022:
- Frequent Access Tier, First 50 TB / Month $0.023 per GB
- Frequent Access Tier, Next 450 TB / Month $0.022 per GB
- Frequent Access Tier, Over 500 TB / Month $0.021 per GB