You are here:

Storage Services at CHPC

CHPC currently offers four different types of storage: home directories, group space, scratch file systems and a new archive storage system.  All storage types except for the archive storage system are accessible from all CHPC resources.  Data on the new archive storage space must be moved to one of the other spaces in order to be accessible. Home directories and group spaces can also be mounted on local desktops.  See the Data Transfer Services page for information on mounting CHPC file systems on local machines along with details on moving data to and from CHPC file systems. 

In addition we have limited tape back up systems for both home directories and group spaces.  

Note that the information below is specific for the general environment.  In the protected environment (PE) all four types of storage exist.  However, the nature of the storage, the pricing and policies do vary in the PE.  See the Protected Environment page for more details. 

***Remember that you should always have an additional copy or possibly multiple copies, on independent storage systems, for any crucial/critical data. While storage systems built with data resiliency mechanisms (such as RAID and erasure coding mentioned in the offerings listed below or other similar technologies)  allow for multiple component failures, they do not offer any protection for large scale hardware failures, software failures leading to corruption, or for accidental deletion or overwriting of data.  Please  take the necessary steps to protect your data to the level you deem necessary.***

Home Directories

CHPC provides several options for home directory file systems.

General HPC  home directories

The General HPC home directory storage system is the default home directory file system that is available to groups free of charge.   If your group has not purchased storage or you do not fit into one of the other categories listed below, then this is the space where your CHPC home directory will be provisioned. This file system has a 50 GB per user quota which is enforced. CHPC can provide temporary increases to this space. THIS SPACE IS NOT BACKED UP. It is assumed important data will be copied to a departmental file server or another location.

Owner home directories - New solution as of August 2017

CHPC currently allows CHPC PIs with sponsored research projects to buy-in to storage at a price determined based on cost recovery. The current limit for this space is 1TB/group, and all members of the research group will have home directories on this space.  The hardware for the current home directory solution was purchased in the summer of 2017.  Initially, it was sold at a price of  $1250/TB for the 5 year warranty period of the hardware.   The current price is prorated to $1000/B for the remaining warranty lifetime.

THis solution is based on an offering from Dell known as their Compellent solution.  In this solution there are two RAID disk based copies, one of which is the primary storage with which users will normally interact.  The second copy is used as a fail over, and is effectively a replicated copy of the primary side.  In case of hardware issue with the primary copy, the fail over will become the active working copy until repairs can be made. For increased performance, there are solid state drives that will be used in an transparent manner as a first tier for I/O to this space in front of the larger capacity traditional spinning drives. Note that these features, including the fail over copy, will be present for all home directories, including the default 50GB ones provided to users whose groups do not purchase the larger home directory space.

Note that having a redundant copy is not a backup solution as any changes in the primary side will be synced to the secondary side, for example deleting or overwriting a file.  THIS SPACE IS BACKED UPThe price of this solution also includes a back up, with nightly incremental and weekly full back ups and a two week retention window. 

We will continue to prorate the cost of this storage based on the remaining warranty time.  When it is time to refresh the hardware, CHPC will contact all groups who have purchased space about the new pricing policy.   Please contact us by emailing helpdesk@chpc.utah.edu and request to meet with us to discuss your needs and timing. 

Group Level Storage File Systems

CHPC currently allows CHPC PIs with sponsored research projects to buy-in to file storage at a price determined based on cost recovery.  A more detailed description of this storage offering is available.  The current pricing is $150/TB for the lifetime of the hardware which is purchased with a 5 year warranty.  CHPC purchases the hardware for this storage in bulk  and then sells it to individual groups in TB quantities, so depending on the amount of group storage space you are interested in purchasing, CHPC may have the storage to meet your needs on hand.  Please contact us by emailing helpdesk@chpc.utah.edu and request to meet with us to discuss your needs and timing. BY DEFAULT THIS SPACE IS NOT BACKED UP, HOWEVER CHPC PROVIDES A  BACK UP OPTION.

NOTE:  March 2019. We are no longer offering backup of NEW group spaces to tape.  We will continue to provide backups of group spaces for which groups who have already purchased tapes until that group space is retired. Details of the new options for backup of group spaces are given in CHPC's Spring 2019 Newsletter as well as in the Backup section below.  

New archive backups of group level storage will be to the Archive Storage discussed below.  CHPC will perform the backups on a quarterly basis provided the group purchase enough space on pando to allow for two copies of the data.    Contact us at helpdesk@chpc.utah.edu to set up any group space backup. CHPC also provides information on a number of user driven alternative to this service; see the section on User Driven Backup Options below.

Scratch File Systems

There are various scratch file systems which are available on the HPC clusters. THE SCRATCH FILE SYSTEMS ARE NOT BACKED UP. This space is provided for users to store intermediate files required during the duration of a job on one of the HPC clusters. On these scratch file system, files that have not been accessed for 60 days are automatically scrubbed. There is no charge for this service.

The current scratch file systems are:

  • /scratch/general/lustre - a 700TB lustre parallel file system accessible from all CHPC resources
  • /scratch/kingspeak/serial - a 175 TB NFS system accessible from all CHPC resources
  • /scratch/general/nfs1 - a 595 TB NFS system accessible from all CHPC resources

Temporary File Systems

Linux defines temporary file system at /tmp or /var/tmp where temporary user and system files are stored. CHPC cluster nodes set up temporary file systems as a RAM disk with limited capacity. All interactive and compute nodes have also a spinning disk local storage at /scratch/local. If an user program is known to need temporary storage, it is advantageous to set environment variable TMPDIR which defines the location of the temporary storage and point it to /scratch/local. Local disk drives range from 40 to 500 GB depending on the node, which is much more than the default /tmp size. /scratch/localcan also be used for storing intermediate files during calculation, however be aware that getting to these files after the job finishes will be difficult since they are local to the (compute) node and not directly accessible from cluster interactive nodes.

Jump to top of page

Archive Storage 

CHPC now has a new archive storage solution based around object storage, specifically ceph, a distributed object store suite developed at UC Santa Cruz.   We are offering an 6+3 erasure coding configuration which results in a price of $140/TB of usable capacity for the 5 year lifetime of the hardware.  As we currently do with our group space, we will operate this space in a condominium model by reselling this space in TB chunks.  

This space is a stand alone entity, and will not be mounted on other CHPC resources.

One of the key features of the archive system is that users manage the archive directly, unlike the tape archive option. Users can move data in and out of the archive storage as needed -- they can archive milestone moments in their research, store an additional copy of crucial instrument data, and retrieve data as needed. This archive storage solution will be accessible via applications that use Amazon’s S3 API.  GUI tools such as transmit  (for Mac) as well as command-line tools such ass3cmd and rclone can be used to move the data. In addition Globus can be used to access this space; however note that the globus ceph plugin is a new tool that is still be developed and should be treated as such.

It should also be noted that this archive storage space is for use in the general environment, and is not for use with regulated data;  there is a separate archive space in the protected environment.

Backup Policy

The backup policy of the individual file systems is mentioned above.

Note: March 2019. CHPC is migrating the backup of group home directory from tape to the disk based archive storage mentioned above.  At this same time we started the process of phasing out the backup of group spaces to tape by moving the CHPC managed quarterly archives of any newly purchased spaces to the archive storage.

For additional information on user driven backup options see the next section.

User Driven Backup Options

Owner back up to Google Drive: There is a University agreement with Google that provides for unlimited storage and this is an option that a number of CHPC users already use for backup, using rclone.   Please keep in mind that Google Drive is only suitable for public data, it is NOT suitable for sensitive or restricted data. Details can be found on the University’s Google Drive page and CHPC' rclone page. One other consideration is that the google drive storage is owned by an individual, not by a group.

Owner backup to Box: This is an option suitable for sensitive/restricted data. However there is a file size limitation of 15GB.  In addition, if using rclone the credentials expire and have to be reset periodically.

Owner backup to pando: This choice, mentioned in the Archive storage section above, is a good option if a group wishes not to use Google Drive, especially if only a subset of the data needs to be backed up or if a different backup frequency is desired.  

Owner backup to other storage external to CHPC: Some groups have access to other storage resources, external to CHPC, whether at the University or at other sites. The tools that can be used  for doing this are dependent on the nature of the target storage.

There a a number of tools, mentioned on our Data Transfer Services page, that can be used.  Several places above we mentioned rclone which is the tool best suited for transfers to object storage file system; others are fpsync, a parallel version of rsync suited for transfers between typical Linux "POSIX-like" file systems, and globus, best suited for transfers to and from resources outside of the CHPC.

In addition we have a page that presents a number of considerations and tips for user driven backups.

Mounting CHPC Storage

For making direct mounts of home and group space on your local machine see the instructions provided on our Data Transfer Services page. 

Additional Information

For more information on CHPC Data policies, visit: File Storage Policies

Last Updated: 5/15/19