CHPC provides several options for home directory file systems.
General HPC home directories
The General HPC home directory storage system is the default home directory file system that is available to groups free of charge. If your group has not purchased storage or you do not fit into one of the other categories listed below, then this is the space where your CHPC home directory will be provisioned. This file system has a 50 GB per user quota which is enforced. CHPC can provide temporary increases to this space. THIS SPACE IS NOT BACKED UP. It is assumed important data will be copied to a departmental file server or another location.
INSCC home directories
This file system is available free of charge to groups residing in the INSCC building. Quotas are set at a group level to allow more fluidity between users. This file system is backed up to tape, with nightly incremental and weekly full back ups and a two week retention window.
Owner home directories
CHPC currently allows CHPC PIs with sponsored research projects to buy-in to storage at a price determined based on cost recovery. The current limit for this space is 1TB/group, and all members of the research group will have home directories on this space. The current pricing is $500/TB for the 5 year warranty period of the hardware. We prorate the charge based on the remaining warranty time. When it is time to refresh the hardware, CHPC will contact all groups who have purchased space about the new pricing policy. This file system is backed up to tape, with nightly incremental and weekly full back ups and a two week retention window. Please contact us by emailing firstname.lastname@example.org and request to meet with us to discuss your needs and timing.
NOTE -- January 2017
CHPC is starting to explore options for the refresh of the current home directory file systems as the hardware that is currently being used for home directories goes out of warranty in August 2017. Note that this includes all home directories in the general environment (NOT the protected environment home directories) – both the default 50GB/user spaces (that do not have tape backup) and the larger home directory spaces purchased by groups, which do have backup support.
The explorations have focused on ways to: (1) Improve the overall performance, (2) Improve the reliability/time available, (3) Increase the amount of home directory space in order to accommodate existing and new groups, and (4) Have a price point that strives to find a balance between cost and improved reliability and performance over our previous generation.
Our preferred solution is one that addresses all of these items. It is based on an offering from Dell known as their Compellent solution. In this solution there are two RAID disk based copies, one of which is the primary storage with which users will normally interact. The second copy is used as a fail over, and is effectively a replicated copy of the primary side. In case of hardware issue with the primary copy, the fail over will become the active working copy until repairs can be made. For increased performance, there are solid state drives that will be used in an transparent manner as a first tier for I/O to this space in front of the larger capacity traditional spinning drives. Note that these features, including the fail over copy, will be present for all home directories, including the default 50GB ones provided to users whose groups do not purchase the larger home directory space.
For those groups that purchase space, the cost of this proposed option is $1000/TB -- and this is the price that we will charge the PI/group. This is a one-time charge for the warranty lifetime of the hardware, which is 5 years from the date of purchase, and this price includes a backup solution (details of this solution are still being considered). Note that having a redundant copy is not a backup solution as any changes in the primary side will be synced to the secondary side, for example deleting or overwriting a file. Initially groups will be limited to their current size, and any new purchases are limited to 1TB.
Group Level Storage File Systems
CHPC currently allows CHPC PIs with sponsored research projects to buy-in to file storage at a price determined based on cost recovery. A more detailed description of this storage offering is available. The current pricing is $150/TB for the lifetime of the hardware which is purchased with a 5 year warranty. CHPC purchases the hardware for this storage in bulk (currently 320TB at a time) and then sells it to individual groups in TB quantities, so depending on the amount of group storage space you are interested in purchasing, CHPC may have the storage to meet your needs on hand. Please contact us by emailing email@example.com and request to meet with us to discuss your needs and timing.
Archive backups of group level storage is available by request for the cost of the backup tapes. These backups are performed quarterly. We recommend that groups purchase a quantity of tapes to allow for two copies, so that backups can be alternated between the two sets. Contact us at firstname.lastname@example.org for current pricing and to request an archive of your group space.
Scratch File Systems
There are various scratch file systems which are available on the HPC clusters. THESE FILE SYSTEMS ARE NOT BACKED UP. This space is provided for users to store intermediate files required during the duration of a job on one of the HPC clusters. On these scratch filesystem, files that have not been accessed for 60 days are automatically scrubbed. There is no charge for this service.
The current scratch file systems are:
- /scratch/general/lustre - a 700TB lustre parallel file system accessible from all all CHPC resources
- /scratch/kingspeak/serial - a 175 TB NFS system accessible from all CHPC resources except lonepeak
- /scratch/lonepeak/serial - a 33 TB NFS system accessible from all interactive nodes and from the comput nodes of Lonepeak
CHPC now has a new archive storage solution based around object storage, specifically ceph, a distributed object store suite developed at UC Santa Cruz. We have an initial raw capacity of 1.15PB, with a cost of $80/TB raw space. In order to calculate the cost per TB of usable space you must consider the replication configuration. Initially, we will be offering an 6+3 erasure coding configuration which results in a price of $120/TB of usable capacity for the 5 year lifetime of the hardware. As we currently do with our group space, we will operate this space in a condominium model by reselling this space in TB chunks. This space is a stand alone entity, and will not be mounted on other CHPC resources.
One of the key features of the archive system is that users manage the archive directly,
unlike the tape archive option. Users can move data in and out of the archive storage as needed -- they can archive
milestone moments in their research, store an additional copy of crucial instrument
data, or retrieve data as needed. This archive storage solution will be accessible
via applications that use Amazon’s S3 API. GUI tools such as transmit (for Mac) as well as command-line tools such as
s3cmd and rclone can be used to move the data. In addition Globus can be used to access this space; however note that the globus ceph plugin is a new
tool that is still be developed and should be treated as such.
It should also be noted that this archive storage space is for use in the general environment, and is not for use with regulated data; CHPC is actively working on vetting this solution for human genomic data that is covered by NIH’s dbGaP policies.
The backup policies of each type of storage have been described above.
Mounting CHPC Storage
For making direct mounts of home and group space on your local machine see the instructions provided on our Data Transfer Services page.
For more information on CHPC Data policies, visit: File Storage Policies