Skip to content

3.1 File Storage Policies

  1. CHPC Home/Group/Project Directory File Systems
    Many of the CHPC  file systems are based on NFS (Network File System), and proper management of files is critical to the performance of applications and the performance of the entire network. All files in home directories are NFS mounted from a fileserver, and a request for data must go over the network. Therefore, it is advised that all executables and input files be copied to a scratch directory before running a job on the clusters.
    1. Default home directory space in the general environment
      1. The general environment CHPC home directory file system (hpc_home) is available to users who have a CHPC account and do not have a department or group home directory file system maintained by CHPC (see item 2-2 below).
      2. This file system enforces quotas set at 50 GB per user. If you need a temporary increase on this limit, please let us know (helpdesk@chpc.utah.edu) and we may be able to provide the increase.
      3. This file system is not backed up 
      4. users are encouraged to move important data back to a file system that is backed up, such as a department file server.
    2. Department or Group owned storage in the general environment
      1. Departments or PIs with sponsored research projects can work with CHPC to procure storage to be used as CHPC Home Directory or Group Storage
      2. Home directory space purchases include full backup as described in the Backup Policies below.
      3. The owner of group storage can arrange for archival back up as described in the Backup Policies below.
      4. Usage Policies of this storage will be set by the owning department/group.
      5. When using shared infrastructure to support this storage it is still expected that all groups be 'good citizens'. By good citizens we mean that utilization should be moderate and not impact other users of the file server.
      6. Quotas
        1. User and or Group quotas can be used to control usage
        2. The quota layer will be enabled allowing for reporting of usage even if quota limits are not set
      7. Any backups of owner home directory space run regularly by CHPC have a two week retention period - See Backup Policies below.
      8. Life Cycle
        1. CHPC will support storage for the duration of the warranty period of the storage hardware that supports the group space.
        2. Shortly before the end of the warranty period, CHPC will reach out to let groups know of the upcoming retiremend of the file system, giving the groups time to either purchase new group space or move their data outside of CHPC.
    3. Archive storage in the general environment
      1. CHPC maintains a ceph object file system in the general environment. 
      2. The archive storage is NOT mounted on any of the CHPC systems
      3. This space is used for storage of CHPC run backups
      4. Groups can purhase storage on this file system to use for owner driven backups as well as for sharing of data
    4. Home directory space in the protected environment (PE)
      1. A PE home directory is provided for all users in the PE
      2. This file system enforces quotas set at 50 GB per user. CHPC will NOT increase the size on any PE home directory space.
      3. CHPC provides backup of the PE home directories on described in the Backup Policies below.
    5. Project space in the protected environment (PE)
      1. Each PE project is provided with 250 GB of project space 
      2. Groups can purchase additional project space as needed
      3. The project space is not backed up by default, however groups can arrange for archival back up as described in the Backup Policies below.
      4. Life Cycle
        1. CHPC will support storage for the duration of the warranty period of the storage hardware that supports the project space.
        2. Shortly before the end of the warranty period, CHPC will reach out to let groups know of the upcoming retiremend of the file system, giving the groups time to either purchase new project space or move their data outside of CHPC.
    6. Archive storage in the protected environement (PE)
      1. CHPC maintains a ceph object file system in the protected environment (PE)
      2. The archive storage is NOT mounted on any of the CHPC systems
      3. This space is used for storage of CHPC run backups of PE file systems
      4. Groups can purhase storage on this file system to use for owner driven backups as well as for sharing of data
    7. Web Support from home directories
      1. Place html files in public_html directories
      2. URL published: "http://home.chpc.utah.edu/~<uNID>"
      3. May request a more human readable URL handle to redirect to something like: "http://www.chpc.utah.edu/~<my_name>"
  2. Backup Policies
    1. Scratch file systems are not backed up
    2. The HPC general file system in the general environment is not backed up
    3. Owned home directory space in the general environement and all home directories in the Protected Environment (PE): 
      1. The backup of this space is included in the price and occurs on the schedule of weekly full backups with daily incremental backups.  The retention window is two weeks.
      2. The backup is done to the archive storage (pando in the general environemnt, elm in the PE) 
    4. Group spaces in the general environment and project spaces in the protected environement are not backed up by CHPC unless the group requests backup and purchases the necessary space on the CHPC archive storage system of that environment (see 2-5 below). Note that CHPC documenation also provides information about user driven backup options on our storage page.
    5. Backup Service: 

       The time it takes to backup a space depends on several factors, including the size of the space and the number of files.  With incremental backups, the time it takes and the space the incremental backup requires depends on the turnover rate of the data, i.e., the amount of the space that has changed since the last backup. When a given group space takes longer than six days, CHPC will work with the group to develop a feasible backup strategy. Typically, this involves determining which files are static (such as raw data files) and therefore only need to be backed up once and then saved (never over written in the backup location), which files that regularly change and need to be backed up on a regularly scheduled basis, and which files do not need to be backed up at all. Note, this policy was updated 31-Jan-2024 , effective 1-April-2024, changing the frequency of backups for group/project spaces that are under 5 TB from a monthly full backup to a quarterly full backup.

      1. For group or preoject spaces, CHPC will perform a quarterly full  backup with weekly incremental backups. Once a new full backup is completed, the previous period backup is deleted. 
      2. For group or project spaces that cannot be backed up within six days, CHPC will also reach out to the group to determine a feasible backup strategy.
      3. Groups interested in a different backup schedule should reach out via helpdesk@chpc.utah.edu to discuss.
    6. To schedule this service, please:
      1. send email to helpdesk@chpc.utah.edu
      2. purchase  necessary archive space
      3. CHPC will perform the archive backup
      4. The archive space be twice the capacity of the group space being archived such that we still have a copyof the previous backup for protection if the disaster were to happen mid archive run.
  3. Scratch Disk Space: Scratch space for each HPC system is architected differently. CHPC offers no guarantee on the amount of available /scratch disk space available at any given time.
    1. Local Scratch (/scratch/local):
      1. This space is on the local hard drive of the node and therefore is unique to each individual node and is not accessible from any other node. 
      2. This space is encrypted; each time the node is rebooted the encryption is reset tfrom scratch, which in effect purges the content of this space. 
      3. /scratch/local on compute nodes is set such that users cannot create a directory under the top level /scratch/local space.
        1. As part of the slurm job prolog (before the job is started), a job level directory,  /scratch/local/$USER/$SLURM_JOB_ID , is created and set such that only the job owner has access to the directory. At the end of the job, in the slurm job epilog, this directory is deleted.
      4. There is no access to /scratch/local outside of a job. 
      5. This space will be the fastest, but not necessarily the largest. 
      6. Users should use this space at their own risk.
      7. This space is not backed up
    2. NFS Scratch:
      1. /scratch/kingspeak/serial is mounted on all general environment interactive nodes and  compute nodes.
      2. /scratch/general/pe-nfs1 is mounted on protected environment interactive and compute nodes
      3. /scratch/general/nfs1 mounted on all general environment interactive and compute nodes
      4. scratch file systems are not intended for use as storage beyond the data's use in batch jobs
      5. scratch file systems are scrubbed weekly of files that have not been accessed for over 60 days 
      6. each user will be responsible for creating directories and cleaning up after their jobs 
      7. They are not backed up
    3. Parallel Scratch: (/scratch/general/lustre): 
      1. This general environment scratch file system is available on all general environment cluster interactive and compute nodes
      2. It is scrubbed weekly of files that have not been accessed for over 60 days.
      3. It is not intended for use as storage beyond the data's use in batch jobs.
      4. It is not backed up.
    4. Owner Scratch Storage
      1. It is configured and made available as per the owner groups requirements.
      2. It is not subject to the general scrub policies that CHPC enforces on CHPC provided scratch space.
      3. Owners/groups can request automatic scrub scripts to be run per their specifications on their scratch spaces.
      4. It is not backed up.
      5. quota layer enabled to facilitate usage reporting.
      6. quota limits can be configured per owner/groups needs.
  4. File Transfer Services 3.2 Guest File Transfer Policy
Last Updated: 1/31/24