Apexarch User Guide
Note: the apexarch cluster was retired and relaced with the redwood cluster in early 2018.
Apexarch is an HPC cluster designed and built for researchers whose data contains Protected Health Information (PHI) and other sensitive data. This cluster is considered HIPAA Compliant.
Contents
Apexarch cluster hardware overview
- 16 nodes (320 total cores)
- 10 nodes with 28 cores with 128G of memory
- 6 nodes with 8 cores with 24G memory
- Mellanox QDR Infiniband interconnect
- Gigabit Ethernet interconnect for management
- 5.2TB General Scratch server
Important Differences between apexarch and Other CHPC Clusters - NEW!
The design of the protected environment is fundamentally different than other CHPC clusters. The general clusters (ember, kingspeak, lonepeak etc.) were designed to be open (within reason, taking a balanced approach on security). Apexarch was designed primarily to mitigate risk and protect data. If you have used the general clusters, the first thing you will notice when you login is that the home directory is on a completely different file system, where your other CHPC home directory is mounted on all the various general and more open clusters.
FAQ section - NEW!
Please refer to the Protected Environment Frequently Asked Questions (FAQ).
Apexarch Cluster Usage
CHPC resources are available to qualified faculty, students (under faculty supervision), and researchers from any Utah institution of higher education. Users can request accounts for CHPC computer systems by filling out an account request form. This can be found by following this link: account request form.
Because Apexarch is part of the CHPC Protected Environment, users must also have permission to access that. See the Protected Environment Frequently Asked Questions (FAQ) for how to apply and qualify for access.
Apexarch does not currently use allocations of wall clock core hours for priority access. This may change in the future.
Apexarch Cluster Access and Environment
As part of the CHPC protected environment, Apexarch requires that you undergo authentication more rigorous than is needed for the other CHPC clusters. See the Protected Environment Frequently Asked Questions (FAQ).
Once you are able to connect to the Protected Environment, The Apexarch cluster can be accessed via ssh (secure shell) at the following addresses:
- apex.chpc.utah.edu (general PE users; round robins between apex1.chpc.utah.edu and apex2.chpc.utah.edu)
- poet.chpc.utah.edu (owned by Dr. Hurdle; others may be granted access on request)
Apexarch does not mount the same account directory as the unprotected clusters do. If you have files on your regular CHPC account that you wish to use on Apexarch, you must copy them using a secure protocol such as scp.
You may also have access to a project directory on the homerfs filesystem, containing data common to the users of that project, who are vetted by the Institutional Review Board or other relevant authority as having rights to access it. Project members may create files there or modify them, as well.
At the present time, the CHPC supports two types of shells: tcsh
and bash
. Tcsh shell users need to select the .tcshrc
login script. Users whose shell is bash
need the .bashrc
file to log in.
Your environment is setup through the use of modules. Please see the User Environment section of the General Cluster Information page for details in setting up your environment for batch and other applications.
Using the Batch System on Apexarch
The batch implementation on all CHPC systems is Slurm.
The creation of a batch script on the apexarch cluster
A shell script is a bundle of shell commands which are fed one after another to a
shell (bash
, tcsh
,..). As soon as the first command has successfully finished, the second command is
executed. This process continues until either an error occurs or the complete array
of individual shell commands has been executed. A batch script is a shell script which
defines the tasks a particular job has to execute on a cluster.
Below this paragraph a batch script example for running in Slurm on the Ember cluster is shown. The lines at the top of the file all begin with #SBATCH which are interpreted by the shell as comments, but give options to Slurm.
Example Slurm Script for Apexarch:
#!/bin/csh
#SBATCH --time=1:00:00 # walltime, abbreviated by -t
#SBATCH --nodes=2 # number of cluster nodes, abbreviated by -N
#SBATCH -o slurm-%j.out-%N # name of the stdout, using the job number (%j) and the
first node (%N)
#SBATCH --ntasks=16 # number of MPI tasks, abbreviated by -n # additional information
for allocated clusters
#SBATCH --account=baggins # account - abbreviated by -A
#SBATCH --partition=apexarch # partition, abbreviated by -p # # set data and working
directories
setenv WORKDIR $HOME/mydata
setenv SCRDIR /scratch/<path>/UNID/myscratch
mkdir -p $SCRDIR
cp -r $WORKDIR/* $SCRDIR
cd $SCRDIR
# load appropriate modules, in this case Intel compilers, MPICH2
module load intel mpich2
# for MPICH2 over Ethernet, set communication method to TCP - for general lonepeak
nodes
# see above for network interface selection options for other MPI distributions
setenv MPICH_NEMESIS_NETMOD tcp
# run the program
# see above for other MPI distributions
mpirun -np $SLURM_NTASKS my_mpi_program > my_program.out
For more details and example scripts please see our Slurm documentation. Also, to help with specifying your job and instructions in your slurm script, please review CHPC Policy 2.3.1 Apexarch Job Scheduling Policy.
Job Submission on Apexarch
In order to submit a job on ember, one has to login first into an interactive node (see above, apexarch1 or poet).
To submit a script named slurmjob.apexarch, just type:
sbatch slurmjob.apexarch
Checking the status of your job in slurm
To check the status of your job, use the "sinfo" command
sinfo
For information on compiling on the clusters at CHPC, please see our Programming Guide.