Slurm Interactive Sessions with salloc
This documentation page provides instructions for launching and managing interactive computing sessions on CHPC clusters. It explains how to use the salloc command to request immediate access to compute nodes, highlights a dedicated partition
(notchpeak-shared-short) designed for shorter interactive tasks, and details methods for monitoring real-time
job performance.
On this page
The table of contents requires JavaScript to load.
Slurm Directives
To request interactive compute resources through Slurm, you must pass Slurm directives as flags to the salloc command. These Slurm directives will define the computational requirements of your work, which Slurm then uses to determine which resources to assign to your job.
The different Slurm directives are as below:
Unsure which Slurm account, partition, and QoS combination to use? Use the command
mychpc batch to list the options available to you. |
| Try using our tool that helps users find which accounts, partitions and qualities of service you can use when submitting jobs on Center for High Performance Computing systems. |
Starting an Interactive Session with salloc
Submitting for an interactive job via Slurm's salloc command works by passing Slurm directives as flags. For reference, these are the same directives you can utilize in a Slurm batch
script, just without using #SBATCH.
Below is an example where someone in the baggins account requests interactive access to lonepeak with 2 cores across 1 node.
salloc --time=02:00:00 --ntasks 2 --nodes=1 --account=baggins --partition=lonepeak
The salloc flags can be abbreviated as:
salloc -t 02:00:00 -n 2 -N 1 -A baggins -p lonepeak
Running the salloc command above will automatically ssh you into the compute node, allowing you to complete your work interactively.
Notchpeak-Shared-Short and Redwood-Shared-Short Partitions
CHPC cluster queues tend to be very busy; it may take some time for an interactive
job to start. For this reason, we have added two nodes to a special partition, notchpeak-shared-short, on the notchpeak cluster. These nodes are geared more towards interactive work.
Job limits on this partition are 8 hours wall time, a maximum of ten submitted jobs
per user, with a maximum of two running jobs with a maximum total of 32 tasks and
128 GB memory.
There is an equivalent partition in the Protected Environment called redwood-shared-short.
To access these special partitions, request both an account and partition under this name, e.g.:
salloc -N 1 -n 2 -t 2:00:00 -A notchpeak-shared-short -p notchpeak-shared-short
Logging Onto Computational Nodes: Checking Job Stats
Sometimes it is useful to connect to the node(s) where a job runs to monitor the executable
and determine if it is running correctly and efficiently. For that, we allow users
with active jobs on compute nodes to ssh to these compute nodes. To determine the
name of your compute node, run the squeue -u $USER command, and then ssh to the node(s) listed.
Once logged onto the compute node, you can run the top command to view CPU and memory usage of the node. If using GPUs, you can view GPU
usage through the nvidia-smi command.