Skip to content

New Enforcement of Interactive Node Usage Policy

Posted: April 4, 2019

In the Fall 2018 CHPC Newsletter there was an article on a new version of Arbiter – the service that monitors the usage of the general environment cluster login nodes along with the frisco nodes and emails users when they are making excessive use of these resources.  The new version not only monitors the usage and emails the users, but it also applies penalties for excessive usage (see details below). Over the last couple of months we have been adjusting this new version, while running it in a mode where the messages of policy violations were only going to the CHPC staff and the penalties for violating policy were not being applied. 

We are now ready to take Arbiter2 live – and will do so tomorrow Thursday, April 4, 2019  sometime in the morning.

Major changes:

  1. Users are limited, via the use of cgroups, to 4 cores and 8GB memory on the cluster interactive nodes and 12 cores and 24GB of memory on the frisco nodes. While the aggregate cpu usage can never exceed the cgroup core limit, when the aggregate total memory usage reaches this cgroup memory limit the out of memory (OOM) killer will start to kill processes to reduce the memory usage.
  2. Usage above the threshold (or trigger) levels of 1 core and 4 GB memory on the cluster interactive nodes and about 2 cores and 8 GB memory on the frisco nodes will be tracked.
  3. The goal is to limit cpu usage to the equivalent of 15 core minutes at or under 4 GB memory usage on the cluster interactive nodes or 120 core minutes at or under 12 GB memory usage on the frisco nodes.
  4. When these levels are reached – you will be put into a penalty condition, and your usage will be throttled to progressively lower levels.

For more detailed information, please see the CHPC General Login Node Policy

Once the service is running, if you should have any questions about messages, please send them to helpdesk@chpc.utah.edu

Last Updated: 6/11/21