2004 CHPC Downtimes and History

Delicatearch and Tunnelarch re-booted 12/10/04

Posted: December 11, 2004

Arches Downtime Duration:

problems on arches 12/10/04 - rebooted delicatearch and tunnelarch

After our systems group had stablized /scratch/serial, they noticed an issue with one of the main administrative nodes, which was causing serious issues with both Delicatearch and Landscapearch. It was determined that a reboot of those clusters was necessary. All running jobs were lost.

As of about 1:30 am last night (this morning) things seem to be running through the queue again. We have a snapshot of the queue right before the reboot, so if you need/want a priority boost, please let us know. We will be issuing allocation refunds for the jobs which were running at the time of the reboot. We apologize for the inconvenience.

problems on arches 12/10/04 - rebooted delicatearch and tunnelarch

After our systems group had stablized /scratch/serial, they noticed an issue with one of the main administrative nodes, which was causing serious issues with both Delicatearch and Landscapearch. It was determined that a reboot of those clusters was necessary. All running jobs were lost.

As of about 1:30 am last night (this morning) things seem to be running through the queue again. We have a snapshot of the queue right before the reboot, so if you need/want a priority boost, please let us know. We will be issuing allocation refunds for the jobs which were running at the time of the reboot. We apologize for the inconvenience.


Arches /scratch/serial down, queues suspended

Posted: December 9, 2004

Arches Downtime Duration:

Arches /scratch/serial down, queues suspended

Arches /scratch/serial down, queues suspended throughout the afternoon. Delicatearch and Landscapearch rebooted (all running jobs on those clusters lost. Queues resumed about 1:30 am on 12/11/04.

Arches /scratch/serial down, queues suspended

Arches /scratch/serial down, queues suspended throughout the afternoon. Delicatearch and Landscapearch rebooted (all running jobs on those clusters lost. Queues resumed about 1:30 am on 12/11/04.


arches /scratch/serial down

Posted: December 9, 2004

Arches Downtime Duration:

arches /scratch/serial down

Unstable throughout the day.

arches /scratch/serial down

Unstable throughout the day.


Emergency Downtime: ALL CHPC systems - Security Breech (12/12/04)

Posted: December 9, 2004

Arches Downtime Duration:

Emergency Downtime: ALL CHPC systems - Security Breech (12/12/04)

Emergency Downtime: ALL HPC Systems, Friday November 12th, 2004 beginning at noon due to security breakins.

All CHPC NIS passwords reset 11/16/2004

Arches: Available Wednesday 11/17/2004 approx 1:00 pm.

Sierra: Available Wednesday 11/17/2004 approx 5:00 pm.

Icebox: Will begin rebuild over next several weeks.

Emergency Downtime: ALL CHPC systems - Security Breech (12/12/04)

Emergency Downtime: ALL HPC Systems, Friday November 12th, 2004 beginning at noon due to security breakins.

All CHPC NIS passwords reset 11/16/2004

Arches: Available Wednesday 11/17/2004 approx 1:00 pm.

Sierra: Available Wednesday 11/17/2004 approx 5:00 pm.

Icebox: Will begin rebuild over next several weeks.


Network Outage Thursday, December 9th, 2004

Posted: December 6, 2004

Arches Downtime Duration:

Network Outage Thursday, December 9th, 2004

The Inscc networking staff would like to schedule a brief outage for this Thursday from 6:00 to 6:30 PM. The outage will affect the network ports 1025A through 1048B on the first floor, ports 2121A through 2122B on the second floor, as well as the #2 port on any splitters for the first and second floors. The outage should only affect the hosts directly connected to the specified ports. The downtime will address two closet switches which due to hardware defects need to be replaced.

If there are any questions or concerns please contact networking staff.

Network Outage Thursday, December 9th, 2004

The Inscc networking staff would like to schedule a brief outage for this Thursday from 6:00 to 6:30 PM. The outage will affect the network ports 1025A through 1048B on the first floor, ports 2121A through 2122B on the second floor, as well as the #2 port on any splitters for the first and second floors. The outage should only affect the hosts directly connected to the specified ports. The downtime will address two closet switches which due to hardware defects need to be replaced.

If there are any questions or concerns please contact networking staff.


Sierra Cluster Rebooted 12/3/04

Posted: December 3, 2004

Arches Downtime Duration:

Sierra Cluster Rebooted 12/3/04

The Sierra Cluster was Rebooted about 5pm on 12/3/04. Problems persisted. Re-booted again and functional 12/7/04.

Sierra Cluster Rebooted 12/3/04

The Sierra Cluster was Rebooted about 5pm on 12/3/04. Problems persisted. Re-booted again and functional 12/7/04.


Icebox, the IA-32 cluster, rebuilt. Down 11/22/04 thru 02/02/05.

Posted: November 22, 2004

Arches Downtime Duration:


updated: January 28th, 2005
updated: February 2nd, 2005

Icebox, the IA-32 cluster, rebuilt. Down 11/22/04 thru 02/02/05

The IA-32 cluster was rebuilt due to the security breach in November 2004. Icebox was made available February 2, 2005.


updated: January 28th, 2005
updated: February 2nd, 2005

Icebox, the IA-32 cluster, rebuilt. Down 11/22/04 thru 02/02/05

The IA-32 cluster was rebuilt due to the security breach in November 2004. Icebox was made available February 2, 2005.