On June 13th there will be maintenance to configure the networking on Chinook and assist in addressing an observed slowdown in multi-node jobs running on the cluster.
Any jobs that are scheduled to run during this time will remain in the queue.
Our maintenance on Chinook to address the slowdown on multi-node jobs is now complete. As a result of this change in multi-node jobs the following should be added to your job submission script:
ulimit -l unlimited
ulimit -s unlimited
If these commands are not added there is a possibility that your job may fail.
On May 20 and 21, 2017, the Butrovich data center will be undergoing the annual University Fire Alarm and Safety Systems Test (FAST).
During this time OIT and RCS will be performing preventative maintenance on systems and services.
***This outage has been extended.***
The Linux workstations hosted in WRRB 004 will be taken offline from May 20-31, 2017. During this time, RCS staff will reconfigure them to align with chinook.alaska.edu.
Please note the following changes:
The $CENTER1 Lustre filesystem became temporarily unavailable to the Chinook compute nodes on May 11th around 3pm AKDT, causing some submitted jobs to fail immediately. To resolve this issue the job partitions were taken down, and any submitted jobs were placed into a waiting queue until the partitions were brought back online.
Any jobs that were in the process of running during that timeframe should be unaffected.