Chinook will be offline from 9 AM on November 1, 2017, to 5 PM on November 2, 2017, to facilitate expansion from 1892 cores, 73 nodes to 2816 cores, 106 nodes.
This outage is required to facilitate an upgrade of the high-speed, Lustre filesystem software and implement a new management structure for storage services.
Following this outage, all user directories in $CENTER1 will be located in subdirectories under the project(s) that you are a member. Each project will receive a 1 TB, unpurged quota. User directories currently in $CENTER1 will be moved to the new $CENTER1 project directory.
On July 27th there will be maintenance on the $CENTER1 file system to address slowdown and errors involving file operations. The system will need to be taken offline to apply a patch that we anticipate will resolve these issues. Jobs that are scheduled to run during this downtime will stay in the queue and be run after the downtime reservation. The Chinook HPC cluster and the Linux workstations will be affected by this outage.
Chinook00 is being rebooted on July 12, 2017 at 3pm AKDT. This should be a brief outage, and logins during the time chinook00.alaska.edu is offline should be redirected to the other login nodes.
Due to continued system troubles from the June 28 unplanned power outage, some web services may be experiencing glitches. RCS is currently troubleshooting to restore services.
RCS Systems are still currently in the process of being restored. Access to RCS systems may be available but we are still investigating issues that are causing unexpected behavior and are impacting work on these systems.
As of June 29th, 8am RCS systems are steadily coming back online. It is currently unknown when all systems will be fully operational and it may extend past to the previous estimate of June 29th, 9am.
We will distribute further notifications as we assess our systems and can give a concrete estimate of when each system will be back online.
There was an unplanned power outage in the UAF Butro Data Center this morning. OIT and Facilities Services have replaced the critical equipment.
This was a hard power failure and Research Computing Systems (RCS) is currently assessing the impacts to our hardware and services.
Network on UAF campus has been restored and all RCS HPC, storage, and web services are planned to be back online by 9 AM AKST, June 29, 2017.
We will distribute notifications as more information is available.
The $ARCHIVE filesystem is experiencing heavy utilization, impacting file transfers to and from $ARCHIVE. Users may experience slow file transfers throughout the day as process of archiving files to tape finishes.
***This outage has been extended.***
The Linux workstations hosted in WRRB 004 will be taken offline from May 20-31, 2017. During this time, RCS staff will reconfigure them to align with chinook.alaska.edu.
Please note the following changes: