Chinook Outage and Filesystem Change

This outage is required to facilitate an upgrade of the high-speed, Lustre filesystem software and implement a new management structure for storage services.

Following this outage, all user directories in $CENTER1 will be located in subdirectories under the project(s) that you are a member. Each project will receive a 1 TB, unpurged quota. User directories currently in $CENTER1 will be moved to the new $CENTER1 project directory.

July 27th Chinook Scheduled Outage

On July 27th there will be maintenance on the $CENTER1 file system to address slowdown and errors involving file operations. The system will need to be taken offline to apply a patch that we anticipate will resolve these issues. Jobs that are scheduled to run during this downtime will stay in the queue and be run after the downtime reservation. The Chinook HPC cluster and the Linux workstations will be affected by this outage.

Chinook00 Reboot

Chinook00 is being rebooted on July 12, 2017 at 3pm AKDT. This should be a brief outage, and logins during the time chinook00.alaska.edu is offline should be redirected to the other login nodes.

Web Services

Due to continued system troubles from the June 28 unplanned power outage, some web services may be experiencing glitches. RCS is currently troubleshooting to restore services.

RCS System Outage

As of June 29th, 8am RCS systems are steadily coming back online. It is currently unknown when all systems will be fully operational and it may extend past to the previous estimate of June 29th, 9am.

We will distribute further notifications as we assess our systems and can give a concrete estimate of when each system will be back online.

RCS System Outage

There was an unplanned power outage in the UAF Butro Data Center this morning. OIT and Facilities Services have replaced the critical equipment.

This was a hard power failure and Research Computing Systems (RCS) is currently assessing the impacts to our hardware and services.
Network on UAF campus has been restored and all RCS HPC, storage, and web services are planned to be back online by 9 AM AKST, June 29, 2017.

We will distribute notifications as more information is available.

Pages

Subscribe to RSS - Outage