New HPC Cluster Expands Twofold

February 1, 2017

UAF's new HPC cluster, chinook.alaska.edu, will be offline for the week of February 6-10, 2017. During this time RCS will expand the cluster from 912 to 1,892 cores and increase the interconnect speeds from 40 Gb/s up to 100 Gb/s with the deployment of:

  • 6 racks with ISObases, PDUs, & XDVs
  • 1 Relion 2900 head node
  • 2 Relion 1900v2 login nodes + 2 legacy login nodes
  • 35 Relion 1900v2 compute nodes, each with 28 cores, 128GBs memory, EDR interface
  • 38 Relion 1900v1 compute nodes, each with 24 cores, 128GBs memory, FDR interface
  • 11 Arctica ethernet switches
  • 11 Mellanox EDR IB switches
  • QDR (40 Gb/s) to EDR (56 Gb/s) and FDR (100 Gb/s) interconnect

New compute nodes will be distributed to Chinook resulting in partition counts as follows:

  • debug: 2 nodes (52 cores)
  • t1small/t2small: 5 nodes (120 cores)
  • t1standard/t2standard: 66 nodes (1,720 cores)

In 2016, UAF received support from the M.J. Murdock Charitable Trust to expand Chinook by approximately 1,300 cores over two years. The February 2017 outage is the first major expansion and we anticipate additional growth of over 1,000 cores in the months ahead.

RCS looks forward to supporting the scientific computational needs of the UA research community and will distribute notifications as any system and service changes take effect. Please contact us with any questions or suggestions at (907) 450-8602 or uaf-rcs@alaska.edu or visit us on the web at www.gi.alaska.edu/rcs.