Research Computing and Data

Upcoming Spring 2024 Maintenance Work

The RCD team has scheduled a maintenance window to complete major changes to the Palmetto Cluster and other systems at the end of the Spring semester.

This work will begin on Monday, May 6th, 2024, at 9:00 am. While maintenance work is in progress, all RCD services, including the Palmetto Cluster and the Indigo Data Lake, will be unavailable.

During this maintenance window, the RCD team will complete the following updates, which may have user impact:

  1. The Palmetto 2 (Slurm) cluster will move into general availability.
  2. Additional nodes will move into Palmetto 2:
    • All nodes in owner queues
    • All nodes from HDR phases
  3. Our new allocation management system, ColdFront, will become available.
    • Current Palmetto 1 (PBS) accounts do not grant access to Palmetto 2.
    • Current Palmetto 1 users must use ColdFront to request new allocations to make use of Palmetto 2
    • No new accounts will be added to Palmetto 1 (PBS). 
  4. /scratch1 and /fastscratch will be decommissioned and will no longer be available.
    • All data on /scratch1 and /fastscratch will be erased.
  5. /scratch will be re-initialized.
    • All data on /scratch will be erased.
  6. ZFS systems are being decommissioned.
    • ZFS storage owners have been contacted about transitioning to the new storage system.
    • All data stored on ZFS file systems will be migrated to the Indigo Data Lake, so no data will be lost.
    • If you are a ZFS storage owner and have not received an email from us, please reach out to us.
  7. A new software module system will be introduced for Palmetto 2 (Slurm). This system will provide a more user-friendly and efficient way to manage software installations and versions.
  8. A refreshed Open OnDemand interface will be available for Slurm. The OnDemand interface will be updated to provide a more modern and user-friendly experience for Slurm users.
  9. A new job monitoring and visualization tool, jobstats, will be deployed across the cluster. This new tool allows users to monitor their jobs more easily and efficiently and will replace many existing monitoring methods.

Users should expect that services will be restored no earlier than Friday, May 13th, 2024, at 5:00 pm and should monitor their email for updates from RCD.

We understand that these changes are significant and want to help users transition smoothly. RCD will make updated documentation, training/tutorial sessions, and additional support resources available after maintenance.

Please feel free to reach out to RCD with any questions or concerns that you have about the maintenance work by submitting a support ticket – we would love to hear from you!