- The command
uptime_remaining will display the amount of time remaining before the systems are taken offline for maintenance
- If you see one of your jobs held with the Reason code
ReqNodeNotAvail, Reserved for maintenance, your job's walltime overlaps with an upcoming maintenance period. Run
uptime_remaining to see when the systems will be taken offline.
Most maintenance is performed during regular hours with no interruption to service. System wide maintenance is usually planned ahead of time and is scheduled for Wednesdays from 8AM to 5PM with at least 10 days notice. These will be planned to occur four times per year.
These maintenance windows represent periods when UITS may choose to drain the queues of running jobs and suspend access to the cluster operation for HPC maintenance purposes.
The notification will describe the nature and extent (partial or full) of the interruptions of HPC services.
Batch queues will also be modified prior to scheduled downtimes to hold jobs which request more wallclock time than remains before the shutdown. Held jobs will be released to run once maintenance concludes.