Upcoming Maintenance + Downtime: June 25-28, 2021

Due to further work required for the final phase of the USC Data Center’s power supply upgrade, CARC systems will be down for maintenance at the end of June. Maintenance is scheduled to begin at 8:00 PM PST on Friday, June 25, anticipated to complete by 12:00 PM PST on Monday, June 28. We will be taking advantage of this necessary downtime to perform maintenance on our systems, including:

• A BeeGFS configuration change to improve fair share use between users and metadata performance
• Setting up a temporary directory (TMPDIR) with Slurm
• New GPU partition (possibly)

During this downtime, you will be unable to run jobs or access the compute nodes on Discovery and Endeavour. You will notice that Slurm has now been configured to prevent the scheduling of jobs during this maintenance period. You will also be unable to access CARC file systems (/home, /project, /scratch, and /scratch2) during the maintenance.

We apologize for the downtime but thank you for your continued cooperation and flexibility.

Weekend maintenance is now complete and both CARC clusters (Discovery and Endeavour) and all file systems (/home1, /project, /scratch, and /scratch2) are running normally. Thank you for your patience during this maintenance.

NEW: From now on, if you wish to use a GPU node(s) for a job on Discovery, you must include the line #SBATCH --partition=gpu in your job submission script, as we have created a new separate GPU partition on Discovery.