Major CARC System Updates - January 13, 2025

There have been several major updates during the last CARC maintenance period. Please refer to this page for more information and instructions on how to adjust to the new changes. We will also be updating the user guides to reflect these changes.

Rocky 8 Linux operating system

CARC systems have transitioned from CentOS 7 to Rocky 8 Linux. With the discontinuation of CentOS 7, Rocky 8 was developed as a new open-source platform to fill the role of CentOS. Rocky 8 is based off of RHEL 8 (Red Hat Enterprise Linux) in the same way that CentOS 7 was based off of RHEL 7. You will likely not encounter any major changes and data storage will not be impacted by the migration. The main impact will be on software; see below for further details.

Software modules

We have upgraded our compiler from GCC 11.3.0 to GCC 13.3.0. The default module that is loaded upon login has changed to usc/13.3.0. The software modules have been reorganized within the default module, but it will still be a similar experience to the previous one.

Apptainer software container

We switched from SingularityCE to Apptainer for two useful new features:

  • Building container images unprivileged directly on CARC clusters
  • Running containers within containers

Existing container images and scripts should continue to work without any changes. To load the corresponding module, run:

module purge
module load apptainer

SSH keys

Upon log in, all users will receive a message like the following:

C:\Users\user>ssh user@endeavour2.usc.edu
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
...

Do not be alarmed; run the following set of commands to generate new SSH keys.

Mac/Linux

Mac and Linux users should run these commands on their computers before connecting to CARC:

ssh-keygen -f ~/.ssh/known_hosts -R discovery.usc.edu
ssh-keygen -f ~/.ssh/known_hosts -R discovery1.usc.edu
ssh-keygen -f ~/.ssh/known_hosts -R discovery2.usc.edu
ssh-keygen -f ~/.ssh/known_hosts -R endeavour.usc.edu
ssh-keygen -f ~/.ssh/known_hosts -R endeavour1.usc.edu
ssh-keygen -f ~/.ssh/known_hosts -R endeavour2.usc.edu
ssh-keygen -f ~/.ssh/known_hosts -R hpc-transfer.usc.edu
ssh-keygen -f ~/.ssh/known_hosts -R hpc-transfer1.usc.edu
ssh-keygen -f ~/.ssh/known_hosts -R hpc-transfer2.usc.edu
ssh-keygen -f ~/.ssh/known_hosts -R knoll.usc.edu

Windows

Windows users should run these commands in the powershell:

ssh-keygen -f $env:USERPROFILE/.ssh/known_hosts -R discovery.usc.edu
ssh-keygen -f $env:USERPROFILE/.ssh/known_hosts -R discovery1.usc.edu
ssh-keygen -f $env:USERPROFILE/.ssh/known_hosts -R discovery2.usc.edu
ssh-keygen -f $env:USERPROFILE/.ssh/known_hosts -R endeavour.usc.edu
ssh-keygen -f $env:USERPROFILE/.ssh/known_hosts -R endeavour1.usc.edu
ssh-keygen -f $env:USERPROFILE/.ssh/known_hosts -R endeavour2.usc.edu
ssh-keygen -f $env:USERPROFILE/.ssh/known_hosts -R hpc-transfer.usc.edu
ssh-keygen -f $env:USERPROFILE/.ssh/known_hosts -R hpc-transfer1.usc.edu
ssh-keygen -f $env:USERPROFILE/.ssh/known_hosts -R hpc-transfer2.usc.edu
ssh-keygen -f $env:USERPROFILE/.ssh/known_hosts -R knoll.usc.edu

New /project2 storage system

We have introduced a new storage system called /project2. /project2 is an all-flash storage system with over 10 PB of total storage space. Twice as many file servers as /project make use of the increased performance of flash storage.

As before, PIs will receive 10 TB for free across their projects. /project is very near to capacity and will only be available for a few more years, therefore all new projects or requests for increased storage allocations will be on /project2 going forward. Because /project2 boasts better performance, we also encourage users to proactively transfer their data from /project to the new system.

Because /project2 offers better performance, the new rate for storage allocations over 10 TB is $60/TB/year. Please keep this in mind when deciding to transfer data or request new allocations. If PIs choose to keep their current /project storage allocation, they are free to do so at the original $40/TB/year rate.

2 Likes

Can we still use vscode using hpc ondemand? I found it disappeared after the maintance.

Is endeavour still down? I’m not yet able to submit jobs:

[kmilner@endeavour2 ~]$ salloc -N 1 --ntasks=1 --cpus-per-task=20 --mem 2gb -t 00:30:00 -p scec
salloc: error: Job submit/allocate failed: Invalid account or account/partition combination specified

And there don’t seem to be any jobs running for any user, which makes me think the problem isn’t just with me:

[kmilner@endeavour2 ~]$ squeue
       JOBID PARTITION                           NAME     USER ST       TIME  NODES NODELIST(REASON)
[kmilner@endeavour2 ~]$ 
2 Likes

Same here. Can not alloc any resources.

There was an issue coupling users with their Slurm accounts on Endeavour. You should now be able submit jobs. We did notice some users had an unrelated accounting issue that may prevent them from running jobs on Endeavour so please let us know if you are affected and we can resolve that too.

Hello! I’m getting this error when I try to activate a Conda environment that used to work for me:

Has there been a change to the syntax required to activate a Conda env? Thank you!

-Adam

You’ll have to load the conda module and may need to run conda init.

That worked, thank you! I also stopped receiving emails for jobs that have started/stopped. Has the syntax here changed as well? This is what I’m inputting into the sbatch file:

The syntax is correct but this feature is not working right now. We’re working on getting it back up.

Hello,

Got a long error that does not have much useful information when running conda init after loaded the module conda. Is there still some problem there?

Also, I previously use the command nodeinfo and myqueue a lot, which both receive a command not found right now. Is there any problem there for the slurm system?

Thanks!

When I try to open a browser-based terminal in ondemand, I get the warning message that was posted on the top. clearing known hosts from my machine doesn’t help with that (I guess since this is an emulated terminal in a browser tab). Any solutions?

Can you please share the error that you encountered after running conda init?

We also made some changes so you should see myqueue and nodeinfo now.

Thanks for the response. The comands works for me now. Still running into conda issues. Attached is the error report I got when forced to run conda init before using conda activate.

# >>>>>>>>>>>>>>>>>>>>>> ERROR REPORT <<<<<<<<<<<<<<<<<<<<<<

    Traceback (most recent call last):
      File "/apps/conda/miniforge3/24.3.0/lib/python3.10/site-packages/conda/exception_handler.py", line 17, in __call__
        return func(*args, **kwargs)
      File "/apps/conda/miniforge3/24.3.0/lib/python3.10/site-packages/conda/cli/main.py", line 83, in main_subshell
        exit_code = do_call(args, parser)
      File "/apps/conda/miniforge3/24.3.0/lib/python3.10/site-packages/conda/cli/conda_argparse.py", line 199, in do_call
        result = getattr(module, func_name)(args, parser)
      File "/apps/conda/miniforge3/24.3.0/lib/python3.10/site-packages/conda/cli/main_init.py", line 161, in execute
        return initialize(
      File "/apps/conda/miniforge3/24.3.0/lib/python3.10/site-packages/conda/core/initialize.py", line 139, in initialize
        run_plan_elevated(plan2)
      File "/apps/conda/miniforge3/24.3.0/lib/python3.10/site-packages/conda/core/initialize.py", line 856, in run_plan_elevated
        result = subprocess_call(
      File "/apps/conda/miniforge3/24.3.0/lib/python3.10/site-packages/conda/gateways/subprocess.py", line 128, in subprocess_call
        raise CalledProcessError(rc, command, output=formatted_output)
    subprocess.CalledProcessError: Command '['sudo', '/apps/conda/miniforge3/24.3.0/bin/python', '-m', 'conda.core.initialize']' returned non-zero exit status 1.

`$ /apps/conda/miniforge3/24.3.0/bin/conda init`

  environment variables:
                 CIO_TEST=<not set>
        CMAKE_PREFIX_PATH=/apps/spack/2406/apps/linux-rocky8-x86_64_v3/gcc-13.3.0/openmpi-5.0.5-
                          mufqd73:/apps/spack/2406/apps/linux-rocky8-x86_64_v3/gcc-
                          13.3.0/openblas-0.3.28-mgrljin:/apps/spack/2406/apps/linux-rocky8-
                          x86_64_v3/gcc-13.3.0/python-3.11.9-x74mtjf:/apps/generic/gcc/13.3.0
               CONDA_ROOT=/apps/conda/miniforge3/24.3.0
       CPLUS_INCLUDE_PATH=/apps/spack/2406/apps/linux-rocky8-x86_64_v3/gcc-13.3.0/openmpi-5.0.5-
                          mufqd73/include:/apps/spack/2406/apps/linux-rocky8-x86_64_v3/gcc-
                          13.3.0/openblas-0.3.28-mgrljin/include:/apps/spack/2406/apps/linux-
                          rocky8-x86_64_v3/gcc-13.3.0/python-3.11.9-x74mtjf/include
           CURL_CA_BUNDLE=<not set>
           C_INCLUDE_PATH=/apps/spack/2406/apps/linux-rocky8-x86_64_v3/gcc-13.3.0/openmpi-5.0.5-
                          mufqd73/include:/apps/spack/2406/apps/linux-rocky8-x86_64_v3/gcc-
                          13.3.0/openblas-0.3.28-mgrljin/include:/apps/spack/2406/apps/linux-
                          rocky8-x86_64_v3/gcc-13.3.0/python-3.11.9-x74mtjf/include
          LD_LIBRARY_PATH=/apps/conda/miniforge3/24.3.0/lib:/apps/spack/2406/apps/linux-rocky8-
                          x86_64_v3/gcc-13.3.0/openmpi-5.0.5-
                          mufqd73/lib:/apps/spack/2406/apps/linux-rocky8-x86_64_v3/gcc-
                          13.3.0/openblas-0.3.28-mgrljin/lib64:/apps/spack/2406/apps/linux-rocky
                          8-x86_64_v3/gcc-13.3.0/python-3.11.9-
                          x74mtjf/lib:/apps/generic/gcc/13.3.0/lib64
               LD_PRELOAD=<not set>
             LIBRARY_PATH=/apps/spack/2406/apps/linux-rocky8-x86_64_v3/gcc-13.3.0/openmpi-5.0.5-
                          mufqd73/lib:/apps/spack/2406/apps/linux-rocky8-x86_64_v3/gcc-
                          13.3.0/openblas-0.3.28-mgrljin/lib64:/apps/spack/2406/apps/linux-
                          rocky8-x86_64_v3/gcc-13.3.0/python-3.11.9-x74mtjf/lib
                  MANPATH=/apps/conda/miniforge3/24.3.0/share/man:/apps/spack/2406/apps/linux-ro
                          cky8-x86_64_v3/gcc-13.3.0/openmpi-5.0.5-
                          mufqd73/share/man:/apps/spack/2406/apps/linux-rocky8-x86_64_v3/gcc-
                          13.3.0/python-3.11.9-
                          x74mtjf/share/man:/apps/generic/gcc/13.3.0/share/man::
               MODULEPATH=/apps/spack/2406/apps/lmod/linux-rocky8-x86_64/openmpi/5.0.5-
                          mufqd73/gcc/13.3.0:/apps/lmod/modules/gcc/13.3.0:/apps/spack/2406/apps
                          /lmod/linux-rocky8-
                          x86_64/gcc/13.3.0:/apps/lmod/modules/compilers:/apps/lmod/modules/util
                          s:/apps/lmod/modules/misc
                     PATH=/apps/conda/miniforge3/24.3.0/bin:/apps/spack/2406/apps/linux-rocky8-
                          x86_64_v3/gcc-13.3.0/openmpi-5.0.5-
                          mufqd73/bin:/apps/spack/2406/apps/linux-rocky8-x86_64_v3/gcc-
                          13.3.0/python-3.11.9-
                          x74mtjf/bin:/apps/generic/gcc/13.3.0/bin:/apps/utilities:/usr/local/bi
                          n:/usr/bin:/usr/local/sbin:/usr/sbin:/home1/weizhech/.local/bin:/home1
                          /weizhech/bin
          PKG_CONFIG_PATH=/apps/conda/miniforge3/24.3.0/lib/pkgconfig:/apps/spack/2406/apps/linu
                          x-rocky8-x86_64_v3/gcc-13.3.0/openmpi-5.0.5-
                          mufqd73/lib/pkgconfig:/apps/spack/2406/apps/linux-rocky8-
                          x86_64_v3/gcc-13.3.0/openblas-0.3.28-
                          mgrljin/lib64/pkgconfig:/apps/spack/2406/apps/linux-
                          rocky8-x86_64_v3/gcc-13.3.0/python-3.11.9-x74mtjf/lib/pkgconfig
              PYTHON_ROOT=/apps/spack/2406/apps/linux-
                          rocky8-x86_64_v3/gcc-13.3.0/python-3.11.9-x74mtjf
       REQUESTS_CA_BUNDLE=<not set>
            SSL_CERT_FILE=<not set>
__LMOD_REF_COUNT_CMAKE_PREFIX_PATH=/apps/spack/2406/apps/linux-rocky8-x86_64_v3/gcc-13.3.0/openmpi-5.0.5-
                          mufqd73:1;/apps/spack/2406/apps/linux-rocky8-x86_64_v3/gcc-
                          13.3.0/openblas-0.3.28-mgrljin:1;/apps/spack/2406/apps/linux-rocky8-
                          x86_64_v3/gcc-13.3.0/python-3.11.9-
                          x74mtjf:1;/apps/generic/gcc/13.3.0:1
__LMOD_REF_COUNT_CPLUS_INCLUDE_PATH=/apps/spack/2406/apps/linux-rocky8-x86_64_v3/gcc-13.3.0/openmpi-5.0.5-
                          mufqd73/include:1;/apps/spack/2406/apps/linux-rocky8-x86_64_v3/gcc-
                          13.3.0/openblas-0.3.28-mgrljin/include:1;/apps/spack/2406/apps/linux-
                          rocky8-x86_64_v3/gcc-13.3.0/python-3.11.9-x74mtjf/include:1
__LMOD_REF_COUNT_C_INCLUDE_PATH=/apps/spack/2406/apps/linux-rocky8-x86_64_v3/gcc-13.3.0/openmpi-5.0.5-
                          mufqd73/include:1;/apps/spack/2406/apps/linux-rocky8-x86_64_v3/gcc-
                          13.3.0/openblas-0.3.28-mgrljin/include:1;/apps/spack/2406/apps/linux-
                          rocky8-x86_64_v3/gcc-13.3.0/python-3.11.9-x74mtjf/include:1
__LMOD_REF_COUNT_LD_LIBRARY_PATH=/apps/conda/miniforge3/24.3.0/lib:1;/apps/spack/2406/apps/linux-rocky8
                          -x86_64_v3/gcc-13.3.0/openmpi-5.0.5-
                          mufqd73/lib:1;/apps/spack/2406/apps/linux-rocky8-x86_64_v3/gcc-
                          13.3.0/openblas-0.3.28-mgrljin/lib64:1;/apps/spack/2406/apps/linux-roc
                          ky8-x86_64_v3/gcc-13.3.0/python-3.11.9-
                          x74mtjf/lib:1;/apps/generic/gcc/13.3.0/lib64:1
__LMOD_REF_COUNT_LIBRARY_PATH=/apps/spack/2406/apps/linux-rocky8-x86_64_v3/gcc-13.3.0/openmpi-5.0.5-
                          mufqd73/lib:1;/apps/spack/2406/apps/linux-rocky8-x86_64_v3/gcc-
                          13.3.0/openblas-0.3.28-mgrljin/lib64:1;/apps/spack/2406/apps/linux-
                          rocky8-x86_64_v3/gcc-13.3.0/python-3.11.9-x74mtjf/lib:1
 __LMOD_REF_COUNT_MANPATH=/apps/conda/miniforge3/24.3.0/share/man:1;/apps/spack/2406/apps/linux-
                          rocky8-x86_64_v3/gcc-13.3.0/openmpi-5.0.5-
                          mufqd73/share/man:1;/apps/spack/2406/apps/linux-rocky8-x86_64_v3/gcc-
                          13.3.0/python-3.11.9-
                          x74mtjf/share/man:1;/apps/generic/gcc/13.3.0/share/man:1;:2
__LMOD_REF_COUNT_MODULEPATH=/apps/spack/2406/apps/lmod/linux-rocky8-x86_64/openmpi/5.0.5-
                          mufqd73/gcc/13.3.0:1;/apps/lmod/modules/gcc/13.3.0:1;/apps/spack/2406/
                          apps/lmod/linux-rocky8-
                          x86_64/gcc/13.3.0:1;/apps/lmod/modules/compilers:1;/apps/lmod/modules/
                          utils:1;/apps/lmod/modules/misc:1
    __LMOD_REF_COUNT_PATH=/apps/conda/miniforge3/24.3.0/bin:1;/apps/spack/2406/apps/linux-rocky8
                          -x86_64_v3/gcc-13.3.0/openmpi-5.0.5-
                          mufqd73/bin:1;/apps/spack/2406/apps/linux-rocky8-x86_64_v3/gcc-
                          13.3.0/python-3.11.9-
                          x74mtjf/bin:1;/apps/generic/gcc/13.3.0/bin:1;/apps/utilities:1;/usr/lo
                          cal/bin:1;/usr/bin:1;/usr/local/sbin:1;/usr/sbin:1;/home1/weizhech/.lo
                          cal/bin:1;/home1/weizhech/bin:1
__LMOD_REF_COUNT_PKG_CONFIG_PATH=/apps/conda/miniforge3/24.3.0/lib/pkgconfig:1;/apps/spack/2406/apps/li
                          nux-rocky8-x86_64_v3/gcc-13.3.0/openmpi-5.0.5-
                          mufqd73/lib/pkgconfig:1;/apps/spack/2406/apps/linux-rocky8-
                          x86_64_v3/gcc-13.3.0/openblas-0.3.28-
                          mgrljin/lib64/pkgconfig:1;/apps/spack/2406/apps/linux-
                          rocky8-x86_64_v3/gcc-13.3.0/python-3.11.9-x74mtjf/lib/pkgconfig:1

     active environment : None
       user config file : /home1/weizhech/.condarc
 populated config files : /apps/conda/miniforge3/24.3.0/.condarc
                          /home1/weizhech/.condarc
          conda version : 24.3.0
    conda-build version : not installed
         python version : 3.10.14.final.0
                 solver : libmamba (default)
       virtual packages : __archspec=1=cascadelake
                          __conda=24.3.0=0
                          __glibc=2.28=0
                          __linux=4.18.0=0
                          __unix=0=0
       base environment : /apps/conda/miniforge3/24.3.0  (read only)
      conda av data dir : /apps/conda/miniforge3/24.3.0/etc/conda
  conda av metadata url : None
           channel URLs : https://conda.anaconda.org/conda-forge/linux-64
                          https://conda.anaconda.org/conda-forge/noarch
                          https://repo.anaconda.com/pkgs/main/linux-64
                          https://repo.anaconda.com/pkgs/main/noarch
                          https://repo.anaconda.com/pkgs/r/linux-64
                          https://repo.anaconda.com/pkgs/r/noarch
          package cache : /home1/weizhech/.conda/pkgs
       envs directories : /home1/weizhech/.conda/envs
                          /spack/conda/miniconda3/4.12.0/envs
                          /project/skoenig_523/zhihan/conda_envs
                          /apps/conda/miniforge3/24.3.0/envs
               platform : linux-64
             user-agent : conda/24.3.0 requests/2.31.0 CPython/3.10.14 Linux/4.18.0-553.22.1.el8_10.x86_64 rocky/8.10 glibc/2.28 solver/libmamba conda-libmamba-solver/24.1.0 libmambapy/1.5.8
                UID:GID : 601009:92503
             netrc file : /home1/weizhech/.netrc
           offline mode : False


An unexpected error has occurred. Conda has prepared the above report.
If you suspect this error is being caused by a malfunctioning plugin,
consider using the --no-plugins option to turn off plugins.

Example: conda --no-plugins install <package>

Alternatively, you can set the CONDA_NO_PLUGINS environment variable on
the command line to run the command without plugins enabled.

Example: CONDA_NO_PLUGINS=true conda install <package>

If submitted, this report will be used by core maintainers to improve
future releases of conda.

Is VSCode removed from ondemand? I cannot see it in the apps.

How can we clear the ssh warning when using the OnDemand Cluster Shell Login to access the command line?

Take the command

ssh-keygen -f ‘~/.ssh/known_hosts’ -R ‘discovery.usc.edu’

as an example, you need to change ~/.ssh/known_hosts

with the path shown below the warning (Add correct host key in /xxx/xxx/.ssh/known_hosts to get rid of this message.). That’s how I did and it worked.

Thank you. I also tried clearing known_hosts and that worked as well.

Hi, upon logging into Discovery/Endeavour I’m faced with this error:

Lmod has detected the following error:  The following module(s) are unknown: "intel"

Please check the spelling or version number. Also try "module spider ..."
It is also possible your cache file is out-of-date; it may help to try:
  $ module --ignore_cache load "intel"

Also make sure that all modulefiles written in TCL start with the string #%Module


Lmod has detected the following error:  The following module(s) are unknown: "intel-mkl"

Please check the spelling or version number. Also try "module spider ..."
It is also possible your cache file is out-of-date; it may help to try:
  $ module --ignore_cache load "intel-mkl"

Also make sure that all modulefiles written in TCL start with the string #%Module


Error:  Environment variable MKLROOT is not set.  Source the
        Intel Compiler setup script and re-execute this command.

My connection to the server is then terminated. Has anyone else come across this problem?

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.