Avoid using 'module load <package>' commands in your bashrc file

The login is a sensitive entry point in a shared resource such as the Discovery and Endeavour login nodes. In order to avoid saturating the applications server, which deploys the large software stack that is made available to all HPC users, the USC CARC recommends for users to avoid adding “module load package” commands to their bashrc file.

One alternative is for users to create a bash alias. A bash alias is a method of supplementing or overriding bash commands with new ones. Bash aliases make it easy for users to customize their experience and environment in a terminal session.

A custom module environment is great for users that employ a consistent/constant set of modules in their computational workflow. Users may define such a custom environment via an alias provided in the $HOME/.bashrc file.

As an example, a user may make use of the following modules under the gcc-8.3.0 programming environment in their daily computational workflow:

relion/3.1.1-cuda-11.1
cuda/11.1-1
r/4.0.3

The alias would take the name of the user’s liking:

alias load_relion_gcc="module purge ; module load usc ; module load relion/3.1.1-cuda-11.1 ; module load r/4.0.3"

Then, the user would issue the alias command in the terminal via:

$ load_relion_gcc

The ‘load_relion_gcc’ command may also be used in a slurm job script for batch job submission.

1 Like

If you alias the module commands and type the alias every time you log in, how is this different from just having it in the .bashrc?

Does the shell SLURM spawns for every job read the user’s .bashrc?

So if I follow this way every time I login to the cluster I need to type the alias? Is there any way that the modules can be loaded automatically so there’s no extra step every time login?

There are many HPC use cases that do not require the loading of modules. For example:

(1) login to USC CARC HPC to transfer files (e.g. via rsync)

(2) login to USC CARC HPC to monitor running jobs (e.g. ssh into compute nodes)

(3) login to USC CARC HPC to submit jobs via sbatch

Some use cases which do require the loading of modules are:

(1) login to compile and debug code

(2) login to run an interactive session via salloc

Upon login to Discover and Endeavour, the bashrc file is automatically sourced. For the many use cases that do not require loading modules, having ‘module load package’ commands in the bashrc will create network traffic on the applications server unnecessarily. It is more efficient to load modules only when required by a users computational workflow.

The modulefiles system used on Discovery/Endeavour is Lmod. The most demanding task in Lmod is the 'module load ’ function, which actions Lmod to walk the entire directory tree to search all of the modulefiles in your MODULEPATH.

The default MODULEPATH on Discovery/Endeavour can be checked with:

$ echo $MODULEPATH
/spack/apps/lmod/linux-centos7-x86_64/openmpi/4.0.2-ipm3dnv/openblas/0.3.8-2no6mfz/gcc/8.3.0:/spack/apps/lmod/linux-centos7-x86_64/openmpi/4.0.2-ipm3dnv/gcc/8.3.0:/spack/apps/lmod/linux-centos7-x86_64/openblas/0.3.8-2no6mfz/gcc/8.3.0:/spack/apps/lmod/linux-centos7-x86_64/gcc/8.3.0:/spack/apps/lmod/linux-centos7-x86_64/Core

The above MODULEPATH arises from the default software stack loaded upon login (gcc/8.3.0; openmpi/4.0.2; openblas/0.3.8; pmix/3.1.3). The default modules incur a smaller network footprint due to a pre-load lmod feature and are provided as a general, robust programming environment for users. Additional ‘module load package’ commands tax the application server by invoking a walking of the multiple directory trees in MODULEPATH. When included in the bashrc, this traversal is performed immediately upon login for multiple users.

The main issue underlying the recent “saturation” of the application server network is the large software stack associated with the gcc/8.3.0 programming environment. USC CARC HPC users have reported that the large and encompassing software stack is a major plus and the CARC HPC team is intent on providing a fast module-based software stack for all HPC users. To address the recent issue with the congestion of the application server, the corresponding application network service was upgraded from a 1GB to a 10GB transfer rate.

Furthermore, the CARC HPC team is working on implementing a continously updated system cache build. Lmod will use the spider cache file as a replacement for walking the directory tree to find all modulefiles in your MODULEPATH. Spider cache file(s) provide a way for Lmod to know which modules exist and any properties that a modulefile may have. Reading a single file will be much faster than walking the directory tree.

2 Likes