There are many HPC use cases that do not require the loading of modules. For example:
(1) login to USC CARC HPC to transfer files (e.g. via rsync)
(2) login to USC CARC HPC to monitor running jobs (e.g. ssh into compute nodes)
(3) login to USC CARC HPC to submit jobs via sbatch
Some use cases which do require the loading of modules are:
(1) login to compile and debug code
(2) login to run an interactive session via salloc
Upon login to Discover and Endeavour, the bashrc file is automatically sourced. For the many use cases that do not require loading modules, having ‘module load package’ commands in the bashrc will create network traffic on the applications server unnecessarily. It is more efficient to load modules only when required by a users computational workflow.
The modulefiles system used on Discovery/Endeavour is Lmod. The most demanding task in Lmod is the 'module load ’ function, which actions Lmod to walk the entire directory tree to search all of the modulefiles in your MODULEPATH.
The default MODULEPATH on Discovery/Endeavour can be checked with:
$ echo $MODULEPATH
The above MODULEPATH arises from the default software stack loaded upon login (gcc/8.3.0; openmpi/4.0.2; openblas/0.3.8; pmix/3.1.3). The default modules incur a smaller network footprint due to a pre-load lmod feature and are provided as a general, robust programming environment for users. Additional ‘module load package’ commands tax the application server by invoking a walking of the multiple directory trees in MODULEPATH. When included in the bashrc, this traversal is performed immediately upon login for multiple users.
The main issue underlying the recent “saturation” of the application server network is the large software stack associated with the gcc/8.3.0 programming environment. USC CARC HPC users have reported that the large and encompassing software stack is a major plus and the CARC HPC team is intent on providing a fast module-based software stack for all HPC users. To address the recent issue with the congestion of the application server, the corresponding application network service was upgraded from a 1GB to a 10GB transfer rate.
Furthermore, the CARC HPC team is working on implementing a continously updated system cache build. Lmod will use the spider cache file as a replacement for walking the directory tree to find all modulefiles in your MODULEPATH. Spider cache file(s) provide a way for Lmod to know which modules exist and any properties that a modulefile may have. Reading a single file will be much faster than walking the directory tree.