Anyone experiencing difficulty connecting to discovery clusters?

After last maintenance, I found I am having trouble log into the discovery clusters using ssh. When I connect to either or, there is a possibility that it will trigger a bug with the following error message:
“/spack/apps/linux-centos7-x86_64/gcc-4.8.5/lua-5.3.5-rzebuo27twsh7lddxyfysshtj52ycsmy/bin/lua: …zl4aoqzz6fgqaoszbd65/lmod/lmod/libexec/MasterControl.lua:152: attempt to index a nil value (local ‘result’)
stack traceback:
…zl4aoqzz6fgqaoszbd65/lmod/lmod/libexec/MasterControl.lua:152: in upvalue ‘l_error_on_missing_loaded_modules’
…zl4aoqzz6fgqaoszbd65/lmod/lmod/libexec/MasterControl.lua:834: in function ‘MasterControl.mustLoad’
…g7wfwzl4aoqzz6fgqaoszbd65/lmod/lmod/libexec/cmdfuncs.lua:457: in function ‘Load_Usr’
…-qzzv4g6g7wfwzl4aoqzz6fgqaoszbd65/lmod/lmod/libexec/lmod:512: in function ‘main’
…-qzzv4g6g7wfwzl4aoqzz6fgqaoszbd65/lmod/lmod/libexec/lmod:570: in main chunk
[C]: in ?”

It happens even more often if I try to create a new tmux session or interactive slurm session. Anyone has idea on why this is happening?


There’s a new process limit on the login nodes, so if you have enough processes running you may see a message like that. SSH connections, tmux sessions, salloc, etc. create processes and process threads.