Currently tmux that we get from module load tmux has its temp dir as TMUX_TMPDIR=/scratch2/$USER
The issue is it is the same on both the clusters: discovery and endeavour
The tmux processes on these head nodes try to use the same socket file /scratch2/$USER/tmux-<some_num>/default because the file system is shared.
It looks like one process overwrites the socket of the other.
For instance, if I use tmux on endeavour1, my tmux on discovery1 fails to reconnect, and vice versa
Thanks for looking into this. There’s a lot of flexibility in what we can do thanks to Lmod. For example we added a custom environment setting to set TMUX_TMPDIR=/scratch2/$USER.
It’s trivial to make it TMUX_TMPDIR=/scratch2/$USER/tmux/$HOSTNAME but that might mess up current tmux sessions. We’ll have to send out an announcement but for now, you can manually add in the hostname.
Well that is the wrong configuration. If I login in endeavour I want to reconnect to my TMUX session no matter if I end up in endeavour1 or endeavour2… When I load the tmux module sessions are created in different directories. I can’t possibly keep track if worked in endeavour1 or endeavour2. Can you name that dir plain endeavour and plain discovery ? I am not sure if that would work.
if your server is running on endeavour1 and the client is also on the same machine, then tmux works out of the box. I haven’t tried connecting tmux client to a tmux server on another machine. (Well I have tried in the past using iTerm client connect to tmux server on remote machine; but that’s a different setting)
Easy fix: always decide which node to login to: either endeavour1.usc.edu or endeavour2.usc.edu instead of letting a random prompt decide it via endeavour.usc.edu