Cannot access compute node during training

After I submitted a job, I cannot access the assigned compute server.

The error is ‘Permission denied (publickey).’

Hi,

This can happen when the cluster/cluster.pub ssh keys are missing from your ~/.ssh directory. To regenerate them, you can delete this directory, and log in again. Create a backup first if necessary.

I recommend you create a new terminal session and keep the original connection open just in case something goes wrong.

Thanks for your help, the problem is solved.

And I also want to mention that only dicovery1.usc.edu would regenerate ~/.ssh for me. Unfortunately, discovery2.usc.edu cannot do the same job, but I don’t think it is a big problem though.