Errror running code after maintenance

zgou · May 27, 2021, 7:31am

After the maintenance of the discovery cluster, when I use salloc to request a core to run my code, errors pops out like this. However, this does not happen before the maintence. Does anyone know why this happens?

dstrong · May 27, 2021, 10:28pm

@zgou What modules did you have loaded? What does your shell script do? Could you share if possible?

zgou · May 27, 2021, 10:46pm

I load usc module. And the shell script looks like this:

The AEROS is a software I self-compiled and installed in my folder. In fact, I can submit a slrum job to run AEROS with 6 cores. But I cannot do interactive jobs with salloc command to run AEROS. The slurm script looks like this:
#!/bin/bash

#SBATCH --ntasks=1
#SBATCH --cpus-per-task=6
#SBATCH --time=1:00:00
#SBATCH --mem=5000MB

cd /home1/zgou/RareEvent/simulation/Hexbar.ymtt/NewMesh/
module load gcc/8.3.0 openmpi/4.0.2 pmix/3.1.3
export OMP_NUM_THREADS=6

./run.sh

dstrong · May 27, 2021, 11:24pm

And AEROS was compiled with openmpi/4.0.2? I see you’re only requesting 1 CPU for the interactive job. Try salloc --cpus-per-task=6 and see if it works then. There was a change to salloc with our recent Slurm update, so this issue is perhaps related to that.

zgou · May 27, 2021, 11:45pm

Yes, AEROS is compiled with openmpi/4.0.2. Just now I used salloc to request 6 cpus to run this but result in the same error. I didn’t encounter this error before the maintenance.

dstrong · May 28, 2021, 12:06am

Okay, I was able to reproduce this, but running the command with srun solved it. Try running srun ./run.sh.

zgou · May 28, 2021, 12:27am

It’s not working in my case. As the picture shows below, I requested 6 CPUS and use “srun ./run.sh”, then a new error shows “exec format error”. But when I exit the interactive mode and do “./run.sh” on the login node, it works although I am not supposed to do this on the login node. However, it does prove that there is no error in the exec format.

dstrong · May 28, 2021, 12:34am

Ah, right. You’ll need to add #!/bin/bash to the top of your shell script.

zgou · May 28, 2021, 12:37am

Cool, it works! Thank you so much.