Run out of memory problem with slurm


I am running an algorithm with sbatch with it doesn’t work and the slurm.out file has the following error message:

slurmstepd: error: Detected 1 oom-kill event(s) in StepId=5262200.0 cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler.

srun: error: d11-16: task 0: Out Of Memory

Any idea on how to fix this?


Are you submitting your job in home directory? That could be a problem. You should try submitting your job in your project directory.

@luisalbe The out-of-memory error means you’ll have to increase your memory request, either the --mem-per-cpu option or the --mem (per node) option. You could also try to reduce the memory used by your job if possible. You could perform memory profiling to identify parts of your code that use the most memory.