GPU jobs pending despite idle resouces

Hi there,

I am trying to understand slurm allocation algorithm… In particular, I have submitted some jobs that require a40 GPU, and they are shown as queued, waiting for resources. However, the output of nodeinfo shows that there are multiple(12?) idle gpus of the type I’ve requested, so I was wondering what am I missing and why my allocations are not being fulfilled. I read on the CARC page that the limit of GPU usage is 36, and I am only allocated 3 right now.

Thanks!
Shushan

Hi Shushan,

It appears as the jobs you are referring to are no longer in the queue as of this writing so it’s hard to know why they were pending at that time.

Generally you see this when a job with higher priority will be using them soon. Slurm will allow jobs to run on these nodes but only if they can be scheduled to complete before the higher priority job begins.

-Cesar