Slurm commands for monitoring jobs now available

Following up on our last post, we have now made available three commands for monitoring your jobs on CARC systems. These are available on the login nodes.


Print job queue information for user.

$ myqueue
      Job ID     Job Name  Partition    State     Elapsed     Nodelist(Reason)
------------ ------------ ---------- -------- ----------- --------------------
487418         main  PENDING        0:00         (Dependency)
486794          main  RUNNING  1-12:13:20               d05-06


Print a compact history of user’s jobs with basic job information.

$ jobhist
 Startdate        Job ID     Job Name  Partition      State    Elapsed Nodes CPUs  Memory
---------- ------------- ------------ ---------- ---------- ---------- ----- ---- -------
2021-06-25 484933      debug     FAILED   00:00:05     1    4    30Gn
2021-06-25 484934      debug  COMPLETED   00:00:19     1    4    30Gn
2021-06-26 486288         interactive       main  COMPLETED   00:20:53     1   16     2Gc
2021-06-27 486290         interactive      debug    TIMEOUT   00:30:02     1   16     2Gc
2021-06-28 486624        oneweek  COMPLETED 3-00:30:22     1   16   120Gn

Add an integer argument to the command to print the job history for that many days in the past. The default value is 7 days.


Print detailed information for a pending, running, or completed job.

$ jobinfo 483699
Job ID               : 483699
Job name             :
User                 : ttrojan
Account              : ttrojan_123
Cluster              : discovery
Partition            : oneweek
Nodes                : 1
Nodelist             : e01-76
CPUs                 : 16
GPUs                 : 0
State                : FAILED
Exit code            : 1:0
Submit time          : 2021-06-22T12:38:06
Start time           : 2021-06-22T12:38:15
End time             : 2021-06-25T12:40:53
Wait time            :   00:00:09
Reserved walltime    : 3-07:00:00
Used walltime        : 3-00:02:38
Used CPU time        : 3-14:00:01
% User (computation) : 93.82%
% System (I/O)       :  6.18%
Memory reserved      : 248G/node
Max memory used      : 218.60G (e01-76)
Max disk write       : 143.41G (e01-76)
Max disk read        : 352.60G (e01-76)

We will also add job efficiency information to this output at a later date.