Job monitoring/management commands¶
There are a number of commands to monitor jobs on Artemis. A brief set of useful commands is shown below. For more commands, see the PBS Professional user manual.
|qstat -u abcd1234||show status of abcd1234’s jobs|
|qdel 1234567||delete job 1234567 from queue|
|qstat||show status of all jobs|
|qstat -f 1234567||show detailed stats for job 1234567|
|qstat -xf 1234567||show detailed stats for job 1234567, even after it has finished|
When jobs finish, they produce three output files. One for standard output, one for standard error and a resource usage file. The file formats are as follows:
<JobName>.o<JobID> – Standard output file <JobName>.e<JobID> – Standard error file <JobName>.o<JobID>_usage – Resource usage file
If you don’t redirect standard output or standard error to a file, they will be printed
.o or the
.e files and only appear after your jobs finish. These files
may contain useful information about why your job terminated before it finished.
The resource usage file contains details about how long your job ran for and also the memory used by your job. You can use the information in the resource usage file to optimise your walltime and memory requests for future jobs. An example resource usage file is shown below:
Job Id: 1050977.pbsserver for user abcd1234 in queue small Job Name: TestJob Project: RDS-ICT-PANDORA-RW Exit Status: 0 Walltime requested: 00:03:00 : Walltime used: 00:01:36 Cpus requested: 48 : Cpu Time: 00:36:38 : Cpu percent: 3102 Mem requested: 8gb : Mem used: 2342348kb VMem requested: None : VMem used: 2342348kb PMem requested: None : PMem used: None