The National Institute for Computational Sciences

Running Jobs


When you log in, you will be directed to one of the login nodes. The login nodes should only be used for basic tasks such as file editing, code compilation, and job submission. Please do not run production jobs on the login nodes. If you submit a production job on a login node, it will be administratively terminated. Instead, use the ACF’s compute resources for production jobs. In this document, you will learn how to execute, monitor, and modify jobs on ACF resources.

Job Submission

Batch Scripts

Batch scripts are used to submit jobs to the ACF. Batch scripts allow users to run non-interactive batch jobs. These jobs submit a group of commands, run them through the queue, and then output the results for review.

All non-interactive jobs must be submitted to the ACF using job scripts via the qsub command. Batch scripts are shell scripts that contain PBS flags and commands to be interpreted by the shell. The batch script is submitted to the system’s resource manager to be parsed, queued, and executed. At the time of this writing, up to 2500 non-interactive jobs may be submitted by a single user at one time.

To write a batch script, use your preferred text editor, such as vim, nano, or emacs. Once open, insert the following information into the editor. Follow this order as presented. Deviating from this pattern will produce undesirable results.

  1. First, specify the script’s interpreter. If you do not specify the interpreter, the system will use its default shell. At the time of this writing, the default shell on the ACF is bash. Use the #PBS -S /path/to/shell option as the first line of your batch script. Table 2.1 shows the shells available on the ACF. Any of the interpreters listed in the figure may be used, but only one interpreter may be specified in a single batch script.

  2. Table 2.1 - Options for the script interpreter
    /bin/sh /bin/bash
    /usr/bin/csh /usr/bin/ksh
  3. Second, specify the necessary PBS submission options. Each option must be preceded by #PBS.

  4. Third, specify the shell commands. These commands are the executable content of the batch script. These commands must follow the final #PBS option. It is best to execute cd $PBS_O_WORKDIR as the first command in the script so that your job is executed within Lustre space. Switch to Lustre space before submitting the job so that this environment variable will reference your Lustre space. You can change your directory to Lustre space by executing cd $SCRATCHDIR. Additionally, you may use the mpirun command to specify how many MPI ranks will be used by your job. In Figure 2.1, mpirun uses 28 MPI tasks, one per core. The ./job_file target is the file that the job will execute. For more information on MPI and the ranking process, review the mpirun section of this document.

Figure 2.1 depicts a basic batch script. The specifics of this example will be explained in subsequent sections.

#PBS -S /bin/bash
#PBS -l partition=general,feature=sigma_bigcore,nodes=1,walltime=24:00:00

mpirun -n 28 ./job_file
Figure 2.1 - Basic Batch Script

Altering Batch Jobs

After you submit a job, you may need to modify it. Several commands facilitate these modifications. If you need to modify a running job, please contact the ACF staff. Certain modifications can only be performed by administrators. For further information on the commands presented here, use man <command-name> on the ACF for documentation on the command, including the options and arguments you may use with it.

Remove a Job from the Queue

Jobs in the queue in any state can be stopped and removed from the queue using the qdel command. For example, qdel 1234 would remove the job with that identifier. Note that job identifiers can be viewed with the qstat -a command.

Hold a Queued Job

Jobs in the queue that are not running may be placed on hold using the qhold command. For example, to move the job with the identifier of 1234 into a hold state, use qhold 1234. Jobs placed on hold remain in the queue, but they will not be executed.

Release a Held Job

When you place a job on hold, it will not execute until it is released. To release a job in a held state, use the qrls command. For example, to release the job with the identifier of 1234 from a held state, use qrls 1234.

Modify Job Details

Non-running jobs or jobs in a held state can be modified with the qalter command. The various uses of this command are presented in Table 2.2. For walltime modifications, please note that you cannot specify a new walltime that exceeds the maximum walltime for the queue where your job is.

Table 2.2 - qalter options
qalter -N <newname> <jobid> Modifies a job's name
qalter -l nodes <numnodes> <jobid> Modifies the number of requested nodes
qalter -l walltime <hh:mm:ss> <jobid> Modifies the job's walltime
qalter -W depend type:argument <jobid> Sets dependencies for a job
qalter -W depend type <jobid> Removes dependencies from a job

To verify that the changes to your job completed successfully, use qstat -a <jobid>.

Interactive Batch Jobs

Interactive batch jobs allow users to directly manipulate compute resources. One common use for interactive batch jobs is debugging. This section demonstrates how to run interactive jobs and provides common usage tips.

Users are not allowed to run interactive jobs from login nodes. If you submit an interactive job on a login node, it will be administratively terminated. Instead, run your interactive jobs with the qsub -I command. Figure 2.2 shows the syntax used to run an interactive job. Table 2.3 explains the interactive submission options. Note that the -I option is an upper-case "i," not a lower-case "l."

qsub -I -A ACF-UTK0011 -l nodes=1,walltime=1:00:00
Figure 2.2 - Syntax for an Interactive Job

Table 2.3 - Options for Interactive Jobs
-I N/A Start an interactive session
-A <account> Change to another account
-l nodes <numnodes> Request the specified number of nodes

After running this command, you must wait until enough compute nodes are available. Once the job starts, the standard input and standard output of this terminal will be linked directly to the head node of the allocated resource. The executable should then be placed on the same line after the mpirun command. Figure 2.3 provides an example of what to type when the interactive job starts.

mpirun -n 16 ./job_file
Figure 2.3 - Commands to Run When the Interactive Job Begins

Issuing the exit command will end the interactive job.

PBS Usage

This section gives an overview of common PBS options. Table 3.1 lists the PBS options necessary to run a job. Table 3.2 lists PBS options that might be useful to your job.

PBS Options

Table 3.1 - Necessary PBS Options
S #PBS -S <shell> Sets the shell to interpret the job script. Please refer to the Batch Scripts section for a list of the available shells.
A #PBS -A <account> Causes the job time to be charged to <account>. The account string is typically composed of three letters followed by three digits and optionally followed by a subproject identifier. If no account is specified, the system will use your default project. Please refer to the Projects section for more information.
l #PBS -l nodes=<numnodes>:ppn=<numcores> Number of requested nodes and the number of requested cores on each node. The ppn option is not required, but it allows you to control process placement.
  #PBS -l mem=<requested-memory> Amount of requested memory for the job. Specify the amount in MB, GB, or TB. This option is not required, but it enables you to specify the amount of memory your job will need.
  #PBS -l walltime=<time> Maximum wall-clock time. <time> is in HH:MM:SS format. The default walltime is one hour.
Table 3.2 - Useful PBS Options
o #PBS -o <name> Writes standard output to <name> instead of <job script>.o$PBS_JOBID. $PBS_JOBID is an environment variable created by PBS that contains the PBS job identifier.
e #PBS -e <name> Writes standard error to <name> instead of the <job script>.e$PBS_JOBID.
j #PBS -j {oe,eo} Combines standard output and standard error into the standard error file (eo) or the standard out file (oe).
m #PBS -m a Sends email to the submitter when the job aborts.
  #PBS -m b Sends email to the submitter when the job begins.
  #PBS -m e Sends email to the submitter when the job ends.
M #PBS -M <address> Specifies email address to use for -m options.
N #PBS -N <name> Sets the job name to <name> instead of the name of the job script.
q #PBS -q <queue> Directs the job to the run under the specified QoS. This option is not required to run in the default QoS.
l #PBS -l feature=<feature> Selects the desired node feature set.
  #PBS -l partition=<partition> Selects the desired partition to use for your job.

When you use the -l (lower-case “l”) option with nodes and ppn, they must be separated with colons (:). For example, if you request two 24-core nodes and 12 cores per node, you would use #PBS -l nodes=2:ppn=12 in your batch script. Be aware that the -l option does not tolerate spaces; each additional argument must be separated by commas.

If you need to pass arguments to your batch script, the -F option of the qsub command provides this capability. For instance, to pass arguments to a batch script named main_job_script from within your working directory, use qsub main_job_script -F “arguments”. The batch script must accept positional parameters for this to work. Please refer to the Positional Parameters section of the Advanced Bash-Scripting Guide for more information.

Please do not use the PBS -V option. This can propagate large numbers of environment variable settings from the submitting shell into a job which may cause problems for the batch environment. Instead of using PBS -V, please pass only necessary environment variables using -v <comma_separated_list_of_needed_envars>. You can also include module load statements in the job script. Figure 3.1 shows an example of this process.

Figure 3.1 - Using PBS -v for Environment Variables

Environment Variables

This section gives an overview of useful environment variables within PBS jobs. These variables contain useful information that can simplify your batch scripts. Table 3.3 lists and describes these variables.

Table 3.3 - PBS Environment Variables
$PBS_O_WORKDIR Directory from which the batch job was submitted.
$PBS_JOBID Refer to the job's identifier.
$PBS_NNODES Return the number of logical cores requested by a job.

To refer to the directory from which the batch job was submitted, use the $PBS_O_WORKDIR environment variable. Figure 3.1 shows how to use this variable with the builtin shell command cd. If you executed this command during an interactive job, your working directory would change to the directory from which you submitted the batch job.

Figure 3.2 - Changing to the PBS Submission Directory

In situations where you wish to append a job’s ID to the standard output and standard error files, the $PBS_JOBID environment variable is useful. For example, placing the PBS option shown in Figure 3.2 in a batch script would write the job’s output to the specified file.

#PBS -o scriptname.$PBS_JOBID
Figure 3.3 - Using $PBS_JOBID to redirect standard output to a separate file

The $PBS_NNODES variable is most useful when used with mpirun. For example, rather than manually specify the number of nodes yourself, you could instead use the command shown in Figure 3.3.

mpirun -n $($PBS_NNODES) ./job_file
Figure 3.4 - Using $PBS_NNODES with mpirun

Using mpirun

The mpirun command facilitates the execution of MPI programs. These programs execute in parallel across multiple nodes to enhance performance and resource utilization. When you use mpirun, you can specify the total number of ranks you desire the program to use, in addition to the amount of processes you wish to run on each node. By specifying the amount of ranks and processes, you have greater control over the execution of your jobs on the ACF.

Before you use mpirun in your job, please review the Systems Overview document to familiarize yourself with the core counts of each node. Understanding the amount of cores at your disposal is critical to using mpirun correctly.

To specify the amount of ranks for your MPI program, use the -n option of mpirun. For instance, if you execute mpirun -n 16 ./test_job on a single Beacon node, one rank will be placed on each core because one Beacon node has a total of sixteen cores between two processors.

mpirun is not limited to one rank per core, however; nodes can be oversubscribed. To oversubscribe a node is to specify more ranks than the node has cores. By default, additional ranks will not be placed until all the cores on each node are filled. To illustrate this process, consider a job that has requested four Rho nodes. Each Rho node has sixteen cores; in this case, the job has 64 cores allocated to it. If this job executes mpirun -n 256 ./rho_job, 64 ranks will be placed across each core on each node. After all 64 cores have received a rank, an additional 64 ranks will be placed on each core. This process will continue until each rank has been allocated.

If the amount of ranks is fewer than the available cores on a node, the ranks are evenly spread across processors. As mentioned previously, one Beacon node has sixteen cores between two processors. If a job executes mpirun -n 8 on one of these nodes, four ranks will be placed on the first processor and four ranks will be placed on the second processor.

For greater control over rank placement, mpirun uses the -ppn option. ppn (processes per node) defines how many ranks should execute on each node. By default, ranks are placed based upon the number of cores each node contains. As an example, using mpirun -n 45 -ppn 15 ./ppn_job across three Beacon nodes would place sixteen ranks on the first two nodes and thirteen ranks on the last one. To override this behavior, use the -f $PBS_NODEFILE option with mpirun so that it can use the -ppn option properly. If you execute mpirun -n 45 -ppn 15 -f $PBS_NODEFILE ./ppn_job, it will place fifteen ranks across all three Beacon nodes.

Before you attempt to run an MPI program, verify that you have loaded the appropriate compiler and MPI implementation with the module list command. By default, Intel’s MPI implementation is loaded into your environment. You can switch to other implementations with the module swap command. Please refer to the Modules document for more information on how to use the module commands. If you intend to use a Python MPI program, load the mpi4py module.

Job Monitoring

Before and during job execution, you may wish to monitor the job’s status. The ACF features several commands that enable such monitoring. Several of these commands are listed and described below.

To view the status of your submitted jobs, use the qstat -a command. The output of the command shown in Figure 5.1 is explained in Table 5.1. Additionally, the possible statuses for a job are listed in Table 5.2.

> qstat -a 
Job ID               Username    Queue    Jobname          SessID NDS   TSK    Memory   Time     S  Time
-----------------  -----------     --------   ----------------   ------    -----   ------  ------        --------   -  --------
102903              lucio       batch    STDIN              9317    --       16       --            01:00:00 C 00:06:17
102904              lucio       batch    STDIN              9590    --       16       --            01:00:00 R      -- 
Figure 5.1 - Output of qstat -a

Table 5.1 - Explanation of qstat -a Output
Job ID The first column gives the PBS-assigned job identifier.
Username The second column gives the submitting user's login name.
Queue The third column gives the queue into which the job has been submitted.
Jobname The fourth column gives the PBS job name. This is specified by the PBS -N option in the PBS batch script. Or, if the -N option is not used, PBS will use the name of the batch script.
SessID The fifth column gives the associated session ID.
NDS The sixth column gives the PBS node count.
Tasks The seventh column gives the number of logical cores requested by the job's -size option.
Requested Memory The eighth column gives the job's requested memory.
Requested Time The ninth column gives the job's requested wall time.
S (Status) The tenth column gives the job's current status. See Table 5.2 for status types.
Elapsed Time The eleventh column gives the job's time spent in a running status. If a job is not currently or has not been in a run state, the field will be blank.
Table 5.2 - Status Values for Jobs
E The job has finished running and is exiting.
H The job is being held.
Q The job is queued.
R The job is running.
S The job is suspended.
T The job is being transferred to a new location.
W The job is waiting for execution.
C The job was completed within the last five minutes.

To determine the current state of your submitted jobs, use the showq utility. If you execute showq without any options, it will return a list of all jobs on the ACF. To only see jobs that belong to you, execute showq -w user=$USER. Table 5.3 shows the possible states of your jobs.

Table 5.3 - Possible Job States
Running The jobs are currently running.
Idle These jobs are currently queued awaiting to be assigned resources by the scheduler. A user is allowed five jobs in the Idle state to be considered by the scheduler.
Blocked Blocked jobs are those that are ineligible to be considered by the scheduler. Common reasons for jobs in this state are jobs that the specified resources are not available or the user or system has put a hold on the job.
BatchHold These jobs are currently in the queue but are on hold from being considered by the scheduler usually because the requested resources are not available in the system or because the resource manager has repeatedly failed in attempts to start the job.

To see the status of a specific job in the queue, use the checkjob utility. For example, using checkjob 1234 would return the status of the job with the identifier of 1234. This can be helpful to determine whether a job is blocked and the reason why.

To determine when a submitted job will start, use the showstart utility. For example, using showstart 1234 would return the estimated start time of the job with the identifier of 1234. Note that the start time is subject to dramatic change, so periodically rerun the command to get a clearer picture of when the job will start. Be aware that showstart will not accurately produce the start time of your jobs. For this reason, it is best to only use the utility for diagnostic purposes.

Scheduling Policy

Several factors influence the priority of a given job. The major factors are listed and described below.

  1. Jobs that request more nodes get a higher priority.
  2. Priority increases along with a job’s queue wait time. Blocked jobs are not counted because the scheduler does not see them as queued.
  3. The number of jobs submitted by a user influences the priority of those jobs. At the time of this writing, only ten jobs can be executed by a single user at a time. Single core jobs submitted by the same user are generally scheduled on the same node. Users on the same project can share nodes with written permission of the PI.

In certain cases, the priority of a job may be manually increased upon request. To request a priority change for one of your jobs, please submit a ticket to They will need the job ID and reason to submit the request.


The ACF uses a condo-based model for scheduling. In a condo-based model, nodes are grouped into logical units consisting of several compute nodes that are scheduled as an independent cluster. Condos are provided for institutional or individual investments. All faculty, staff, and students will have access to an institutional condo provided by funding from their respective institution. Individual investors will be provided exclusive access to a given number of nodes commensurate with their investment level. Investor projects will have exclusive use of the nodes in their condo.

Condos use projects, queues, partitions, features, and quality-of-service attributes to control access to condos. In most cases, the project ID will place a job in the correct condo and no other attributes are needed.


By default, UTK users belong to the ACF-UTK0011 project. UTHSC users belong to the ACF-UTHSC0001 project. Because these project are opportunistic, jobs are executed as resources become available. At the time of this writing, only 48 jobs across 24 nodes can run at one time for a single opportunistic user. As explained in the PBS Usage section, the -A option specifies the project the job will use. If you need to determine to which projects you belong, navigate to the NICS User Portal and review the “Projects” section. Ensure that your jobs use the appropriate project to guarantee that they will be accepted and run by the scheduler.


Queues are used by the scheduler to aid in the organization of jobs. There are currently two queues: “batch” and “debug.” By default, all jobs are submitted to the "batch" queue and users do not have to indicate that they wish to run in the batch queue.

The ACF has set aside four Rho nodes for debug jobs. Debug jobs are limited to one hour of walltime. To access the debug queue, add #PBS -q debug to your batch script for non-interactive jobs and -q debug for interactive jobs. For example, to submit an interactive job to the scheduler, use qsub -I -q debug. Be aware that the first -I in this example is an upper-case “i.” Add any other options to the command you require for the debug job. Please refer to the Interactive Jobs and PBS Usage sections for more information.


Partitions are used to group similar nodes together. Nodes in the ACF are grouped into the partitions listed below. Note that the UTK Institutional Condo uses the general, beacon, and rho partitions by default while individual condos use the general partition.

  • general (consists of skylake, sigma, and sigma_bigcore nodes)
  • beacon
  • rho
  • monster

To request a partition other than the default, use the #PBS -l partition=<partition> option in your batch script for non-interactive jobs and qsub -I -A <project> -l partition=<partition> for interactive jobs.


Features are an attribute that applies to individual nodes. Features are used to explicitly request nodes based on their feature attribute. For example, to request monster nodes, use the #PBS -l feature=monster option in your batch script. For interactive jobs, use qsub -I -A <project> -l feature=<feature>.

QoS (Quality of Service)

Jobs are assigned a quality of service, or QoS, attribute. Jobs are given a specific QoS based on investment type. The QoS defines the minimum and maximum node allocation a given user can have for jobs. It also defines the wall clock limitations. Table 6.1 outlines the parameters of each investment types.

  • Condo: instructs the scheduler to place a job in an individual condo; default QoS for individual condos
  • Campus: instructs the scheduler to place job in institutional condo; default QoS for the institutional condo
  • Overflow: instructs the scheduler to place a job first on an individual condo and then overflow into the user's institutional condo
  • Long: instructs the scheduler to place a job in the institutional condo and allow it to run for up to six days; only available to UTHSC projects

To change the default QoS, use the #PBS -l qos=<qos> in a batch script for non-interactive jobs. For interactive jobs, use qsub -I -A <project> -l qos=<qos>.

Table 6.1 - Investment Types and Limitations
QoSMin. SizeMax. SizeWall Clock Limit
Condo 1 Node Condo Max. 28 Days
Campus 1 Node 24 Nodes 24 Hours
Overflow 1 Node 24 Nodes 24 Hours
Long (UTHSC Projects Only) 1 Node 24 Nodes 6 Days

Targeting GPU Nodes

If your job(s) require GPUs (graphics processing units), the process for job submission differs. For non-interactive jobs, you specify a partition and feature set that contains GPU nodes in your batch script. For interactive jobs, you specify these options with the qsub -I (lower-case “l”) command. You must also load the relevant modules that will use the GPUs, such as tensorflow-gpu. For more information on modules and the commands to manipulate them, please refer to the Modules document.

If you intend to use the Beacon GPU nodes, use either the ACF-UTK0011 or ACF-UTHSC0001 project for both interactive and non-interactive jobs. You may also use the campus QoS policy. At the time of this writing, the Beacon GPU nodes are the only GPU nodes available to all ACF users. Otherwise, specify a project or QoS policy to which you belong that provides access to GPU nodes. Please refer to the Systems Overview document for more information on which condos have access to GPUs.

Figure 7.1 shows a sample batch script that targets GPUs on the ACF using the -A option. Figure 7.2 shows a similar batch script that adds the qos argument of the -l option. If necessary, replace the tensorflow-gpu module with the modulefile you require or add additional modulefiles. For the other options, please refer to the PBS Usage and Batch Scripts sections of this document for more information. Note that ./gpu_job refers to the code that will execute on the nodes allocated to your job. Verify that it is in the same directory as the batch script if you use these examples as they appear.

#PBS -S /bin/bash
#PBS -l partition=beacon,feature=beacon_gpu,nodes=1,walltime=24:00:00

module load tensorflow-gpu


mpirun -n 16 ./gpu_job
Figure 7.1 – Sample Batch Script to Target GPU Nodes Using the -A Option

#PBS -S /bin/bash
#PBS -l partition=beacon,feature=beacon_gpu,qos=campus,nodes=2,walltime=24:00:00

module load tensorflow-gpu


mpirun -n 32 ./gpu_job
Figure 7.2 – Sample Batch Script to Target GPU Nodes Using the qos Argument

Once you write a batch script that targets GPU nodes, use the qsub command to submit the job to the scheduler. It will enter the queue and run when resources become available.

To target GPU nodes with an interactive job, follow the general process described in the Interactive Jobs section of this document. For the -A option, specify ACF-UTK0011 or ACF-UTHSC0001 if you intend to use Beacon GPU nodes. If not, specify a project to which you belong that provides GPU resources. You may also use the qos argument with the -l (lower-case "l") option. Place the qos argument after the feature argument. For example, using the qsub -I -A ACF-UTK0011 -l partition=beacon,feature=beacon_gpu,qos=campus,nodes=1,walltime=1:00:00 command submits an interactive job to the scheduler that will use a single Beacon GPU node on the default campus project. Note that there cannot be any spaces between the arguments of the -l option. For more information on the other options, please refer to the PBS Usage section of this document.

After the interactive job starts, load the relevant modulefiles with the module load command so that you can utilize the allocated GPUs. Please refer to the Modules document for more information. To query for the GPU and its available driver, execute the nvidia-smi command after the interactive job starts. Figure 7.3 shows a possible output of this command.

[...@acf-bk003 ~]$ nvidia-smi
Tue Dec 31 16:30:13 2019
| NVIDIA-SMI 418.67       Driver Version: 418.67       CUDA Version: 10.1     |
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|   0  Tesla K20Xm         On   | 00000000:81:00.0 Off |                    0 |
| N/A   18C    P8    29W / 235W |      0MiB /  5700MiB |      0%      Default |
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|  No running processes found                                                 |
Figure 7.3 – Sample Output of the nvidia-smi Command

Return to Top

Last Updated: 03 / 03 / 2020