Batch Jobs
来源:互联网 发布:三菱plc模拟软件 编辑:程序博客网 时间:2024/05/01 17:01
- commands: qsub, qstat, qdel
- qsub
- qstat
- qdel
- examples
- serial programme
- parallel: MPI
- parallel: OpenMP
commands: qsub, qstat, qdel
Within the alibaba cluster, the batch queing system torque is used. torque an open source resource manager providing control over batch jobs running on the compute nodes.
The most important commands are qsub for submitting a job, qstat for monitoring its status, and qdel for deleting a job.
For the description of these and related other commands:
qalter, pbs_alterjob, pbs_statjob, pbs_statque, pbs_statserver, pbs_submit, pbs_job_attributes, pbs_queue_attributes, pbs_server_attributes, pbs_resources_*
see http://www.clusterresources.com/wiki/doku.php?id=torque:torque_wiki or the corresponding man-pages.
qsub
The qsub command usually is called with the filename of a script as a parameter. That script holds job parameters as well as the call of the actual programme. Parameters are placed as a comment ("#") in the first lines of the script and start with the command prefix "PBS" followed by the parameter setting, eg
...# PBS -l walltime=6:10:00...
to set the maximum amount of real time during which the job can be in the running state. Parameters can also be specified can as command-line arguments. eg
> qsub -l nodes=12
to request 12 nodes. Command line arguments take precedence over parameters set in the script.
Important options are:
The important resource parameters are:
At run time the following environment variables are set:
qstat
To monitor submitted jobs, the qstat command is used. Though not all job-information will be presented to a normal user, one can get information like job-ID, name, queing status etc. The output can be given in different formats and verbosity.
To get an overview in table form, type qstat without any argument.
Status can be
C Job is completed after having run E Job is exiting after having run H Job is held Q job is queued, eligible to run or routed R job is running T job is being moved to new location W job is waiting for its execution time (-a option) to be reachedMore detailed information can be requested by using the "-f" option:
> qstat -f [job_id]
For more information, see the man-page or the online man page of qstat at torque.
qdel
The qdel command is used to delete a job, which has to be specified by its job-identifier, that is, type
> qdel <job_id>
to delete the job with the id <id>. After submission of the command, a "Delete Job batch request" will be sent to the batch server that owns the job. See the man-page for more information.
examples
Simple examples are given for serial and parallel batch jobs.
serial programme
#PBS -N test1#PBS -j oe#PBS -o /home/user/test/test1.log#PBS -l walltime=100#set -xcd /work/user/home/user/test/a.outexit
The batch jobs executes /home/user/test/a.out in directory /work/user and writes a log file to /home/user/test/test1.log
parallel: MPI
#PBS -N test2#PBS -j oe#PBS -o /home/user/test/test2.log#PBS -l nodes=3:ppn=8#PBS -l walltime=100#set -xcd /work/user/home/user/test/a.out -np 24exit
The job executes on 3 nodes using all 8 cores of each node. On has to make sure that the "-l nodes=...:ppa=..." and "-np ..." specifications match. In parallel jobs one should always request 8 cores per node (ie ppn=8). Otherwise one would share nodes with other users what should be avoided.
A special case is ppn=4 (or smaller). In that case one should also specify the interconnect one wants to use. This can be done by adding #PBS -q gbe
for gigabit-ethernet or #PBS -q ib
for inifiniband. (For ppn larger than 4 the system will automatically use the large nodes and infiniband.)
parallel: OpenMP
#PBS -N test3#PBS -j oe#PBS -o /home/user/test/test3.log#PBS -l nodes=1:ppn=T#PBS -l walltime=100#set -xcd /work/userexport OMP_NUM_THREADS=T/home/user/test/a.outexit
Where T
stands for the number of threads. In parallel jobs one should always request T
= 8 cores per node (ie ppn=8). Otherwise one would share nodes with other users what should be avoided. As a consequence OMP_NUM_THREADS should be set to 8.
An alterative is requesting 4 threads. Then one should explicitly request small nodes by adding #PBS -q gbe
.
- Batch Jobs
- Batch Jobs
- Spring Batch--jobs
- jobs
- TNS-12542 Error When Executing Batch Jobs or in High Transaction Environment
- batch
- Batch
- Batch
- back jobs
- Hunting Jobs!
- ORACLE JOBS
- sample jobs
- Pegasystems Jobs
- Steve Jobs
- ORACLE JOBS
- ORACLE JOBS
- Steve jobs
- oracle jobs
- 关于sizeof(转载)
- windows下修改php.ini位置
- 推荐20个关于CSS3优秀学习资源
- 结构体与联合体的用法
- 将数据库查询结果插入到相关表格的若干事项
- Batch Jobs
- SQL语句
- CDC常用
- SQL查询语句精华
- Java开源分词系统IKAnalyzer学习(二) 架构
- 编程技巧
- 我们安全吗?
- 内存对齐
- Shell中的特殊符号