SLURM and OpenMPI

来源:互联网 发布:pop3端口 编辑:程序博客网 时间:2024/05/23 11:53

1) The MpiDefault configuration parameter in slurm.conf establishes the system default MPI to be supported. 

The srun option --mpi= (or the equivalent environment variable SLURM_MPI_TYPE can be used to specify when a different MPI implementation is to be supported for an individual job.



2) SLURM creates a resource allocation for the job and then mpirun launches tasks using SLURM's infrastructure (OpenMPI, LAM/MPI and HP-MPI).


3) The current versions of SLURM and Open MPI support task launch using the srun command.

 It relies upon SLURM version 2.0 (or higher) managing reservations of communication ports for use by the Open MPI version 1.5 (or higher). The system administrator must specify the range of ports to be reserved in the slurm.conf file using the MpiParams parameter. For example: 

MpiParams=ports=12000-12999


Launch tasks using the srun command plus the option --resv-ports. The ports reserved on every allocated node will be identified in an environment variable available to the tasks as shown here: 
SLURM_STEP_RESV_PORTS=12000-12015


If the ports reserved for a job step are found by the Open MPI library to be in use, a message of this form will be printed and the job step will be re-launched:
srun: error: sun000: task 0 unble to claim reserved port, retrying
After three failed attempts, the job step will be aborted. Repeated failures should be reported to your system administrator in order to rectify the problem by cancelling the processes holding those ports.


Note: Older releases


Older versions of Open MPI and SLURM rely upon SLURM to allocate resources for the job and then mpirun to initiate the tasks. For example:


$ salloc -n4 sh    # allocates 4 processors  # and spawns shell for job
> mpirun a.out
> exit          # exits shell spawned by initial salloc command
原创粉丝点击