Getting started with LAM
来源:互联网 发布:网络上赌博输了5千万 编辑:程序博客网 时间:2024/04/27 21:10
转自:http://www.lam-mpi.org/tutorials/one-step/lam.php
LAM is a simple yet powerful environment for running and monitoring MPI applications on clusters. The few essential steps in LAM operations are covered below.
Booting LAM
The user creates a file listing the participating machines in the cluster.
shell$ cat lamhosts# a 2-node LAMnode1.cluster.example.comnode2.cluster.example.com
Each machine will be given a node identifier (nodeid) starting with 0 for the first listed machine, 1 for the second, etc.
The recon tool verifies that the cluster is bootable:
shell$ recon -v lamhostsrecon: -- testing n0 (node1.cluster.example.com)recon: -- testing n1 (node2.cluster.example.com)
The lamboot tool actually starts LAM on the specified cluster.
% lamboot -v lamhostsLAM 7.1.4 - Indiana UniversityExecuting hboot on n0 (node1.cluster.example.com - 1 CPU)...Executing hboot on n1 (node2.cluster.example.com - 1 CPU)...
lamboot returns to the UNIX shell prompt. LAM does not force a canned environment or a "LAM shell". The tping command builds user confidence that the cluster and LAM are running.
shell$ tping -c1 N 1 byte from 1 remote node and 1 local node: 0.008 secs1 message, 1 byte (0.001K), 0.008 secs (0.246K/sec)roundtrip min/avg/max: 0.008/0.008/0.008
Compiling MPI Programs
Refer to MPI: It's Easy to Get Started to see a simple MPI program. mpicc (and mpiCC and mpif77) is a wrapper for the C (C++, and F77) compiler that includes all the necessary command line switches to the underlying compiler to find the LAM include files, the relevant LAM libraries, etc.
shell$ mpicc -o foo foo.cshell$ mpif77 -o foo foo.f
Executing MPI Programs
A MPI application is started by one invocation of the mpirun command. A SPMD application can be started on the mpirun command line.
shell$ mpirun -v -np 2 foo2445 foo running on n0 (o)361 foo running on n1
An application with multiple programs must be described in an application schema, a file that lists each program and its target node(s).
shell$ cat appfile# 1 master, 2 slavesn0 master n0-1 slave shell$ mpirun -v appfile3292 master running on n0 (o)3296 slave running on n0 (o)412 slave running on n1
Monitoring MPI Applications
The full MPI synchronization status of all processes and messages can be displayed at any time. This includes the source and destination ranks, the message tag, count and datatype, the communicator, and the function invoked.
shell$ mpitaskTASK (G/L) FUNCTION PEER|ROOT TAG COMM COUNT DATATYPE0/0 master Recv ANY ANY WORLD 1 INT1 slave <running>2 slave <running>
Process rank 0 is blocked receiving a message consisting of a single integer from any source rank and any message tag, using the MPI_COMM_WORLD communicator. The other processes are running.
shell$ mpimsgSRC (G/L) DEST (G/L) TAG COMM COUNT DATATYPE MSG0/0 1/1 7 WORLD 4 INT n0,#0
Later, we see that a message sent by process rank 0 to process rank 1 is buffered and waiting to be received. It was sent with tag 7 using the MPI_COMM_WORLD communicator and contains 4 integers.
Cleaning LAM
All user processes and messages can be removed, without rebooting.
shell$ lamclean -vkilling processes, done sweeping messages, done closing files, done sweeping traces, done
It is typical for users to mpirun a program, lamclean when it finishes, and then mpirun another program. It is not necessary to lamboot to run each user MPI program.
Terminating LAM
The lamhalt tool removes all traces of the LAM session on the network. This is only performed when LAM/MPI is no longer needed (i.e., no more mpirun/lamclean commands will be issued).
shell$ lamhalt
In the case of a catastrophic failure (e.g., one or more LAM nodes crash), the lamhalt utility will hang. In this case, the wipe tool is necessary. The same boot schema that was used with lamboot is necessary to list each node where the LAM run-time environment is running:
shell$ wipe -v lamhostsExecuting tkill on n0 (node1.cluster.example.com)...Executing tkill on n1 (node2.cluster.example.com)...
- Getting started with LAM
- Getting Started with Smartphone
- Getting Started With JasperReports
- Getting started with OpenCV
- Getting Started with DWR
- Getting Started With JasperReports
- Getting Started with MASM
- Getting started with BlazeDS
- Getting Started with RMAN
- Getting started with Davinci
- Getting Started with Monkey
- Getting Started with Boost
- Getting started with windbg
- Getting started with IUNIVERS
- Getting started with uClinux
- Getting Started with CVS
- Getting started with rrdtools
- Getting Started with ReactiveCocoa
- 程序员的生活觉得倦了
- Duwamish
- Python牛人
- 做个小游戏吧~貌似还挺准
- 抛弃以往分页方式,改用ROW_NUMBER()加BETWEEN方式的分页存储过程,特此分享
- Getting started with LAM
- 真难
- Javascript_Ajax:如何在前台利用JS处理$.ajax() request 中得到的不同结果?
- CASE在sql server中的使用用法
- DD
- 如何创建自己的共享UML模型,在线视频教程!!
- Eclipse RCP开发桌面程序
- WTL之父Nenad Stefanovic访谈录 与 wtl使用后感
- 妖精的资料