Mpi4py实践
来源:互联网 发布:fastjson解析json数据 编辑:程序博客网 时间:2024/06/07 23:38
1.概述
MPI(Message Passing Interface),消息传递接口,是一个标准化和轻便的能够运行在各种各样并行计算机上的消息传递系统。消息传递指的是并行执行的各个进程拥有自己独立的堆栈和代码段,作为互不相关的多个程序独立执行,进程之间的信息交互完全通过显示地调用通信函数来完成。
mpi4py是构建在MPI之上的Python非官方库,使得Python的数据可以在进程之间进行传递。
2.MPI执行模型
并行程序是指一组独立、同一的处理过程;
- 所有的进程包含相同的代码;
- 进程可以在不同的节点或者不同的计算机;
当使用Python,使用n个Python解释器;
mpirun -np 32 python parallel_script.py
并行执行模型如下所示,
2.1 MPI基本概念
rank:给予每个进程的id;
- 可通过rank进行查询;
- 根据rank,进程可以执行不同的任务;
Communicator:包含进程的群组;
- mpi4py中基本的对象,通过它来调用方法;
- MPI_COMM_WORLD,包含所有的进程(mpi4py中是MPI.COMM_WORLD);
2.2 数据模型
所有的变量和数据结构都是进程的局部值;
进程之间通过发送和接收消息来交换数据;
2.3 使用mpi4py
from mpi4py import MPIcomm = MPI.COMM_WORLD #Communicator对象包含所有进程size = comm.Get_size()rank = comm.Get_rank()print "rank = %d,size = %d"%(rank,size)
2.4 安装mpi4py
MPI Python环境搭建
MPI Windows集群环境搭建
3.工作方式
工作方式主要有点对点和群体通信两种;点对点通信就是一对一,群体通信是一对多;
3.1 点对点
example 1
点对点发送Python内置dict对象;
#Broadcasting a Python dictfrom mpi4py import MPIcomm = MPI.COMM_WORLDrank = comm.Get_rank()if rank == 0: data = {"a":7,"b":3.14} comm.send(data,dest = 1,tag = 11) print "send data = ",dataelif rank == 1: data = comm.recv(source = 0,tag = 11) print "recv data = ",data
任意的Python内置对象可以通过send和recv进行通信,目标rank和源rank和tag都要互相匹配;
send(data,dest,tag)
- data,待发送的Python内置对象;
- dest,目标rank;
- tag,发送消息的id;
recv(source,tag)
- source,源rank;
- tag,发送消息的id;
example 2
点对点发送Python内置dict对象,非阻塞通信;
#point to point communication Python objects with non-blocking communicationfrom mpi4py import MPIcomm = MPI.COMM_WORLDrank = comm.Get_rank()if rank ==0: data = {"a":7,"b":3.14} req = comm.isend(data,dest = 1,tag = 11) req.wait() print "send data = ",dataelif rank == 1: req = comm.irecv(source = 0,tag = 11) data = req.wait() print "recv data = ",data
example 3
发送Numpy数组;
#point to point communication Python objects Numpy arraysfrom mpi4py import MPIimport numpy as npcomm = MPI.COMM_WORLDrank = comm.Get_rank()# automatic MPI datatypes discoveryif rank == 0: data = np.arange(100,dtype = np.int) comm.Send(data, dest = 1,tag = 13) print "send data = ",dataelif rank == 1: data = np.empty(100,dtype = np.int) comm.Recv(data, source = 0,tag = 13) print "recv data = ",data
当发送消息时,任意的Python对象转换为字节流;
当接收消息时,字节流被转换为Python对象;
Send(data,dest,tag),Recv(data,source,tag),连续型数组,速度快;
send(data,dest,tag),recv(source,tag),Python内置对象,速度慢;
3.2 群体通信
群体通信分为发送和接收,发送是一次性把数据发给所有人,接收是一次性从所有人那里回收结果;
example 1
root进程新建data dict,然后将data数据广播给所有的进程,这样所有的进程都拥有这个data dict;
#Broadcasting a Python dictfrom mpi4py import MPIcomm = MPI.COMM_WORLDrank = comm.Get_rank()if rank == 0: data = {"key1":[7,2.72,2+3j],"key2":("abc","xyz")}else: data = Nonedata = comm.bcast(data,root = 0)print "rank = ",rank," data = ",data
example 2
root进程新建了一个list,然后将它散播给所有的进程,相当于对这个list做了划分,每个进程获得等分的数据,这里就是list中的每一个数字(主要根据list的索引来划分,list索引为第i份的数据就发送给第i个进程),如果是矩阵,那么久等分的划分行,每个进程获得相同的行数进行处理;
MPI的工作方式是每个进程都会执行所有的代码,每个进程都会执行scatter这个指令,但是只有root进程执行它的时候,它才兼备发送者和接收者的身份(root进程也会得到数据它自己的那份数据),对于其他进程来说,他们都只是接收者而已;
#Scattering Python objectsfrom mpi4py import MPIcomm = MPI.COMM_WORLDsize = comm.Get_size()rank = comm.Get_rank()if rank == 0: data = [(i+1)**2 for i in range(size)]else: data = Nonedata = comm.scatter(data,root = 0)assert data == (rank+1)**2print "rank = ",rank," data = ",data
example 3
gather是将所有进程的数据收集回来,然后合并成一个列表;
#Gathering Python objectsfrom mpi4py import MPIcomm = MPI.COMM_WORLDsize = comm.Get_size()rank = comm.Get_rank()data = (rank+1)**2data = comm.gather(data,root = 0)if rank == 0: for i in range(size): assert(data[i] == (i+1)**2) print "data = ",dataelse: assert data is None
example 4
广播Numpy数组;
#Broadcasting Numpy arrayfrom mpi4py import MPIimport numpy as npcomm = MPI.COMM_WORLDrank = comm.Get_rank()if rank == 0: data = np.arange(100, dtype = 'i')else: data = np.empty(100,dtype = 'i')comm.Bcast(data,root = 0)for i in range(100): assert(data[i] == i)print "rank = ",rank," data = ",data
example 5
散播Numpy数组;
#Scattering Numpy arraysfrom mpi4py import MPIimport numpy as npcomm = MPI.COMM_WORLDsize = comm.Get_size()rank = comm.Get_rank()senbuf = Noneif rank == 0: senbuf = np.empty([size,100],dtype = 'i') senbuf.T[:,:] = range(size)recvbuf = np.empty(100,dtype = 'i')comm.Scatter(senbuf,recvbuf,root = 0)assert np.allclose(recvbuf,rank)print "rank = ",rank," recvbuf = ",recvbuf
example 6
收集Numpy数组;
#Gathering Numpy arrayfrom mpi4py import MPIimport numpy as npcomm = MPI.COMM_WORLDsize = comm.Get_size()rank = comm.Get_rank()sendbuf = np.zeros(100, dtype='i') + rankrecvbuf = Noneif rank == 0: recvbuf = np.empty([size, 100], dtype='i')comm.Gather(sendbuf, recvbuf, root=0)if rank == 0: for i in range(size): assert np.allclose(recvbuf[i,:], i)
4.Reference
mpi4py tutorial
Python多核编程mpi4py实践
- Mpi4py实践
- Python多核编程mpi4py实践
- windows下安装mpi4py库
- mpi4py在MPICH2上的安装
- mpi学习日志(3):mpi4py与广播
- mpi学习日志(6):mpi4py与sendrecv
- mpi学习日志(9):mpi4py与Split
- mpi学习日志(14):mpi4py与probe
- mpi学习日志(2):mpi4py与点对点通信
- mpi学习日志(4):mpi4py与散播等
- mpi学习日志(5):mpi4py与多点通信续
- mpi学习日志(7):mpi4py与通信子,通信组
- mpi学习日志(8):mpi4py与Group运算
- mpi学习日志(10):mpi4py实现简单并行矩阵乘法
- mpi学习日志(13):mpi4py与非阻塞型函数
- ubuntu下无配置安装openmpi和mpi4py
- 大数据基础(一)openmpi,mpich,mpi4py在ubuntu 16.04下的安装指南
- mpi学习日志(11):mpi4py与Spawn(没法用MSMPI实现)
- 总结1
- 改善java程序的151个建议
- mysql windows 系统服务
- 关于job的应用(quartz中时间表达式的设置)
- canvas练手素材
- Mpi4py实践
- 通过轮廓点对负极点的y值进行调整
- java操作mongodb模糊查询
- 数据结构实验之二叉树三:统计叶子数
- 第7节、插件配置:HTML文件的发布
- 悟空吃桃
- [Kubernetes] Kubernetes 1.7 源码编译
- Cooperative Co-Evolution With Differential Grouping for Large Scale Optimization
- web中采用shiro实现登录认证与权限授权管理