openmpi跨节点报错tcp_peer_send_blocking: send() to socket 9 failed: Broken pipe (32)
来源:互联网 发布:学美工需要什么基础 编辑:程序博客网 时间:2024/06/11 07:13
客户反应作业无法跨节点,运行测试命令如下
mpirun -np 8 -hostfile hostfilt.txt sleep 5运行后报错如下:
[test02:01719] [[24772,0],1] tcp_peer_send_blocking: send() to socket 9 failed: Broken pipe (32)--------------------------------------------------------------------------ORTE was unable to reliably start one or more daemons.This usually is caused by:* not finding the required libraries and/or binaries on one or more nodes. Please check your PATH and LD_LIBRARY_PATH settings, or configure OMPI with --enable-orterun-prefix-by-default* lack of authority to execute on one or more specified nodes. Please verify your allocation and authorities.* the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base). Please check with your sys admin to determine the correct location to use.* compilation of the orted with dynamic libraries when static are required (e.g., on Cray). Please check your configure cmd line and consider using one of the contrib/platform definitions for your system type.* an inability to create a connection back to mpirun due to a lack of common network interfaces and/or no route found between them. Please check network connectivity (including firewalls and network routing requirements).--------------------------------------------------------------------------网上查到的解决方法:
I had this same problem on Cygwin with OpenMPI 1.10.4.
Try adding "-report-uri -" to your mpirun command to see what IP address it's trying to use for connection:
mpirun -report-uri - -np 2 a.exe
It should print out a line that looks something like this:
568328192.0;tcp://192.168.10.103,169.254.247.11,0.0.0.0,0.0.0.0,0.0.0.0:55600
If the first IP address after the "tcp://" is not a current valid address for your machine, that's the problem and things are likely to break (even if the correct IP appears later in the list). Apparently ORTE is not smart enough to order the interfaces based on what is actually enabled and online.
If the wrong IP corresponds to an old/disabled interface, uninstall it (if possible)
Reference: https://stackoverflow.com/questions/34032655/cygwin-error-tcp-peer-send-blocking-send-to-socket
- openmpi跨节点报错tcp_peer_send_blocking: send() to socket 9 failed: Broken pipe (32)
- sphinx 分布式搜索出现warning:send() failed : 32: broken pipe,
- 邮件发送失败( connection failed,Can't send command to SMTP host,Broken pipe)
- Failed sending reply to debugger: Broken pipe
- 尝试真机调试时,Xcode报这样的错:putpkt: write failed, broken pipe
- Java socket broken pipe
- linux socket c : send data when socket close—SIGPIPE, Broken pipe
- linux socket c : send data when socket close—SIGPIPE, Broken pipe
- ssh--write failed broken pipe
- SSH write failed broken pipe
- oracle 监听异常崩溃,报错Linux Error: 32: Broken pipe
- linux 下 tomcat 运行报错 Broken pipe
- java.net.SocketException: Broken pipe报错可能的原因
- linux 下 tomcat 运行报错 Broken pipe
- java.net.SocketException: Broken pipe报错可能的原因
- 报错解决openmpi
- Android上传文件 报java.net.SocketException: sendto failed: EPIPE (Broken pipe)
- [iphone][debug]putpkt: write failed: Broken pipe
- ZedGraph 多Y轴应用
- 开发时,用blocks还是Delegates
- 安装 htop 报 Unable to locate package htop
- mysql将多行数据合并成一行显示
- 代码提交前看下这个 Review 清单
- openmpi跨节点报错tcp_peer_send_blocking: send() to socket 9 failed: Broken pipe (32)
- Spring MVC启动过程
- 闭包
- 谈谈我对项目管理的理解
- Qt实现Socket从文件发送多幅图片(Qt③)
- 算法导论 21.1-3
- 写时拷贝(copy on write)
- Oracle、SQL server、MySQL 数据库的分页语法
- HDU