mpirun failed to start when TMPDIR=. is set

来源:互联网 发布:微信java版 编辑:程序博客网 时间:2024/05/17 05:05

mpirun failed to start when TMPDIR=. is set

linfa's picture

Hi,

We found when the environment variable TMPDIR is set to the current directory, not matter it is '.', './', or full path, the Intel MPI failed to run.

This happens on all MPI versions ( including 4.0)

[linfa@babbage testrun]$ setenv TMPDIR .
[linfa@babbage testrun]$ {/opt/intel/impi/3.2.0.011/bin64/mpirun} -n 2
mpdboot_babbage.tx.altair.com (handle_mpd_output 730): Failed to establish a socket connection with babbage:54848 : (111, 'Connection refused')
mpdboot_babbage.tx.altair.com (handle_mpd_output 747): failed to connect to mpd on babbage


Is this a bug? Is there any workaround ? Thanks.

5 posts / 0 new
Last post
For more complete information about compiler optimizations, see ourOptimization Notice.
Gergana Slavova (Intel)'s picture

Hey linfa,

I would actually recommend upgrading to our newest version: Intel MPI Library 4.0 Update 3. You can grab it from theIntel Registration Center. While I was able to reproduce this with the 4.0 release, I don't see this issue with the 4.0.3 release:

[user@host1:~]> export TMPDIR=.
[user@node1:~]> /opt/intel/impi/4.0.0.025/bin64/mpirun -n 2 hostname
mpdboot_node1 (handle_mpd_output 949): Failed to establish a socket connection with node1:33751 : (111, 'Connection refused')
mpdboot_node1 (handle_mpd_output 969): failed to connect to mpd on node1
[user@node1:~]> /opt/intel/impi/4.0.3/bin64/mpirun -n 2 hostname
node1
node1

We have a new default process manager in 4.0.3. I don't believe we supported the shorthand symbols with our old PM.

Give this a try and let us know how it goes.

Regards,
~Gergana

Gergana Slavova
Technical Consulting Engineer
Intel® Cluster Tools
E-mail: gergana.s.slavova_at_intel.com
linfa's picture

Hi Gergana,Thanks for your quick reply. I have several questions1) What is "default process manager"? How is it related to this issue?Could you explain to me a little bit more?2) What should I update, SDK for building executable or run-time library only?3) It is not a problem for me to update. But it is more difficult to ask our customer to do it. So I am wondering if there is an workaround.Thanks.Linfa

Gergana Slavova (Intel)'s picture

Hi Linfa,

1) A process manager is the part of the library that launches the MPI ranks, interracts with the job or batch schedulers, makes the physically connections between the nodes (e.g. via ssh), etc. It would also do parsing of your command and any env variables you're using (like TMPDIR) to start that job.
In older versions of our library, we used the Multi-Purpose Daemons (MPDs) as the process manager.In the 4.0.3 version and later, we use the Hydra process manager. Hydra has some advantages to the MPDs - as you can see here.

2) I recommend updating the full SDK - if you have a valid license, that would be free and easy to do. But, if not possible, all of our 4.0.x packages are compatible with each other. So you can simply update the runtimes and be ok.

3) The only workaround would be to use the full path:

[user@node1:~]> export TMPDIR=/home/user
[user@node1:~]> /opt/intel/impi/4.0.0.025/bin64/mpirun -n 2 hostname
node1
node1

Is your customer just running your application? If yes, they can simply update the runtimes. Those are available as a free download from our website:www.intel.com/go/mpi.

Does that sound reasonable?

Regards,
~Gergana

Gergana Slavova
Technical Consulting Engineer
Intel® Cluster Tools
E-mail: gergana.s.slavova_at_intel.com
linfa's picture

Thanks. That's what I want


http://software.intel.com/en-us/forums/topic/277274

原创粉丝点击