[solve]Bind: Address Already in …

来源:互联网 发布:java.sql.timestamp 编辑:程序博客网 时间:2024/04/29 16:17

Bind: Address Already in Use

OrHow to Avoid this Error when Closing TCP Connections

Normal Closure

Inorder for a network connection to close, both ends have tosend FIN (final)packets, which indicate they will not send any additional data, andboth ends must ACK (acknowledge)each other's FIN packets.The FIN packetsare initiated by the application performinga close(),a shutdown(),or an exit().The ACKsare handled by the kernel after the close() hascompleted. Because of this, it is possible for the process tocomplete before the kernel has released the associated networkresource, and this port cannot be bound to another process untilthe kernel has decided that it is done.

 

TCP State Diagram 
Figure 1

Figure 1 shows all of the possible states that can occur during anormal closure, depending on the order in which things happen. Notethat if you initiate closure, there isa TIME_WAIT state thatis absent from the other side.This TIME_WAIT isnecessary in casethe ACK you sent wasn'treceived, or in case spurious packets show up for other reasons.I'm really not sure why this state isn't necessary on the otherside, when the remote end initiates closure, but this is definitelythecase. TIME_WAIT isthe state that typically ties up the port for several minutes afterthe process has completed. The length of theassociated timeout varies on different operating systems, and maybe dynamic on some operating systems, however typical values are inthe range of one to four minutes.

If both ends senda FIN before either endreceives it, both ends will have to gothrough TIME_WAIT.

Normal Closure of Listen Sockets

Asocket which is listening for connections can be closed immediatelyif there are no connections pending, and the state proceedsdirectly to CLOSED.If connections are pendinghowever, FIN_WAIT_1 isentered, and a TIME_WAIT isinevitable.

Note that it is impossible to completely guarantee a clean closurehere. While you can check the connections usinga select() call beforeclosure, a tiny but real possibility exists that a connection couldarrive afterthe select() but beforethe close().

Abnormal Closure

Ifthe remote application dies unexpectedly while the connection isestablished, the local end will have to initiate closure. In thiscase TIME_WAIT isunavoidable. If the remote end disappears due to a network failure,or the remote machine reboots (both are rare), the local port willbe tied up until each state times out. Worse, some older operatingsystems do not implement a timeoutfor FIN_WAIT_2,and it is possible to get stuck there forever, in which caserestarting your server could require a reboot.

If the local application dies while a connection is active, theport will be tied up in TIME_WAIT. Thisis also true if the application dies while a connection ispending.

Strategies for Avoidance

SO_REUSEADDR

Youcan use setsockopt() toset the SO_REUSEADDR socketoption, which explicitly allows a process to bind to a port whichremains in TIME_WAIT (itstill only allows a single process to be bound to that port). Thisis the both the simplest and the most effective option for reducingthe "address already in use" error.
int ctrl_flag =1;
setsockopt(socketFD, SOL_SOCKET,SO_REUSEADDR, &ctrl_flag,sizeof(ctrl_flag));

Oddly,using SO_REUSEADDR canactually lead to more difficult "address already in use"errors. SO_REUSADDR permitsyou to use a port that is stuck inTIME_WAIT, but youstill can not use thatport to establish a connection to the last place it connected to.What? Suppose I pick local port 1010, and connect to foobar.comport 300, and then close locally, leaving that portin TIME_WAIT. I can reuse local port 1010right away to connect to anywhere exceptforfoobar.com port 300.

A situation where this might be a problem is if my program istrying to find a reserved local port (< 1024) toconnect to some service which likes reserved ports. If Iused SO_REUSADDR, then each time I runthe program on my machine, I'll keep gettingthe same local reservedport, even if it is stuck inTIME_WAIT, and I risk gettinga "connect: Address already in use" error if I go back to any placeI've been to in the last few minutes. The solution here istoavoid SO_REUSEADDR.

Some folks don'tlike SO_REUSEADDR becauseit has a security stigma attached to it. On some operating systemsit allows the same port to be used with a different address on thesame machine by different processes at the same time. This is aproblem because most servers bind to the port, but they don't bindto a specific address, instead theyuse INADDR_ANY (this iswhy things show upin netstat outputas *.8080). So if the server is bound to*.8080, another malicious user on the local machine can bind tolocal-machine.8080, which will intercept all of your connectionssince it is more specific. This is only a problem on multi-usermachines that don't have restricted logins, itis NOT a vulnerabilityfrom outside the machine. And it is easily avoided by binding yourserver to the machine's address.

Additionally, others don't like that a busy server may havehundreds or thousands ofthese TIME_WAIT socketsstacking up and using kernel resources. For these reasons, there'sanother option for avoiding this problem.

Client Closes First

Lookingat the diagram above, it is clearthat TIME_WAIT canbe avoided if the remote end initiates the closure. So the servercan avoid problems by letting the client close first. Theapplication protocol must be designed so that the client knows whento close. The server can safely close in response toan EOFfromthe client, however it will also need to set a timeout when it isexpecting an EOF in case the client has left the networkungracefully. In many cases simply waiting a few seconds before theserver closes will be adequate.

It probably makes more sense to call this method "Remote ClosesFirst", because otherwise it depends on what you are calling theclient and the server. If you are developing some system where acluster of client programs sit on one machine and contact a varietyof different servers, then you would want to foist theresponsibility for closure onto the servers, to protect theresources on the client.

For example, I wrote a script thatuses rsh to contact allof the machines on our network, and it does it in parallel, keepingsome number of connections open at alltimes. rsh source portsare arbitrary available ports less than 1024. I initially used"rsh -n", which it turns out causes the local end to closefirst. After a few tests, every single free port less than 1024 wasstuck in TIME_WAIT and Icouldn't proceed. Removing the "-n" option causes theremote (server) end to close first (understanding why is left as anexercise for the reader), and should've eliminatedthe TIME_WAIT problem.However, without the -n, rsh can hang waiting for input. And, ifyou close input at the local end, this can again result in the portgoing into TIME_WAIT. I ended up avoidingthe system-installed rsh program, and developing my ownimplementation in perl. My currentimplementation, multi-rsh,is available for download

Reduce Timeout

If(for whatever reason) neither of these options works for you, itmay also be possible to shorten the timeout associatedwith TIME_WAIT.Whether this is possible and how it should be accomplished dependson the operating system you are using. Also, making this timeouttoo short could have negative side-effects, particularly in lossyor congested networks.