TCP connections hanging in the CLOSE_WAIT and FIN_WAIT_2 state.

来源:互联网 发布:查看车的软件 编辑:程序博客网 时间:2024/06/11 10:16

i think it's very usfull to understand tcp/ip. so i copy it here to share with all.

 

The other day I had an interesting case, and since there was very little information about this, I thought I’d share it.

 

The problem in short, a customer is running a web application that connects to SQL Server and very rapidly they were running out of TCP ports/sockets.

Normally this type of problem occurs because the web application is opening connections ‘under the covers’ and then put the port into TIME_WAIT and thereby

exhausting the number of available ports. This I have written about earlier,Nested RecordSet and the port/socket in TIME_WAIT problem by example and

(provider: TCP Provider, error: 0 - Only one usage of each socket address (protocol/network address/port) is normally permitted.)

 

However, in this case the netstat output showed the following:

 

  TCP    xxx.xxx.xxx.xxx:1240   xxx.xxx.xxx.xxx:1433   CLOSE_WAIT      <pid>

  TCP    xxx.xxx.xxx.xxx:1241   xxx.xxx.xxx.xxx:1433   CLOSE_WAIT      <pid>

  TCP    xxx.xxx.xxx.xxx:1242   xxx.xxx.xxx.xxx:1433   CLOSE_WAIT      <pid>

 

and

 

  TCP    xxx.xxx.xxx.xxx:1433   xxx.xxx.xxx.xxx:1463   FIN_WAIT_2      <pid>

  TCP    xxx.xxx.xxx.xxx:1433   xxx.xxx.xxx.xxx:1464   FIN_WAIT_2      <pid>

  TCP    xxx.xxx.xxx.xxx:1433   xxx.xxx.xxx.xxx:1465   FIN_WAIT_2      <pid>

 

We know that 1433 is most likely the SQL Server, and since the port on the other side is increasing by one, we can assume that this is the web application connecting to it fairly rapidly.

So what does the CLOSE_WAIT andFIN_WAIT_2 states mean?

 

I’m in no way a TCP expert so you have to excuse the lack of depth in this matter, but there are some links at the end which may be helpful.

What I do know is that a normal TCP connection in terms of protocol looks as follows (remember that ‘server’ and ‘client’ in this case doesn’t necessarily have to mean SQL Server

and a client to a SQL Server, the server is the one accepting a connection and the client is the one requesting a connection):

 

.1 The client sends a SYN to the server.

.2 The server responds with a SYN and ACK to the client.

.3 The client responds with an ACK to the server.

 

Connection is now established and data transfer takes place (the steps above are known as a 3 way handshake).

When the server is closing the connection, the following sequence takes place:

 

.4 The server sends a FIN and an ACK to the client.

.5 The client sends an ACK to the server.

.6 The client sends its own FIN and ACK to the server

.7 The server sends and ACK to the client.

 

(NOTE that if it is the client that closes the connection, then you should replace server with client and client with server in steps 4 – 7 since a close

on the connection can be called from both sides).

 

Connection is now closed and the ports are reset and are available for the next connection. You will see below a Network Monitor trace that hopefully will illustrate the above better.

 

So, where does theCLOSE_WAIT andFIN_WAIT_2 states come into play in the scenario above? In the scenario above the ports are in this state after step 5.

 

On the client the port will be in  CLOSE_WAIT: TCP   xxx.xxx.xxx.xxx:1242   xxx.xxx.xxx.xxx:1433   CLOSE_WAIT      <pid>

On the server the port will be inFIN_WAIT_2: TCP    xxx.xxx.xxx.xxx:1433   xxx.xxx.xxx.xxx:1465   FIN_WAIT_2      <pid>

 

The client has just sent the first ACK and the server is waiting for the FIN and ACK from the client.

This state is normally very short lived, so it is hard to see or catch, unless something goes wrong and it gets stuck in this state.

 

So, in good old style, I’ll do this by example. What we need is a server, a client, two command prompts and a Network Monitor (download site below).

This should be done on two separate machines so that (localhost)/shared memory is not coming into play.

 

First, create a server application (console) on one machine, like so:

using System;

using System.Net;

using System.Net.Sockets;

using System.Text;

 

class Server

{

   static void Main(string[] args)

   {

      try

      {

         const int portNumber = 10000;

         IPAddress ipAddress =Dns.GetHostEntry("Name Of Server").AddressList[0];

         TcpListener tcpListener =new TcpListener(ipAddress, portNumber);

         tcpListener.Start();

         Console.WriteLine("1. Waiting for client connection...");

         TcpClient tcpClient = tcpListener.AcceptTcpClient();

         Console.WriteLine("2. We have a client connection, now closing TcpClient....");

         tcpClient.Close();

         tcpListener.Stop();

         Console.WriteLine("3. TcpClient is now closed and TcpListener stopped.\nPress any key to exit.");

         Console.ReadLine();

      }

      catch (Exception e)

      {

         Console.WriteLine(e.ToString());

      }  

   }

}

 

Second, create a client machine (console) on another machine, like so:  

using System;

using System.Net;

using System.Net.Sockets;

using System.Text;

 

class Client

{

    static void Main(string[] args)

    {

        try

        {

            TcpClient tcpClient =new TcpClient();

            IPAddress ipAddress =Dns.GetHostEntry("<Name Of Server>").AddressList[0];

            tcpClient.Connect(ipAddress, 10000);

            Console.WriteLine("Paused.\nThe Server now has closed the TCP connection and is in FIN_WAIT_2, the client is in CLOSE_WAIT");

            Console.WriteLine("Press any key to close from client, ie. sending the second client ACK");

            Console.ReadKey();

            tcpClient.Close();

        }

        catch (Exception e)

        {

            Console.WriteLine(e.ToString());

        }

    }

}                                                                   

                                                         

In the above, the server starts listening on port 10000 and then stops at 1. since it is waiting for a client to connect.

The client connects on port 10000 and the server accepts the client connection and just closes it and stops the listener.

However, we have paused the client just before the close on the client.

 

So, firstly start Network Monitor on the server or the client or both. And set the Capture Filter to (we know in this case the communication will occur on port 10000 on server):

 

Tcp.SrcPort == 10000 || Tcp.DstPort == 10000

 

Without the filter there will be a lot of data, and we do not need that data for this purpose.

Now when the Network Monitor is running, start the Server application and then the Client application.

The server should display that it has closed the TcpClient and stopped the TcpListener, the client should display that pressing any button will send the second ACK.

Turn to the Network Monitor trace, you should have an output that looks as follows, server in green, client in red:

 

    Source IP         Port    Destination IP    Port                                  

3   xxx.xxx.xxx.111  54868   xxx.xxx.xxx.187  10000   TCP   TCP: Flags=.S......, SrcPort=54868, DstPort=10000, Len=0, Seq=269180009, Ack=0, Win=8192

4   xxx.xxx.xxx.187  10000   xxx.xxx.xxx.111  54868   TCP   TCP: Flags=.S..A..., SrcPort=10000, DstPort=54868, Len=0, Seq=372159794, Ack=269180010

5   xxx.xxx.xxx.111  54868   xxx.xxx.xxx.187  10000   TCP   TCP: Flags=....A..., SrcPort=54868, DstPort=10000, Len=0, Seq=269180010, Ack=372159795

6   xxx.xxx.xxx.187  10000   xxx.xxx.xxx.111  54868   TCP   TCP: Flags=F...A..., SrcPort=10000, DstPort=54868, Len=0, Seq=372159795, Ack=269180010

7   xxx.xxx.xxx.111  54868   xxx.xxx.xxx.187  10000   TCP   TCP: Flags=....A..., SrcPort=54868, DstPort=10000, Len=0, Seq=269180010, Ack=372159796

 

So here we first see the 3way handshake.

Frame 3, the client sends a SYN to the server.

Frame 4, the server responds with the SYN_ACK

Frame 5, the client responds with the ACK. Now we are up and running and now data transmission would take place. But we do not send/receive anything, we just close, so:

Frame 6, the server sends a FIN_ACK to the server, ie closing the connection.

Frame 7, the client responds with an ACK.

 

Here the client should send a FIN_ACK to the server and the server should send client ACK and the connection should be closed.

However, we have stopped the client from sending it’s FIN_ACK, so start a command prompt on the server and on the client and run ‘netstat –aon’ (no quotes).

This should give the following output.

 

On server:

   TCP   xxx.xxx.xxx.187:10000   xxx.xxx.xxx.111:53961   FIN_WAIT_2      <pid>

On client:

   TCP   xxx.xxx.xxx.187:53961   xxx.xxx.xxx.111:10000   CLOSE_WAIT      <pid>

 

Now press any button on the client and the final conversation should happen in the Network Monitor trace:

 

8   xxx.xxx.xxx.111  54868   xxx.xxx.xxx.187  10000   TCP   TCP: Flags=F...A..., SrcPort=54868, DstPort=10000, Len=0, Seq=269180010, Ack=372159796

9   xxx.xxx.xxx.187  10000   xxx.xxx.xxx.111  54868   TCP   TCP: Flags=....A..., SrcPort=10000, DstPort=54868, Len=0, Seq=372159796, Ack=269180011

 

And the connection is closed.

So, now you know, and can see, when and what theFIN_WAIT_2 / CLOSE_WAIT states means and when they happen.         

 

The short version is that this state exists when the first FIN_ACK and ACK have been sent but the second FIN_ACK and ACK has not.

On the side that closed the connection you will have FIN_WAIT_2, on the side that is to send the final FIN_ACK and ACK you will have CLOSE_WAIT.

 

Basically this condition exists when one side has said that it will not send more data, but may still receive data from the other side.

 

When can this happen, that all the ports runs out since they hang in this state?

Well, one reason could be that you have a web application that logs into a Sql Server, however, the connection is refused since the password has changed or expired or something similar.

You‘ll may see this in the logs, in the application event for example:

 

xx/xx/2008  10:00:00 AM MSSQLSERVER Failure Audit            (4)         18456       N/A            <servername> Login failed for user '<user>'. [CLIENT: xxx.xxx.xxx.139]

xx/xx/2008  10:00:00 AM MSSQLSERVER Failure Audit            (4)         18456       N/A            <servername> Login failed for user '<user>'. [CLIENT: xxx.xxx.xxx.139]

 

You can find out more about 18456 hereError/Event 18456 explained

But what happens here is that the SYN, SYN_ACK, ACK takes place, then the server denies the connection since the login failed, ie. it sends the FIN_ACK to the client.

The client then sends the ACK back but it does not properly close or dispose the connection, for example the login fail could be caught but discarded.

By not doing this the client never sends the last bits of the TCP conversation and then the ports hang in these states and thereby blocks other connection attempts,

forcing the opening on another port. And voila, you will run out of ports.

 

Hmmm, a long one this one, but hopefully it helps someone out there.

More information:

 

Transmission Control Protocol

http://en.wikipedia.org/wiki/Transmission_Control_Protocol

TCP Connection States and Netstat Output

http://support.microsoft.com/kb/137984

The Basics of Reading TCP/IP Traces

http://support.microsoft.com/kb/169292

Explanation of the Three-Way Handshake via TCP/IP

http://support.microsoft.com/kb/172983

Microsoft Network Monitor 3.1

http://www.microsoft.com/downloads/details.aspx?familyid=18b1d59d-f4d8-4213-8d17-2f6dde7d7aac&displaylang=en

TCPView for Windows

http://technet.microsoft.com/en-us/sysinternals/bb897437.aspx

0 0
原创粉丝点击