6-I&O Multiplexing-The 'select' and 'poll' Functions

来源：互联网发布：无创dna数据辨别男女编辑：程序博客网时间：2024/04/19 03:52

Please indicate the source: http://blog.csdn.net/gaoxiangnumber1

Welcome to my github: https://github.com/gaoxiangnumber1

6.1 Introduction

TCP client handle two inputs at the same time: standard input and a TCP socket. When the client was blocked in fgets(on standard input) and the server process was killed, the server TCP sent a FIN to the client TCP, but since the client was blocked reading from standard input, it never saw the EOF until it read from the socket.
The ability to tell the kernel that we want to be notified if one or more I/O conditions are ready(i.e., input is ready to be read, or descriptor is capable of taking more output) is called I/O multiplexing and is provided by the select and poll functions.
I/O multiplexing is used in the following scenarios.
—When client handles:
1. multiple descriptors(normally interactive input and a network socket).
2. multiple sockets at the same time.
  —When server handles:
3. both a listening socket and its connected sockets(Section 6.8).
4. both TCP and UDP(Section 8.15).
5. multiple services and perhaps multiple protocols(Section 13.5).

6.2 I/O Models

blocking I/O
nonblocking I/O
I/O multiplexing(select and poll)
signal driven I/O(SIGIO)
asynchronous I/O(the POSIX aio_ functions)
There are normally two phases for an input operation.
1. Waiting for data to arrive(on the network). When data arrives, it is copied into a buffer within the kernel.
2. Copying data from the kernel’s buffer into our application buffer.

Blocking I/O Model

By default, all sockets are blocking.
The process calls recvfrom and the system call does not return until the datagram arrives and is copied into our application buffer, or an error occurs. Process is blocked the entire time from when it calls recvfrom until it returns. When recvfrom returns successfully, our application processes the datagram.
The most common error is the system call being interrupted by a signal(Section 5.9).

Nonblocking I/O Model

Set a socket to be nonblocking: when an I/O operation that we request cannot be completed without putting the process to sleep, do not put the process to sleep, but return an error(EWOULDBLOCK) instead.
When an application sits in a loop calling recvfrom on a nonblocking descriptor, it is called polling. The application is continually polling the kernel to see if some operation is ready. This is often a waste of CPU time.

I/O Multiplexing Model

With I/O multiplexing, we call select or poll and block in one of two system calls, instead of blocking in the actual I/O system call.
Advantage: we can wait for more than one descriptor to be ready.
Disadvantage: using select requires two system calls instead of one.
Use multithreading with blocking I/O: Instead of using select to block on multiple file descriptors, the program uses multiple threads(one per file descriptor), and each thread is free to call blocking system calls.

Signal-Driven I/O Model

Use signals to tell the kernel to notify us with the SIGIO signal when the descriptor is ready.
We first enable the socket for signal-driven I/O(Section 25.2) and install a signal handler using the sigaction system call. The return from this system call is immediate and our process continues. When the datagram is ready to be read, the SIGIO signal is generated for our process. We can either read the datagram from the signal handler by calling recvfrom and then notify the main loop that the data is ready to be processed, or we can notify the main loop and let it read the datagram.
Advantage: we are not blocked while waiting for the datagram to arrive.

Asynchronous I/O Model

Asynchronous I/O functions work by telling the kernel to start the operation and to notify us when the entire operation(including the copy of the data from the kernel to our buffer) is complete.
We call aio_read and pass the kernel the descriptor, buffer pointer, buffer size, file offset, and how to notify us when the entire operation is complete. aio_read returns immediately and our process is not blocked while waiting for the I/O to complete. Signal is not generated until the data has been copied into our application buffer, which is different from the signal-driven I/O model.

Comparison of the I/O Models

Main difference between the first four models is the first phase, as the second phase in the first four models is the same: the process is blocked in a call to recvfrom while the data is copied from the kernel to the caller’s buffer. Asynchronous I/O handles both phases and is different from the first four.

Synchronous I/O versus Asynchronous I/O

POSIX defines these two terms as follows:
1. Synchronous I/O causes the requesting process to be blocked until that I/O operation completes.
2. Asynchronous I/O does not cause the requesting process to be blocked.
The first four I/O models(blocking, nonblocking, I/O multiplexing, and signal-driven I/O) are all synchronous because the actual I/O operation(recvfrom) blocks the process. Asynchronous I/O model matches the asynchronous I/O definition.

6.3 ‘select’ Function

select allows the process to instruct the kernel to wait for any one of multiple events to occur and to wake up the process only when one or more of these events occurs or when a specified amount of time has passed. We tell the kernel what descriptors we are interested in(for reading, writing, or an exception condition) and how long to wait. The descriptors in which we are interested can be any descriptor.

#include <sys/select.h>#include <sys/time.h>#include <sys/types.h>#include <unistd.h>int select(int maxfdp1, fd_set *readset, fd_set *writeset, fd_set *exceptset, const struct timeval *timeout);Returns: positive count of ready descriptors, 0 on timeout, -1 on error

timeout tells the kernel how long to wait for one of the specified descriptors to become ready. A timeval structure specifies the number of seconds and microseconds.

struct timeval{    long tv_sec;        /* seconds */    long tv_usec;   /* microseconds */};

There are three possibilities.
1. timeout = NULL: Wait forever
  Return only when one of the specified descriptors is ready for I/O.
2. Wait up to a fixed amount of time
  Return when one of the specified descriptors is ready for I/O, but do not wait beyond the number of seconds and microseconds specified in the timeval structure pointed to by the timeout argument.
3. Do not wait at all
  Return immediately after checking the descriptors. This is called polling. The timeout argument must point to a timeval structure and the timer value(the number of seconds and microseconds specified by the structure) must be 0.
The wait in the first two scenarios is interrupted if the process catches a signal and returns from the signal handler. Berkeley-derived kernels never automatically restart select, we must be prepared for select to return an error of EINTR if we are catching signals.
timeout is const, so it is not modified by select on return. For example, if we specify a time limit of 10 seconds, and select returns before the timer expires with one or more of the descriptors ready or with an error of EINTR, the timeval structure is not updated with the number of seconds remaining when the function returns. If we wish to know this value, we must obtain the system time before calling select, and then again when it returns, and subtract the two.
readset, writeset, and exceptset specify the descriptors that we want the kernel to test for reading, writing, and exception conditions. There are only 2 exception conditions supported:
1. The arrival of out-of-band data for a socket(Chapter 24).
2. The presence of control status information to be read from the master side of a pseudo-terminal that has been put into packet mode.
select uses descriptor sets: an array of integers with each bit in each integer corresponding to a descriptor. For example, using 32-bit integers, the first element corresponds to descriptors[0, 31], the second element corresponds to descriptors[32, 63] and so on.
The constant FD_SETSIZE in

void FD_ZERO(fd_set *fdset);        /* clear all bits in fdset */void FD_SET(int fd, fd_set *fdset); /* turn on the bit for fd in fdset */void FD_CLR(int fd, fd_set *fdset); /* turn off the bit for fd in fdset */int FD_ISSET(int fd, fd_set *fdset);    /* is the bit for fd on in fdset ? */

We allocate a descriptor set of fd_set type, we set and test the bits in the set using these macros, and we can assign it to another descriptor set by equals sign(=).
For example, to define a variable of type fd_set and then turn on the bits for descriptors 1, 4, and 5, we write

fd_set rset;FD_ZERO(&rset); /* initialize the set: all bits off */FD_SET(1, &rset);   /* turn on bit for fd 1 */FD_SET(4, &rset);   /* turn on bit for fd 4 */FD_SET(5, &rset);   /* turn on bit for fd 5 */

Any of these 3 arguments can be specified as a null pointer if we are not interested in that condition. If all three pointers are null, we have a higher precision timer than sleep function(which sleeps for multiples of a second).
maxfdp1 is the maximum descriptor to be tested plus one, descriptors[0, maxfdp1 - 1] are tested.
We turn on all the bits in which we are interested in all the descriptor sets when we call select and any descriptor that is not ready on return will have its corresponding bit cleared. We use FD_ISSET macro to test a specific descriptor in an fd_set structure.
Return value indicates the total number of bits that are ready across all the descriptor sets. If the same bit was on in multiple sets, say a descriptor was ready for both reading and writing, it was counted twice. If the timer value expires before any of the descriptors are ready, a value of 0 is returned. -1 indicates an error(e.g., interrupted by a signal).

Under What Conditions Is a Descriptor Ready?

Ready for reading if:
a. The number of bytes of data in the socket receive buffer is greater than or equal to the current size of the low-water mark for the socket receive buffer.
read on the socket will not block and return a value greater than 0. We can set low-water mark(defaults to 1 for TCP and UDP sockets) using SO_RCVLOWAT socket option.
b. The read half of the connection is closed(i.e., TCP connection receive a FIN).
read on the socket will not block and return 0(i.e., EOF).
c. The socket is a listening socket and the number of completed connections is nonzero.
An accept on the listening socket will normally not block(Section 16.6: a timing condition can cause accept block).
d. A socket error is pending.
read on the socket will not block and return an error(-1) with errno set to the error condition. These pending errors can be fetched and cleared by calling getsockopt and specifying the SO_ERROR socket option.
Ready for writing if:
a. The number of bytes of available space in the socket send buffer is greater than or equal to the current size of the low-water mark for the socket send buffer and either: (i) the socket is connected, or
(ii) the socket does not require a connection(e.g., UDP).
If we set the socket to nonblocking(Chapter 16), a write operation will not block and will return a positive value. We can set low-water mark(default to 2048 for TCP and UDP sockets) using SO_SNDLOWAT socket option.
b. The write half of the connection is closed.
write on the socket will generate SIGPIPE(Section 5.12).
c. A socket using a non-blocking connect has completed the connection, or the connect has failed.
d. A socket error is pending.
write on the socket will not block and return an error(-1) with errno set to the error condition. These pending errors can be fetched and cleared by calling getsockopt with the SO_ERROR socket option.
A socket has an exception condition pending if there is out-of-band data for the socket or the socket is at the out-of-band mark.(Chapter 24)
When an error occurs on a socket, it is marked as both readable and writable by select.
The purpose of the receive and send low-water marks is to give the application control over how much data must be available for reading or how much space must be available for writing before select returns a readable or writable status. For example, if our application has nothing to do unless at least 64 bytes of data are present, we can set the receive low-water mark to 64 to prevent select from waking us up if less than 64 bytes are ready for reading.
As long as the send low-water mark for a UDP socket is less than the send buffer size, the UDP socket is writable since a connection is not required.
Figure 6.7 summarizes the conditions just described that cause a socket to be ready for select.

Maximum Number of Descriptors for select

/** Select uses bitmasks of file descriptors in longs. These macros manipulate such bit* fields (the filesystem macros use chars). FD_SETSIZE may be defined by the user,* but the default here should be enough for most uses.*/#ifndef FD_SETSIZE#define FD_SETSIZE256#endif

The only way to increase the size of the descriptor sets is to increase the value of FD_SETSIZE and then recompile the kernel.

6.4 ‘str_cli’ Function (Revisited)

Three conditions are handled with the socket. If the peer TCP sends:
1. data: socket becomes readable and read returns greater than 0(the number of bytes of data).
2. FIN(the peer process terminates): socket becomes readable and read returns 0(EOF).
3. RST(the peer host has crashed and rebooted): socket becomes readable, read returns -1, and errno contains the specific error code.

int Select(int maxfdp1, fd_set *readset, fd_set *writeset, fd_set *exceptset,           struct timeval *timeout){    int n;    if((n = select(maxfdp1, readset, writeset, exceptset, timeout)) < 0)    {        Exit("select error");    }    return n;}void str_cli(FILE *fp, int fd){    printf("str_cli Enter:\n");    int maxfdp1;    fd_set rset;    char sendline[MAXLINE], recvline[MAXLINE];    FD_ZERO(&rset);    for(;;)    {        FD_SET(fileno(fp), &rset);        FD_SET(fd, &rset);        maxfdp1 = ((fileno(fp) > fd) ? fileno(fp) : fd) + 1;        Select(maxfdp1, &rset, NULL, NULL, NULL);        printf("Select success\n");        printf("Check fd\n");        if(FD_ISSET(fd, &rset))        {            printf("fd is ready\n");            if(Readline(fd, recvline, MAXLINE) == 0)            {                Exit("str_cli: server terminated prematurely");            }            Fputs(recvline, stdout);        }        printf("Check fileno(fp)\n");        if(FD_ISSET(fileno(fp), &rset))        {            printf("fileno(fp) is ready\n");            if(Fgets(sendline, MAXLINE, fp) == NULL)            {                return;            }            Writen(fd, sendline, strlen(sendline));        }    }}

Call select 8-13

We only need one descriptor set to check for readability. This set is initialized by FD_ZERO and then two bits are turned on using FD_SET: the standard I/O file pointer(fp) bit and the socket(sockfd) bit. fileno converts a standard I/O file pointer into its corresponding descriptor.

Handle readable socket 14-18

On return from select, if socket is readable, the echoed line is read with readline and output by fputs.

Handle readable input 19-23

If the standard input is readable, a line is read by fgets and written to the socket using writen.

6.5 Batch Input and Buffering

Figure 5.5 operates in a stop-and-wait mode: It sends a line to the server and then waits for the reply. This amount of time is one RTT plus the server’s processing time(which is close to 0 for echo server).
Since the network between the client and server is a full-duplex pipe, requests go from the client to the server and replies in the reverse direction. Figure 6.10 shows stop-and-wait mode.

Since our client reads from standard input and writes to standard output, we can run our client in a batch mode. When we redirect the input and output, the resulting output file is always smaller than the input file and they should be identical for an echo server.
In batch mode, we can keep sending requests as fast as the network can accept them. The server processes them and sends back the replies at the same rate. This leads to the full pipe at time 7, as shown in Figure 6.11.

Problem with str_cli in Figure 6.9

Assume the input file contains only nine lines. The last line is sent at time 8. But we cannot close the connection after writing this request because there are other requests and replies in the pipe.
In batch mode, an EOF on input doesn’t imply that we have finished reading from the socket, there might be requests on the way to the server, or replies on the way back from the server.
We need a way to close one-half of the TCP connection: we send a FIN to the server, telling it we have finished sending data, but leave the socket descriptor open for reading. This is done with the shutdown function.

Buffering for performance adds complexity to a network application.

When several lines of input are available from the standard input, select will cause the code at line 20 to read the input using fgets and that will read the available lines into a buffer used by stdio. But fgets only returns a single line and leaves any remaining data sitting in the stdio buffer. The code at line 22 writes that single line to the server and then select is called again to wait for more work, even if there are additional lines to consume in the stdio buffer.
Reason is that select knows nothing of the buffers used by stdio, it will only show readability from the viewpoint of the read system call, not calls like fgets. So, mixing stdio and select is error-prone.
Same problem in the call to readline in Figure 6.9: data is hidden in readline’s buffer. One solution is to modify our code to use that function before calling select to see if data has already been read but not consumed. But the complexity grows out of hand when we have to handle the case where the readline buffer contains a partial line(meaning we still need to read more) as well as when it contains one or more complete lines(which we can consume).

6.6 ‘shutdown’ Function

Normal way to terminate a network connection is calling close. There are two limitations with close that can be avoided with shutdown:
1. close decrements the descriptor’s reference count and closes the socket only if the count reaches 0. With shutdown, we can initiate TCP’s normal connection termination sequence(the four segments beginning with a FIN) regardless of the reference count.
2. close terminates both directions of data transfer(reading and writing). Since TCP connection is full-duplex, there are times when we want to tell the other end that we have finished sending, even though that end might have more data to send us. Figure 6.12 shows the function calls in this scenario.

#include <sys/socket.h>int shutdown(int sockfd, int howto);Returns: 0 if OK, -1 on error

The action of the function depends on the value of the howto argument.

howto Meaning SHUT_RD The read half of the connection is closed: No more data can be received on the socket and any data currently in the socket receive buffer is discarded. The process can no longer issue any of the read functions on the socket. Any data received after this call for a TCP socket is acknowledged and then silently discarded. By default, everything written to a routing socket(Chapter 18) loops back as possible input to all routing sockets on the host. Some programs call shutdown with a second argument of SHUT_RD to prevent the loopback copy. An alternative way to prevent this loopback copy is to clear the SO_USELOOPBACK socket option. SHUT_WR The write half of the connection is closed: this is called a half-close in TCP. Any data currently in the socket send buffer will be sent, followed by TCP’s normal connection termination sequence. The process can no longer issue any of the write functions on the socket. SHUT_RDWR The read half and the write half of the connection are both closed. Equivalent to calling shutdown twice: first with SHUT_RD and then with SHUT_WR.

Figure 7.12 summarize the different possibilities available to the process by calling shutdown and close. The operation of close depends on the value of the SO_LINGER socket option.
Typical values for howto: 0(close the read half), 1(close the write half), 2(close the read half and the write half).

6.7 ‘str_cli’ Function (Revisited Again)

The former notifies us when the server closes its end of the connection and the latter lets us handle batch input correctly. This version does away with line-centric code and operates instead on buffers, eliminating the complexity concerns raised in Section 6.5.
5-8
stdineof is initialized to 0. As long as it is 0, each time around the main loop, we select on standard input for readability.
17-25
When we read the EOF on the socket, if we have encountered an EOF on standard input, this is normal termination; otherwise, the server process has prematurely terminated.
26-34
When we encounter the EOF on standard input, stdineof is set and we call shutdown with SHUT_WR to send the FIN.

6.8 TCP Echo Server (Revisited)

Rewrite the server as a single process that uses select to handle any number of clients.
Figure 6.14 shows the state of the server before the first client has established a connection.

The server maintains only a read descriptor set, which we show in Figure 6.15.

Assume the server is started in the foreground, so descriptors 0, 1, and 2 are set to standard input, output, and error. The first available descriptor for the listening socket is 3. So maxfdp1 = 4.
An array of integers named client that contains the connected socket descriptor for each client. All elements in this array are initialized to -1.
When the first client establishes a connection with our server, the listening descriptor becomes readable and our server calls accept. The new connected descriptor returned by accept will be 4. Figure 6.16 shows the connection from the client to the server.

From this point on, our server must remember the new connected socket in its client array, and the connected socket must be added to the descriptor set. These updated data structures are shown in Figure 6.17.

Sometime later a second client establishes a connection and we have the scenario shown in Figure 6.18.

The new connected socket(assume 5) must be remembered, giving the data structures shown in Figure 6.19.

Next, assume the first client terminates its connection. The client TCP sends a FIN, which makes descriptor 4 in the server readable. When our server reads this connected socket, read returns 0. We then close this socket and update our data structures accordingly. The value of client [0] is set to -1 and descriptor 4 in the descriptor set is set to 0. This is shown in Figure 6.20. Notice that the value of maxfd does not change.

As clients arrive, we record their connected socket descriptor in the first available entry in the client array(i.e., the first entry with a value of -1). We must also add the connected socket to the read descriptor set.
maxi: the highest index in the client array that is currently in use;
maxfd plus one: the current value of the first argument to select.
The limit on the number of clients that this server can handle is the minimum of the two values FD_SETSIZE and the maximum number of descriptors allowed for this process by the kernel.

Create listening socket and initialize for select 12-24

We initialize our data structures assuming that the only descriptor that we will select on initially is the listening socket.

Block in select 26-27

select waits for something to happen: either the establishment of a new client connection or the arrival of data, a FIN, or an RST on an existing connection.

accept new connections 28-45

If the listening socket is readable, a new connection has been established. We call accept and update our data structures accordingly. We use the first unused entry in the client array to record the connected socket. The number of ready descriptors is decremented, and if it is 0, we can avoid the next for loop.

Check existing connections 46-60

A test is made for each existing client connection as to whether or not its descriptor is in the descriptor set returned by select. If so, a line is read from the client and echoed back to the client. If the client closes the connection, read returns 0 and we update our data structures accordingly.
We never decrement the value of maxi, but we could check for this possibility each time a client closes its connection.

Denial-of-Service Attacks

Consider what happens if a client connects to the server, sends one byte of data(other than a newline), and then goes to sleep. The server will call read, which will read the single byte of data from the client and then block in the next call to read, waiting for more data from this client. The server is then blocked by this one client and will not service any other clients(either new client connections or existing clients’ data) until the client either sends a newline or terminates.
The concept is that when a server is handling multiple clients, the server can never block in a function call related to a single client. This is called a denial-of-service attack. It does something to the server that prevents it from servicing other legitimate clients.
Solutions:
(i) Use nonblocking I/O(Chapter 16),
(ii) Have each client serviced by a separate thread of control(e.g., either spawn a process or a thread to service each client),
(iii) Place a timeout on the I/O operations(Section 14.2).

6.9 ‘pselect’ Function

#include <sys/select.h>#include <signal.h>#include <time.h>#include <sys/types.h>#include <unistd.h>int pselect(int maxfdp1, fd_set *readset, fd_set *writeset, fd_set *exceptset,const struct timespec *timeout, const sigset_t *sigmask);Returns: count of ready descriptors, 0 on timeout, -1 on error

pselect contains two changes from select:
1. Use the timespec structure instead of the timeval structure.

struct timespec{    time_t  tv_sec; // seconds    long    tv_nsec;    // nanoseconds};

Add a sixth argument: a pointer to a signal mask. This allows the program to disable the delivery of certain signals, test some global variables that are set by the handlers for these now-disabled signals, and then call pselect, telling it to reset the signal mask.
- Consider the example(discussed on P308-309 of APUE). Our program’s signal handler for SIGINT sets the global intr_flag and returns. If our process is blocked in a call to select, the return from the signal handler causes the function to return with errno set to EINTR. When select is called, the code looks like:

if (intr_flag)    handle_intr();/* handle the signal */if((nready = select( ... )) < 0){    if (errno == EINTR)    {        if (intr_flag)            handle_intr();    }    ...}

Problem is that between the test of intr_flag and the call to select, if the signal occurs, it will be lost if select blocks forever. With pselect, we can code reliably as

sigset_t newmask, oldmask, zeromask;sigemptyset(&zeromask);sigemptyset(&newmask);sigaddset(&newmask, SIGINT);sigprocmask(SIG_BLOCK, &newmask, &oldmask); /* block SIGINT */if (intr_flag)    handle_intr();/* handle the signal */if((nready = pselect ( ... , &zeromask)) < 0){    if(errno == EINTR)    {        if (intr_flag)            handle_intr ();    }    ...}

Before testing intr_flag, we block SIGINT. When pselect is called, it replaces the signal mask of the process with an empty set(i.e., zeromask) and then checks the descriptors, possibly going to sleep. But when pselect returns, the signal mask of the process is reset to its value before pselect was called(i.e., SIGINT is blocked).

6.10 ‘poll’ Function

#include <poll.h>int poll(struct pollfd *fdarray, unsigned long nfds, int timeout);Returns: count of ready descriptors, 0 on timeout, -1 on error

fdarray is a pointer to the first element of an array of structures. Each element of the array is a pollfd structure that specifies the conditions to be tested for a given descriptor fd.

struct pollfd{    int fd;         // descriptor to check    short events;       // events of interest on fd    short revents;  // events that occurred on fd};

The conditions to be tested are specified by the events member, and the function returns the status for that descriptor in the corresponding revents member. Figure 6.23: Constants used to specify the events flag and to test the revents flag against.

Three classes of data identified by poll: normal, priority band, and high-priority. These terms come from the STREAMS-based implementations(Figure 31.5). POLLIN = (POLLRDNORM | POLLRDBAND). POLLOUT = POLLWRNORM.
With regard to TCP and UDP sockets, the following conditions cause poll to return the specified revent.
1. All regular TCP data and all UDP data is considered normal.
2. TCP’s out-of-band data(Chapter 24) is considered priority band.
3. When the read half of a TCP connection is closed(e.g., FIN is received), this is considered normal data and a subsequent read operation will return 0.
4. An error(receipt of RST, timeout…) for TCP connection can be considered either normal data or an error(POLLERR). In either case, a subsequent read will return -1 with errno set to the appropriate value.
5. The availability of a new connection on a listening socket can be considered either normal data or priority data. Most consider this normal data.
6. The completion of a nonblocking connect is considered to make a socket writable.
nfds: the number of elements in the array of structures.
timeout: how long the function is to wait before returning. A positive value specifies the number of milliseconds to wait. Figure 6.24 shows the values for timeout.

The constant INFTIM is defined to be a negative value. If the system does not provide a timer with millisecond accuracy, the value is rounded up to the nearest supported value. As with select, any timeout is limited by the implementation’s clock resolution(often 10 ms).
The return value from poll is -1 if an error occurred, 0 if no descriptors are ready before the timer expires, otherwise is the number of descriptors that have a nonzero revents member.
If we are no longer interested in a particular descriptor, we set the fd member of the pollfd structure to a negative value. Then the events member is ignored and the revents member is set to 0 on return.

6.11 TCP Echo Server (Revisited Again)

With poll, we must allocate an array of pollfd structures to maintain the client information. We handle the fd member of this array the same way we handled the client array in Figure 6.15: a value of -1 means the entry is not in use; otherwise, it is the descriptor value.

Allocate array of pollfd structures 11

We declare OPEN_MAX elements in our array of pollfd structures.

Initialize 20-24

We use the first entry in the client array for the listening socket and set the descriptor for the remaining entries to -1. We set the POLLRDNORM event for this descriptor, to be notified by poll when a new connection is ready to be accepted. The variable maxi contains the largest index of the client array currently in use.

Call poll, check for new connection 26-42

We call poll to wait for either a new connection or data on existing connection.
When a new connection is accepted, we find the first available entry in the client array by looking for the first one with a negative descriptor. When an available entry is found, we save the descriptor and set the POLLRDNORM event.

Check for data on an existing connection 43-63

The two return events that we check for are POLLRDNORM and POLLERR. We did not set POLLERR in the events member because it is always returned when the condition is true.
The reason we check for POLLERR is because some implementations return this event when an RST is received for a connection, while others return POLLRDNORM. In either case, we call read and if an error has occurred, it will return an error. When an existing connection is terminated by the client, we just set the fd member to -1.

6.12 Summary

Exercises 6.1

We said that a descriptor set can be assigned to another descriptor set across an equals sign in C. How is this done if a descriptor set is an array of integers? (Hint: Look at your system’s sys/select.h> or sys/types.h> header.)

The array of integers is contained within a structure and C allows structures to be assigned across an equals sign.

Exercises 6.2

When describing the conditions for which select returns “writable” in Section 6.3, why did we need the qualifier that the socket had to be nonblocking for a write operation to return a positive value?

If select tells us that the socket is writable, the socket send buffer has room for 8,192 bytes, but when we call write for this blocking socket with a buffer length of 8,193 bytes, write can block, waiting for room for the final byte.
Read operations on a blocking socket will always return a short count if some data is available, but write operations on a blocking socket will block until all the data can be accepted by the kernel. Therefore, when using select to test for writability, we must set the socket to nonblocking to avoid blocking.

Exercises 6.3

What happens in Figure 6.9 if we pre-pend the word “else” before the word “if” on line 19?

If both descriptors are readable, only the first test is performed, the test of the socket. But this does not break the client; it just makes it less efficient.
That is, if select returns with both descriptors readable, the first if is true, causing a readline from the socket followed by an fputs to standard output. The next if is skipped, but select is then called again and immediately finds standard input readable and returns immediately.
The key concept here is that what clears the condition of “standard input being readable” is not select returning, but reading from the descriptor.

Exercises 6.4

In our example in Figure 6.21 add code to allow the server to be able to use as many descriptors as currently allowed by the kernel. (Hint: Look at the setrlimit function.)

Use the getrlimit function to fetch the values for the RLIMIT_NOFILE resource and then call setrlimit to set the current soft limit (rlim_cur) to the hard limit (rlim_max).

Exercises 6.5

Let’s see what happens when the second argument to shutdown is SHUT_RD. Start with the TCP client in Figure 5.4 and make the following changes: Change the port number from SERV_PORT to 19, the chargen server (Figure 2.18); then, replace the call to str_cli with a call to the pause function. Run this program specifying the IP address of a local host that runs the chargen server. Watch the packets with a tool such as tcpdump (Section C.5). What happens?

The server application continually sends data to the client, which the client TCP acknowledges and throws away.

Exercises 6.6

Why would an application call shutdown with an argument of SHUT_RDWR instead of just calling close?

shutdown with SHUT_WR or SHUT_RDWR always sends a FIN, while close sends a FIN only if the descriptor reference count is 1 when close is called.

Exercises 6.7

What happens in Figure 6.22 when the client sends an RST to terminate the connection?

read returns an error, and our Read wrapper function terminates the server. Servers must be more robust than this. Notice that we handle this condition in Figure 6.26, although even that code is inadequate. Consider what happens if connectivity is lost between the client and server and one of the server’s responses eventually times out. The error returned could be ETIMEDOUT.
In general, a server should not abort for errors like these. It should log the error, close the socket, and continue servicing other clients. Realize that handling an error of this type by aborting is unacceptable in a server such as this one, where one process is handling all clients. But if the server was a child handling just one client, then having that one child abort would not affect the parent (which we assume handles all new connections and spawns the children), or any of the other children that are servicing other clients.

Please indicate the source: http://blog.csdn.net/gaoxiangnumber1

Welcome to my github: https://github.com/gaoxiangnumber1

0 0