Aprial 27th Friday (四月 二十七日 金曜日)

来源:互联网 发布:软件定制开发方案 编辑:程序博客网 时间:2024/05/10 19:56

Raw Socket Output

  Output on a raw socket is governed by the following rules:

  Normal output is performed by calling sendto or sendmsg and specifying the destination IP address. write, writev,
or send can also be called if the socket has been connected.

If the IP_HDRINCL option is not set, the starting address of the data for the kernel to send specifies the first byte
following the IP header because the kernel will build the IP header and prepend it to the data from the process. The
kernel sets the protocol field of the IPv4 header that it builds to the third argument from the call to socket.

If the IP_HDRINCL option is set, the starting address of the data for the kernel to send specifies the first byte of
the IP header. The amount of data to write must include the size of the caller's IP header. The process builds the
entire IP header, except: (i) the IPv4 identification field can be set to 0, which tells the kernel to set this value;
(ii) the kernel always calculates and stores the IPv4 header checksum; and (iii) IP options may or may not be included.

The kernel fragments raw packets that exceed the outgoing interface MTU.


IPv6 Differences

There are a few differences with raw IPv6 sockets (RFC 3542 [Stevens et al. 2003]):

All fields in the protocol headers sent or received on a raw IPv6 socket are in network byte order.

There is nothing similar to the IPv4 IP_HDRINCL socket option with IPv6. Complete IPv6 packets (including the IPv6
header or extension headers) cannot be read or written on an IPv6 raw socket. Almost all fields in an IPv6 header and
all extension headers are available to the application through socket options or ancillary data (see Exercise 28.1).
Should an application need to read or write complete IPv6 datagrams, datalink access must be used.

Checksums on raw IPv6 sockets are handled differently, as will be described shortly.

IPV6_CHECKSUM Socket Option

For an ICMPv6 raw socket, the kernel always calculates and stores the checksum in the ICMPv6 header. This differs from
an ICMPv4 raw socket, where the application must do this itself (compare Figures 28.14 and 28.16). While ICMPv4 and
ICMPv6 both require the sender to calculate the checksum, ICMPv6 includes a pseudoheader in its checksum. One of the
fields in this pseudoheader is the source IPv6 address, and normally the application lets the kernel choose this value.
To prevent the application from having to try to choose this address just to calculate the checksum, it is easier to
let the kernel calculate the checksum.

For other raw IPv6 sockets (i.e., those created with a third argument to socket other than IPPROTO_ICMPV6), a socket
option tells the kernel whether to calculate and store a checksum in outgoing packets and verify the checksum in
received packets. By default, this option is disabled, and it is enabled by setting the option value to a nonnegative
value, as in

int offset = 2;

if (setsockopt(sockfd, IPPROTO_IPV6, IPV6_CHECKSUM,
               &offset, sizeof(offset)) < 0)
    error

This not only enables checksums on this socket, it also tells the kernel the byte offset of the 16-bit checksum:
2 bytes from the start of the application data in this example. To disable the option, it must be set to -1.
When enabled, the kernel will calculate and store the checksum for outgoing packets sent on the socket and also
verify the checksums for packets received on the socket.


Raw Socket Input

The first question that we must answer regarding raw socket input is: Which received IP datagrams does the kernel
pass to raw sockets? The following rules apply:

Received UDP packets and received TCP packets are never passed to a raw socket. If a process wants to read IP
datagrams containing UDP or TCP packets, the packets must be read at the datalink layer.

Most ICMP packets are passed to a raw socket after the kernel has finished processing the ICMP message.
Berkeley-derived implementations pass all received ICMP packets to a raw socket other than echo request,
timestamp request, and address mask request (pp. 302?303 of TCPv2). These three ICMP messages are processed
entirely by the kernel.

All IGMP packets are passed to a raw socket after the kernel has finished processing the IGMP message.

All IP datagrams with a protocol field that the kernel does not understand are passed to a raw socket.
The only kernel processing done on these packets is the minimal verification of some IP header fields:
the IP version, IPv4 header checksum, header length, and destination IP address.

If the datagram arrives in fragments, nothing is passed to a raw socket until all fragments have arrived and
have been reassembled.

When the kernel has an IP datagram to pass to the raw sockets, all raw sockets for all processes are examined,
looking for all matching sockets. A copy of the IP datagram is delivered to each matching socket. The following
tests are performed for each raw socket and only if all three tests are true is the datagram delivered to the
socket:

If a nonzero protocol is specified when the raw socket is created (the third argument to socket), then the
received datagram's protocol field must match this value or the datagram is not delivered to this socket.

If a local IP address is bound to the raw socket by bind, then the destination IP address of the received
datagram must match this bound address or the datagram is not delivered to this socket.

If a foreign IP address was specified for the raw socket by connect, then the source IP address of the received
datagram must match this connected address or the datagram is not delivered to this socket.

Notice that if a raw socket is created with a protocol of 0, and neither bind nor connect is called, then
that socket receives a copy of every raw datagram the kernel passes to raw sockets.

Whenever a received datagram is passed to a raw IPv4 socket, the entire datagram, including the IP header,
is passed to the process. For a raw IPv6 socket, only the payload is passed to the socket.


ICMPv6 Type Filtering

A raw ICMPv4 socket receives most ICMPv4 messages received by the kernel. But ICMPv6 is a superset of ICMPv4,
including the functionality of ARP and IGMP. Therefore, a raw ICMPv6 socket can potentially receive many more
packets compared to a raw ICMPv4 socket. But most applications using a raw socket are interested in only a small
subset of all ICMP messages.

To reduce the number of packets passed from the kernel to the application across a raw ICMPv6 socket, an application
-specified filter is provided. A filter is declared with a datatype of struct icmp6_filter, which is defined by
including <netinet/icmp6.h>. The current filter for a raw ICMPv6 socket is set and fetched using setsockopt and
getsockopt with a level of IPPROTO_ICMPv6 and an optname of ICMP6_FILTER.

#include <netinet/icmp6.h>
 
void ICMP6_FILTER_SETPASSALL (struct icmp6_filter *filt);
 
void ICMP6_FILTER_SETBLOCKALL (struct icmp6_filter *filt);
 
void ICMP6_FILTER_SETPASS (int msgtype, struct icmp6_filter *filt);
 
void ICMP6_FILTER_SETBLOCK (int msgtype, struct icmp6_filter *filt);
 
int ICMP6_FILTER_WILLPASS (int msgtype, const struct icmp6_filter *filt);
 
int ICMP6_FILTER_WILLBLOCK (int msgtype, const struct icmp6_filter *filt);
 
 /* Both return: 1 if filter will pass (block) message type, 0 otherwise */

The filt argument to all the macros is a pointer to an icmp6_filter variable that is modified by the first four
macros and examined by the final two macros. The msgtype argument is a value between 0 and 255 and specifies the
ICMP message type.

The SETPASSALL macro specifies that all message types are to be passed to the application, while the SETBLOCKALL
macros specifies that no message types are to be passed. By default, when an ICMPv6 raw socket is created, all
ICMPv6 message types are passed to the application.

The SETPASS macro enables one specific message type to be passed to the application while the SETBLOCK macro
blocks one specific message type. The WILLPASS macro returns 1 if the specified message type is passed by the
filter, or 0 otherwise; the WILLBLOCK macro returns 1 if the specified message type is blocked by the filter,
or 0 otherwise.

As an example, consider the following application, which wants to receive only ICMPv6 router advertisements:

struct icmp6_filter myfilt;

fd = socket (AF_INET6, SOCK_RAW, IPPROTO_ICMPV6);

ICMP6_FILTER_SETBLOCKALL (&myfilt);
ICMP6_FILTER_SETPASS (ND_ROUTER_ADVERT, &myfilt);
Setsockopt (fd, IPPROTO_ICMPV6, ICMP6_FILTER. &myfilt, sizeof (myfilt));

We first block all message types (since the default is to pass all message types) and then pass only router
advertisements. Despite our use of the filter, the application must be prepared to receive all types of ICMPv6
packets since any ICMPv6 packets that arrive between the socket and the setsockopt will be added to the receive
queue. The ICMP6_FILTER option is simply an optimization.