High Load And Linux Server. Part 1. Router And NAT-server

来源：互联网发布：sql leftjoin order by 编辑：程序博客网时间：2024/05/16 06:48

http://www.erazer.org/high-load-and-linux-server-part-1-router-and-nat-server/

I work more than 10 years for various ISP’s. At the moment we are delivering urban network, in which more than 30,000 subscribers. I think I have something to share with readers. Surely, as set out herein may be useful to someone. Obviously, this is only the first part of a series of articles on the heavily Linux-based servers for various purposes.

Router

On small streams tuning does not play a big role. However, if you have a large network under big load – you just have to make some tuning of the network stack.

First of all, if you have Gigabit Ethernet interfaces in your network, it makes sense to pay attention to the MTU on your servers and switches. In a nutshell, MTU – is the amount of package that can be transmitted over the network without fragmentation. Ie how much information one router can transfer to another without fragmentation. With a significant increase in the volume of network traffic is much more efficient to transmit packets larger volume of less – than send small data packets more often.

Increase MTU on linux

/sbin/ifconfig eth0 mtu 9000

Increase MTU on switches

On the switching equipment usually it will be called jumbo-frame. In particular, for Cisco Catalyst 3750

3750(config)# system mtu jumbo 9000
3750(config)# exit
3750# reload

Note: the switch must be reloaded then. By the way, mtu jumbo command relate only gigabit links – 100-Mbit links are not affected.

Increase the transmission queue on linux

/sbin/ifconfig eth0 txqueuelen 10000

The default value is 1000. But for Gigabit links it is recommended to set 10000. In a nutshell, this is the size of the send buffer. When the buffer is filled up to this boundary value, data is transmitted to the network.

Keep in mind that if you change the MTU size on the interface of some network node – you have to do the same thing and at the interface of its neighbor. So, if you increase the MTU to 9000 on the linux-router interface, you should enable jumbo-frame to the switch port which this router is plugged to. Otherwise, the network will work, but very badly: packets going through the network will be "over one".

Results

As a result of all these changes, you’ll notice ping time is increased – but the overall throughput will increase significantly, and the load on the active equipment will be less.

NAT server

NAT (Network Address Translation) is one of the most expensive (in the sense of resource-intensive) operations. Therefore, if you have a large network you should pay attention to NAT-server tuning.

Increasing count of connections tracking

To carry out its tasks, NAT-server is "remember" all the connections that pass through it. Whether it’s "ping" or someone’s "ICQ" – NAT-server "remembers" and follows in his memory in a special table all of these sessions. When the session closes, information about it is deleted from the connecton tracking table. The size of this table is fixed. That is why, if the traffic through the server is quite a lot but lacks the size of the table, – then NAT-server starts to drop packets and just breaks sessions. To avoid such horrors, it is necessary to adequately increase the size of the connection tracking table – in accordance with the traffic passing through NAT:

/sbin/sysctl -w net.netfilter.nf_conntrack_max=524288

It is not recommended to put so big value if you have less than 1 gigabyte of RAM in your NAT-server. To show the current value you can use something like this:

/sbin/sysctl net.netfilter.nf_conntrack_max

See how connection tracking table is already full can be like this:

/sbin/sysctl net.netfilter.nf_conntrack_count

Increasing the size of hash-table

Hash table, which stores lists of conntrack-entries, should be increased proportionately.

echo 65536 > /sys/module/nf_conntrack/parameters/hashsize

The rule is simple: hashsize = nf_conntrack_max / 8

Decreasing time-out values

NAT-server only tracks "live" session which pass through it. When the session is closed – information about it is removed so that the connection tracking table does not overflow. Information about the sessions is removed as a timeout. That is, if a session is empty a long time, it is closed and information about it is just removed from the connectionn tracking table.

However, the default value of time-outs are quite large. Therefore, for large flows of traffic even if you stretch nf_conntrack_max to the limit – you can still run the risk of quickly run into the overflow table, and the connection is broken. To this did not happen, you must correctly set the timeout connection tracking on NAT-server. Current values can be seen, for example:

sysctl -a | grep conntrack | grep timeout

As a result, you’ll see something like this:

net.netfilter.nf_conntrack_generic_timeout = 600 net.netfilter.nf_conntrack_tcp_timeout_syn_sent = 120 net.netfilter.nf_conntrack_tcp_timeout_syn_recv = 60 net.netfilter.nf_conntrack_tcp_timeout_established = 432000 net.netfilter.nf_conntrack_tcp_timeout_fin_wait = 120 net.netfilter.nf_conntrack_tcp_timeout_close_wait = 60 net.netfilter.nf_conntrack_tcp_timeout_last_ack = 30 net.netfilter.nf_conntrack_tcp_timeout_time_wait = 120 net.netfilter.nf_conntrack_tcp_timeout_close = 10 net.netfilter.nf_conntrack_tcp_timeout_max_retrans = 300 net.netfilter.nf_conntrack_tcp_timeout_unacknowledged = 300 net.netfilter.nf_conntrack_udp_timeout = 30 net.netfilter.nf_conntrack_udp_timeout_stream = 180 net.netfilter.nf_conntrack_icmp_timeout = 30 net.netfilter.nf_conntrack_events_retry_timeout = 15

This is the value of timeouts in seconds. As you can see, the value net.netfilter.nf_conntrack_generic_timeout is 600 (10 minutes). Ie NAT-server keeps in mind about the session as long as it is to "run over" anything at least once every 10 minutes.

At first glance, that’s okay – but in fact it is very, very bad. If you look at net.netfilter.nf_conntrack_tcp_timeout_established – you will see there is value 432000. In other words, your NAT-server will support a simple TCP-session as long as it does runs on some bag at least once every 5 days (!).

Speaking even more simply, it is just easy to DDOS such NAT-server: his connection-tracking table (nf_conntrack_max) overflows with simple flood – so that he will break the connection and in the worst case quickly turns into a black hole.

The time-outs it is recommended to set within 30-120 seconds. This is quite sufficient for normal users, and this is quite sufficient for the timely clearing NAT-table, which excludes its overflow. And do not forget to enter the appropriate changes to /etc/rc.local и /etc/sysctl.conf

Results

After tuning you will get a viable and productive NAT-server. Of course, this is only the basic tuning – we are not concerned, for example, kernel tuning, etc. things. However, in most cases even such simple actions will be sufficient for normal operation of a sufficiently large network. As I said earlier, our network of more than 30 thousand subscribers, the traffic which was treated with 4 NAT-server.

In the following articles:

large flows and high-shaper;
large flows and high-performance firewall.