Switching Performance – Connecting Linux Network Namespaces

来源:互联网 发布:mac和windows快捷键 编辑:程序博客网 时间:2024/05/29 18:23

In a previous article I have shown several solutions to interconnect Linux network namespaces (http://www.opencloudblog.com). Four different solutions can be used – but which is the best solution with respect to performance and resource usage? This is quite interesting when you are running Openstack Networking Neutron together with the Openvswitch. The systems used to run these tests have Ubuntu 13.04 installed with kernel 3.8.0.30 and Openvswitch 1.9. The MTU of the IP interfaces is kept at the default value of 1500.

A test script has been used to collect performance numbers. The test script creates 2 namespaces, connects these namespaces using different Linux network technologies and measures the throughput and CPU usage using the software iperf. The iperf server process is started in one namespace, the client iperf process is started in the other namespace.

IMPORTANT REMARK: The results are confusing – a deeper analysis shows, that the configuration of the virtual network devices has a major impact on the performance. The settings TSO and GSO play a very important role when using network devices. I’ll show an analyses in an upcoming article.

Perf test setup

Perf test setup

Single CPU (i5-2550 [3.3 GHz]) Kernel 3.13

The test system has the Desktop CPU i5-2500 CPU @ 3.30GHz providing 4 CPU cores and 32 GByte DDR3-10600 RAM providing around 160 GBit/s RAM throughput.

The test system is running Ubuntu 14.04 with kernel 3.13.0.24 and Openvswitch 2.0.1

The results are shown in the table below. iperf has been running with one, two and four threads. At the end the limiting factor is CPU usage. The column “efficiency” is defined as network throughput in GBit/s per Gigahertz available on CPUs.

Switch and Connection typeno of
iperf threadsthroughput
[GBit/s]
tso gso lro gro onEfficiency
[GBit/s per CPUGHZ]throughput
[GBit/s]
tso gso lro gro offone veth pair137.86.33.7one veth pair265.05.47.9one veth pair454.64.411.0one veth pair840.73.211.4one veth pair1637.42.911.7linuxbridge with two veth pairs133.35.52.7linuxbridge with two veth pairs254.34.45.6linuxbridge with two veth pairs443.93.46.9linuxbridge with two veth pairs832.12.57.9linuxbridge with two veth pairs1634.02.67.9openvswitch with two veth pairs135.05.93.2openvswitch with two veth pairs251.54.26.7openvswitch with two veth pairs447.33.88.7openvswitch with two veth pairs836.02.87.5openvswitch with two veth pairs1636.52.89.4openvswitch with two internal ovs ports137.06.23.3openvswitch with two internal ovs ports265.65.56.4openvswitch with two internal ovs ports474.35.76.3openvswitch with two interval ovs ports874.35.710.9openvswitch with two internal ovs ports1673.45.612.6

The numbers show an drastic effect, if TSO … are switches off. TSO perfformes the segmentation at the NIC level. In this software NIC only environment no segmentation is done by the NIC, the large packets (average 40 kBytes) are sent to the other side as one packet. The guiding factor is the packet packet rate.

Single CPU (i5-2550 [3.3 GHz]) Kernel 3.8

The test system is running Ubuntu 13.04 with kernel 3.8.0.30 and Openvswitch 1.9 .

The results are shown in the table below. iperf has been running with one, two and four threads. At the end the limiting factor is CPU usage. The column “efficiency” is defined as network throughput in GBit/s per Gigahertz available on CPUs.

Switch and Connection typeno of
iperf threadsthroughput
[GBit/s]Efficiency
[GBit/s per CPUGHZ]one veth pair17.41.21one veth pair213.51.15one veth pair414.21.14one veth pair815.31.17one veth pair1614.01.06linuxbridge with two veth pairs13.90.62linuxbridge with two veth pairs28.50.70linuxbridge with two veth pairs48.80.69linuxbridge with two veth pairs89.50.72linuxbridge with two veth pairs169.10.69openvswitch with two veth pairs14.50.80openvswitch with two veth pairs29.70.82openvswitch with two veth pairs410.70.85openvswitch with two veth pairs811.30.86openvswitch with two veth pairs1610.70.81openvswitch with two internal ovs ports141.96.91openvswitch with two internal ovs ports269.15.63openvswitch with two internal ovs ports475.55.74openvswitch with two interval ovs ports867.05.08openvswitch with two internal ovs ports1674.35.63

The results show a huge differences. The openvswitch using two internal openvswitch ports has the best throughput and the best efficiency.

The short summary is:

  • Use Openvswitch and Openvswitch internal ports – in the case of one iperf thread you get 6.9 GBit/s throughput per CPU Ghz. But this solution does not provide any iptables rules on the link.
  • If you like the old linuxbridge and veth pairs you get only 0.7 GBit/s per CPU Ghz throughput. With this solution it’s possible to filter the traffic on the network namespace links.

The table shows some interesting effects, e.g.:

  • The test with the ovs and two ovs ports shows a drop in performance between 4 and 16 threads. The CPU analysis shows, that in the case of 8 threads, the CPU time used by softirqs doubled in comparison to the case of 4 threads. The softirq time used by 16 threads is the same as for 4 threads.

Openstack

If you are running Openstack Neutron, you should use the Openvswitch. Avoid linuxbridges. When connecting the Neutron networking Router/LBaas/DHCP namespaces DO NOT enable ovs_use_veth.

0 0