facebook开源WDT项目

来源:互联网 发布:淘宝儿童毛衣 编辑:程序博客网 时间:2024/05/27 18:17


作为国际著名的社交网站,Facebook与开源项目一直关系密切。目前,其设立的开源项目个数已经超过200个。这些开源项目在为广大开发人员提供便利的同时,也为Facebook带来了各方面丰厚的回报。近日,Facebook又公布了一个开源项目——超高速数据传输工具(Warp speed Data Transfer,WDT)。接下来,本文就对WDT项目进行简要介绍。

从2004年上线开始,Facebook迅速发展。时至今日,Facebook每月的活跃用户量超过20亿。在西方的万圣节,Facebook每天所接收到的照片数量更是多达20几亿张。为了满足用户的巨大需求,该公司先后在美国俄勒冈州和北卡罗来纳州等多个地方建立了数据中心。那么,如何在数据中心内部的主机之间以及数据中心之间进行高效的数据传输就成了迫切需要解决的问题。为此,Facebook设立了WDT项目。

WDT可以被视为一个嵌入式的库或者命令行工具,其目的是在尽量减少资源(CPU/内存等)消耗的情况下,利用多个TCP路径提高两个系统之间传送文件的效率。为了提高代码的可移植性,Facebook尽量减少了WDT的依赖关系。由此,代码的编译时间也大量减少,并使得项目轻量化。此外,WDT没有采用异常,以保证传输效率和代码的易于集成。

在传输机制方面,WDT采用了阻塞式线程IO,保证在任何点都会有线程在读和写。这样,数据就可以被缓冲在双向传输道路上,使得最小内核/用户空间切换的情况下,每个子系统仍然处于忙碌状态。正是这样的双向传输机制进一步保证了系统吞吐率的最大化。

目前,WDT代码已经托管在GitHub中。其代码中包含了一个小的命令行工具wcp.sh,用来测试传输性能。据透露,在Facebook内部传输系统之间的RocksDB快照时,WDT通过长距离和高延迟的传输链路提供了高达600MB/s的传输速率。相比于之前高度优化的基于HTTP的传输方案,WDT传输速率约是其3倍左右,且系统资源消耗更少。在没有进行节流控制的情况下,WDT可以轻易使得40Gb/s的网卡饱和,并得到近乎理论的链路传输速度(大于4GB/s)。

未来,Facebook会借助开源社区继续对WDT项目进行改进。其关注点包括重新构建代码来使用无需复制的流/缓存流水线和处理乱序的报文等。


1. 概述

前几天facebook开源了WDT项目,项目的地址为:https://github.com/facebook/wdt

WDT可以看为一个嵌入式的库或者命令行工具,其目的是在尽量减少资源(CPU/内存等)消耗的情况下,利用多个TCP路径提高两个系统之间传送文件的效率。本文主要是体验一下该项目,在Ubuntu14.04上安装并进行测试。这里有个条件可能需要在gcc的版本,在Centos6.3/6.5上的gcc版本为4.4,在编译的过程中可能会出现错误,这里使用的gcc-4.9.0,但是没有在gcc-4.8.0上进行编译,不太确定是否可用。

2. 安装

1.       第一步先是在ubuntu14.04上安装gcc /g++版本。

sudo add-apt-repository ppa:ubuntu-toolchain-r/test

sudo apt-get update

sudo apt-get install g++-4.9             

通过命令g++ -v可以看到安装的g++版本号

g++ -v

gcc version 4.9.2 (Ubuntu 4.9.2-0ubuntu1~14.04)

2.        下面的命令保证可以在编译Cmake时,找到

sudo apt-get install build-essential

ubuntu上必须先安装build-essential否则,在cmake下执行bootstrap的话,会出现下面的错误。

./bootstrap --prefix=/usr --parallel=16

---------------------------------------------

CMake 3.2.3, Copyright 2000-2015 Kitware, Inc.

---------------------------------------------

Error when bootstrapping CMake:

Cannot find appropriate C compiler on this system.

Please specify one using environment variable CC.

See cmake_bootstrap.log for compilers attempted.

 

---------------------------------------------

Log of errors: /home/ubuntu/cmake-3.2.3/Bootstrap.cmk/cmake_bootstrap.log

3.       wget http://www.cmake.org/files/v3.2/cmake-3.2.3.tar.gz

tar xvfz cmake-3.2.3.tar.gz

cd cmake-3.2.3

./bootstrap --prefix=/usr --parallel=16 && make -j && sudo make install

4.       sudo apt-get install libgoogle-glog-dev libboost-system-dev \

libdouble-conversion-dev

5.       sudo apt-get install git subversion.

6.       git clone https://github.com/floitsch/double-conversion.git

cd double-conversion; cmake . ; make -j && sudo make install

7.       git clone https://github.com/schuhschuh/gflags.git

mkdir gflags/build

cd gflags/build

cmake -D GFLAGS_NAMESPACE=google -D BUILD_SHARED_LIBS=on ..

make -j && sudo make install

8.       git clone https://github.com/facebook/folly.git

9.       svn checkout http://google-glog.googlecode.com/svn/trunk/ glog

./configure --with-gflags=/usr/local

10.   git clone https://github.com/facebook/wdt.git

cmake /home/ubuntu/wdt/ -DBUILD_TESTING=on

make -j

make test

sudo make install

3.  验证测试

在接收端执行下面的命令:sudo wdt -directory /home/ubuntu/test/

ubuntu@10-8-7-191:~/test$ sudo wdt -directory /home/ubuntu/test/

I0802 12:26:45.662655 31270 wdtCmdLine.cpp:130] Running WDT 1.15.1507290 p 15

I0802 12:26:45.662818 31270 WdtBase.cpp:304] Generated a transfer id 1616988683

I0802 12:26:45.662849 31270 WdtBase.cpp:272] using wdt protocol version 15

I0802 12:26:45.663175 31270 Receiver.cpp:151] Registered 8 sockets

I0802 12:26:45.663211 31270 Receiver.cpp:163] Transfer id 1616988683

I0802 12:26:45.663229 31270 wdtCmdLine.cpp:161] Starting receiver with connection url

wdt://10-8-7-191?ports=22356,22357,22358,22359,22360,22361,22362,22363&protocol=15&id=1616988683

I0802 12:26:45.663298 31270 wdtCmdLine.cpp:82] Setting up abort 0 seconds.

I0802 12:26:45.663318 31270 Receiver.cpp:404] Starting (receiving) server on ports [ 22356 22357 22358 22359 22360 22361 22362 22363 ] Target dir : /home/ubuntu/test/

I0802 12:26:45.663354 31270 FileCreator.cpp:215] dir already exists /

I0802 12:26:45.663379 31270 FileCreator.cpp:215] dir already exists /home/

I0802 12:26:45.663391 31270 FileCreator.cpp:215] dir already exists /home/ubuntu/

I0802 12:26:45.663419 31270 FileCreator.cpp:215] dir already exists /home/ubuntu/test/

I0802 12:26:45.663478 31270 WdtBase.cpp:292] Throttling not enabled

I0802 12:26:45.664497 31279 Receiver.cpp:361] Progress reporter updating every 20 ms

I0802 12:26:55.560704 31272 Receiver.cpp:504] New transfer started 1

[================================================>] 99% 162.8 5.6 Mbytes/s  I0802 12:26:56.284689 31274 Receiver.cpp:463] Received done for all threads. Transfer session 1 finished

I0802 12:26:56.285032 31274 Receiver.cpp:1268] Thread[3, port: 22359]  got ack for DONE. Transfer finished

I0802 12:26:56.285162 31278 Receiver.cpp:1268] Thread[7, port: 22363]  got ack for DONE. Transfer finished

I0802 12:26:56.285157 31275 Receiver.cpp:1268] Thread[4, port: 22360]  got ack for DONE. Transfer finished

I0802 12:26:56.285118 31277 Receiver.cpp:1268] Thread[6, port: 22362]  got ack for DONE. Transfer finished

I0802 12:26:56.285444 31272 Receiver.cpp:1268] Thread[1, port: 22357]  got ack for DONE. Transfer finished

I0802 12:26:56.285712 31276 Receiver.cpp:1268] Thread[5, port: 22361]  got ack for DONE. Transfer finished

I0802 12:26:56.285836 31271 Receiver.cpp:1268] Thread[0, port: 22356]  got ack for DONE. Transfer finished

I0802 12:26:56.285913 31273 Receiver.cpp:1268] Thread[2, port: 22358]  got ack for DONE. Transfer finished

W0802 12:26:56.286119 31273 Receiver.cpp:1386] Last thread finished. Duration of the transfer 0.725431

[=================================================] 100% 159.5 Mbytes/s         

W0802 12:26:56.286365 31270 Receiver.cpp:268] WDT receiver's transfer has been finished

I0802 12:26:56.286382 31270 Receiver.cpp:269] Transfer status = OK. Number of blocks transferred = 632. Data Mbytes = 115.742. Header kBytes = 13.8848 (0.0117151% overhead). Total bytes = 121378462. Wasted bytes due to failure = 0 (0% overhead).

在发送端执行下面的命令wdt -directory /usr/bin -destination 10.8.7.191 -transfer_id 1616988683,在传输完成后,最下面是统计信息。

wdt -directory /usr/bin -destination 10.8.7.191 -transfer_id 1616988683

I0802 12:26:55.557087  5427 wdtCmdLine.cpp:130] Running WDT 1.15.1507290 p 15

I0802 12:26:55.557574  5427 WdtBase.cpp:272] using wdt protocol version 15

I0802 12:26:55.557641  5427 wdtCmdLine.cpp:200] Starting sender with details wdt://10.8.7.191?ports=22356,22357,22358,22359,22360,22361,22362,22363&protocol=15&dir=/usr/bin&id=1616988683

I0802 12:26:55.557668  5427 wdtCmdLine.cpp:82] Setting up abort 0 seconds.

I0802 12:26:55.557696  5427 Sender.cpp:357] Client (sending) to 10.8.7.191, Using ports [ 22356 22357 22358 22359 22360 22361 22362 22363 ]

I0802 12:26:55.557796  5427 WdtBase.cpp:292] Throttling not enabled

I0802 12:26:55.557886  5428 DirectorySourceQueue.cpp:139] Exploring root dir /usr/bin/ include_pattern :  exclude_pattern :  prune_dir_pattern :

I0802 12:26:55.560621  5429 Sender.cpp:489] Connection took 1 attempt(s) and 0.00210126 seconds. port 22356

I0802 12:26:55.560621  5430 Sender.cpp:489] Connection took 1 attempt(s) and 0.00209491 seconds. port 22357

I0802 12:26:55.561148  5431 Sender.cpp:489] Connection took 1 attempt(s) and 0.000271586 seconds. port 22358

I0802 12:26:55.561575  5432 Sender.cpp:489] Connection took 1 attempt(s) and 0.000139581 seconds. port 22359

I0802 12:26:55.561785  5433 Sender.cpp:489] Connection took 1 attempt(s) and 0.000302113 seconds. port 22360

I0802 12:26:55.570382  5437 Sender.cpp:1277] Progress reporter tracking every 20 ms

I0802 12:26:55.570552  5436 Sender.cpp:489] Connection took 1 attempt(s) and 0.00024346 seconds. port 22363

I0802 12:26:55.571169  5435 Sender.cpp:489] Connection took 1 attempt(s) and 0.00111626 seconds. port 22362

I0802 12:26:55.573040  5428 DirectorySourceQueue.cpp:294] Number of files explored: 632, errors: false

I0802 12:26:55.574348  5434 Sender.cpp:489] Connection took 1 attempt(s) and 0.00426591 seconds. port 22361

[================================================>] 99% 160.6 171.7 Mbytes/s  I0802 12:26:56.285174  5432 Sender.cpp:1106] Port 22359 done. Transfer status = OK. Number of blocks transferred = 51. Data Mbytes = 15.3743. Header kBytes = 1.82031 (0.0115624% overhead). Total bytes = 16123023. Wasted bytes due to failure = 0 (0% overhead). Total throughput = 21.2451 Mbytes/sec

I0802 12:26:56.285308  5433 Sender.cpp:1106] Port 22360 done. Transfer status = OK. Number of blocks transferred = 37. Data Mbytes = 9.76006. Header kBytes = 1.39941 (0.0140021% overhead). Total bytes = 10235597. Wasted bytes due to failure = 0 (0% overhead). Total throughput = 13.4855 Mbytes/sec

I0802 12:26:56.285414  5436 Sender.cpp:1106] Port 22363 done. Transfer status = OK. Number of blocks transferred = 131. Data Mbytes = 16.1024. Header kBytes = 4.19922 (0.025467% overhead). Total bytes = 16888874. Wasted bytes due to failure = 0 (0% overhead). Total throughput = 22.5229 Mbytes/sec

I0802 12:26:56.285537  5435 Sender.cpp:1106] Port 22362 done. Transfer status = OK. Number of blocks transferred = 52. Data Mbytes = 12.5765. Header kBytes = 1.80859 (0.0140436% overhead). Total bytes = 13189316. Wasted bytes due to failure = 0 (0% overhead). Total throughput = 17.5835 Mbytes/sec

I0802 12:26:56.285713  5429 Sender.cpp:1106] Port 22356 done. Transfer status = OK. Number of blocks transferred = 61. Data Mbytes = 16.361. Header kBytes = 2.10254 (0.0125498% overhead). Total bytes = 17157861. Wasted bytes due to failure = 0 (0% overhead). Total throughput = 22.5007 Mbytes/sec

I0802 12:26:56.285908  5434 Sender.cpp:1106] Port 22361 done. Transfer status = OK. Number of blocks transferred = 78. Data Mbytes = 14.3567. Header kBytes = 2.74512 (0.0186727% overhead). Total bytes = 15056863. Wasted bytes due to failure = 0 (0% overhead). Total throughput = 20.0654 Mbytes/sec

I0802 12:26:56.285987  5430 Sender.cpp:1106] Port 22357 done. Transfer status = OK. Number of blocks transferred = 78. Data Mbytes = 17.14. Header kBytes = 2.52637 (0.0143941% overhead). Total bytes = 17975215. Wasted bytes due to failure = 0 (0% overhead). Total throughput = 23.5733 Mbytes/sec

I0802 12:26:56.286065  5431 Sender.cpp:1106] Port 22358 done. Transfer status = OK. Number of blocks transferred = 144. Data Mbytes = 14.071. Header kBytes = 4.63965 (0.0322004% overhead). Total bytes = 14759246. Wasted bytes due to failure = 0 (0% overhead). Total throughput = 19.4189 Mbytes/sec

I0802 12:26:56.286252  5431 Sender.cpp:1075] Last thread finished 0.728525

[=================================================] 100% 158.9 Mbytes/s         

I0802 12:26:56.286648  5427 Sender.cpp:330] Total sender time = 0.728582 seconds (0.0155556 dirTime). Transfer summary : Transfer status = OK. Number of files transferred = 632. Data Mbytes = 115.742. Header kBytes = 21.2412 (0.0179221% overhead). Total bytes = 121385995. Wasted bytes due to failure = 0 (0% overhead).

Total sender throughput = 158.885 Mbytes/sec (162.351 Mbytes/sec pure transf rate)

3.1      使用wcp.sh

Wdt提供了一个类似于scp的脚本,该脚本使用wdt进行模拟scp的功能,下面是使用wcp.sh进行数据包的传输的过程。

./wcp.sh -n /data/bigfile  ubuntu@10.8.7.191:/data

Copying bigfile (/data/bigfile) to 10.8.7.191 (using ubuntu@10.8.7.191 in /data)

Starting destination side server

Starting at Sun Aug  2 13:56:28 CST 2015 (1438494988043)

1438494988.047310974

[                                                 ] 0% 0.0 0.0 Mbytes/s  wdt://10-8-7-191?ports=22356,22357,22358,22359,22360,22361,22362,22363&protocol=15&id=7825

[=================================================] 100% 86.8 Mbytes/s

Complete!

 

real  1m55.222s

user 0m2.858s

sys   0m10.572s

Sun Aug  2 13:58:23 CST 2015

Dst checksum

Succesfull transfer

All done client side! cleanup...

Overall transfer @ 86 Mbytes/sec (115228 ms for 10485760000 uncompressed)

Source checksum:

26f56024ac39cdc54b228820107f040d  /data/bigfile

         使用scp的进行测试的话。

scp /data/bigfile ubuntu@10.8.7.191:/data

bigfile                                                                                             100%   10GB  87.7MB/s   01:54

4.总结

 通过上面的测试,在copy /usr/bin目录下的文件时,速度确认很快,但是在使用scpwcp传输一个较大的文件时,并没有体现出其很大的优势,也许是我测试的方法有问题,后续有机会再研究测试,也希望可以得到实践者的指导。
参考文献:
http://www.infoq.com/cn/news/2015/07/facebook-wdt


0 0