ramcloud 5us低时延是如何做到的？

来源：互联网发布：黄牛抢票软件演唱会编辑：程序博客网时间：2024/05/21 17:15

最近在研究ramcloud,我对ramcloud的其中一个指标很感兴趣, ramcloud宣称能达到5us的读时延和15us的写时延，一般hdd的时延是ms级别的，相差1000倍, ramcloud采用了什么技术达到如此低的时延？

看了ramcloud相关的paper, 有如下一句话："The RAMCloud Storage System" :When used with leading-edge networking, RAMCloud offers exceptionally low latency for remote access. In our 80-node development cluster, a client can read any 100-byte object in less than 5 µs, and durable writes take about 15 µs.

要达到如此低的时延，有前提条件的：1、leading-edge networking；2、100字节左右的小对象。

什么叫leading-edge交换，查了一下字典，大致的意思是采用了最先进的网络交换技术。那到底采用了哪些先进的网络交换机书呢？下面我们详细地来看一下。

一、首先我们看一下当前数据中心的网络交换时延的现状:

1) 当前数据中心一般一个来回的数据报文需要200~500us. 我用ping 工具测试了一下(4k大小的报文)，一般是0.2ms左右（1G以太网），如果是10g 以太网，一般是0.15ms左右。

上面的图说明了tcp 通讯时各个模块花费的时间，delay表示一次单个模块花费的时间，Round-Trip表示一个回来，一般需要经过5次交换(大型的数据中心)，4次os nework stack. 4次os 网络栈比较好理解,因为端对端的,一个端发送和接收是2次.

2) 网卡和网络交换机一般为了少丢包，所以一般交换机和网卡的缓冲队列比较大。导致当网络很拥塞时，网络时延也会大大增加。

3) 网卡为了在单个中断里处理更多的报文，一般都会延迟30us处理，这也会增加网络时延。

二、RamCloud采用了哪些先进的网络交换技术呢？

1）最新的10G 交换机采用了cut-through交换芯片，例如Arista公司的交换机，采用了Fulcrum Microsystems ，提供了小于1us的交换时延。

2）网卡采用最新的网卡芯片，

3）RDMA技术，可以直接访问另外一台服务器的内存。我公司同事测试过RDMA 的时延，大约在几us左右。而且提供高可靠。

4）采用poll模式，消除了中断时延和进程切换时延。

题外话：

最近在存储性能优化，特地调查了system call和进程切换花费的时间，下面的一篇blog对system call和进程切换做了实际测试：http://blog.tsunanet.net/2010/11/how-long-does-it-take-to-make-context.html，

system call: 大约50~100ns

Intel 5150: 105ns/syscall
Intel E5440: 87ns/syscall
Intel E5520: 58ns/syscall
Intel X5550: 52ns/syscall
Intel L5630: 58ns/syscall
Intel E5-2620: 67ns/syscall

context swich: 大约3us到4.3 us

Intel 5150: ~4300ns/context switch
Intel E5440: ~3600ns/context switch
Intel E5520: ~4500ns/context switch
Intel X5550: ~3000ns/context switch
Intel L5630: ~3000ns/context switch
Intel E5-2620: ~3000ns/context switch

如果pin cpu 也就是进程一直在这个cpu上执行，则context switch: 大约1.3us到1.9us，提升了1倍左右。

Intel 5150: ~1900ns/process context switch, ~1700ns/thread context switch
Intel E5440: ~1300ns/process context switch, ~1100ns/thread context switch
Intel E5520: ~1400ns/process context switch, ~1300ns/thread context switch
Intel X5550: ~1300ns/process context switch, ~1100ns/thread context switch
Intel L5630: ~1600ns/process context switch, ~1400ns/thread context switch
Intel E5-2620: ~1600ns/process context switch, ~1300ns/thread context siwtch

0 0