文章标题

来源:互联网 发布:51单片机数码管时钟 编辑:程序博客网 时间:2024/06/02 04:22

**

TCP的RTT计算


  1. RTT和timestamp的关系**

      RTT是TCP发送端请求报文段和接收到该应答的时间间隔,主要由链路传输时间路径,路由处理和终端处理时间组成,其中传输时间和中断处理时间较为固定,而路由排队转发时间受网络拥塞程度影响,变化很大。
      RTT的测量可采用两种方式,用timestamp测量或者用重传队列中数据包的TCP控制块。采用第一种测量方式,在通信开始时就要启用timestamp选项。开始一次测量发送方要携带回显时间,而接收方要把回显时间在应答时返回,这样:
    RTT=当前时刻-回显时间
    采用第二种测量方式:在不启用timestamp时,重传队列保存着发送还没有被应答的报文段。数据包skb中的TCP控制块包含着一个变量tcp_skb_cb->when。这个变量保存着第一次的发送时刻。应答到来时,可计算RTT为:
    RTT=当前时刻-when
    既然存在when变量,为何还要增加timestamp option?这就牵涉到计算方式的问题了。when变量只能保存第一次发送时刻,而超时重传的报文短只能采取Karn算法,舍弃该报文的RTT计算。携带有timestamp的报文段在重传时可以写入当前重传时刻,可以完全避免重传舍弃的情况。

    2. Nagle算法对RTT计算的影响

      Nagle算法主是应对应用程序产生小报段直接造成的资源浪费。考虑到报文头部的长度,报文段每次载荷越多,每次传输的信息效率越高。Nagle算法在第一个报文到达时立即发送,而后在一个RTT时间内缓存起来其他到达的报文段,当应答到来时发送下一个报文,依次类推。可以看出只要没有应答延时,对RTT是没有影响的。
    由公式RTT=应答到来时刻-发送时刻,可以看出应答延时会加长RTT。

    3. 真实中的RTT计算和RTT方差以及和RTO(Retransmission Timeout)的关系

      前面说道了RTT的影响因素,网络的不稳定性也造成了每次测量结果的变化,而且在现在的Internet上变化非常大,以至于无法用单次测量结果作为超时重传的依据。为此,绝大多数实现使用的是一种平滑的RTT,称为RTTs,而测量的RTT称为RTTm。计算公式采取加权平均,可写成:

nRTTs=oRTTs(1-α)+α×RTTm

方差(RTTd)可采用以下步骤计算,开始是没有RTTd,在第一次测量后RTTd=RTTm/2,第二次以后的计算公式为:
RTTd=(1-β)RTTs+β×abs(RTTs-RTTm)
  在实现中α通常取1/8,β通常取1/4。
  RTO是基于平滑往返时间和方差得来的,在开始通信时,由于没有测量RTT,而才用直接赋初值的方式,在任意一次测量过RTT之后,RTO计算公式为:
RTO=RTTs+4×RTTd

4. Linux对RTO、RTT的优化

  Linux实线是按标准rfc文档实现的,主要的不同是在计算时对α、β的处理采取了大量的移位操作,计算效率增加。
具体代码如下:

static void tcp_rcv_rtt_update(struct tcp_sock *tp, u32 sample, int win_dep)
{
u32 new_sample = tp->rcv_rtt_est.rtt;
long m = sample;

if (m == 0)    m = 1;if (new_sample != 0) {    /* If we sample in larger samples in the non-timestamp     * case, we could grossly overestimate the RTT especially     * with chatty applications or bulk transfer apps which     * are stalled on filesystem I/O.     *     * Also, since we are only going for a minimum in the     * non-timestamp case, we do not smooth things out     * else with timestamps disabled convergence takes too     * long.     */    if (!win_dep) {        m -= (new_sample >> 3);        new_sample += m;    } else if (m < new_sample)        new_sample = m << 3;} else {    /* No previous measure. */    new_sample = m << 3;}if (tp->rcv_rtt_est.rtt != new_sample)    tp->rcv_rtt_est.rtt = new_sample;

}
RTO计算代码:
static void tcp_rtt_estimator(struct sock *sk, const __u32 mrtt)
{
struct tcp_sock *tp = tcp_sk(sk);
long m = mrtt; /* RTT */

/*  The following amusing code comes from Jacobson's *  article in SIGCOMM '88.  Note that rtt and mdev *  are scaled versions of rtt and mean deviation. *  This is designed to be as fast as possible *  m stands for "measurement". * *  On a 1990 paper the rto value is changed to: *  RTO = rtt + 4 * mdev * * Funny. This algorithm seems to be very broken. * These formulae increase RTO, when it should be decreased, increase * too slowly, when it should be increased quickly, decrease too quickly * etc. I guess in BSD RTO takes ONE value, so that it is absolutely * does not matter how to _calculate_ it. Seems, it was trap * that VJ failed to avoid. 8) */if (m == 0)    m = 1;if (tp->srtt != 0) {    m -= (tp->srtt >> 3);   /* m is now error in rtt est */    tp->srtt += m;      /* rtt = 7/8 rtt + 1/8 new */    if (m < 0) {        m = -m;     /* m is now abs(error) */        m -= (tp->mdev >> 2);   /* similar update on mdev */        /* This is similar to one of Eifel findings.         * Eifel blocks mdev updates when rtt decreases.         * This solution is a bit different: we use finer gain         * for mdev in this case (alpha*beta).         * Like Eifel it also prevents growth of rto,         * but also it limits too fast rto decreases,         * happening in pure Eifel.         */        if (m > 0)            m >>= 3;    } else {        m -= (tp->mdev >> 2);   /* similar update on mdev */    }    tp->mdev += m;          /* mdev = 3/4 mdev + 1/4 new */    if (tp->mdev > tp->mdev_max) {        tp->mdev_max = tp->mdev;        if (tp->mdev_max > tp->rttvar)            tp->rttvar = tp->mdev_max;    }    if (after(tp->snd_una, tp->rtt_seq)) {        if (tp->mdev_max < tp->rttvar)            tp->rttvar -= (tp->rttvar - tp->mdev_max) >> 2;        tp->rtt_seq = tp->snd_nxt;        tp->mdev_max = tcp_rto_min(sk);    }} else {    /* no previous measure. */    tp->srtt = m << 3;  /* take the measured time to be rtt */    tp->mdev = m << 1;  /* make sure rto = 3*rtt */    tp->mdev_max = tp->rttvar = max(tp->mdev, tcp_rto_min(sk));    tp->rtt_seq = tp->snd_nxt;}

}

/* Calculate rto without backoff. This is the second half of Van Jacobson’s
* routine referred to above.
*/

0 0
原创粉丝点击