netperf的时间测量值得分析

来源:互联网 发布:人工智能 蒋里博士 编辑:程序博客网 时间:2024/03/29 17:11

最近载一个项目中发现用netperf测试的网络延时波动较大,于是分析原因,同时也发现了netperf本身测量的问题。

发现netperf的结果

ELAPSED_TIME=60.03

         MIN_LATENCY=2

         MAX_LATENCY=20888

         MEAN_LATENCY=399.72

         P90_LATENCY=6

         P99_LATENCY=11800

         STDDEV_LATENCY=2084.31

         THROUGHPUT=655.23

         LOCAL_CPU_UTIL=1.13

         REMOTE_CPU_UTIL=1.86

         LOCAL_SD=1.132

         REMOTE_SD=5.575

         REQUEST_SIZE=-1

         RESPONSE_SIZE=-1

         ELAPSED_TIME=60.03

最小值极小,根据传输的数据包大小计算应该远远大于2us。分析了netperf代码,

在send_omni_inner函数里有如下代码:

HIST_get_stats(time_hist,
  &min_latency,
  &max_latency,
  &mean_latency,
  &stddev_latency);
    p50_latency = HIST_get_percentile(time_hist, 0.02);
    p90_latency = HIST_get_percentile(time_hist, 0.90);
    p99_latency = HIST_get_percentile(time_hist, 0.99);

进入HIST_get_stats:

void
HIST_get_stats(HIST h, int *min, int *max, double *mean, double *stddev){
  *min = h->hmin;
  *max = h->hmax;
  if (h->total){
    *mean = (double)h->sum / (double)h->total;
    *stddev = (h->sumsquare * h->total - pow((double)h->sum, 2)) /
      pow(h->total, 2);
    *stddev = sqrt(*stddev);
  }
  else{
    *mean = 0;
    *stddev = 0;
  }
}

看来传入的HIST h就是重点,测量的值都来之与这个结构。

接下来看下发送tcp的函数tcp_send_stream:

time_hist = HIST_new();这里先进行了初始化。

#ifdef WANT_HISTOGRAM
      if (verbosity > 1) {
/* timestamp just before we go into send and then again just
after we come out raj 8/94 */
/* but lets only do this if there is going to be a histogram
  displayed */
HIST_timestamp(&time_one);
      }
#endif /* WANT_HISTOGRAM */


      if((len=send(send_socket,
  send_ring->buffer_ptr,
  send_size,
  0)) != send_size) {
      if ((len >=0) || SOCKET_EINTR(len)) {
   /* the test was interrupted, must be the end of test */
   break;
 }
perror("netperf: data send error");
printf("len was %d\n",len);
exit(1);
      }


      local_bytes_sent += send_size;


#ifdef WANT_HISTOGRAM
      if (verbosity > 1) {
/* timestamp the exit from the send call and update the histogram */
HIST_timestamp(&time_two);
HIST_add(time_hist,delta_micro(&time_one,&time_two));
      }
#endif /* WANT_HISTOGRAM */

好像有点眉目了,HIST_timestamp是个获取时间戳的函数,不同的平台有不同的实现,可以看出在send调用之前获取了时间戳time_one,

在send系统调用返回后获取了时间戳time_two,然后就是HIST_add函数的事情了,来看看它是干啥的。

void
HIST_add(register HIST h, int time_delta){
   register float val;
   register int base = HIST_NUM_OF_BUCKET / 10;


   /* check for < 0 added via VMware ESX patches. */


   /* hoisted up to the top because we do not want to count any
      ridiculous values in the actual statistics. right? raj
      2011-07-28 */
   if (time_delta < 0) {
     h->ridiculous++;
     return;
   }


   if (!h->total)
      h->hmin = h->hmax = time_delta;
   h->total++;
   h->sum += time_delta;
   /* am I just being paranoid about the overhead of pow() when we
      aren't all that interested in the statistics derived from it?
      raj 20100914 */
   if (keep_statistics) {
     h->sumsquare += pow(time_delta, 2);
   }
   h->hmin = ((h->hmin < time_delta) ? h->hmin : time_delta);
   h->hmax = ((h->hmax > time_delta) ? h->hmax : time_delta);
   val = (float) time_delta;
   if(val < 10) h->unit_usec[(int)(val * base)]++;
   else {
     val /= 10;
     if(val < 10) h->ten_usec[(int)(val * base)]++;
     else {
       val /= 10;
       if(val < 10) h->hundred_usec[(int)(val * base)]++;
       else {
val /= 10;
if(val < 10) h->unit_msec[(int)(val * base)]++;
else {
  val /= 10;
  if(val < 10) h->ten_msec[(int)(val * base)]++;
  else {
    val /= 10;
    if(val < 10) h->hundred_msec[(int)(val * base)]++;
    else {
               val /= 10;
               if(val < 10) h->unit_sec[(int)(val * base)]++;
               else {
val /= 10;
if(val < 10) h->ten_sec[(int)(val * base)]++;
else h->ridiculous++;
               }
    }
  }
}
       }
     }
   }
}

哈哈,这个函数我只负责贴代码,很容易看出来都干了些啥。一句话netperf统计的时间就是send调用的时间。

send调用调用了sock_sendmsg->inet_sendmsg->tcp_sendmsg,发送消息就是把数据拷贝到发送缓冲区就返回了,这里唯一要处理的

就是发送缓冲区满之后,send会阻塞(非阻塞模式除外)直到有可用的内存,没有足够内存会执行下面的代码

wait_for_memory:
if (copied)
tcp_push(sk, flags & ~MSG_MORE, mss_now, TCP_NAGLE_PUSH);


if ((err = sk_stream_wait_memory(sk, &timeo)) != 0)
goto do_error;


mss_now = tcp_send_mss(sk, &size_goal, flags);

所以send的时间关键就在sk_stream_wait_memory这个函数上面。用systemtap测试下这个函数的阻塞时间,

写个stp脚本计算下send调用在sk_stream_wait_memory上阻塞的时间。

stap sk_stream_wait_memory.stp

然后用脚本统计下结果的处min:3us,max:20880us和netperf的统计结果一致。所以如果要计算真正的网络时间,需要打开tcp协议的时间戳记录选项,

在proc/sys/net/ipv4/下可以设置,然后用tcpdump抓包,用wireshark分析即可。

0 0
原创粉丝点击