Hunting I/O Bottlenecks with iostat

来源：互联网发布：阿里云华北2 可用区编辑：程序博客网时间：2024/06/07 03:35

http://www.linuxquestions.org/linux/articles/Jeremys_Magazine_Articles/Hunting_I_O_Bottlenecks_with_iostat

Hunting I/O Bottlenecks with iostat
Tech Support
Written by Jeremy Garcia

The October 2004 “Tech Support” column showed you how to track downperformance bottlenecks using vmstat. This month, let’s take a closerlook at input/output (I/O) issues that you may have identified usingvmstat.

The iostat command monitors system I/O device loading by observing thetime devices are active in relation to their average transfer rates.The iostat command generates reports that can be used to modify your systemconfiguration to better balance the I/O load between physical disks orto let you know when you have reached the threshold of your currentdisk subsystem.
Running iostat with no arguments generates a report that contains information since the system was booted. You can provide two optional parameters to change this:

Code:

$ iostat [ interval [ count ] ]

The interval parameter specifies the amount of time in secondsbetween each report. You can specify the count parameter in conjunctionwith the interval parameter and control how many reports are generatedbefore iostat exits. When using these arguments, the first reportcontains information since the system was booted, while each subsequentreport covers the time period since the last report.
By default, iostat generates two reports, one for CPUutilization and one for device utilization. You can use the –c optionto get just the CPU report or the –d option to get just the devicereport. Here is the default output from iostat:

Code:

avg-cpu: %user %nice %sys %iowait %idle
4.92 0.00 0.78 0.77 93.53

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 1.22 0.19 33.09 690042 121113184
sda1 0.82 0.04 23.16 146122 84755712
sda2 0.40 0.15 9.93 543450 36357472
sdb 1.25 0.27 33.09 986982 121113184
sdb1 0.84 0.04 23.16 160592 84755712
sdb2 0.41 0.23 9.93 825960 36357472

The first three columns of the CPU report show the percentage of CPU utilization that occurred while executing at the userlevel (applications), at the user level with a nice priority, and atthe system level (kernel), respectively. The last two columns show thepercentage of time the CPU was idle while it had an outstanding diskI/O request and while it did not have an outstanding disk I/O request.
The columns in the device report section are transfers per second,number of blocks per second read, number of blocks per second written,total number of blocks read, and total number of blocks written. Youcan use the –k parameter to change the last four columns to beexpressed in kilobytes instead of blocks, which makes the report alittle more readable. You can get additional extended statistics usingthe –x parameter, but you must be running either a 2.6 kernel or apatched 2.4 kernel (such as the one shipped with Red Hat Enterprise Linux 3).

Now that you know what each column means, how do you use thisinformation? First, make sure you run iostat when your system isrunning slow or you suspect a problem. Rediect the output of iostat toa file, setting the interval to about 15 seconds and the count to12-16. This provides a quick snapshot of what’s happening over the spanof a few minutes. You don’t want to run iostat too often, as it willstart to actually contribute to the load and skew your numbers.

Now that you have the output, how do you interpret the numbers? As youmight guess, reading iostat output takes a bit of experience and anunderstanding of the underlying principles behind the numbers.
The first thing you should look at is iowait. If you have a high percentage of CPU timeidle while it’s waiting on disk I/O, that’s a good indicator that youhave an I/O bottleneck. Moving on to the device section, you should beable to easily see how I/O is being distributed between disks. Do youhave a lot of activity on one disk while another one is sitting idle?If so, you should see if you can move some of the activity from theactive disk to the idle disk. You may have a case where all of youravailable disks are being utilized or you can’t evenly distribute theload among the existing disks. In that case, you need to either addadditional disks (if you have the capacity) or replace the currentdisks with ones that have a faster spindle speed, higher throughput,and lower seek times.

Once you are comfortable with iostat you can use the –x parameter toget useful information such as average request size, average wait timefor requests and average service time for requests.
With a little work, iostat allows you to identify I/O bottlenecks andlead you to potential solutions. The numbers may seem overwhelming atfirst, but with some patience, you’ll be able to use iostatproductively in no time. Happy hunting.