latencytop深度了解你的Linux系统的延迟
来源:互联网 发布:福师大网络教育平台 编辑:程序博客网 时间:2024/06/05 10:05
原创文章,转载请注明: 转载自系统技术非业余研究
本文链接地址: latencytop深度了解你的Linux系统的延迟
我们在系统调优或者定位问题的时候,经常会发现多线程程序的效率很低,但是又不知道问题出在哪里,就知道上下文切换很多,但是为什么上下文切换,是谁导致切换,我们就不知道了。上下文切换可以用dstat这样的工具查看,比如:
$dstat
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq|
read
writ| recv send|
in
out | int csw
9 2 87 2 0 1|7398k 31M| 0 0 | 9.8k 11k| 16k 64k
20 4 69 3 0 4| 26M 56M| 34M 172M| 0 0 | 61k 200k
21 5 64 6 0 3| 26M 225M| 35M 175M| 0 0 | 75k 216k
21 5 66 4 0 4| 25M 119M| 34M 173M| 0 0 | 66k 207k
19 4 68 5 0 3| 23M 56M| 33M 166M| 0 0 | 60k 197k
#或者用systemtap脚本来看
$
sudo
stap -e
'global cnt; probe scheduler.cpu_on {cnt<<<1;} probe timer.s(1){printf("%d\n", @count(cnt)); delete cnt;}'
217779
234141
234759
每秒高达200k左右的的上下文切换, 谁能告诉我发生了什么? 好吧,latencytop来救助了!
它的官网:http://www.latencytop.org/
Skipping audio, slower servers, everyone knows the symptoms of latency. But to know what’s going on in the system, what’s causing the latency, how to fix it… that’s a hard question without good answers right now.
LatencyTOP is a Linux* tool for software developers (both kernel and userspace), aimed at identifying where in the system latency is happening, and what kind of operation/action is causing the latency to happen so that the code can be changed to avoid the worst latency hiccups.
它是Intel贡献的另外一个性能查看器,还有一个是powertop,都是很不错的工具.
Latencytop通过在内核上下文切换的时候,记录被切换的进程的内核栈,然后通过匹配内核栈的函数来判断是什么原因导致上下文切换,同时他把几十种容易引起切换的场景的函数都记录起来,这样在判断系统问题的时候能容易定位到问题。
latencytop分成2个部分,内核部分和应用部分。内核部分负责调用栈的收集并且通过/proc来暴露, 应用部分负责显示.
工作界面截图如下:
latencytop在2.6.256后被内核吸收成为其中一部分,只要编译的时候打开该选项就好,如何确认呢?
$
cat
/proc/latency_stats
Latency Top version : v0.1
看到这个就好了, 遗憾的是RHEL6竟然带了latencytop应用部分,而没有打开编译选项,让我们情何以堪呢?
在ubuntu下可以这么安装:
$
uname
-r
2.6.38-yufeng
$ apt-get
install
latencytop
$
sudo
latencytop
#就可以使用了
但是latencytop比较傻的是默认是开图像界面的,我们很不习惯,我们要文本界面, 自己动手把!
$ apt-get
source
latencytop
$
diff
-up Makefile.orig Makefile
--- Makefile.orig 2011-03-29 20:10:29.025845447 +0800
+++ Makefile 2011-03-28 14:48:11.232318002 +0800
@@ -1,5 +1,5 @@
# FIXME: Use autoconf ?
-HAS_GTK_GUI = 0
+
#HAS_GTK_GUI = 0
DESTDIR =
SBINDIR = /usr/sbin
重新make下就好了, 文本界面出现了. 具体使用参看 man latencytop。
fcicq同学说:
加个 –nogui 参数就好了. 不需要重新编译.
谢谢!
好了,那么latencytop支持多少种的延迟原因呢?让latencytop.trans告诉你,我们也可以自己修改这个文件,把新的延迟原因加上去。
$
cat
/usr/share/latencytop/latencytop.trans
#
1 vfs_read Reading from
file
1 vfs_write Writing to
file
1 __mark_inode_dirty Marking inode dirty
1 vfs_readdir Reading directory content
1 vfs_unlink Unlinking
file
1 blocking_notifier_call_chain Blocking notifier
1 lock_super Superblock lock contention
1 vfs_create Creating a
file
1 KAS_ScheduleTimeout Binary AMD driver delay
1 firegl_lock_device Binary AMD driver delay
#
2 __bread Synchronous buffer
read
2 do_generic_mapping_read Reading
file
data
2 sock_sendmsg Sending data over socket
2 do_sys_open Opening
file
2 do_sys_poll Waiting
for
event (poll)
2 core_sys_select Waiting
for
event (
select
)
2 proc_reg_read Reading from /proc
file
2 __pollwait Waiting
for
event (poll)
2 sys_fcntl FCNTL system call
2 scsi_error_handler SCSI error handler
2 proc_root_readdir Reading /proc directory
2 ksoftirqd Waking ksoftirqd
2 worker_thread .
2 do_unlinkat Unlinking
file
2 __wait_on_buffer Waiting
for
buffer IO to complete
2 pdflush pdflush() kernel thread
2 kjournald kjournald() kernel thread
2 blkdev_ioctl block device IOCTL
2 kauditd_thread kernel audit daemon
2 tty_ioctl TTY IOCTL
2 __filemap_fdatawrite_range fdatasync system call
2 do_sync_write synchronous write
2 kthreadd kthreadd kernel thread
2 usb_port_resume Waking up USB device
2 usb_autoresume_device Waking up USB device
2 kswapd kswapd() kernel thread
2 md_thread Raid resync kernel thread
2 i915_wait_request Waiting
for
GPU
command
to complete
2 request_module Loading a kernel module
#
3 tty_wait_until_sent Waiting
for
TTY to finish sending
3 pipe_read Reading from a pipe
3 pipe_write Writing to a pipe
3 pipe_wait Waiting
for
pipe data
3 read_block_bitmap Reading EXT3 block bitmaps
3 scsi_execute_req Executing raw SCSI
command
3 sys_wait4 Waiting
for
a process to die
3 sr_media_change Checking
for
media change
3 sr_do_ioctl SCSI cdrom ioctl
3 sd_ioctl SCSI disk ioctl
3 sr_cd_check Checking CDROM media present
3 ext3_read_inode Reading EXT3 inode
3 htree_dirblock_to_tree Reading EXT3 directory htree
3 ext3_readdir Reading EXT3 directory
3 ext3_bread Synchronous EXT3
read
3 ext3_free_branches Unlinking
file
on EXT3
3 ext3_get_branch Reading EXT3 indirect blocks
3 ext3_find_entry EXT3: Looking
for
file
3 __ext3_get_inode_loc Reading EXT3 inode
3 ext3_delete_inode EXT3 deleting inode
3 sync_page Writing a page to disk
3 tty_poll Waiting
for
TTY data
3 tty_read Waiting
for
TTY input
3 tty_write Writing data to TTY
3 update_atime Updating inode atime
3 page_cache_sync_readahead Pagecache
sync
readahead
3 do_fork Fork() system call
3 sys_mkdirat Creating directory
3 lookup_create Creating
file
3 inet_sendmsg Sending TCP/IP data
3 tcp_recvmsg Receiving TCP/IP data
3 link_path_walk Following
symlink
3 path_walk Walking directory tree
3 sys_getdents Reading directory content
3 unix_stream_recvmsg Waiting
for
data on unix socket
3 ext3_mkdir EXT3: Creating a directory
3 journal_get_write_access EXT3: Waiting
for
journal access
3 synchronize_rcu Waiting
for
RCU
3 input_close_device Closing input device
3 mousedev_close_device Closing mouse device
3 mousedev_release Closing mouse device
3 mousedev_open Opening mouse device
3 kmsg_read Reading from dmesg
3 sys_futex Userspace lock contention
3 do_futex Userspace lock contention
3 vt_waitactive vt_waitactive IOCTL
3 acquire_console_sem Waiting
for
console access
3 filp_close Closing a
file
3 sync_inode (f)syncing an inode to disk
3 ata_exec_internal_sg Executing internal ATA
command
3 writeback_inodes Writing back inodes
3 ext3_orphan_add EXT3 adding orphan
3 ext3_mark_inode_dirty EXT3 marking inode dirty
3 ext3_unlink EXT3 unlinking
file
3 ext3_create EXT3 Creating a
file
3 log_do_checkpoint EXT3 journal checkpoint
3 generic_delete_inode Deleting an inode
3 proc_delete_inode Removing /proc
file
3 do_truncate Truncating
file
3 sys_execve Executing a program
3 journal_commit_transaction EXT3: committing transaction
3 __stop_machine_run Freezing the kernel (
for
module load)
3 sys_munmap unmapping memory
3 sys_mmap mmaping memory
3 sync_buffer Writing buffer to disk (synchronous)
3 inotify_inode_queue_event Inotify event
3 proc_lookup Looking up /proc
file
3 generic_make_request Creating block layer request
3 get_request_wait Creating block layer request
3 alloc_page_vma Allocating a VMA
#3 __d_lookup Looking up a dentry
3 blkdev_direct_IO Direct block device IO
3 sys_mprotect mprotect() system call
3 shrink_icache_memory reducing inode cache memory footprint
3 vfs_stat_fd stat() operation
3 cdrom_open opening cdrom device
3 sys_epoll_wait Waiting
for
event (epoll)
3 sync_sb_inodes Syncing inodes
3 tcp_connect TCP/IP connect
3 ata_scsi_ioctl ATA/SCSI disk ioctl
3 do_rmdir Removing directory
3 vfs_rmdir Removing directory
3 sys_flock flock() on a
file
3 usbdev_open opening USB device
3 lock_kernel Big Kernel Lock contention
3 blk_execute_rq Submitting block IO
3 scsi_cmd_ioctl SCSI ioctl
command
3 acpi_ec_transaction ACPI hardware access
3 journal_get_undo_access Waiting
for
EXT3 journal undo operation
3 i915_irq_wait Waiting
for
GPU interrupt
3 i915_gem_throttle_ioctl Throttling GPU
while
waiting
for
commands
#
#
5 do_page_fault Page fault
5 handle_mm_fault Page fault
5 filemap_fault Page fault
5 sync_filesystems Syncing filesystem
5 sys_nanosleep Application requested delay
5 sys_pause Application requested delay
5 evdev_read Reading keyboard/mouse input
5 do_fsync fsync() on a
file
(
type
'F'
for
details)
5 __log_wait_for_space Waiting
for
EXT3 journal space
延迟原因非常的详细.
本来到这里,我要介绍的要介绍了,但是且慢,由于这个东西要在2.6.26后的系统上使用,我们的线上系统大部分是RHEL 5U4, 2.6.18的, 我们如何使用呢?
这时候 systemtap 一如既往的前来救助了!
systemtap 1.4版本以后带了个latencytop.stp, 也是intel的贡献. 那我们试验下穷人家的latencytop.
它在那里呢?
$
uname
-r
2.6.18-164.el5
$ stap -V
Systemtap translator/driver (version 1.5 /0.137 non-git sources)
Copyright (C) 2005-2011 Red Hat, Inc. and others
This is
free
software; see the
source
for
copying conditions.
enabled features: AVAHI LIBRPM LIBSQLITE3 NSS BOOST_SHARED_PTR TR1_UNORDERED_MAP NLS
$
ls
-al /usr/share/doc/systemtap/examples/profiling/latencytap.stp
-rwxr-xr-x 1 chuba
users
16240 Feb 17 22:02/usr/share/doc/systemtap/examples/profiling/latencytap.stp
$
sudo
stap -t --all-modules /usr/share/doc/systemtap/examples/profiling/latencytap.stp
ERROR: Skipped too many probes, check MAXSKIPPED or try again with stap -t
for
more
details.
WARNING: Number of errors: 0, skipped probes: 101
WARNING: Skipped due to global
'dequeue'
lock timeout: 2
WARNING: Skipped due to global
'this_sleep'
lock timeout: 99
----- probe hit report:
kernel.trace(
"deactivate_task"
)!, (/usr/share/doc/systemtap/examples/profiling/latencytap.stp:47:1), hits: 254, cycles: 680min/43327avg/2248467max, from: kernel.trace(
"deactivate_task"
)
kernel.trace(
"activate_task"
)!, (/usr/share/doc/systemtap/examples/profiling/latencytap.stp:58:1), hits: 255, cycles: 890min/502549avg/2271568max, from: kernel.trace(
"activate_task"
)
kernel.
function
(
"finish_task_switch@kernel/sched.c:1969"
)?, (/usr/share/doc/systemtap/examples/profiling/latencytap.stp:78:7), hits: 509, cycles: 213min/1002207avg/5382852max, from: kernel.
function
(
"finish_task_switch"
) from: scheduler.cpu_on
begin, (/usr/share/doc/systemtap/examples/profiling/latencytap.stp:123:1), hits: 1, cycles: 1802min/1802avg/1802max, from: begin
begin, (/usr/share/doc/systemtap/examples/profiling/latencytap.stp:131:1), hits: 1, cycles: 227979min/227979avg/227979max, from: begin
Pass 5: run failed. Try again with another
'--vp 00001'
option.
出错了! 原因是lock timeout, 原来stap的全局变量是用锁保护的,现在超时了!知道原因好办,打个patch吧!
$
diff
-up translate.cxx.orig translate.cxx
--- translate.cxx.orig 2011-03-22 21:26:52.000000000 +0800
+++ /translate.cxx 2011-03-29 20:31:28.000000000 +0800
@@ -5802,10 +5802,10 @@ translate_pass (systemtap_session& s)
s.
op
->newline() <<
"#define MAXACTION_INTERRUPTIBLE (MAXACTION * 10)"
;
s.
op
->newline() <<
"#endif"
;
s.
op
->newline() <<
"#ifndef TRYLOCKDELAY"
;
- s.
op
->newline() <<
"#define TRYLOCKDELAY 10 /* microseconds */"
;
+ s.
op
->newline() <<
"#define TRYLOCKDELAY 50 /* microseconds */"
;
s.
op
->newline() <<
"#endif"
;
s.
op
->newline() <<
"#ifndef MAXTRYLOCK"
;
- s.
op
->newline() <<
"#define MAXTRYLOCK 100 /* 1 millisecond total */"
;
+ s.
op
->newline() <<
"#define MAXTRYLOCK 500 /* 1 millisecond total */"
;
s.
op
->newline() <<
"#endif"
;
s.
op
->newline() <<
"#ifndef MAXMAPENTRIES"
;
s.
op
->newline() <<
"#define MAXMAPENTRIES 2048"
;
#编译安装后再来一次
$
sudo
stap --all-modules /usr/share/doc/systemtap/examples/profiling/latencytap.stp
ERROR: probe overhead exceeded threshold
WARNING: Number of errors: 1, skipped probes: 0
Pass 5: run failed. Try again with another
'--vp 00001'
option.
#又错了,这次原因是probe overhead exceeded threshold, 看下代码我们知道,脚本的开销太大了,超过正常的负载,通过查看代码可以用STP_NO_OVERLOAD来解除这个限制
#再来一次
$
sudo
stap -DSTP_NO_OVERLOAD --all-modules -DMAXSKIPPED=1024 /usr/share/doc/systemtap/examples/profiling/latencytap.stp
Reason Count Average(us) Maximum(us) Percent%
Userspace lock contention 345 16409195 83258717 45%
1453 867513 60231852 10%
95 7391754 33821926 5%
migration() kernel thread 1733 402701 3571412 5%
7239 87993 401552 5%
Reading from a pipe 212 2922207 52151180 4%
142 2267850 17990214 2%
108 2457247 7494331 2%
Waking ksoftirqd 16 16082822 59266312 2%
Waiting
for
event (
select
) 99 2113310 28510974 1%
kjournald() kernel thread 148 1313447 13983084 1%
Application requested delay 94 1059898 10011409 0%
41 2391993 7618788 0%
Waiting
for
event (
select
) 38 2259444 29057362 0%
719 92947 584944 0%
Waiting
for
event (poll) 1 57582711 57582711 0%
Application requested delay 3 19030709 36000553 0%
Waiting
for
event (
select
) 39 1341880 5847683 0%
34 936628 6649350 0%
5 6163603 10008484 0%
...
这次看到结果了,哈哈,小高兴一把。但是在繁忙的系统上这个脚本的资源占用特别多,也是不爽的。 幸运的是这个脚本支持查看某个进程的延迟情况, 就是在 latencytap.stp 后面加个-x 参数。
这个脚本设计应该是支持进程ID, 但是结果写成了线程ID,属于bug!!!
动手改下吧:
$
diff
-u latencytap.stp.orig latencytap.stp
--- latencytap.stp.orig 2011-02-17 22:02:40.000000000 +0800
+++ latencytap.stp 2011-03-29 20:43:51.000000000 +0800
@@ -15,7 +15,7 @@
global this_sleep;
global debug = 0;
-
function
log_event:long (p:long) {
return
(!traced_pid || traced_pid == p) }
+
function
log_event:long (p:long) {
return
(!traced_pid || traced_pid == task_pid(p)) }
#func names from hex addresses
function
func_backtrace:string (ips:string)
@@ -50,14 +50,14 @@
# check to see if task is in appropriate state:
# TASK_INTERRUPTIBLE 1
# TASK_UNINTERRUPTIBLE 2
-
if
(log_event($p->pid) && (s & 3)) {
+
if
(log_event($p) && (s & 3)) {
dequeue[$p] = gettimeofday_us();
}
}
probe kernel.trace(
"activate_task"
) !,
kernel.
function
(
"activate_task"
) {
-
if
(!log_event($p->pid)) next
+
if
(!log_event($p)) next
a = gettimeofday_us()
d = dequeue[$p]
#再来一次
$
sudo
stap --all-modules /usr/share/doc/systemtap/examples/profiling/latencytap.stp -x $$
...
#如果发现出来的Reason是空行, 就把latencytap.stp里面的debug=0, 改成debug=1
这下终于爽了,旧内核用systemtap版本的,新内核用内核版本的,世界和谐!
通过对线上MySQL的诊断发现大部分时间花在mutex锁的竞争上来,我说过了,我会收拾你的,等着瞧!
玩得开心!
- latencytop深度了解你的Linux系统的延迟
- latencytop深度了解你的Linux系统的延迟
- latencytop深度了解你的Linux系统的延迟
- latencytop深度了解你的Linux系统的延迟
- Cenos 6 下通过latencytop了解系统延迟
- 用日志了解你的 Linux 系统
- blktrace 深度了解linux系统的IO运作
- blktrace 深度了解linux系统的IO运作
- 用系统日志了解你的 Linux 系统
- Linux系统编程的学习路线,你了解吗?
- 分布式系统,你真的了解吗?
- 让你系统的了解shell
- 深度学习你需要了解的名词(中英文对照)
- Shell脚本编程——了解你的Linux系统必须掌握的20个命令
- 你需要了解的嵌入式Linux
- 深入了解Linux的系统目录
- 了解linux下的系统调用
- NO.3 linux的系统延迟机制
- ContentProvider跨APP读取数据失败
- Binder的理解
- MATLAB之极坐标绘图
- 《思维导图》总结
- 修改Android系统目录的读写权限命令
- latencytop深度了解你的Linux系统的延迟
- mysql计算两点经纬度之间的距离
- UIGestureRecognizer 手势识别
- 13. Roman to Integer
- Android中JNI起源及存在的必要性
- python实现二分查找
- Python 高级编程笔记之低于类别
- 安卓文件上传
- 文件操作