linux系统挂掉排查 3板斧

来源:互联网 发布:阿里云 oa 编辑:程序博客网 时间:2024/05/16 06:40
磁盘 主要看%util指标
[root@xxxxxx ~]# iostat  -x
Linux 2.6.32-431.23.3.el6.x86_64 (xxx)         07/20/2017      _x86_64_        (8 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           2.08    0.00    0.43    0.26    0.00   97.23

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
xvda              0.00    11.00    1.74   10.20    85.66   169.63    21.38     0.26   22.09   1.03   1.23
xvdb              0.00    16.59    2.17   15.16   156.69   253.94    23.70     0.33   18.99   0.80   1.38
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     8.00     0.00    0.00   0.00   0.00


网络
[root@xxxx ~]# netstat -n |awk '/^tcp/{++S[$NF]}END{for(a in S)print a,S[a]}'
TIME_WAIT 56
CLOSE_WAIT 399
FIN_WAIT2 1
ESTABLISHED 418

主要看ESTABLISHED 如果高于1000则需要排查nginx或者tomcat的log,并封闭恶意请求的IP

TIME_WAIT如果高于1000则,系统调优/et/sysctl


内存
[root@xxxxxx ~]# free -m
             total       used       free     shared    buffers     cached
Mem:         15948      15411        537          0        159       5876
-/+ buffers/cache:       9375       6572
Swap:            0          0          0


处理器
top
top - 16:55:07 up 35 days,  3:44,  9 users,  load average: 0.33, 0.26, 0.21
Tasks: 251 total,   2 running, 249 sleeping,   0 stopped,   0 zombie
Cpu(s): 12.0%us,  0.9%sy,  0.0%ni, 86.7%id,  0.0%wa,  0.1%hi,  0.3%si,  0.0%st
Mem:  16330912k total, 15796064k used,   534848k free,   163380k buffers
Swap:        0k total,        0k used,        0k free,  6021828k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
16941 root      20   0 8206m 1.0g  12m S 64.8  6.6  13:02.45 java
17198 root      20   0 6719m 461m  14m S 16.6  2.9   5:26.95 java
16690 root      20   0 6747m 438m  13m S  5.3  2.7   3:41.55 java
18915 nobody    20   0 70672  26m 1080 S  4.0  0.2   0:10.54 nginx
18917 nobody    20   0 70672  25m 1060 S  3.0  0.2   0:10.63 nginx
16753 root      20   0 6246m 460m  12m S  2.0  2.9   1:01.34 java
17173 root      20   0 6785m 454m  12m S  2.0  2.9   1:18.24 java
 2666 rabbitmq  20   0 2635m 136m 2032 S  1.7  0.9 379:51.28 beam.smp
17256 root      20   0 6719m 443m  12m S  1.3  2.8   1:17.54 java
17902 root      20   0 6720m 326m  11m S  1.0  2.0   0:37.88 java
16649 root      20   0 6655m 545m  12m S  0.7  3.4   1:23.51 java
16804 root      20   0 6766m 402m  12m S  0.7  2.5   1:06.29 java
16938 root      20   0  4120  596  496 S  0.7  0.0   0:20.26 cronolog
17225 root      20   0 6719m 391m  12m S  0.7  2.5   1:22.49 java
25279 root      20   0  162m  10m 4172 S  0.7  0.1 122:14.72 AliYunDun
   39 root      20   0     0    0    0 R  0.3  0.0 133:28.05 events/4
 1144 root      20   0 2380m  64m 3256 S  0.3  0.4 324:17.22 java


系统被爬虫搞挂
awk -F "\"" '/Spider/{print $10}'  access.log  |uniq |tee -a /root/IP.txt

nginx.conf
加入:

deny  42.236.10.114     ;




网络

监控总体带宽使用  nload -u K 按下enter

10M带宽的单位 10 MBit/s=10mbps

Device eth0 [10.139.40.112] (2/4):
================================================================================
Incoming:
                                                # .
                                               ## #
                                               ## #
                                               ## #
                                               ## ##
                                               ##.##   Curr: 485.60 kBit/s
                                             | #####   Avg: 5.62 MBit/s
                                             #######   Min: 485.60 kBit/s
                                             #######   Max: 10.11 MBit/s
                                             #######.  Ttl: 855.63 GByte
Outgoing:


                                                 ...
                                             |   ###
                                             #. |###
                                             ## ####   Curr: 3.11 MBit/s
                                             ## ####   Avg: 5.03 MBit/s
                                             ## #####  Min: 1.51 MBit/s
                                             ##.#####  Max: 7.70 MBit/s
                                             ########  Ttl: 804.93 GByte

每个进程的带宽使用 nethogs
NetHogs version 0.8.5

    PID USER     PROGRAM                    DEV        SENT      RECEIVED
  13481 root     /opt/t111/jdk7/bin/java   eth0       29.119     178.401 KB/sec
  13275 root     /opt/t111/jdk7/bin/java   eth0        2.920     148.093 KB/sec
  15137 nobody   nginx: worker process      eth0      225.527      84.252 KB/sec
  13793 root     /opt/t111/jdk7/bin/java   eth1       69.167       3.456 KB/sec
  14642 root     /opt/t111/jdk7/bin/java   eth0        0.495       2.273 KB/sec
  18639 root     sshd: root@pts/0           eth0        0.928       0.045 KB/sec
  25279 root     ..sr/local/aegis/aegis_cl  eth1        0.022       0.023 KB/sec
      ? root     ...139.40.111:35440-10.25              0.000       0.000 KB/sec
      ? root     ...139.40.111:35433-10.25              0.000       0.000 KB/sec
  15135 nobody   nginx: worker process      eth0        0.000       0.000 KB/sec
      ? root     ...139.40.111:35434-10.25              0.000       0.000 KB/sec
      ? root     ...139.40.111:35421-10.25              0.000       0.000 KB/sec
      ? root     ...139.40.111:35419-10.25              0.000       0.000 KB/sec
      ? root     ...139.40.111:80-10.158.2              0.000       0.000 KB/sec
  17902 root     /opt/t111/jdk7/bin/java   eth1        0.000       0.000 KB/sec
      ? root     ...139.40.111:35412-10.25              0.000       0.000 KB/sec
      ? root     ...139.40.111:35401-10.25              0.000       0.000 KB/sec
      ? root     ...139.40.111:35375-10.25              0.000       0.000 KB/sec
      ? root     ...139.40.111:35374-10.25              0.000       0.000 KB/sec
      ? root     ...139.40.111:35372-10.25              0.000       0.000 KB/sec
  TOTAL                                               328.178     416.543 KB/sec

原创粉丝点击