nagios监控mysql

来源：互联网发布：茅场晶彦知乎编辑：程序博客网时间：2024/05/21 23:33

nagios监控mysql的工具比较多，个人比较喜欢的一个是check_mysql_health，下面就来介绍一下。

网址：https://labs.consol.de/nagios/check_mysql_health

监控项如下:

#连接数据库时间connection-time      (Time to connect to the server)#数据库运行时间uptime               (Time the server is running)#连接数threads-connected    (Number of currently open connections)#线程缓存命中率threadcache-hitrate  (Hit rate of the thread-cache)#每秒线程创建数threads-created      (Number of threads created per sec)#线程正在运行数threads-running      (Number of currently running threads)#缓存线程数threads-cached       (Number of currently cached threads)#每秒丢弃的连接数connects-aborted     (Number of aborted connections per sec)#由于客户端关闭，每秒丢弃的连接数clients-aborted      (Number of aborted connections (because the client died) per sec)#slave落后master的时长slave-lag            (Seconds behind master)#slave io进程是否运行slave-io-running     (Slave io running: Yes)#slave sql进程是否运行slave-sql-running    (Slave sql running: Yes)#查询缓存命中率qcache-hitrate       (Query cache hitrate)#由于内存小导致删除缓存中的查询量qcache-lowmem-prunes (Query cache entries pruned because of low memory)#key缓存命中率keycache-hitrate     (MyISAM key cache hitrate)#innodb缓冲池命中率bufferpool-hitrate   (InnoDB buffer pool hitrate)#innodb缓冲池清理有效页的等待bufferpool-wait-free (InnoDB buffer pool waits for clean page available)#由于log缓冲太小导致innodb log等待log-waits            (InnoDB log waits because of a too small log buffer)#表缓存命中率tablecache-hitrate   (Table cache hitrate)#锁表率table-lock-contention(Table lock contention)#索引使用率index-usage          (Usage of indices)#在磁盘创建临时表的百分比tmp-disk-tables      (Percent of temp tables created on disk)#表优化table-fragmentation  (Show tables which should be optimized)#打开文件百分比open-files           (Percent of opened files)#慢查询slow-queries         (Slow queries)#长时间运行的进程long-running-procs   (long running processes)#ndb节点运行cluster-ndbd-running (ndnd nodes are up and running)#返回一个数字的sql语句sql                  (any sql command returning a single number)

从以上看出check_mysql_health不仅可以帮助我们监控mysql的运行情况，还有助我们对mysql进行调优。

在此我们只监控mysql基本的运行项，如连接数，slave运行状况及长时间运行的进程。

1.安装

[root@nagios ~]# cd /usr/local/src[root@nagios src]# wget https://labs.consol.de/assets/downloads/nagios/check_mysql_health-2.2.1.tar.gz[root@nagios src]# tar -zxvf check_mysql_health-2.2.1.tar.gz[root@nagios src]# cd check_mysql_health-2.2.1[root@nagios check_mysql_health-2.2.1]# ./configure --prefix=/usr/local/nagios --with-nagios-user=nagios --with-nagios-group=nagios --with-perl=/usr/bin/perl[root@nagios check_mysql_health-2.2.1]# make && make install

这样check_mysql_health插件就被安装到/usr/local/nagios/libexec中了。

2.授权

check_mysql_health监控mysql时，需要在mysql上对nagios创建一个监控账号，官方文档建议：

GRANT USAGE ON *.* TO 'nagios'@'10.10.1.73' IDENTIFIED BY 'nagiospassword';
但是在测试过程中发现，如果监控从库的slave-lag，slave-io-running，slave-sql-running时，会报错：“CRITICAL - unable to get replication info”；这是由于监控账户的USAGE权限无法执行show slave status。

因此我们需要对从库的监控账户授权为replication clinet或super权限，主库可以用USAGE权限，如下：

grant replication client ON *.* TO 'nagios'@'X.X.X.X' IDENTIFIED BY 'test';

3.配置监控命令

define command {    command_name check_mysql_health    command_line $USER1$/check_mysql_health --hostname $HOSTADDRESS$ --username $USER3$ --password $USER4$ -mode $ARG1$ --warn $ARG2$ --crit $ARG3$ }

mysql的监控账户和密码的变量$USER3$，$USER4$在resource.cfg中配置，如下：

vim /usr/local/nagios/etc/resource.cfg # Sets $USER1$ to be the path to the plugins$USER1$=/usr/local/nagios/libexec# Sets $USER2$ to be the path to event handlers#$USER2$=/usr/local/nagios/libexec/eventhandlers# Store some usernames and passwords (hidden from the CGIs)#check_mysql_health监控mysql$USER3$=nagios$USER4$=test

4.配置监控服务

define service{    use                     local-service    service_description     MYSQL slave-io    host_name               mysql-test-slave    check_command           check_mysql_health!slave-io-running    service_groups          mysql_services    check_interval          10    notifications_enabled   1    notification_interval   0    contact_groups          admins}define service{    use                     local-service    service_description     MYSQL slave-sql    host_name               mysql-test-slave    check_command           check_mysql_health!slave-sql-running    service_groups          mysql_services    check_interval          10      notifications_enabled   1       notification_interval   0       contact_groups          admin}define service{    use                     local-service,srv-pnp    service_description     MYSQL slave-lag    host_name               mysql-test-slave    check_command           check_mysql_health!slave-lag!10!20    service_groups          mysql_services    check_interval          10      notifications_enabled   1       notification_interval   0       contact_groups          admin}define service{    use                     local-service,srv-pnp    service_description     MYSQL long-running-procs    host_name               mysql-test-slave,mysql-test-master    check_command           check_mysql_health!long-running-procs!10!20    service_groups          mysql_services    check_interval          10    notifications_enabled   1    notification_interval   0    contact_groups          admin}define service{    use                     local-service,srv-pnp    service_description     MYSQL threads-connected    host_name               mysql-test-slave,mysql-test-master    check_command           check_mysql_health!threads-connected!2000!2500    service_groups          mysql_services    check_interval          10    notifications_enabled   1    notification_interval   0    contact_groups          admin}

ok，至此我们的监控全部完成。

0 0