monit配置文件命令学习

来源:互联网 发布:软件系统可靠性指标 编辑:程序博客网 时间:2024/05/17 17:42

monit可以用来监视unix进程,程序,文件,目录,文件系统等(processes,programs, files, directories and filesystems)比如时间戳,校验和,或者大小发生改变都能监测到!

monit程序提供了一个http接口,你可以通过浏览器访问monit程序,monit通过一个配置文件来控制自己的行为

终于明白为什么为什么有两个monitrc配置文件了,缺省的是/usr/local/monit/etc/monitrc,如果这个文件不存在的话就会去访问/etc/monit/monitrc【回答:前者是缺省的配置文件,后者是备用配置文件][

你有可以手动指定一个配置文件的路径,

 $ monit -c /var/monit/monitrc你可以通过命令行来修改配置文件,但是为了简单起便,推荐在配置文件里面进行配置sudo /usr/local/monit/bin/monit -t 可以编译一下配置文件,看配置文件是否出错,没有错误将会出现如下信息laicb@laicb-HP-ProBook-4416s:~$ sudo /usr/local/monit/bin/monit -t[sudo] password for laicb: Control file syntax OK

./monit.state保存monit的状态并且利用她从一个毁坏性的状态恢复!

~/.monit.id保存他自己的唯一ID到这个文件里面


守护进程

守护进程(Daemon)是运行在后台的一种特殊进程。它独立于控制终端并且周期性地执行某种任务或等待
    处理某些发生的事件。守护进程是一种很有用的进程。
    Linux的大多数服务器就是用守护进程实现的。比如,Internet服务器inetd,Web服务器httpd等。
    同时,守护进程完成许多系统任务。比如,作业规划进程crond,打印进程lpd等。

启动后monit将做为守护进程

This is Monit version 5.3.2

Monit 参数

The following options are recognized by Monit. However, it isrecommended that you set options (when applicable) directly inthe.monitrc control file.

-c file Use this control file

-d n Run Monit as a daemon once per n seconds. Or use"set daemon" in monitrc.

-g name Set group name for start, stop, restart, monitor and unmonitor action.

-l logfile Print log information to this file. Or use"set logfile" in monitrc.

-p pidfile Use this lock file in daemon mode. Or use"set pidfile" in monitrc.

-s statefile Write state information to this file. Or use"set statefile" in monitrc.

-I Do not run in background (needed for run from init)

-t Run syntax check for the control file

-v Verbose mode, work noisy (diagnostic output)

-vv Very verbose mode, same as -v plus log stack-trace on error

-H [filename] Print MD5 and SHA1 hashes of the file or of stdin if the filename is omitted; Monit will exit afterwards

-V Print version number and patch level

-h Print a help text


一旦启动了Monit,你就可以使用以下命令来操作这个守护进程

Once you have Monit running as a daemon process, you can callMonit with one of the following arguments. Monit will thenconnect to the Monit daemon (on TCP port 127.0.0.1:2812 bydefault) and ask the Monit daemon to perform the requestedaction. In other words; calling monit without arguments startsthe Monit daemon, and calling monit with arguments enables youto communicate with the Monit daemon process.

Once you have Monit running as a daemon process, you can call Monit with one of the following arguments. Monit will then connect to the Monit daemon (on TCP port 127.0.0.1:2812 by default) and ask the Monit daemon to perform the requested action. In other words; calling monit without arguments starts the Monit daemon, and calling monit with arguments enables you to communicate with the Monit daemon process.start all    Start all services listed in the control file and enable monitoring for them. If the group option is set (-g), only start and enable monitoring of services in the named group ("all" is not required in this case).start name    Start the named service and enable monitoring for it. The name is a service entry name from the monitrc file.stop all    Stop all services listed in the control file and disable their monitoring. If the group option is set, only stop and disable monitoring of the services in the named group (all" is not required in this case).stop name    Stop the named service and disable its monitoring. The name is a service entry name from the monitrc file.restart all    Stop and start all services. If the group option is set, only restart the services in the named group ("all" is not required in this case).restart name    Restart the named service. The name is a service entry name from the monitrc file.monitor all    Enable monitoring of all services listed in the control file. If the group option is set, only start monitoring of services in the named group ("all" is not required in this case).monitor name    Enable monitoring of the named service. The name is a service entry name from the monitrc file. Monit will also enable monitoring of all services this service depends on.unmonitor all    Disable monitoring of all services listed in the control file. If the group option is set, only disable monitoring of services in the named group ("all" is not required in this case).unmonitor name    Disable monitoring of the named service. The name is a service entry name from the monitrc file. Monit will also disable monitoring of all services that depends on this service.status    Print status information of each service.summary    Print a short status summary.reload    Reinitialize a running Monit daemon, the daemon will reread its configuration, close and reopen log files.quit    Kill the Monit daemon process

Monit可以做什么?

你可以用monit来监控进程,尤其对监控守护进程很有用,比如在系统启动时间启动的 /etc/init.d,比如sendmail,ssh,apache,mysql等

1,你可以用Monit来监控files,directories,文件系统,monit可以监控这些项目的改变,比如时间戳,校验和改变,文件大小改变,这样比较安全,比如你改变了文件的内容,那么它的md5或者sha1校验码不会改变。

2,monit可以监控到各种服务器的网络链接,本地或者远程,TCP还是UDP,Unix DomainSockets 都支持

3,monit可以用来在某些时候测试程序或者脚本,你可以测试程序的返回值,并以此为依据,进行一些必要的操作,比如执行某一个动作或者发送一个警报

4,Monit可以用来监控一般的系统资源,比如CPU使用,内存,以及负载均值(Load Acerage)

  {

 Load AverageCPULoad,它所包含的信息不是CPU的使用率状况,而是在一段时间内CPU正在处理以及等待CPU处理的进程数之和的统计信息,也就是CPU使用队列的 长度的统计信息

}


LOGGINGMonit will log status and error messages to a log file. Use the set logfile statement in the monitrc control file. To setup Monit to log to its own logfile, use e.g. set logfile /var/log/monit.log. If syslog is given as a value for the -l command-line switch (or the keyword set logfile syslog is found in thecontrol file) Monit will use the syslog system daemon to log messages with a priority assigned to each message based on the context. To turn off logging, simply do not set the logfile in the control file (and of course, do not use the -l switch)


 守护模式(DAEMON MODE)

use

   set daemon n (where n is a number in seconds)

如果你没有指定这个命令set daemon,那么monit将会运行一次,然后退出,这在某些地方可能会有用处,但是monit当初设计就是设计为守护进程的


INIT SUPPORT

 set init 让阻止monit转换他自己为一个守护进程,而把monit作为一个前台进程,但是你仍然要在配置文件中设置set daemon,以此来设置轮询的时间

从init启动是一个最好的方式了,因为这样你可以保证你的系统里面始终有一个Monit进程

另外可以选择从crontab来启动Monit

要从init启动MOnit,一种方式是你在配置文件中设置Monit的配置问i俺,另外一种可选命令行的是 -I 选项

下面是你要要添加到/etc/inittab:

  # Run Monit in standard run-levels  mo:2345:respawn:/usr/local/bin/monit -Ic /etc/monitrc
在你已经修改init的配置文件之后,你可以运行下面命令去重新检验,/etc/inittab并且启动Monit

telinit q

对于没有telinit的系统使用

kill -1 1

加入init启动之后可能会出现一个问题,就是当Monti监控的某些服务比Monit启动慢的时候,Monit会认为这个服务没有启动,所以会发送错误警报;

要解决这个问题,可以参考FAQ


监控模式:(MONITRING MODE)

Monit支持三种监控模式,

active--Monitj监控一个服务,为了防止一系列问题,Monit会执行以及发送警报,停止,启动,重启,这是一个缺省的模式

passive--MOnit监控一个服务,不会尝试去修复这个问题,但还是会发送警报

manual--Monit监控进入active模式,通过monit的控制,比如在控制台执行命令,比如 Monit start sybase

  (Monit will call sybase's start method and enable monitoring)

ALERT MESSAGES

Monit会发送一个邮件提醒,在下列情况

o A service timed out o A service does not exist o A service related data access problem o A service related program execution problem o A service is of invalid object type o A program status failed o A icmp problem o A port connection problem o A resource statement match o A file checksum problem o A file size problem o A file/directory timestamp problem o A file/directory/filesystem permission problem o A file/directory/filesystem uid problem o A file/directory/filesystem gid problem o An action is done per administrator's request
Monit 会发送一个警报只要被监控对象发生了改变,这些对象包括

 o Monit started, stopped or reloaded o A file checksum changed o A file size changed o A file content match o A file/directory timestamp changed o A filesystem mount flags changed o A process PID changed o A process PPID changed

警报状态有两种形式

Global -- common for all services

local -- per service

在没一种形式下你都可以发送多个警报状态,换句话说你可以发懊恼过不同的邮件到不同的地址

Setting a global alert statement

{

   如果在监控服务发生了改变,Monit将会发送一个警报到全局列表的所有的接受者,下面是全局警报的语法

   SET ALERT mail-address [[NOT]{EVENTS}]  [MAIL-FORMAT {mail-format}] [REMINDER number]

   简单使用:set alert foo@bar

  EVENTS,MAIL-FORMAT,REMINDER看下面使用用法

 Setting a local alert statement

 每一个服务可以有他自己的接收列表

ALERT mail-address [[NOT]{EVENTS}]  [MAIL-FORMAT {mail-format}] [REMINDER number]

没有了SET就成了局部的了

或者NOALERT  mail-address

如果你只想接受某些服务的某些警报信息的话,比如你只想接受timeout或者nonexist事件,那么你可以这么写

check process myproc with pidfile /var/run/my.pid   alert foo@bar only on { timeout, nonexist }    ...

你可以指定除去某些事件外发送警报信息,比如你想监听所有时间除了instance事件,那么你可以这么写

 check system myserver   alert foo@bar but not on { instance }    ...
相当于

  alert foo@bar on { action                      checksum                      connection                      content                      data                      exec                      fsflags                      gid                      icmp                      invalid                      nonexist                      permission                      pid                      ppid                      resource                      size                      status                      timeout                      timestamp                      uid                      uptime }
一个instance事件是指Monit程序启动或者停止

你也可以根据事件的不同来发送给不同的邮件

 alert foo@bar { nonexist, timeout, resource, icmp, connection } alert security@bar on { checksum, permission, uid, gid } alert manager@bar
可以在邮件过滤器中使用的事件如下:

action,checksum, connection, content, data, exec, fsflags, gid, icmp,instance, invalid, nonexist, permission, pid, ppid, resource, size, status, timeout, timestamp, uid, uptime

你可以使用

noalert appadmin@bar来进行不接受警报的邮箱

 set alert foo@bar  check process myfoo with pidfile /var/run/myfoo.pid   ... check process mybar with pidfile /var/run/mybar.pid   alert foo@bar only on { timeout }
上述代码会把所有的警报信息发送给foo@bar,除了mybar服务的,在timeout的时候发送警报信息,这就是局部覆盖全局的原理了


    $EVENT     A string describing the event that occurred. The values are     fixed and are:     Event:    | Failure state:           | Success state:                   -------------------------------------------------------------------     ACTION    | "Action done"            | "Action done"                    CHECKSUM  | "Checksum failed"        | "Checksum succeeded"             CONNECTION| "Connection failed"      | "Connection succeeded"           CONTENT   | "Content failed",        | "Content succeeded"     DATA      | "Data access error"      | "Data access succeeded"          EXEC      | "Execution failed"       | "Execution succeeded"            FSFLAG    | "Filesystem flags failed"| "Filesystem flags succeeded"     GID       | "GID failed"             | "GID succeeded"                  ICMP      | "ICMP failed"            | "ICMP succeeded"                 INSTANCE  | "Monit instance changed" | "Monit instance changed not"     INVALID   | "Invalid type"           | "Type succeeded"                 NONEXIST  | "Does not exist"         | "Exists"                         PERMISSION| "Permission failed"      | "Permission succeeded"           PID       | "PID failed"             | "PID succeeded"     PPID      | "PPID failed"            | "PPID succeeded"     RESOURCE  | "Resource limit matched" | "Resource limit succeeded"       SIZE      | "Size failed"            | "Size succeeded"                 STATUS    | "Status failed"          | "Status succeeded"                 TIMEOUT   | "Timeout"                | "Timeout recovery"               TIMESTAMP | "Timestamp failed"       | "Timestamp succeeded"            UID       | "UID failed"             | "UID succeeded"                  UPTIME    | "Uptime failed"          | "Uptime succeeded"    $SERVICE     The service entry name in monitrc    $DATE     The current time and date (RFC 822 date style).    $HOST     The name of the host Monit is running on    $ACTION     The name of the action which was done. Action names are fixed     and are:http://write.blog.csdn.net/postedit/9564261     Action:  | Name:     --------------------     ALERT    | "alert"     EXEC     | "exec"     RESTART  | "restart"     START    | "start"     STOP     | "stop"     UNMONITOR| "unmonitor"    $DESCRIPTION     The description of the error condition

Setting an error reminder

ALERT ... [WITH] REMINDER [ON] number [CYCLES]

For example if you want to be notified each tenth cycle if a service remains in a failed state, you can use:(/如果一个服务10个轮询都在失败状态,那么就发送邮件)

  alert foo@bar with reminder on 10 cycles

Likewise if you want to be notified on each failed cycle, you canuse:

  alert foo@bar with reminder on 1 cycle

为提醒消息设置邮件服务器

SET MAILSERVER {hostname|ip-address [PORT port]                [USERNAME username] [PASSWORD password]                [using SSLV2|SSLV3|TLSV1] [CERTMD5 checksum]}+                 [with TIMEOUT X SECONDS]                [using HOSTNAME hostname]

set mailserver mail.tildeslash.com, mail.foo.bar port 10025     username "Rabbi" password "Loew" using tlsv1, localhost     with timeout 15 secondshttp://write.blog.csdn.net/postedit/9564261

使用qq邮箱邮件服务器来发送用163.qq等邮件服务器需要较多的安全验证信息,使用本地安装的sendmail服务就没有那么多的要求首先设置mail server  set mailserver  smtp.qq.com USERNAME "530765863"  PASSWORD "*********"注意qq邮箱的用户名是不加@qq.com的,网上说vip需要加上@vip.qq.com,这没有考证过这样设置之后还不行,需要设置邮件的格式,from字段,也就是发件人必须是530765863@qq.comset mail-format {                   from: 530765863@qq.com                   subject:monit alert --  $EVENT $SERVICE                   message: $EVENT Service $SERVICE                   Date:        $DATE                   Action:      $ACTION                   Host:        $HOST                   Description: $DESCRIPTION            Your faithful employee,            Monit}查收一下你的qq邮箱,一大堆邮件正在靠近。。。



可以设置有多个邮件服务器,用逗号分隔开,如果15秒内第一个邮件服务器没有反应,会去尝试第二个邮件服务器,会去尝试第三个邮件服务器

缺省的,Monit会使用主机名在SMTP HELO/EHLO 以及the Message-ID header,但是在一些邮件服务器,为了防止垃圾邮件,如果DNS和在事务中所用的主机名不一致,那么就会拒绝,解决这个问题的方法就是设置主机名 [using HOSTNAME hostname]


设置事件队列

  set eventqueue      basedir /var/monit      slots 5000
basedir是可选项目,可以仅仅更改slots的数目

SET EVENTQUEUE BASEDIR <path> [SLOTS <number>]
为什么要设置队列?因为有些时候去借助邮件服务器发送邮件,会出现连不上,那么就可以把这些事件放在邮件队列里面,等到邮件服务器可用的时候再次发送!


服务超时Service timeout

Monit提供超时服务机制,如果一个服务拒绝启动或者长时间没有回复,那么就超时了

IF <number> RESTART <number> CYCLE(S) THEN <action>
如果邮件在x个轮询中有y次重新启动,那么就执行某一动作action

比如:

 if 2 restarts within 3 cycles then unmonitor
如果三个轮询中有两次重启,那么就不监视了

 if 5 restarts within 5 cycles then exec "/foo/bar"
如果5个轮询中有5次重启,那么就执行 某一个动作

 if 7 restarts within 10 cycles then stop

如果在10个轮询中有7次重启,那么就关闭这个服务



服务测试:(SERVICES TEST)

  MONIT在“check service"入口提供了多种测试服务,有两类测试,第一种是可变测试,第二种是不变测试,这就是说我们的测试的条件可以是不变的,比如一个数字,或者可变的

不变测试语法  [[<X>] [TIMES WITHIN] <Y> CYCLES] THEN ACTION [ELSE IF SUCCEEDED [[<X>] [TIMES WITHIN] <Y> CYCLES] THEN ACTIIF <TEST> [[<X>] [TIMES WITHIN] <Y> CYCLES] THEN ACTION [ELSE IF SUCCEEDED [[<X>] [TIMES WITHIN] <Y> CYCLES] THEN ACTION

IF <TEST> [[<X>] [TIMES WITHIN] <Y> CYCLES] THEN ACTION [ELSE IF SUCCEEDED [[<X>] [TIMES WITHIN] <Y> CYCLES] THEN ACTION]
可变测试:

IF CHANGED <TEST> [[<X>] [TIMES WITHIN] <Y> CYCLES] THEN ACTIO

IF CHANGED <TEST> [[<X>] [TIMES WITHIN] <Y> CYCLES] THEN ACTION
可以使用的ACTION

n each test you must select the action to be executed from this list:    ALERT sends the user an alert event on each state change (for constant tests) or on each change (for variable tests).    RESTART restarts the service and sends an alert. Restart is conducted by first calling the service's registered stop method and then the service's start method.    START starts the service by calling the service's registered start method and send an alert.    STOP stops the service by calling the service's registered stop method and send an alert. If Monit stops a service it will not be checked by Monit anymore nor restarted again later. To reactivate monitoring of the service again you must explicitly enable monitoring from the web interface or from the console, e.g. 'monit monitor apache'.    EXEC can be used to execute an arbitrary program and send an alert. If you choose this action you must state the program to be executed and if the program require arguments you must enclose the program and its arguments in a quoted string. You may optionally specify the uid and gid the executed program should switch to upon start. For instance:     exec "/usr/local/tomcat/bin/startup.sh"          as uid nobody and gid nobody    The uid and gid switch can be useful if the program to be started cannot change to a lesser privileged user and group. This is typically needed for Java Servers. Remember, if Monit is run by the superuser, then all programs executed by Monit will be started with superuser privileges unless the uid and gid extension was used.    UNMONITOR will disable monitoring of the service and send an alert. The service will not be checked by Monit anymore nor restarted again later. To reactivate monitoring of the service you must explicitly enable monitoring from monit's web interface or from the console using the monitor argument.




}

 

原创粉丝点击