AIX系统日志学习笔记之一

来源:互联网 发布:zbrush 4r8 mac 破解 编辑:程序博客网 时间:2024/06/05 03:53


AIX系统上线之后,难免会出现错误,为了应对错误,aix提供了很多处理错误的方法和日志记录机制,为修复故障和系统提供方便。

Errdemon是aix的一个守护进程,该进程会实时检查/dev/drror设备文件,查看是否有新的内容,并将与系统错误模版对比,将错误信息写入系统错误日志中。

 

Errdemon守护进程会在系统启动是自动启动,也可以手动启动:

#/usr/lib/errdemon

关闭errdemon守护进程

#/usr/lib/errstop

#ps –ef | greperrdemon

 

AIX错误日志记录在/var/adm/ras/errlog中、

以下可以确定系统中错误日志文件的位置,日志文件的大小,缓存占用情况等

/usr/lib/errdemon–l    

以下命令可以更改日志文件的大小

/usr/lib/errdemon–s 2097153

日志缓存设置

/usr/lib/errdemon–B 16384

 

AIX将日志记录下来之后,同时提供errpt命令来查看错误日志。另外一个诊断命令是diag用来诊断和分析硬件错误,而errpt仅仅是打印错误。

1、errpt命令

# errpt --h

errpt: Not arecognized flag: -

Usage:   errpt -@ wpar_name -actgDP -s startdate -eenddate

         -N resource_name_list -Sresource_class_list -R resource_type_list

         -T err_type_list -d err_class_list -jid_list -k id_list

         -J label_list -K label_list -lseq_no_list -F flags_list

         -m machine_id -n node_id -i filename -yfilename -z filename

         -I filename

 

Process errorlog entries from the supplied file(s).

-i filename  Uses the error log file specified by thefilename parameter.

-y filename  Uses the error record template file specifiedby the filename

                         parameter.

-z filename  Uses the error logging message catalogspecified by the filename

                         parameter.

-I filename  Uses the diagnostics error log specified bythe filename

                        parameter.

 

Output formattederror log entries sorted chronologically.

显示全部错误日志的详细信息

-a         Print adetailed listing. Default is a summary listing.

-c         Concurrent mode. Display error logentries as they arrive.

-t         Print error templates instead of errorlog entries.

-g         Output raw ascii  error record structures.

-D         Consolidate duplicate errors.

-P         Show only duplicates from the errordevice driver.

 

Error log entryqualifiers:

-@wpar_name    Select entries for the wparname.

下面两个是起止日期

-s startdate  Selectentries posted later   than date.(MMddhhmmyy)

-e enddate    Selectentries posted earlier than date. (MMddhhmmyy)

-N list       Select resource_names   in 'list'.

-S list       Select resource_classes in 'list'.

-R list       Select resource_types   in 'list'.

-T list       Select types            in 'list'.

-d list       Select classes          in 'list'.

指定错误ID

-j list       Selectids              in 'list'.

-k list       Select ids  NOT        in 'list'.

-J list       Select labels           in 'list'.

-K list       Select labels NOT       in 'list'.

-l list       Select sequence_numbers in 'list'.

-F list       Select templates according to the valueof the

              Alert, Log, or Report field.

-m machine_idSelect entries for the machine id as output by uname -m.

-n node_id    Select entries for the node id    as output by uname -n.

 

'list' is a listof entries separated by commas.

错误信息严重性:

error_type  =PERM,TEMP,PERF,PEND,UNKN,INFO

错误类型:                                               

error_class = H (HARDWARE), S (SOFTWARE), O (errloggerMESSAGES), U (UNDETERMINED)

 

常用的命令有:

1、列出简短的出错信息

errpt | more

2、列出所有硬件出错信息       

errpt -d H

3、列出所有软件错误信息         

errpt -d S 

4、列出详细的出错信息

errpt –a

5、指定错误id号查询      

errpt -aj ERROR_ID

6、永久错误信息

errpt -T PERM -d H

 

2、错误日志处理方法

#errclear                     从错误日志中删除记录

#errstop/errdemon            停止错误记录守护进程/启动错误记录守护进程

    #errclear

0315-136 Number of days is required, and must be zero or greater.

Usage:

errclear -@ wpar_name -J err_label_list -K err_label_list -Nresource_name_list

        -R resource_type_list -S resource_class_list -T err_type_list

        -d err_class_list -i filename -m machine_id -n node_id

        -j id_list -k id_list -l seq_no_list -y filename number_of_days

 

Delete error log entries in the specified list that are older than

number_of_days specified. Number_of_days refers to the number of twenty

four hour periods from command invocation time.

-@ wpar_name    Delete entriesfor the wpar name.

-J list       Select onlyerror_labels     in 'list'.

-K list       Select onlyerror_labels not in 'list'.

-N list       Select onlyresource_names   in 'list'.

-S list       Select onlyresource_classes in 'list'.

-R list       Select onlyresource_types   in 'list'.

-T list       Select onlyerror_types      in 'list'.

-d list      Select only error_classes    in'list'.

-i filename   Uses the errorlog file specified by the filename parameter.

-j list       Select onlyerror_ids        in 'list'.

-k list       Select onlyerror_ids  not   in 'list'.

-l list       Selectsequence_numbers in 'list'.

-m machine_id Delete entries for the machine id as output by uname-m.

-n node_id    Delete entriesfor the node id    as output by uname -n.

-y filename   Uses the error recordtemplate file specified by the filename

              parameter.

'list' is a list of entries separated by commas.

error_type = PERM,TEMP,PERF,PEND,UNKN,INFO

error_class = H (HARDWARE), S (SOFTWARE), O(errlogger MESSAGES), U (UNDETERMINED) 

 

常用的errclear命令

  从错误日志中删除所有记录,请输入:

errclear  0

  从错误日志中删除所有软件错误类的条目

errclear -d S 0

从错误日志中删除所有硬件错误类的条目

errclear -d H 0