CRS无法启动的解决过程

来源:互联网 发布:java去掉html标签样式 编辑:程序博客网 时间:2024/05/21 17:33

一客户报RAC一节点的CRS无法启动,通过VPN远端连过去检查crs,如下:

# crsctl check crs
Failure 1 contacting CSS daemon
Cannot communicate with CRS
Cannot communicate with EVM 

检查crs进程:

# ps -ef | grep css

root      6929     1  0 19:56 ?        00:00:00 /bin/sh /etc/init.d/init.cssd fatal
root      6960  6928  0 19:56 ?        00:00:00 /bin/sh /etc/init.d/init.cssd startcheck
root      6963  6929  0 19:56 ?        00:00:00 /bin/sh /etc/init.d/init.cssd startcheck
root      7064  6935  0 19:56 ?        00:00:00 /bin/sh /etc/init.d/init.cssd startcheck

从上可以看出init.cssd停在startcheck中,并没有运行ocssd.bin daemon。

检查crsd.log及ocssd.log,没有发现有用的信息,而在$ORA_CRS_HOME/log/rac1/client中:

[root@rac1 client]# more css339.log
oracle Database 10g CRS Release 10.2.0.1.0 Production Copyright 1996, 2005 oracle .  All rights reserved.
2009-06-09 20:00:28.799: [ CSSCLNT][2541220896]clsssInitNative: connect failed, rc 9

[root@rac1 client]# more  clsc790.log
oracle Database 10g CRS Release 10.2.0.1.0 Production Copyright 1996, 2005 oracle .  All rights reserved.
2009-06-09 20:39:27.940: [ COMMCRS][2541220896]clsc_connect: (0×66c3d0) no listener at (ADDRESS=(PROTOCOL=IPC)(KEY=CRSD_UI_SOCKET))
2009-06-09 20:39:27.941: [ COMMCRS][2541220896]clsc_connect: (0×6116e0) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth))
2009-06-09 20:39:27.941: [ default][2541220896]Terminating clsd session

检查系统日志,在客户报故障的时间范围内发现如下信息:

Jun  9 17:19:39 rac1 logger: oracle Cluster Ready Services starting up automatically.
Jun  9 17:19:39 rac1 init.crs: Startup will be queued to init within 90 seconds.
Jun  9 17:19:39 rac1 rc: Starting init.crs:  succeeded
Jun  9 17:19:39 rac1 readahead: Starting background readahead: 
Jun  9 17:19:39 rac1 rc: Starting readahead:  succeeded
Jun  9 17:19:41 rac1 logger: Cluster Ready Services waiting on dependencies. Diagnostics in /tmp/crsctl.4528.
Jun  9 17:19:41 rac1 logger: Cluster Ready Services waiting on dependencies. Diagnostics in /tmp/crsctl.4531.
Jun  9 17:19:41 rac1 logger: Cluster Ready Services waiting on dependencies. Diagnostics in
/tmp/crsctl.4603.

转到/tmp去查看crsctl.*文件,却发现没有,奇怪!那这些信息又是在哪里产生呢?于是去查看/etc/init.d/init.cssd脚本,该信息在脚本中的如下地方产生:

    # Wait for additional filesystems and objects to become available
    # crsctl should print out a message indicating cause of failure.

    $SU $ORACLE_USER -c “$CRSCTL check boot > $CRSCTLOUT”
    RC=$?
    while [ "$RC" != "0" ]
    do
      $LOGMSG Cluster Ready Services waiting on dependencies. Diagnostics in $CRSCTLOUT.
      $SLEEP $DEP_CHECK_WAIT
      $SU $ORACLE_USER -c “$CRSCTL check boot > $CRSCTLOUT”
      RC=$?
    done

该段脚本正好是在startcheck的选项中执行的,再看看变量$CRSCTLOUT的定义:

# Temp file for crsctl output
CRSCTLOUT=/tmp/crsctl.$$

正是/tmp目录。难道是/tmp目录有问题?手工执行:

[root@rac1 /]# /etc/init.d/init.cssd startcheck
-bash: /tmp/crsctl.31969: No such file or directory

果然是/tmp有问题,检查/tmp的权限:

drwxr-xr-x    7   1003 dba   4096 Jun 10 04:02 tmp

晕,权限被改了,正确的应该是:

drwxrwxrwt   15 root root  4096 Jun 10 09:51 tmp

于是修改/tmp权限:

#chown -R root:root /tmp
#chmod -R 1777 /tmp

然后执行:

[root@rac1 tmp]# /etc/init.d/init.crs stop
Shutting down oracle Cluster Ready Services (CRS):
Stopping resources.
Successfully stopped CRS resources
Stopping CSSD.
Shutting down CSS daemon.
Shutdown request successfully issued.
Shutdown has begun. The daemons should exit soon.

[root@rac1 tmp]# /etc/init.d/init.crs start
Startup will be queued to init within 90 seconds.

过一会儿查看crs进程,crs成功运行,至此问题解决!

客户环境:RHEL AS 4 +  oracle 10g RAC


原创粉丝点击