11gR2解决CRS-4530: Communications failure contacting Cluster Synchronization Services daemon

来源:互联网 发布:电力网络 桌游 编辑:程序博客网 时间:2024/05/23 23:36

作者 阿九【转载时请务必以超链接形式标明文章原始出处和作者信息】


RHEL6.3上的Oracle11gR2 版本为:11.2.0.1 在重启服务器后不能启动CSS服务,在以前服务器重启后没有出现这个情况,检查css服务器状态,报错:
CRS-4530: Communications failure contacting Cluster Synchronization Services daemon

根据网友的一篇文章解决了这个问题,文章作者及地址:
文章作者:tomszrp
地址:http://tomszrp.itpub.net/post/11835/493005
[grid@vm11gr2] /home/grid> sqlplus "/as sysasm"    SQL*Plus: Release 11.2.0.1.0 Production on Sun Oct 25 10:16:21 2009    Copyright (c) 1982, 2009, Oracle. All rights reserved.    Connected to an idle instance.    SQL> startup    ORA-01078: failure in processing system parameters    ORA-29701: unable to connect to Cluster Synchronization Service    SQL>     无法连接到CSS服务上.到操作系统上检查一下看看    [grid@vm11gr2] /home/grid> crsctl check css    CRS-4530: Communications failure contacting Cluster Synchronization Services daemon    [grid@vm11gr2] /home/grid>     [grid@vm11gr2] /home/grid> ps -ef|grep cssd    果然没有CSS的服务daemon进程,再看一下HAS(High Availability Service)的状态    [grid@vm11gr2] /home/grid> crsctl check has    CRS-4638: Oracle High Availability Services is online    [grid@vm11gr2] /home/grid> ps -ef|grep d.bin    grid 5886 1 0 10:06 ? 00:00:01 /u01/app/grid/product/11.2/grid/bin/ohasd.bin reboot    [grid@vm11gr2] /home/grid>     发现HAS的服务确实启动了的,而ora.cssd和ora.diskmon这2个服务是依赖于HAS维护的.        进一步查看各资源的状态    [grid@vm11gr2] /home/grid> crs_stat -t    Name                Type                 Target    State Host     --------------------------------------------------------------    ora.FLASH_DATA.dg   ora.diskgroup.type   OFFLINE OFFLINE vm11gr2     ora.SYS_DATA.dg     ora.diskgroup.type   OFFLINE OFFLINE vm11gr2     ora.asm             ora.asm.type         OFFLINE OFFLINE vm11gr2     ora.cssd            ora.cssd.type        OFFLINE OFFLINE vm11gr2     ora.diskmon         ora.diskmon.type     OFFLINE OFFLINE vm11gr2     [grid@vm11gr2] /home/grid>         [grid@vm11gr2] /home/grid> crsctl status resource -t    --------------------------------------------------------------------------------    NAME               TARGET          STATE        SERVER     STATE_      DETAILS     --------------------------------------------------------------------------------    Local Resources    --------------------------------------------------------------------------------    ora.FLASH_DATA.dg                      OFFLINE          OFFLINE     vm11gr2     ora.SYS_DATA.dg                         OFFLINE          OFFLINE     vm11gr2     ora.asm                                 OFFLINE          OFFLINE     vm11gr2     --------------------------------------------------------------------------------    Cluster Resources    --------------------------------------------------------------------------------    ora.cssd                     1         OFFLINE          OFFLINE     ora.diskmon                  1         OFFLINE          OFFLINE     再看一下ora.cssd和ora.diskmon的属性    [grid@vm11gr2] /home/grid> crs_stat -p ora.cssd    NAME=ora.cssd    TYPE=ora.cssd.type    ACTION_SCRIPT=    ACTIVE_PLACEMENT=0    AUTO_START=never    CHECK_INTERVAL=30    DESCRIPTION="Resource type for CSSD"    FAILOVER_DELAY=0    FAILURE_INTERVAL=3    FAILURE_THRESHOLD=5    HOSTING_MEMBERS=    PLACEMENT=balanced    RESTART_ATTEMPTS=5    SCRIPT_TIMEOUT=600    START_TIMEOUT=600    STOP_TIMEOUT=900    UPTIME_THRESHOLD=1m    [grid@vm11gr2] /home/grid> crs_stat -p ora.diskmon    NAME=ora.diskmon    TYPE=ora.diskmon.type    ACTION_SCRIPT=    ACTIVE_PLACEMENT=0    AUTO_START=never    CHECK_INTERVAL=20    DESCRIPTION="Resource type for Diskmon"    FAILOVER_DELAY=0    FAILURE_INTERVAL=3    FAILURE_THRESHOLD=5    HOSTING_MEMBERS=    PLACEMENT=balanced    RESTART_ATTEMPTS=10    SCRIPT_TIMEOUT=60    START_TIMEOUT=60    STOP_TIMEOUT=60    UPTIME_THRESHOLD=5s    [grid@vm11gr2] /home/grid>     到这里基本就找到了原因了,可以看到这两个资源的AUTO_START属性默认都设置为never,也就是说他们不会随着HAS服务的启动而自动启动的,尽管默认情况下HAS服务是开机自动启动的.好了,那我们就手动启动一下吧:    [grid@vm11gr2] /home/grid> crsctl start resource ora.cssd    CRS-2672: Attempting to start 'ora.cssd' on 'vm11gr2'    CRS-2679: Attempting to clean 'ora.diskmon' on 'vm11gr2'    CRS-2681: Clean of 'ora.diskmon' on 'vm11gr2' succeeded    CRS-2672: Attempting to start 'ora.diskmon' on 'vm11gr2'    CRS-2676: Start of 'ora.diskmon' on 'vm11gr2' succeeded    CRS-2676: Start of 'ora.cssd' on 'vm11gr2' succeeded    [grid@vm11gr2] /home/grid>     :ora.cssd和ora.diskmon这两个服务是有依赖关系的,启动哪个都会把两个都起来.    [grid@vm11gr2] /home/grid> crs_stat -t    Name                Type                 Target    State Host     --------------------------------------------------------------    ora.FLASH_DATA.dg   ora.diskgroup.type   OFFLINE OFFLINE vm11gr2     ora.SYS_DATA.dg     ora.diskgroup.type   OFFLINE OFFLINE vm11gr2     ora.asm             ora.asm.type         OFFLINE OFFLINE vm11gr2     ora.cssd            ora.cssd.type        ONLINE  ONLINE  vm11gr2     ora.diskmon         ora.diskmon.type     ONLINE  ONLINE  vm11gr2     [grid@vm11gr2] /home/grid>     CSS服务起来了,重启动asm instance       [grid@vm11gr2] /home/grid> sqlplus "/as sysasm"    SQL*Plus: Release 11.2.0.1.0 Production on Sun Oct 25 10:30:03 2009    Copyright (c) 1982, 2009, Oracle. All rights reserved.    Connected to an idle instance.    SQL> startup    ASM instance started    Total System Global Area 284565504 bytes    Fixed Size 1336036 bytes    Variable Size 258063644 bytes    ASM Cache 25165824 bytes    ASM diskgroups mounted    SQL> exit    Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production    With the Automatic Storage Management option    [grid@vm11gr2] /home/grid> crs_stat -t     Name                Type                 Target    State Host     --------------------------------------------------------------    ora.FLASH_DATA.dg   ora.diskgroup.type   ONLINE  ONLINE  vm11gr2     ora.SYS_DATA.dg     ora.diskgroup.type   ONLINE  ONLINE  vm11gr2     ora.asm             ora.asm.type         ONLINE  ONLINE  vm11gr2     ora.cssd            ora.cssd.type        ONLINE  ONLINE  vm11gr2     ora.diskmon         ora.diskmon.type     ONLINE  ONLINE  vm11gr2     [grid@vm11gr2] /home/grid>     tips      1)默认情况下HAS(High Availability Service)是自动启动的.通过如下命令可以取消和启用自动启动        crsctl disable has        crsctl enable has      2)HAS手动启动和停止        crsctl start has        crsctl stop has      3)查看HAS的状态        crsctl check has      4)如果想让ora.css和ora.diskmon服务随着HAS的启动而自动启动,那么你可以这两个服务的AUTO_START属性        crsctl modify resource "ora.cssd" -attr "AUTO_START=1"        or         crsctl modify resource "ora.diskmon" -attr "AUTO_START=1"      5)如果想取消ora.css和ora.diskmon的Auto start         crsctl modify resource "ora.cssd" -attr "AUTO_START=never"        crsctl modify resource "ora.diskmon" -attr "AUTO_START=never"

但是我发现在RHEL6.3上启动has服务并不能只使用crsctl start has 这个命令来启动,还必须以root用户去执行
/etc/init.d/init.ohasd run 这命令,才能启动has服务,不然crsctl start has 会一直挂着且也不能启动。
[grid@rh6-ora11g ~]$ crsctl stop has
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rh6-ora11g'
CRS-2673: Attempting to stop 'ora.ora11g.db' on 'rh6-ora11g'
CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'rh6-ora11g'
CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'rh6-ora11g' succeeded
CRS-2677: Stop of 'ora.ora11g.db' on 'rh6-ora11g' succeeded
CRS-2673: Attempting to stop 'ora.DATA.dg' on 'rh6-ora11g'
CRS-2677: Stop of 'ora.DATA.dg' on 'rh6-ora11g' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'rh6-ora11g'
CRS-2677: Stop of 'ora.asm' on 'rh6-ora11g' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'rh6-ora11g'
CRS-2677: Stop of 'ora.cssd' on 'rh6-ora11g' succeeded
CRS-2673: Attempting to stop 'ora.diskmon' on 'rh6-ora11g'
CRS-2677: Stop of 'ora.diskmon' on 'rh6-ora11g' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rh6-ora11g' has completed
CRS-4133: Oracle High Availability Services 已停止。
[grid@rh6-ora11g ~]$ 
[grid@rh6-ora11g ~]$ crsctl start has


[root@rh6-ora11g ~]# /etc/init.d/init.ohasd run
mkfifo: 无法创建先进先出文件"/var/tmp/.oracle/npohasd": 文件已存在






^C
[root@rh6-ora11g ~]#

[grid@rh6-ora11g ~]$ crsctl start has
CRS-4123: Oracle High Availability Services 已启动。
[grid@rh6-ora11g ~]$ crs_stat -t
名称           类型           目标      状态      主机        
------------------------------------------------------------
ora.DATA.dg    ora....up.type ONLINE    ONLINE    rh6-ora11g  
ora....ER.lsnr ora....er.type ONLINE    ONLINE    rh6-ora11g  
ora.asm        ora.asm.type   ONLINE    ONLINE    rh6-ora11g  
ora.cssd       ora.cssd.type  ONLINE    ONLINE    rh6-ora11g  
ora.diskmon    ora....on.type ONLINE    ONLINE    rh6-ora11g  
ora.ora11g.db  ora....se.type ONLINE    ONLINE    rh6-ora11g  
[grid@rh6-ora11g ~]$
这个问题其实在装软件的时候就碰到了,在RHEL6.3上装11gR2 在执行root.sh脚本的时候会报错:

Adding daemon to inittab

CRS-4124: Oracle High Availability Servicesstartup failed.

CRS-4000: Command Start failed, orcompleted with errors.

ohasd failed to start: 对设备不适当的 ioctl操作

ohasdfailed to start: 对设备不适当的 ioctl 操作 at /grid/base/grid/crs/install/roothas.pl line296.

在RHEL6.3上解决这个错误的方法是:

官网的解决办法(亲测~没有用,但还是写出来)

1、/usr/local/bin目录下没有perl这个文件

解决办法:

ln -s/usr/bin/perl /usr/local/bin/perl

 

2、存在 /tmp/.oracle or /var/tmp/.oracle or/usr/tmp/.oracle.这几个文件

解决办法:

rm -rf /tmp/.oracle/*/usr/tmp/.oracle/* /var/tmp/.oracle/*


非官网的解决办法(解决办法来自于博客“语虎店”的文章 ,连接:http://yuhushop.com/?p=453)

在执行root.sh的时候,会生成/var/tmp/.oracle/npohasd文件,解决的办法就是在执行root.sh的时候,当生成了 /var/tmp/.oracle/npohasd 后,立即执行 :

dd if=/var/tmp/.oracle/npohasd of=/dev/null bs=1024 count=1

 然后再执行root.sh命令。重新执行root.sh命令之前要先执行一个deconfig操作

在GRID_HOME/crs/install目录下

./roothas.pl-deconfig -force -verbose


原创粉丝点击