oracle-rac 遇到的错误

来源:互联网 发布:淘宝购买怪兽伊兰特 编辑:程序博客网 时间:2024/05/01 09:19
这两天安装生产环境的RAC,成功装完后,期间也遇到了不少小问题,今天一同事把服务器重启启动了一下之后,CRS就启动不了了。。
数据库:oracle 11.2.0.1,系统是oraclelinux6.3
以下是我做的错误及处理记录
各位大师,是否还有更好的处理办法啊??请指教!!!
错误现象:
服务器重启之后,crs启动不了
[grid@RAC1 ~]$crs_stat –t -v
CRS-0184: Cannot communicate with the CRS daemon.
[grid@RAC1 ~]$ crsctl start crs
CRS-4563: Insufficient user privileges.
CRS-4000: Command Start failed, or completed with errors.
[grid@RAC1 ~]$ crsctl check crs
CRS-4639: Could not contact Oracle High Availability Services
CRS-4000: Command Start failed, or completed with errors.
网上查到,说这个是11.2.0.1数据库版本的bug问题
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
PL/SQL Release 11.2.0.1.0 - Production
CORE    11.2.0.1.0      Production
TNS for Linux: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 – Production
我的解决办法:
两个节点同时执行
[root@RAC1 bin]# dd if=/var/tmp/.oracle/npohasd of=/dev/null bs=1024 count=1
查看ohasd日志
[grid@RAC1 log]$ tail -f /u01/11.2.0/grid/log/rac1/ohasd/ohasd.log
2013-08-14 17:08:18.621: [UiServer][3376371456] S(0x7f1f7c000b70): Accepted client connection: saddr =(ADDRESS=(PROTOCOL=ipc)(DEV=29)(KEY=OHASD_UI_SOCKET))daddr = (ADDRESS=(PROTOCOL=ipc)(KEY=OHASD_UI_SOCKET))
2013-08-14 17:08:18.632: [UiServer][3378472704] processMessage called
2013-08-14 17:08:18.632: [UiServer][3378472704] Sending message to PE. ctx= 0x7f1f7c002f20
2013-08-14 17:08:18.632: [UiServer][3378472704] Sending command to PE: 43
2013-08-14 17:08:18.632: [   CRSPE][3382675200] Processing PE command id=144. Description: [Stat Resource : 0x7f1f8407d710]
2013-08-14 17:08:18.633: [   CRSPE][3382675200] Expression Filter : (((NAME == ora.crsd) OR (NAME == ora.cssd)) OR (NAME == ora.evmd))
2013-08-14 17:08:18.636: [   CRSPE][3382675200] PE Command [ Stat Resource : 0x7f1f8407d710 ] has completed
2013-08-14 17:08:18.636: [   CRSPE][3382675200] UI Command [Stat Resource : 0x7f1f8407d710] is replying to sender.
2013-08-14 17:08:18.636: [UiServer][3378472704] Done for ctx=0x7f1f7c002f20
2013-08-14 17:08:18.642: [UiServer][3376371456] Closed: remote end failed/disc.
2013-08-14 18:41:19.320: [ default][3993569056] OHASD Daemon Starting. Command string :reboot
2013-08-14 18:41:19.320: [ default][3600992032] OHASD Daemon Starting. Command string :reboot
之后重新启动crs服务
[grid@RAC1 log]$ crs_start -all
CRS-5702: Resource 'ora.DATA.dg' is already running on 'rac1'
CRS-5702: Resource 'ora.LISTENER.lsnr' is already running on 'rac1'
CRS-5702: Resource 'ora.LISTENER_SCAN1.lsnr' is already running on 'rac1'
CRS-5702: Resource 'ora.OCR.dg' is already running on 'rac1'
CRS-5702: Resource 'ora.RCY.dg' is already running on 'rac1'
CRS-5702: Resource 'ora.VOTE.dg' is already running on 'rac1'
CRS-5702: Resource 'ora.asm' is already running on 'rac1'
CRS-5702: Resource 'ora.eons' is already running on 'rac1'
CRS-5702: Resource 'ora.gsd' is already running on 'rac1'
CRS-5702: Resource 'ora.net1.network' is already running on 'rac1'
CRS-5702: Resource 'ora.oc4j' is already running on 'rac1'
CRS-5702: Resource 'ora.ons' is already running on 'rac1'
CRS-5702: Resource 'ora.asm' is already running on 'rac1'
CRS-5702: Resource 'ora.LISTENER.lsnr' is already runnin
。。。。。。。。。。。。。。。。。。。。。。
。。。。。。。。。。。。。。。。
[grid@RAC1 log]$ crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
服务一切正常
grid@RAC1 log]$ crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora.DATA.dg    ora....up.type ONLINE    ONLINE    rac1        
ora....ER.lsnr ora....er.type ONLINE    ONLINE    rac1        
ora....N1.lsnr ora....er.type ONLINE    ONLINE    rac1        
ora.OCR.dg     ora....up.type ONLINE    ONLINE    rac1        
ora.RCY.dg     ora....up.type ONLINE    ONLINE    rac1        
ora.VOTE.dg    ora....up.type ONLINE    ONLINE    rac1        
ora.asm        ora.asm.type   ONLINE    ONLINE    rac1        
ora.eons       ora.eons.type  ONLINE    ONLINE    rac1        
ora.gsd        ora.gsd.type   ONLINE    ONLINE    rac1        
ora....network ora....rk.type ONLINE    ONLINE    rac1        
ora.oc4j       ora.oc4j.type  ONLINE    ONLINE    rac1        
ora.ons        ora.ons.type   ONLINE    ONLINE    rac1        
ora.orcl.db    ora....se.type ONLINE    ONLINE    rac1        
ora....SM1.asm application    ONLINE    ONLINE    rac1        
ora....C1.lsnr application    ONLINE    ONLINE    rac1        
ora.rac1.gsd   application    ONLINE    ONLINE    rac1        
ora.rac1.ons   application    ONLINE    ONLINE    rac1        
ora.rac1.vip   ora....t1.type ONLINE    ONLINE    rac1        
ora....SM2.asm application    ONLINE    ONLINE    rac2        
ora....C2.lsnr application    ONLINE    ONLINE    rac2        
ora.rac2.gsd   application    ONLINE    ONLINE    rac2        
ora.rac2.ons   application    ONLINE    ONLINE    rac2        
ora.rac2.vip   ora....t1.type ONLINE    ONLINE    rac2        
ora.scan1.vip  ora....ip.type ONLINE    ONLINE    rac1  

 

INS-35423 安装 database 时安装程序无法获取集群节点

操作系统为 RedHat Linux 6.4,已安装Oracle 11.2.0.4.0 版本的 Grid后,开始以oracle用户身份安装对应版本的database软件,结果在在走到  Oracle Database 11g Release 2 Installer database - Step 4 of 10 中的 Grid Instrallation Options 界面,集群列表为空,且报INS-35423错误,见下图:

通过检查各项配置均无误后,于是在网络上搜索了下,在国外的一个论坛上找到如下界面方法:

之所以未能获取到集群节点,是因为位于grid用户下的ORACLE_BASE下的inventory目录内的一个xml文件有问题,请看如下文件:

[root@qjdb1 ContentsXML]# cat inventory.xml 
<?xml version="1.0" standalone="yes" ?>
<!-- Copyright (c) 1999, 2013, Oracle and/or its affiliates.
All rights reserved. -->
<!-- Do not modify the contents of this file by hand. -->
<INVENTORY>
<VERSION_INFO>
   <SAVED_WITH>11.2.0.4.0</SAVED_WITH>
   <MINIMUM_VER>2.1.0.6.0</MINIMUM_VER>
</VERSION_INFO>
<HOME_LIST>
<HOME NAME="Ora11g_gridinfrahome1" LOC="/u01/app/11.2.0/grid" TYPE="O" IDX="1">------少了一个CRS="true"内容
   <NODE_LIST>
      <NODE NAME="qjdb1"/>
      <NODE NAME="qjdb2"/>
   </NODE_LIST>
</HOME>
</HOME_LIST>
<COMPOSITEHOME_LIST>
</COMPOSITEHOME_LIST>
</INVENTORY>

通过执行如下命令修改:
[grid@qjdb1 ~]$ /u01/app/11.2.0/grid/oui/bin/runInstaller -updateNodeList ORACLE_HOME="/u01/app/11.2.0/grid" CRS=true
Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB.   Actual 32000 MB    Passed
The inventory pointer is located at /etc/oraInst.loc
The inventory is located at /u01/app/oraInventory
'UpdateNodeList' was successful.


上面命令成功执行后,可以发现inventory.xml文件的内容发生了变化,已经调整为预期内容:

[root@qjdb1 ContentsXML]# cat inventory.xml 
<?xml version="1.0" standalone="yes" ?>
<!-- Copyright (c) 1999, 2013, Oracle and/or its affiliates.
All rights reserved. -->
<!-- Do not modify the contents of this file by hand. -->
<INVENTORY>
<VERSION_INFO>
   <SAVED_WITH>11.2.0.4.0</SAVED_WITH>
   <MINIMUM_VER>2.1.0.6.0</MINIMUM_VER>
</VERSION_INFO>
<HOME_LIST>
<HOME NAME="Ora11g_gridinfrahome1" LOC="/u01/app/11.2.0/grid" TYPE="O" IDX="1" CRS="true">
   <NODE_LIST>
      <NODE NAME="qjdb1"/>
      <NODE NAME="qjdb2"/>
   </NODE_LIST>
</HOME>
</HOME_LIST>
<COMPOSITEHOME_LIST>
</COMPOSITEHOME_LIST>
</INVENTORY>

需要提醒的是,仅可通过命令方式更新inventory.xml文件,如果通过手工修改inventory.xml文件内容,可能会导致连安装database的界面都打不开,如runInstaller命令无法执行。