RAC VOTINT OCR

来源:互联网 发布:sql语句中的引号 编辑:程序博客网 时间:2024/06/05 20:54

OCR:(ORACLE CLUSTERWARE REGISTRY)负责维护整个集群的配置信息,包括RAC以及CLUSTERWARE资源,包括的信息有节点成员,数据库,实例,服务,监听器,应用程序等。
健忘部题是由于每个节点都有配置信息的拷贝,修改节点的配置信息不能同步所引起的。解决这个问题最好的方法就是让整个集群保留一份配置,各节点共用这份配置,这样无论在哪个节点上修改配置,都是修改相同的配置文件,这样就保证了修改不会丢失。
OCR以KEY-VALUE的形式记录配置,当使用OEM,DBCA或者SRVCTL进行配置时,更新的就是这个文件。
如果想要修改OCR内容,也是由该节点的OCR PROCESS向MASTER NODE 的OCR PROCESS提交请求,由MASTER OCR PROCESS完成物理读写,并同步所有节点的OCR CACHE中的内容。

VOTING DISK: 管理集群的节点成员身份,根据这里的记录判断哪个节点属于集群的成员。并在出现脑裂时,仲裁哪个PARTITION获得集群的控制权,而其他PARTITION必须从集群中剔除。Voting disk使用的是一种“多数可用算法”,如果有多个Voting  disk,,则必须一半以上的Votedisk同时使用,Clusterware才能正常使用。 比如配置了4个Votedisk,坏一个Votedisk,集群可以正常工作,如果坏了2个,则不能满足半数以上,集群会立即宕掉,所有节点立即重启,所以如果添加Votedisk,尽量不要只添加一个,而应该添加2个。这点和OCR 不一样。OCR 只需配置一个。

可见这些信息都非常重要, 在日常工作中需要对他们进行备份,当然OCR也会自动备份,当OCR或者VOTING DISK出现问题时,有备份的话就使用备份来恢复,当没有备份的话就只能重建。

VOTEDISK备份与恢复测试:
1.查询VOTEDISK的位置:

crsctl query css votedisk 0.     0    /ocfs/clusterware/votingdisk


located 1 votedisk(s).

2. 备份VOTEDISK

dd if=/ocfs/clusterware/votingdisk of=/home/oracle/rman/voting_disk.bak20000+0 records in20000+0 records out10240000 bytes (10 MB) copied, 1.01971 seconds, 10.0 MB/s


3.恢复VOTEDISK。

 dd if=/home/oracle/rman/voting_disk.bak of=/ocfs/clusterware/votingdisk20000+0 records in20000+0 records out10240000 bytes (10 MB) copied, 0.590575 seconds, 17.3 MB/s


在测试恢复以后自动关机了。原因:是在正常情况下恢复的,而不是等VOTEDISK出现故障做的恢复。

4.查看VOTING DISK的内容:

strings voting_disk.bakSslcLlikSslcLlikSslcLlikSslcLlikSslcLlikSslcLlikSslcLlikSslcLlikSslcLlik


OCR备份与恢复测试:
因为OCR的内容如此重要,所以Oracle 每4个小时对其做一次备份,并且保留最后的3个备份,以及前一天,前一周的最后一个备份。 这个备份由Master Node CRSD进程完成,备份的默认位置是$CRS_HOME/crs/cdata/<cluster_name>目录下,可以通过ocrconfig -backuploc <directory_name> 命令修改到新的目录。 每次备份后,备份文件名自动更改,以反应备份时间顺序,最近一次的备份叫作backup00.ocr。这些备份文件除了保存在本地,DBA还应该在其他存储设备上保留一份,以防止意外的存储故障。
Oracle 推荐在对集群做调整时,比如增加,删除节点之前,修改RAC IP之前,对OCR做一个备份,可以使用export 备份到指定文件,如果做了replace或者restore 等操作,Oracle 建议使用 cluvfy comp ocr -n all 命令来做一次全面的检查。对OCR的备份与恢复,我们可以使用ocrconfig 命令。

1.查看OCR的备份:

ls -ltotal 13500-rw-r--r-- 1 root root 4595712 May  7 19:50 backup00.ocr-rw-r--r-- 1 root root 4595712 May  7 19:50 day.ocr-rw-r--r-- 1 root root 4595712 May  7 19:50 week.ocrrac1-> pwd/home/oracle/product/10.2.0/crs_1/cdata/crs


2.OCR配置帮助

ocrconfig --helpName:        ocrconfig - Configuration tool for Oracle Cluster Registry.Synopsis:        ocrconfig [option]        option:                -export <filename> [-s online]                                                    - Export cluster register contents to a file                -import <filename>                  - Import cluster registry contents from a file                -upgrade [<user> [<group>]]                                                    - Upgrade cluster registry from previous version                -downgrade [-version <version string>]                                                    - Downgrade cluster registry to the specified version                -backuploc <dirname>                - Configure periodic backup location                -showbackup                         - Show backup information                -restore <filename>                 - Restore from physical backup                -replace ocr|ocrmirror [<filename>] - Add/replace/remove a OCR device/file                -overwrite                          - Overwrite OCR configuration on disk                -repair ocr|ocrmirror <filename>    - Repair local OCR configuration                -help                               - Print out this help informationNote:        A log file will be created in        $ORACLE_HOME/log/<hostname>/client/ocrconfig_<pid>.log. Please ensure        you have file creation privileges in the above directory before        running this tool.


3.恢复OCR

/etc/init.d/init.crs stop  两个节点均要关闭ocrconfig -showbackupocrconfig -restore filename_location: [root@rac1 ~]# /home/oracle/product/10.2.0/crs_1/bin/ocrconfig -restore /home/oracle/product/10.2.0/crs_1/cdata/crs/backup00.ocr [root@rac1 bin]# ./ocrcheckStatus of Oracle Cluster Registry is as follows :         Version                  :          2         Total space (kbytes)     :     262144         Used space (kbytes)      :       4304         Available space (kbytes) :     257840         ID                       : 1278044310         Device/File Name         : /ocfs/clusterware/ocr                                    Device/File integrity check succeeded                                    Device/File not configured         Cluster registry integrity check succeeded


启动两个节点上的CRS。
导出OCR内容:

[root@rac1 ~]# cd /home/oracle/product/10.2.0/crs_1/bin/[root@rac1 bin]# ./ocrconfig -export /home/oracle/rman/ocr.exp检查CRS状态: ./crsctl check crsCSS appears healthyCRS appears healthyEVM appears healthy


破坏OCR内容:

dd if=/dev/zero of=/ocfs/clusterware/ocr bs=1024 count=102400102400+0 records in102400+0 records out104857600 bytes (105 MB) copied, 15.1351 seconds, 6.9 MB/s


检查OCR一致性:

[root@rac1 bin]# ./ocrcheckPROT-601: Failed to initialize ocrcheck


使用CLUVFY工具检查一至性。

/home/oracle/clusterware/cluvfy/runcluvfy.sh comp ocr -n allVerifying OCR integrity Unable to retrieve nodelist from Oracle clusterware.Verification cannot proceed.


使用IMPORT 恢复OCR内容:

./ocrconfig -import /home/oracle/rman/ocr.exp [root@rac1 bin]# 


再次检查OCR:

./ocrcheckStatus of Oracle Cluster Registry is as follows :         Version                  :          2         Total space (kbytes)     :     262144         Used space (kbytes)      :       4304         Available space (kbytes) :     257840         ID                       :   86171642         Device/File Name         : /ocfs/clusterware/ocr                                    Device/File integrity check succeeded                                    Device/File not configured         Cluster registry integrity check succeeded


使用CLUVFY工具检查:

 /home/oracle/clusterware/cluvfy/runcluvfy.sh comp ocr -n allVerifying OCR integrity Checking OCR integrity...Checking the absence of a non-clustered configuration...All nodes free of non-clustered, local-only configurations.Uniqueness check for OCR device passed.Checking the version of OCR...OCR of correct Version "2" exists.Checking data integrity of OCR...Data integrity check for OCR passed.OCR integrity check passed.Verification of OCR integrity was successful. 


验证通过。

 crs_stat -tName           Type           Target    State     Host        ------------------------------------------------------------ora.....CRM.cs application    ONLINE    ONLINE    rac1        ora....db1.srv application    ONLINE    ONLINE    rac1        ora.devdb.db   application    ONLINE    ONLINE    rac2        ora....b1.inst application    ONLINE    ONLINE    rac1        ora....b2.inst application    ONLINE    ONLINE    rac2        ora....SM1.asm application    ONLINE    ONLINE    rac1        ora....C1.lsnr application    ONLINE    ONLINE    rac1        ora.rac1.gsd   application    ONLINE    ONLINE    rac1        ora.rac1.ons   application    ONLINE    ONLINE    rac1        ora.rac1.vip   application    ONLINE    ONLINE    rac1        ora....SM2.asm application    ONLINE    ONLINE    rac2        ora....C2.lsnr application    ONLINE    ONLINE    rac2        ora.rac2.gsd   application    ONLINE    ONLINE    rac2        ora.rac2.ons   application    ONLINE    ONLINE    rac2        ora.rac2.vip   application    ONLINE    ONLINE    rac2        


在做备份的时候也可以用:

dd if=/ocfs/clusterware/ocr of=/home/oracle/rman/ocr.bak204800+0 records in204800+0 records out104857600 bytes (105 MB) copied, 40.2978 seconds, 2.6 MB/s



备份恢复就到此了, 再来看下重建:
1.对OCR和VOTEDISK做一次备份:

 dd if=/ocfs/clusterware/votingdisk of=/home/oracle/rman/voting_disk.bak20000+0 records in20000+0 records out10240000 bytes (10 MB) copied, 6.65133 seconds, 1.5 MB/s


2.关掉数据库相关服务。

rac1-> srvctl stop instance -d devdb -i devdb1rac1-> srvctl stop instance -d devdb -i devdb2rac1-> srvctl stop asm -n rac1rac1-> srvctl stop asm -n rac2rac1-> srvctl stop nodeapps -n rac1rac1-> srvctl stop nodeapps -n rac2


3.停止所有节点的CRS:
节点1:

[root@rac1 bin]# ./crsctl stop crsStopping resources.Successfully stopped CRS resources Stopping CSSD.Shutting down CSS daemon.Shutdown request successfully issued.


节点2:

 ./crsctl stop crsStopping resources.Successfully stopped CRS resources Stopping CSSD.Shutting down CSS daemon.Shutdown request successfully issued.


4.备份每个节点的CLUSTERWARE HOME

 cp -rf crs_1 crs_1_bak


5.在所有节点执行<ORA_CRS_HOME>/install/rootdelete.sh 命令

pwd/home/oracle/product/10.2.0/crs_1/install[root@rac1 install]# lscluster.ini         install.excl  paramfile.crs  rootaddnode.sbs   rootdeletenode.sh  rootlocaladdcmdllroot.sh        install.incl  preupdate.sh   rootconfig        rootdelete.sh      rootupgradeenvVars.properties  make.log      readme.txt     rootdeinstall.sh  rootinstall        templocal[root@rac1 install]# ./rootderootdeinstall.sh   rootdeletenode.sh  rootdelete.sh      [root@rac1 install]# ./rootderootdeinstall.sh   rootdeletenode.sh  rootdelete.sh      [root@rac1 install]# ./rootdelete.sh Shutting down Oracle Cluster Ready Services (CRS):Stopping resources.Error while stopping resources. Possible cause: CRSD is down.Stopping CSSD.Unable to communicate with the CSS daemon.Shutdown has begun. The daemons should exit soon.Checking to see if Oracle CRS stack is down...Oracle CRS stack is not running.Oracle CRS stack is down now.Removing script for Oracle Cluster Ready servicesUpdating ocr file for downgradeCleaning up SCR settings in '/etc/oracle/scls_scr'


6.在执行安装的节点执行<ORA_CRS_HOME>/install/rootdeinstall.sh命令,记住,只需在安装的节点进行此角本:

 ./rootdeinstall.sh Removing contents from OCR device2560+0 records in2560+0 records out10485760 bytes (10 MB) copied, 1.05943 seconds, 9.9 MB/s


7.检查有无CRS进程,如没有返回值,继续下一步。

[root@rac1 install]# ps -elf | grep -i 'ocs[s]d'[root@rac1 install]# ps -elf | grep -i 'cr[s]d.bin'[root@rac1 install]# ps -elf | grep -i 'ev[m]d.bin'[root@rac1 install]# 


8.在安装节点执行ORA_CRS_HOME/root.sh
 

./root.shWARNING: directory '/home/oracle/product/10.2.0' is not owned by rootWARNING: directory '/home/oracle/product' is not owned by rootWARNING: directory '/home/oracle' is not owned by rootChecking to see if Oracle CRS stack is already configuredSetting the permissions on OCR backup directorySetting up NS directoriesOracle Cluster Registry configuration upgraded successfullyWARNING: directory '/home/oracle/product/10.2.0' is not owned by rootWARNING: directory '/home/oracle/product' is not owned by rootWARNING: directory '/home/oracle' is not owned by rootassigning default hostname rac1 for node 1.assigning default hostname rac2 for node 2.Successfully accumulated necessary OCR keys.Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.node <nodenumber>: <nodename> <private interconnect name> <hostname>node 1: rac1 rac1-priv rac1node 2: rac2 rac2-priv rac2Creating OCR keys for user 'root', privgrp 'root'..Operation successful.Now formatting voting device: /ocfs/clusterware/votingdiskFormat of 1 voting devices complete.Startup will be queued to init within 90 seconds./etc/profile: line 57: ulimit: pipe size: cannot modify limit: Invalid argumentAdding daemons to inittabExpecting the CRS daemons to be up within 600 seconds.CSS is active on these nodes.        rac1CSS is inactive on these nodes.        rac2Local node checking complete.Run root.sh on remaining nodes to start CRS daemons.


9.在另外一节点ORA_CRS_HOME下也执行ROOT.SH
在最后会报:Oracle CRS stack installed and running under init(1M)

Running vipca(silent) for configuring nodeappsError 0(Native: listNetInterfaces:[3])  [Error 0(Native: listNetInterfaces:[3])]


解决办法:

[root@rac2 bin]# ./oifcfg getif[root@rac2 bin]# ./oifcfg iflisteth0  192.168.1.0eth1  10.10.10.0[root@rac2 bin]# ./oifcfg setif -global eth0/192.168.1.0:public[root@rac2 bin]# ./oifcfg setif -global eth1/10.10.10.0:cluster_interconnectYou have new mail in /var/spool/mail/root[root@rac2 bin]# ./oifcfg getifeth0  192.168.1.0  global  publiceth1  10.10.10.0  global  cluster_interconnect


再到图形化界面去执行VIPCA
配置完成后再往下走。

10.检查CRS状态:
 

crs_stat -tName           Type           Target    State     Host        ------------------------------------------------------------ora.rac1.gsd   application    ONLINE    ONLINE    rac1        ora.rac1.ons   application    ONLINE    ONLINE    rac1        ora.rac1.vip   application    ONLINE    ONLINE    rac1        ora.rac2.gsd   application    ONLINE    ONLINE    rac2        ora.rac2.ons   application    ONLINE    ONLINE    rac2        ora.rac2.vip   application    ONLINE    ONLINE    rac2        


11,配置监听(netca)

rac1-> mv /home/oracle/product/10.2.0/db_1/network/admin/listener.ora /tmp/listener.ora.bakrac2-> mv /home/oracle/product/10.2.0/db_1/network/admin/listener.ora /tmp/listener.ora.bak


去执行NETCA
配置完成后再次查看服务可以看到增加了侦听服务:

rac1-> crs_stat -tName           Type           Target    State     Host        ------------------------------------------------------------ora....C1.lsnr application    ONLINE    ONLINE    rac1        ora.rac1.gsd   application    ONLINE    ONLINE    rac1        ora.rac1.ons   application    ONLINE    ONLINE    rac1        ora.rac1.vip   application    ONLINE    ONLINE    rac1        ora....C2.lsnr application    ONLINE    ONLINE    rac2        ora.rac2.gsd   application    ONLINE    ONLINE    rac2        ora.rac2.ons   application    ONLINE    ONLINE    rac2        ora.rac2.vip   application    ONLINE    ONLINE    rac2        


12,添加其他资源到OCR

rac1-> srvctl add asm -n rac1 -i +ASM1 -o /home/oracle/product/10.2.0/db_1rac1-> srvctl add asm -n rac2 -i +ASM2 -o /home/oracle/product/10.2.0/db_1rac1-> srvctl add database -d devdb -o /home/oracle/product/10.2.0/db_1rac1-> srvctl add instance -d devdb -i devdb1 -n rac1rac1-> srvctl add instance -d devdb -i devdb2 -n rac2rac1-> srvctl add service -d devdb -s oltp -r devdb1,devdb2 -P BASIC


13,完成后再来查看下服务:

 crs_stat -tName           Type           Target    State     Host        ------------------------------------------------------------ora.devdb.db   application    OFFLINE   OFFLINE               ora....b1.inst application    OFFLINE   OFFLINE               ora....b2.inst application    OFFLINE   OFFLINE               ora....oltp.cs application    OFFLINE   OFFLINE               ora....db1.srv application    OFFLINE   OFFLINE               ora....db2.srv application    OFFLINE   OFFLINE               ora....SM1.asm application    OFFLINE   OFFLINE               ora....C1.lsnr application    ONLINE    ONLINE    rac1        ora.rac1.gsd   application    ONLINE    ONLINE    rac1        ora.rac1.ons   application    ONLINE    ONLINE    rac1        ora.rac1.vip   application    ONLINE    ONLINE    rac1        ora....SM2.asm application    OFFLINE   OFFLINE               ora....C2.lsnr application    ONLINE    ONLINE    rac2        ora.rac2.gsd   application    ONLINE    ONLINE    rac2        ora.rac2.ons   application    ONLINE    ONLINE    rac2        ora.rac2.vip   application    ONLINE    ONLINE    rac2        


14,启动资源:

rac1-> srvctl start asm -n rac1rac1-> srvctl start asm -n rac2rac1-> srvctl start database -d devdbrac1-> srvctl start service -d devdb


15. 再次检查

rac1-> crs_stat -tName           Type           Target    State     Host        ------------------------------------------------------------ora.devdb.db   application    ONLINE    ONLINE    rac1        ora....b1.inst application    ONLINE    ONLINE    rac1        ora....b2.inst application    ONLINE    ONLINE    rac2        ora....oltp.cs application    ONLINE    ONLINE    rac1        ora....db1.srv application    ONLINE    ONLINE    rac1        ora....db2.srv application    ONLINE    ONLINE    rac2        ora....SM1.asm application    ONLINE    ONLINE    rac1        ora....C1.lsnr application    ONLINE    ONLINE    rac1        ora.rac1.gsd   application    ONLINE    ONLINE    rac1        ora.rac1.ons   application    ONLINE    ONLINE    rac1        ora.rac1.vip   application    ONLINE    ONLINE    rac1        ora....SM2.asm application    ONLINE    ONLINE    rac2        ora....C2.lsnr application    ONLINE    ONLINE    rac2        ora.rac2.gsd   application    ONLINE    ONLINE    rac2        ora.rac2.ons   application    ONLINE    ONLINE    rac2        ora.rac2.vip   application    ONLINE    ONLINE    rac2  


16. 进行环境检查:

 cluvfy stage -post crsinst -n rac1,rac2Performing post-checks for cluster services setup Checking node reachability...Node reachability check passed from node "rac1".Checking user equivalence...User equivalence check passed for user "oracle".Checking Cluster manager integrity... Checking CSS daemon...Daemon status check passed for "CSS daemon".Cluster manager integrity check passed.Checking cluster integrity... Cluster integrity check passedChecking OCR integrity...Checking the absence of a non-clustered configuration...All nodes free of non-clustered, local-only configurations.Uniqueness check for OCR device passed.Checking the version of OCR...OCR of correct Version "2" exists.Checking data integrity of OCR...Data integrity check for OCR passed.OCR integrity check passed.Checking CRS integrity...Checking daemon liveness...Liveness check passed for "CRS daemon".Checking daemon liveness...Liveness check passed for "CSS daemon".Checking daemon liveness...Liveness check passed for "EVM daemon".Checking CRS health...CRS health check passed.CRS integrity check passed.Checking node application existence... Checking existence of VIP node application (required)Check passed. Checking existence of ONS node application (optional)Check passed. Checking existence of GSD node application (optional)Check passed. Post-check for cluster services setup was successful. 



重建结束。

本文摘抄自http://blog.csdn.net/tianlesoftware/article/details/6050606 与《大话RAC》 最终至OCR与VOTEDISK的管理恢复,重建等测试通过。同时感谢21大湿的指导!

 

 

 

 

 

 

 

 

原创粉丝点击