Oracle 11g Data Guard Broker Switchover Fast_start Failover

来源:互联网 发布:数学必修三算法 编辑:程序博客网 时间:2024/05/19 17:11
DG Broker 官方文档
Oracle data guard broker - E40771-05
How to switchover and failover by DG broker

http://docs.oracle.com/cd/E11882_01/server.112/e40771/sofo.htm#DGBKR394
Enterprise Manager  manage DG broker(11.1) , 11.2版本已经没有讲解EM管理DG broker的详细内容了。
http://docs.oracle.com/cd/B28359_01/server.111/b28295/gui.htm#g1062576
DGMGRL manage DG broker(11.2)
http://docs.oracle.com/cd/E11882_01/server.112/e40771/cli.htm#DGBKR495

配置环境
主库:10.253.52.42 BKFSF
备库:10.253.52.107 BKFSST
Observer:10.253.52.39


#################################################
/*DG+DG Broker+Flashback+Fast_start Failover搭建步骤*/
#################################################

--DG+DG Broker+Flashback+Fast_start Failover搭建步骤
1, DG标准配置max performance
2, DG模式转换max availability
3, DG Broker准备工作
4, DG Broker配置工作
5, 启动Fast_start Failover

1,配置DG,请参考文档/*data guard physical standby database oracle 11gr2.sql*/http://docs.oracle.com/cd/E11882_01/server.112/e41134/create_ps.htm#SBYDB4718
这里,我测试了一下,利用冷备份的方式恢复了standby数据库,除pfile,orapwd,standby controlfile需要重建,其他文件只需要在主库冷拷贝至从库即可。
这里,出现了2个问题,主库LNS服务没有起来,日志显示,主库与备库通信异常,有2点原因,a)备库防火墙挡住了1541端口,b)备库密码文件名称未修改。

2,模式转换:将默认的最大保护模式转换成最高可用模式。

3,配置DG Broker准备.
a, 主备库启用spfile参数文件
b, 主备库设置DG_BROKER_START=TRUE
   注意:一旦该参数设置为TRUE,DMON进程立马启动
c, 主备库修改listener.ora,并且运行lsnrctl reload重新加载生效。
主库:listener.ora
LISTENER =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = q1ebsdb.imc.com)(PORT = 1541))
      (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC1541))
    )
  )

# Added for DG Broker
SID_LIST_LISTENER=(SID_LIST=(SID_DESC=(SID_NAME=BKFSF)
     (GLOBAL_DBNAME=BKFSF_DGMGRL)
     (ORACLE_HOME=/u02/oracle/product/11.2.0/db_1)
     (ENVS="TNS_ADMIN=/u02/oracle/product/11.2.0/db_1/network/admin")))

ADR_BASE_LISTENER = /u02/oracle

备库:listener.ora
LISTENER =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = itnm)(PORT = 1541))
      (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC1541))
    )
  )

# Added for DG Broker
SID_LIST_LISTENER=(SID_LIST=(SID_DESC=(SID_NAME=BKFSFST)
     (GLOBAL_DBNAME=BKFSFST_DGMGRL)
     (ORACLE_HOME=/u02/oracle/product/11.2.0/db_1)
     (ENVS="TNS_ADMIN=/u02/oracle/product/11.2.0/db_1/network/admin")))

ADR_BASE_LISTENER = /u02/oracle

4, 配置DG broker步骤
http://docs.oracle.com/cd/E11882_01/server.112/e40771/cli.htm#DGBKR495
主库
DGMGRL> create configuration 'DRSolution' as primary database is 'BKFSF' connect identifier is BKFSF;
DGMGRL> show configuration;
DGMGRL> add database 'BKFSFST' as connect identifier is BKFSFST;
DGMGRL> show configuration;
DGMGRL> show database verbose 'BKFSF';
DGMGRL> show database verbose 'BKFSFST';
BKFSF> !lsnrctl status --此时主库的监听已经出现了BKFSF_DGB的service。
BKFSFST> !lsnrctl status --此时备库的监听还未出现BKFSFST_DGB的service。
DGMGRL> enable configuration;--启用DG Broker配置在主备服务器。
DGMGRL> show configuration;
BKFSFST> !lsnrctl status --此时备库的监听已经出现BKFSFST_DGB的service。
此时主备库的DG Broker算是配置完成了。

5, 此时数据库可以启用Fast-start Failover的功能。
a, 设置faststartfailovertarget
b,另外主备库配置flashback功能,可以帮助failover之后对原主库的状态恢复,使其成为新的备库。
c,启用第三方observer服务器。
d,启用Fast_start Failover
请参考http://docs.oracle.com/cd/E11882_01/server.112/e40771/cli.htm#DGBKR3430的6.6章节

a, 设置faststartfailovertarget,按下面的测试来看估计是需要在主备库分别设置。
主库
DGMGRL> edit database 'BKFSF' set property FastStartFailoverTarget='BKFSFST';--FastStartFailoverTarget
DGMGRL> show fast_start failover;--此时显示
Fast-Start Failover: DISABLED

  Threshold:        30 seconds
  Target:           (none)
  Observer:         (none)
  Lag Limit:        30 seconds
  Shutdown Primary: TRUE
  Auto-reinstate:   TRUE

Configurable Failover Conditions
  Health Conditions:
    Corrupted Controlfile          YES
    Corrupted Dictionary           YES
    Inaccessible Logfile            NO
    Stuck Archiver                  NO
    Datafile Offline               YES

  Oracle Error Conditions:
    (none)
DGMGRL> show database 'BKFSF' FastStartFailoverTarget;--此时显示
  FastStartFailoverTarget = 'BKFSFST'
备库
DGMGRL> show fast_start failover;--此时备库显示

Fast-Start Failover: DISABLED

  Threshold:        30 seconds
  Target:           (none)
  Observer:         (none)
  Lag Limit:        30 seconds
  Shutdown Primary: TRUE
  Auto-reinstate:   TRUE

Configurable Failover Conditions
  Health Conditions:
    Corrupted Controlfile          YES
    Corrupted Dictionary           YES
    Inaccessible Logfile            NO
    Stuck Archiver                  NO
    Datafile Offline               YES

  Oracle Error Conditions:
    (none)
DGMGRL> show database 'BKFSFST' FastStartFailoverTarget;--此时备库显示
  FastStartFailoverTarget = ''

b, 主备库启动flashback
--启用flashback,启动alter database open read only,启动MRP,备库报错.
BKFSFST> alter database recover managed standby database using current logfile disconnect from session;
alter database recover managed standby database using current logfile disconnect from session
*
ERROR at line 1:
ORA-01153: an incompatible media recovery is active
/*解决方案*/
shutdown immediate数据库,重新启动,并开启MRP。这里主要原因是Fast_start Failover自动开启了MRP,不需要手动开启了。

c, 在第三方装有oracle client的服务器q1ebsdb01上启动observer
DGMGRL> connect sys@BKFSF
DGMGRL> START OBSERVER FILE='/home/oraprod/DG/config.dat';
--这里一直停留在这里,使用nohup
nohup dgmgrl sys/amaxgs@BKFSF "start observer file='/home/oraprod/fsfo.dat'" >> /home/oraprod/dgmgrl.log &
验证如下
col FS_FAILOVER_OBSERVER_HOST for a30
select FS_FAILOVER_STATUS,FS_FAILOVER_CURRENT_TARGET,FS_FAILOVER_THRESHOLD,FS_FAILOVER_observer_PRESENT,FS_FAILOVER_observer_HOST from v$database;

d,DGMGRL> ENABLE FAST_START FAILOVER;
DGMGRL> show fast_start failover;--检查主备库
DGMGRL> show database 'BKFSFST' FastStartFailoverTarget;--此时备库已经有BKFSF的FastStartFailoverTarget值了。
  FastStartFailoverTarget = 'BKFSF'

######################
/*DG Broker修改DG参数*/
######################

修改主备库db_file_name_convert,log_file_name_convert,配置了DG Broker需要通过DGMGRL对该参数进行修改,直接在spfile修改不会同步到DG Broker的配置文件。
DGMGRL> edit database 'BKFSF' set property LogFileNameConvert='/u02/oracle/data,/u02/oracle/data';
DGMGRL> edit database 'BKFSF' set property DbFileNameConvert='/u02/oracle/data,/u02/oracle/data';
DGMGRL> show database verbose 'BKFSF';

DGMGRL> edit database 'BKFSFST' set property LogFileNameConvert='/u02/oracle/data,/u02/oracle/data';
DGMGRL> edit database 'BKFSFST' set property DbFileNameConvert='/u02/oracle/data,/u02/oracle/data';
DGMGRL> show database verbose 'BKFSFST';
 
################################
/*DG Broker 进行swithover操作测试*/
################################

步骤:
主库上操作
$ dgmgrl / --dgmgrl使用本地OS认证登录,后面出现权限不足的告警。
DGMGRL> switchover to 'BKFSFST';
Performing switchover NOW, please wait...
New primary database "BKFSFST" is opening...
Operation requires shutdown of instance "BKFSF" on database "BKFSF"
Shutting down instance "BKFSF"...
ORA-01031: insufficient privileges

Warning: You are no longer connected to ORACLE.

Please complete the following steps to finish switchover:
        shut down instance "BKFSF" of database "BKFSF"
        start up instance "BKFSF" of database "BKFSF"

这里出现权限不足的告警,需要人工干预,关闭BKFSF,再启动BKFSF到mount。
https://forums.oracle.com/message/3229546
DGMGRL-to avoid ORA-01031 insufficient privileges during switchover
login use dgmgrl <username>/<password> NOT O/S authentication (dgmgrl /)

解决方法:--dgmgrl登录,别使用本地OS认证即可。
主库上操作
$ dgmgrl
DGMGRL> connect sys
DGMGRL> switchover to 'BKFSFST';
Performing switchover NOW, please wait...
New primary database "BKFSFST" is opening...
Operation requires shutdown of instance "BKFSF" on database "BKFSF"
Shutting down instance "BKFSF"...
ORACLE instance shut down.
Operation requires startup of instance "BKFSF" on database "BKFSF"
Starting instance "BKFSF"...
ORACLE instance started.
Database mounted.
Switchover succeeded, new primary is "BKFSFST"

DG Broker+Fast_start Failover开启后角色切换有下列情况
1,DG Broker角色切换后,新的备库状态为mount
2,手动SQL*Plus启动备库到会自动启动MRP进程。

切换后:
主库:BKFSFST
备库:BKFSF

######################################
/*DG Broker 进行Fast_start Failover操作测试*/
######################################

主库:BKFSFST
强制关闭主库实例和监听,导致observer发现异常,需要进行failover。
shutdown abort;
!lsnrctl stop

观察BKFSF的日志,发现Failover已经自动开始。
tail -f /u02/oracle/diag/rdbms/bkfsf/BKFSF/trace/drcBKFSF.log
...
12/16/2013 16:52:23
FAILOVER TO BKFSF
Beginning failover to database BKFSF
Notifying Oracle Clusterware to teardown database for FAILOVER
12/16/2013 16:52:28
Notifying DMON of db close
DMON: Old primary "BKFSFST" needs reinstatement
Protection mode set to MAXIMUM AVAILABILITY
Deferring associated archivelog destinations of sites permanently disabled due to Failover
Notifying Oracle Clusterware to buildup primary database after FAILOVER
Posting DB_DOWN alert ...
        ... with reason Data Guard Fast-Start Failover - Primary Disconnected
Command FAILOVER TO BKFSF completed
12/16/2013 16:52:36
Data Guard Broker Status Summary:
  Type                        Name                             Severity  Status
  Configuration               DRSolution                        Warning  ORA-16608
  Primary Database            BKFSF                             Warning  ORA-16817
  Physical Standby Database   BKFSFST                             Error  ORA-16661

验证切换结果
DGMGRL> show configuration;
DGMGRL> show database verbose 'BKFSF';
DGMGRL> show database verbose 'BKFSFST';

完成Failover

#####################################
/*DG Broker 进行reinstat问题主库操作测试*/
#####################################

a, lsnrctl start
b, startup mount;
--至此,发现数据自动完成了BKFSFST的reinstate过程,太好了!!!
--下面的步骤可以省略
c, dgmgrl connect sys
d, reinstate database 'BKFSFST';

itnm | BKFSFST | /home/oracle > tail -f /u02/oracle/diag/rdbms/bkfsfst/BKFSFST/trace/drcBKFSFST.log
...
12/16/2013 17:05:51
>> Starting Data Guard Broker bootstrap <<
Broker Configuration File Locations:
      dg_broker_config_file1 = "/u02/oracle/product/11.2.0/db_1/dbs/dr1BKFSFST.dat"
      dg_broker_config_file2 = "/u02/oracle/product/11.2.0/db_1/dbs/dr2BKFSFST.dat"
12/16/2013 17:05:54
DMON Registering service BKFSFST_DGB with listener(s)
Broker Configuration:       "DRSolution"
      Protection Mode:            Maximum Availability
      Fast-Start Failover (FSFO): Enabled, flags=0x40001, version=18
      Primary Database:           BKFSFST (0x02010000)
      Standby Database:           BKFSF, Enabled Physical Standby (FSFO target) (0x01010000)
12/16/2013 17:05:57
Site BKFSF returned ORA-16623.
Version Check Results:
      Database BKFSF returned ORA-16623
Disable redo transport to all standby databases
RSM0 failed to remove all destinations, error ORA-16718
DMON: Old primary "BKFSFST" needs reinstatement
12/16/2013 17:06:14
Data Guard notifying Oracle Clusterware of database role change
Creating process RSM0
12/16/2013 17:06:17
Physical RSM: Reinstatement... Converting old primary control file to physical standby control file
12/16/2013 17:06:19
Purging diverged redos on resetlogs branch 834082402, starting SCN 1383347
Purged 0 archived logs
12/16/2013 17:06:24
Data Guard Broker shutting down
RSM0 successfully terminated
12/16/2013 17:06:26
>> DMON Process Shutdown <<
12/16/2013 17:06:40
>> Starting Data Guard Broker bootstrap <<
Broker Configuration File Locations:
      dg_broker_config_file1 = "/u02/oracle/product/11.2.0/db_1/dbs/dr1BKFSFST.dat"
      dg_broker_config_file2 = "/u02/oracle/product/11.2.0/db_1/dbs/dr2BKFSFST.dat"
12/16/2013 17:06:43
Database needs to be reinstated or re-created, Data Guard broker ready
12/16/2013 17:06:44
DMON Registering service BKFSFST_DGB with listener(s)
12/16/2013 17:06:47
Apply Instance for Database BKFSFST set to BKFSFST
Data Guard notifying Oracle Clusterware of database role change
Creating process RSM0
12/16/2013 17:07:01
Notifying Oracle Clusterware to buildup after REINSTATEMENT
Command REINSTATE DATABASE BKFSFST completed
12/16/2013 17:07:02
Command EDIT DATABASE BKFSFST SET PROPERTY ActualApplyInstance = BKFSFST completed
0 0