DB2 HADR 恢复灾备数据库

来源：互联网发布：dota选手知乎编辑：程序博客网时间：2024/05/17 06:30

环境介绍：
服务器名 IP HADR端口
--------------------------------------------------------
主数据库 tftpay_db2172.16.88.2 55001
主备数据库 tftpay_db3172.16.88.3 55002
灾备数据库 tftpay_db110.64.0.15 55003

生产问题1描述：
db2inst1用户执行DB2命令超时。

现象：
Initializing instance list :.......Success
ERROR: Program name = db2idbm
Instance home dir = /db/tp/dbhome/oracle, Sysadm group = oinstall
Instance type = 1, Auth type = server

DB21015E The Command Line Processor backend process request queue or input
queue was not created within the timeout period.
Update DBM cfg SYSADM_GROUP errcode = 8
DBI1281E The database manager configuration file could not be
initialized.

Explanation:

An error occurred when attempting to initialize the database manager
configuration file. A DB2 instance could not be created or upgraded.

User response:

Refer to the log file for troubleshooting. Resolve the problem and try
the command again. If the problem persists, contact your IBM service
representative.

db2inst1
ERROR: An error occurred while configuring the instance "oracle".

问题原因：
DB2命令去调起DB2后台服务时，由于路由问题，无法找不到主机名而引起执行命令时
间超时。

解决办法：
http://blog.itpub.net/10990946/viewspace-694277/

vi /etc/netsvc.conf
#hosts=local,bind,nis
hosts=local4,bind4

生产问题2描述：

由于灾备数据库停止了2个多月，在这两个月内，主数据库的数据无法同步至灾备数据库。
当我解决问题1之后，在灾备数据库上激活数据库，db2 activate database tftpay.发现
数据库的灾备HADR状态不稳定，由REMOTE_CATCHUP-->DISCONNECTED状态。大概持续了10分
钟数据库就挂掉了。

现象：
2017-02-13-09.26.17.803659+480 E2269253888A447 LEVEL: Error
PID : 22151356 TID : 16320 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : TFTPAY
HOSTNAME: tpdbbak
EDUID : 16320 EDUNAME: db2hadrs.0.0 (TFTPAY) 0
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrEdu::hdrEduS, probe:21580
MESSAGE : ADM12509E HADR encountered an abnormal condition. Reason code: "1"

解释：ADM12509E 说找不到LOG。
[db2inst1@DB-HADR3 20170214]$ db2 ? ADM12509E

ADM12509E HADR encountered an abnormal condition. Reason code:
"<reason-code>"

Explanation:

The explanation corresponding to the reason code is:

1

The HADR primary database cannot find a log file that was
requested by the standby. HADR will attempt to recover from the
condition by disconnecting and then reconnecting the primary
and the standby. Each reconnection will attempt to access the
file. These retry attempts will help avoid transient errors
such as a conflict with other log file usage.

根据我们的备份脚本，就可以解释清楚了：
由于我们备份数据库脚本中，备份的归档日志为30天，但是灾备数据库已经断掉了两个
多月。当灾备数据库激活数据库后，通过归档日志S00000001.LOG（最新）将两个月前
的数据同步到灾备数据库中，但是在主数据库中对应的S00000001.LOG已经在一个月之前
被删除了，所以就报出了无法找到对应的LOG这个现象。

解决办法：
将灾备的数据库删除，然后将主库上的数据库备份文件上传至灾备数据库恢复，重启HADR。

步骤如下：

1.获取灾备数据库 HADR参数配置

db2 "update db cfg for TFTPAY using
HADR_TARGET_LIST tftpay_db2:55001|tftpay_db3:55002
HADR_REMOTE_HOST 172.16.88.23
HADR_REMOTE_SVC DB2_hadr_1
HADR_LOCAL_HOST tpdbbak
HADR_LOCAL_SVC DB2_hadr_3
HADR_SYNCMODE NEARSYNC
HADR_REMOTE_INST db2inst1"

2.停止灾备数据库上的HADR
db2 stop hadr on db tftpay
3.删除灾备数据库
db2 drop db tftpay

4.恢复灾备数据库

4.1 db2 "restore db tftpay from /db/tp/manage/src taken at 20170213043002 logtarget /db/tp/manage/restorelog redirect generate script redirect-20170213.ddl"

4.2 将redirect-20170213.ddl里面需要的路径创建好：vi redirect-20170213.ddl。

4.3 执行恢复数据库脚本：
nohup db2 -z redirect-20170213.out -tvf redirect-20170213.ddl &
4.4 监控恢复数据库脚本执行情况：
tail -f nohup.out
db2top -d tftpay
db2pd -d tftpay -util -repeat 30

4.5 恢复数据库成功后，切记一定不能回滚数据库。接着配置灾备数据库HADR参数。

tpdbbak$ db2 get db cfg for tftpay| grep -i hadr
HADR database role = STANDARD
HADR local host name (HADR_LOCAL_HOST) = 172.16.88.2
HADR local service name (HADR_LOCAL_SVC) = DB2_hadr_1
HADR remote host name (HADR_REMOTE_HOST) = 172.16.88.3
HADR remote service name (HADR_REMOTE_SVC) = DB2_hadr_2
HADR instance name of remote server (HADR_REMOTE_INST) = db2inst1
HADR timeout value (HADR_TIMEOUT) = 180
HADR target list (HADR_TARGET_LIST) = tftpay_db3:55002|tpdbbak:55003
HADR log write synchronization mode (HADR_SYNCMODE) = NEARSYNC
HADR spool log data limit (4KB) (HADR_SPOOL_LIMIT) = AUTOMATIC(307200)
HADR log replay delay (seconds) (HADR_REPLAY_DELAY) = 0
HADR peer window duration (seconds) (HADR_PEER_WINDOW) = 120
tpdbbak$
tpdbbak$
tpdbbak$
tpdbbak$
tpdbbak$
tpdbbak$ db2 "update db cfg for TFTPAY using
HADR_TARGET_LIST tftpay_db2:55001|tftpay_db3:55002
HADR_REMOTE_HOST 172.16.88.2
HADR_REMOTE_SVC DB2_hadr_1
HADR_LOCAL_HOST tpdbbak
HADR_LOCAL_SVC DB2_hadr_3
HADR_SYNCMODE NEARSYNC
HADR_REMOTE_INST db2inst1"
DB20000I The UPDATE DATABASE CONFIGURATION command completed successfully.

4.6 启动灾备数据库的HADR
tpdbbak$ db2 start hadr on db tftpay as standby
DB20000I The START HADR ON DATABASE command completed successfully.

至此，灾备数据库恢复正常！！

1 0