OGG-01738 BOUNDED RECOVERY

来源:互联网 发布:浏览器设置代理端口 编辑:程序博客网 时间:2024/05/29 23:46
database version:11.2.0.3 RAC
goldengate version :11.1.1.1.2
早上发现数据同步异常,source端状态如下:
GGSCI (ulecardrac1) 3> info all

Program Status Group Lag Time Since Chkpt

MANAGER RUNNING
EXTRACT RUNNING EXT232 00:00:00 06:32:33
EXTRACT RUNNING PUMP232 00:00:00 00:00:03
status还是为RUNNING,但是已经有六个半小时没有update了,其实该进程已经hang住
查看告警日志ggserr.log
发现存在OGG-01738提示
2013-03-07 02:42:28  INFO    OGG-01738  Oracle GoldenGate Capture for Oracle, ext232.prm:  BOUNDED RECOVERY: CHECKPOINT: for object pool 1: p5905_Redo Thread 1: start=SeqNo: 679, RBA: 83280912, SCN: 1.913813052 (5208780348), Timestamp: 2013-03-06 22:00:20.000000, end=SeqNo: 679, RBA: 129051136, SCN: 1.938808049 (5233775345), Timestamp: 2013-03-07 02:42:03.000000.
2013-03-07 02:42:28  INFO    OGG-01738  Oracle GoldenGate Capture for Oracle, ext232.prm:  BOUNDED RECOVERY: CHECKPOINT: for object pool 2: p5905_Redo Thread 2: start=SeqNo: 692, RBA: 103611920, SCN: 1.913812238 (5208779534), Timestamp: 2013-03-06 22:00:16.000000, end=SeqNo: 693, RBA: 93604864, SCN: 1.938808100 (5233775396), Timestamp: 2013-03-07 02:42:15.000000.

MOS上有一篇关于该错误的文章 note 1293772.1
国内大牛刘相兵的博客上也有一篇关于该错误的说明:
http://www.askmaclean.com/archives/ogg-01738-bounded-recovery.html
The solution is to reset the Bounded Recovery Checkpoint file when restarting the extract like:
GGSCI> start <extract_name> BRRESET


因为extract进程ext232已经假死,无法stop掉,甚至用'send ext232 forcestop'和'stop mgr'也无法stop掉该extract进程
最后只能在shell下kill掉进程,再重新执行
GGSCI> start ext232 BRRESET

重新启动后,发现状态已经正常,同步已经基本无延迟。
该bug只在RAC中或者单实例设置了多个thread的情况下出现,而且在更高级版本中已经修复,为了一劳永逸,可以考虑将ogg升级至11.2.1.0.1



原创粉丝点击