Instance Crash & Bug 13846531
来源:互联网 发布:as网络语言什么意思 编辑:程序博客网 时间:2024/06/12 17:07
1. 故障现象
一客户决策支持数据库在3月6日、9日节点1的实例Crash。
2. 故障分析
3月6日下午17点27分,发现决策支持数据库节点1 ASM实例后台报ORA-04031错误,该错误表示共享内存不足,检查ASM实例时发现已HANG,该节点已被集群踢出,分析是在跑批处理加载的时候ASM实例shared pool碎片太多导致ORA-04031错误,该故障现象与Bug 13846531 : ASM INSTANCE CRASHED WITH ORA-4031 ON EXADATA类似,需安装相应的patch或设置几个参数绕过。
3月9日下午14点28分,该数据库节点1再次出现Crash,日志如下:
Sat Mar 09 14:28:35 2013Errors in file/u01/app/oracle/diag/rdbms/bjscj1dw/bjscj1dw1/trace/bjscj1dw1_lms1_8230.trc (incident=384220):ORA-00600: internal error code, arguments: [kjbldrmrpst:pkey],[32767], [1123700], [0], [0], [], [], [], [], [], [], []Incident details in:/u01/app/oracle/diag/rdbms/bjscj1dw/bjscj1dw1/incident/incdir_384220/bjscj1dw1_lms1_8230_i384220.trcUse ADRCI or Support Workbench to package the incident.See Note 411.1 at My Oracle Support for error and packagingdetails.Sat Mar 09 14:28:38 2013Dumping diagnostic data in directory=[cdmp_20130309142838],requested by (instance=1, osid=8230 (LMS1)), summary=[incident=384220].Sat Mar 09 14:28:38 2013Sweep [inc][384220]: completedSweep [inc2][384220]: completedErrors in file/u01/app/oracle/diag/rdbms/bjscj1dw/bjscj1dw1/trace/bjscj1dw1_lms1_8230.trc:ORA-00600: internal error code, arguments: [kjbldrmrpst:pkey],[32767], [1123700], [0], [0], [], [], [], [], [], [], []LMS1 (ospid: 8230): terminating the instance due to error 484System state dump requested by (instance=1, osid=8230 (LMS1)),summary=[abnormal instance termination].System State dumped to trace file/u01/app/oracle/diag/rdbms/bjscj1dw/bjscj1dw1/trace/bjscj1dw1_diag_8214.trcSat Mar 09 14:28:50 2013Instance terminated by LMS1, pid = 8230Sat Mar 09 14:28:56 2013Starting ORACLE instance(normal)LICENSE_MAX_SESSION = 0LICENSE_SESSIONS_WARNING = 0
从日志中可以看出,数据库在Crash前出现ORA-00600 [kjbldrmrpst:pkey] 错误,此时实例被LMS1进程终止,6秒后该实例被集群自动启动。从错误信息来看,数据库很可能遭遇Bug 13397104 : INSTANCE CRASH WITH ORA-600 [KJBLPKEYDRMQSCCHK:PKEY],该BUG与11g Dynamic Remastering新引入的Read-mostly locking功能有关,为了提高性能,该功能默认是开启的,若频繁引起实例重启,建议关闭该功能,绕开这个Bug。
3. 处理总结
关于ORA-04031错误,需在asm spfile增加两个隐含参数,如下所示:
Thefollowing work around has helped in reported incident. (The ideal solution isto fix the application code to use bind variables instead of literals in thecalls it make)
1.Set SGA_TARGET=2G (optional)
2._enable_shared_pool_durations= false
(Thiswill allow unpinned SQLA, KGLH0, and KGLHD memory to be freed to make room fornew permanent allocations)
3._library_cache_advice = false
(To eliminate the allocations from kglsim,this parameter setting will work:)
Pleaseset all the above three change in the ASM SPFILE restart clusterware( We needto restart ASM instance).
关于ORA-00600 [kjbldrmrpst:pkey],若频繁出现实例重启的现象,建议关闭Dynamic Remastering的Read-mostly locking功能,设置下列隐含参数可以关闭。详细请参考Note 13397104.1。
_gc_read_mostly_locking=FALSE
- Instance Crash & Bug 13846531
- BUG 10008092 caused instance crash
- Bug2108072导致Instance Crash
- ORA-27072 instance crash
- ORA-600 [KFDAUDEALLOC2] AND INSTANCE CRASH EVEN WITH THE FIX OF BUG 14467061 (文档 ID 1903273.1)
- crash bug之 select
- iOS crash bug 定位
- iOS 如何查crash bug
- Bug分析:unrecognized selector sent to instance
- How to release IPC resource after oracle Instance crash/abort
- 小萝莉说Crash(一):Unrecognized selector sent to instance xxxx
- 【小萝莉说Crash】第一期:Unrecognized selector sent to instance xxxx
- 被释放遇到的Crash message sent to deallocated instance
- 小萝莉说Crash(一):Unrecognized selector sent to instance xxxx
- 调试经验谈-根据crash dump查找bug
- chrome 插件的一处bug crash
- iOS 捕获Crash 发送Bug邮件
- iOS 捕获Crash 发送Bug邮件
- 将Windows Server 2012改造成桌面操作系统(Windows 8)的方法
- oracle中如何根据一个字段名查找出所属的表名
- The Survey of Programming Language Classification
- 【windowSoftInputMode】android软键盘隐藏
- URL带有其他参数时,如何使用PHP的CI框架分页类?
- Instance Crash & Bug 13846531
- hello CUDA
- Ext.form.RadioGroup 和 Ext.form.Radio 基本定义
- 大学两年总结(计算机专业)
- C++回调函数模式的使用.
- 边用边学shell编程
- 2013-6-28-2-服务端获取客户端html页面内容
- 【layout】Android中merge和include标签的使用
- Padrino 博客开发示例