ORA-00600 [17182] ORA-00600[kghstack_free1] ORA-00600[kghfrempty:ds]故障处理
来源:互联网 发布:淘宝发货途中能退款吗 编辑:程序博客网 时间:2024/06/01 09:38
下午2点30左右,负责测试环境的同事电话说测试数据库中心库挂了,看alert日志发现测试环境的DBA同事已经重启好多次也没有成功,启动后马上就Crash。往前
翻看alert,发现最早是中午11:34的时候报的错,这个错误是一个oracle的bug,见mos的:ORA-600 [kghstack_free1] (文档 ID 285602.1):
TNS-00505: Operation timed out nt secondary err code: 78 nt OS err code: 0 Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=10.252.2.199)(PORT=1941))Tue Oct 21 11:34:03 2014Errors in file /oracle/diag/rdbms/tcen/tcen/trace/tcen_ora_27657688.trc (incident=390803):***************************************************************************************************************ORA-00600: internal error code, arguments: [kghstack_free1], [kdumrp: kduutsto], [], [], [], [], [], [], [], [], [], []***************************************************************************************************************Incident details in: /oracle/diag/rdbms/tcen/tcen/incident/incdir_390803/tcen_ora_27657688_i390803.trcUse ADRCI or Support Workbench to package the incident.See Note 411.1 at My Oracle Support for error and packaging details.Tue Oct 21 11:34:04 2014Thread 1 advanced to log sequence 8998 (LGWR switch) Current log# 3 seq# 8998 mem# 0: /oracen/tcen/redo03.logTue Oct 21 11:34:05 2014Archived Log entry 8977 added for thread 1 sequence 8997 ID 0x55472e0a dest 1:Errors in file /oracle/diag/rdbms/tcen/tcen/trace/tcen_ora_27657688.trc (incident=390804):ORA-00600: internal error code, arguments: [kghfrempty:ds], [0x1108E7B98], [], [], [], [], [], [], [], [], [], []ORA-00600: internal error code, arguments: [kghstack_free1], [kdumrp: kduutsto], [], [], [], [], [], [], [], [], [], []Incident details in: /oracle/diag/rdbms/tcen/tcen/incident/incdir_390804/tcen_ora_27657688_i390804.trcUse ADRCI or Support Workbench to package the incident.See Note 411.1 at My Oracle Support for error and packaging details.Tue Oct 21 11:34:09 2014Dumping diagnostic data in directory=[cdmp_20141021113409], requested by (instance=1, osid=27657688), summary=[incident=390804].Errors in file /oracle/diag/rdbms/tcen/tcen/incident/incdir_390803/tcen_ora_27657688_i390803.trc:
该bug导致数据库宕机,于是测试环境DBA不断尝试重启,一直没有成功,从alert中看问题还是比较明显:
Thread 1 advanced to log sequence 9013 (thread open)Thread 1 opened at log sequence 9013 Current log# 2 seq# 9013 mem# 0: /oracen/tcen/redo02.logSuccessful open of redo thread 1MTTR advisory is disabled because FAST_START_MTTR_TARGET is not setTue Oct 21 14:43:22 2014SMON: enabling cache recovery[26151286] Successfully onlined Undo Tablespace 2.Undo initialization finished serial:0 start:1611833444 end:1611833486 diff:42 (0 seconds)Verifying file header compatibility for 11g tablespace encryption..Verifying 11g file header compatibility for tablespace encryption completedSMON: enabling tx recoveryDatabase Characterset is ZHS16GBKErrors in file /oracle/diag/rdbms/tcen/tcen/trace/tcen_smon_15075330.trc (incident=1248316):ORA-00600: internal error code, arguments: [17182], [0x1109AD7A0], [], [], [], [], [], [], [], [], [], []Incident details in: /oracle/diag/rdbms/tcen/tcen/incident/incdir_1248316/tcen_smon_15075330_i1248316.trcUse ADRCI or Support Workbench to package the incident.See Note 411.1 at My Oracle Support for error and packaging details.No Resource Manager plan activereplication_dependency_tracking turned off (no async multimaster replication found)Starting background process QMNCTue Oct 21 14:43:22 2014QMNC started with pid=74, OS id=11209704Completed: alter database openTue Oct 21 14:43:23 2014Starting background process CJQ0Tue Oct 21 14:43:23 2014CJQ0 started with pid=99, OS id=33226882Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x373104342020200C] [PC:0x1000C2E3C, kghalf()+540] [flags: 0x0, count: 1]Errors in file /oracle/diag/rdbms/tcen/tcen/trace/tcen_smon_15075330.trc (incident=1248317):ORA-07445: exception encountered: core dump [kghalf()+540] [SIGSEGV] [ADDR:0x373104342020200C] [PC:0x1000C2E3C] [Address not mapped to object] []ORA-00600: internal error code, arguments: [17182], [0x1109AD7A0], [], [], [], [], [], [], [], [], [], []Incident details in: /oracle/diag/rdbms/tcen/tcen/incident/incdir_1248317/tcen_smon_15075330_i1248317.trcUse ADRCI or Support Workbench to package the incident.See Note 411.1 at My Oracle Support for error and packaging details.Tue Oct 21 14:43:24 2014
这个问题是因为数据库进行完前滚后,smon进行回滚(SMON: enabling cache recovery)发现有部分block CORRUPTED导致该问题。
解决该问题思路就是:
方法1.设置event禁用smon进行tx recovery
alter system set event='10513 trace name context forever,level 2' scope='spfile';
重启数据库
方法2.设置回滚段相关隐含参数,屏蔽回滚,然后重建undo tablespace以彻底解决该问题
用方法1打开数据库,通过下面的sql可以看到要recovery的回滚段,把这些回滚段添加到隐含参数中
SQL> l 1 select segment_name,status,tablespace_name from dba_rollback_segs 2*SQL> /SEGMENT_NAME STATUS TABLESPACE_NAME------------------------------ ---------------- ------------------------------SYSTEM ONLINE SYSTEM_SYSSMU1_2428882487$ ONLINE UNDOTBS1_SYSSMU2_1520383311$ ONLINE UNDOTBS1_SYSSMU3_1371335976$ ONLINE UNDOTBS1_SYSSMU4_686542768$ ONLINE UNDOTBS1_SYSSMU5_538369555$ ONLINE UNDOTBS1_SYSSMU6_2381545608$ ONLINE UNDOTBS1_SYSSMU7_3252991061$ ONLINE UNDOTBS1_SYSSMU8_676308921$ ONLINE UNDOTBS1_SYSSMU9_256155103$ ONLINE UNDOTBS1_SYSSMU10_585802251$ ONLINE UNDOTBS1SEGMENT_NAME STATUS TABLESPACE_NAME------------------------------ ---------------- ------------------------------_SYSSMU11_426062400$ PARTLY AVAILABLE UNDOTBS1 <------------- 这一行_SYSSMU12_4073093061$ OFFLINE UNDOTBS1_SYSSMU13_3405388771$ OFFLINE UNDOTBS1_SYSSMU14_3005306459$ OFFLINE UNDOTBS1_SYSSMU15_1779671541$ OFFLINE UNDOTBS1_SYSSMU16_4085920032$ OFFLINE UNDOTBS1_SYSSMU17_3236800836$ OFFLINE UNDOTBS1_SYSSMU18_2858917377$ OFFLINE UNDOTBS1_SYSSMU19_1683938663$ OFFLINE UNDOTBS1_SYSSMU20_1078477197$ OFFLINE UNDOTBS1_SYSSMU21_3298272824$ OFFLINE UNDOTBS1SEGMENT_NAME STATUS TABLESPACE_NAME------------------------------ ---------------- ------------------------------_SYSSMU22_1848080707$ OFFLINE UNDOTBS1_SYSSMU23_2369118209$ OFFLINE UNDOTBS1_SYSSMU24_2601897058$ OFFLINE UNDOTBS1_SYSSMU25_3225658959$ OFFLINE UNDOTBS126 rows selected.
因为数据库是异常中断,因此可以通过下面的表可以找到那些Dead事务
SQL> select KTUXECFL, KTUXESQN, KTUXESIZ from x$ktuxe where KTUXECFL='DEAD';KTUXECFL KTUXESQN KTUXESIZ------------------------ ---------- ----------DEAD 4371693 2DEAD 8449154 395 <---KTUXESIZ用来记录事务使用的回滚段块数
新建一个undo表空间
create undo tablespace undotbs_tcen datafile '/oracen/tcendata/undotbs_tcen_01.dbf' size 10240M;
alter system set undo_tablespace=undotbs_tcen;
SQL> select segment_name,status,tablespace_name from dba_rollback_segs;SEGMENT_NAME STATUS TABLESPACE_NAME------------------------------ ---------------- ------------------------------SYSTEM ONLINE SYSTEM_SYSSMU10_585802251$ PARTLY AVAILABLE UNDOTBS1<-----------------_SYSSMU11_426062400$ NEEDS RECOVERY UNDOTBS1 <----------------_SYSSMU12_4073093061$ OFFLINE UNDOTBS1_SYSSMU13_3405388771$ OFFLINE UNDOTBS1_SYSSMU14_3005306459$ OFFLINE UNDOTBS1_SYSSMU15_1779671541$ OFFLINE UNDOTBS1_SYSSMU16_4085920032$ OFFLINE UNDOTBS1_SYSSMU17_3236800836$ OFFLINE UNDOTBS1_SYSSMU18_2858917377$ OFFLINE UNDOTBS1_SYSSMU19_1683938663$ OFFLINE UNDOTBS1SEGMENT_NAME STATUS TABLESPACE_NAME------------------------------ ---------------- ------------------------------_SYSSMU20_1078477197$ OFFLINE UNDOTBS1_SYSSMU21_3298272824$ OFFLINE UNDOTBS1_SYSSMU22_1848080707$ OFFLINE UNDOTBS1_SYSSMU23_2369118209$ OFFLINE UNDOTBS1_SYSSMU24_2601897058$ OFFLINE UNDOTBS1_SYSSMU25_3225658959$ OFFLINE UNDOTBS1_SYSSMU39_4072807697$ ONLINE UNDOTBS_TCEN_SYSSMU40_56486773$ ONLINE UNDOTBS_TCEN_SYSSMU41_3994086030$ ONLINE UNDOTBS_TCEN_SYSSMU42_326209782$ ONLINE UNDOTBS_TCEN_SYSSMU43_980468378$ ONLINE UNDOTBS_TCENSEGMENT_NAME STATUS TABLESPACE_NAME------------------------------ ---------------- ------------------------------_SYSSMU44_2332935711$ ONLINE UNDOTBS_TCEN_SYSSMU45_3152501178$ ONLINE UNDOTBS_TCEN_SYSSMU46_3162242451$ ONLINE UNDOTBS_TCEN_SYSSMU47_67541545$ ONLINE UNDOTBS_TCEN_SYSSMU48_1685486861$ ONLINE UNDOTBS_TCEN_SYSSMU49_2686039235$ ONLINE UNDOTBS_TCEN_SYSSMU50_3551989330$ ONLINE UNDOTBS_TCEN_SYSSMU51_2184994096$ ONLINE UNDOTBS_TCEN30 rows selected.SQL> drop tablespace UNDOTBS1 including contents and datafiles;drop tablespace UNDOTBS1 including contents and datafiles*ERROR at line 1:ORA-01548: active rollback segment '_SYSSMU10_585802251$' found, terminatedropping tablespace
因此需要修改参数文件:
*.diagnostic_dest='/oracle'*.dispatchers=''*.distributed_lock_timeout=700*.event='10513 trace name context forever,level 2'*.fast_start_parallel_rollback='FALSE'*.java_pool_size=134217728*.job_queue_processes=200*.large_pool_size=134217728*.log_archive_dest_1='location=/oracen/tcen/archive_log'*.log_archive_format='archive_%t_%s_%r.log'*.log_archive_max_processes=5*.open_cursors=300*.pga_aggregate_target=10737418240*.processes=6000*.remote_login_passwordfile='EXCLUSIVE'*.sec_case_sensitive_logon=FALSE*.sga_max_size=10737418240*.sga_target=0*.shared_pool_size=2147483648*.streams_pool_size=268435456*.undo_tablespace='UNDOTBS_TCEN'_corrupted_rollback_segments=(_SYSSMU10_585802251$,_SYSSMU11_426062400$)
SQL> shutdown abortORACLE instance shut down.SQL> startup mount pfile='/oracle/zhang_new.ora'ORACLE instance started.Total System Global Area 1.0689E+10 bytesFixed Size 2255784 bytesVariable Size 5570036824 bytesDatabase Buffers 4831838208 bytesRedo Buffers 285343744 bytesDatabase mounted.SQL> alter database open;Database altered.SQL> select segment_name,status,tablespace_name from dba_rollback_segs;SEGMENT_NAME STATUS TABLESPACE_NAME------------------------------ ---------------- ------------------------------SYSTEM ONLINE SYSTEM_SYSSMU10_585802251$ NEEDS RECOVERY UNDOTBS1<--------------------_SYSSMU11_426062400$ NEEDS RECOVERY UNDOTBS1<--------------------_SYSSMU12_4073093061$ OFFLINE UNDOTBS1_SYSSMU13_3405388771$ OFFLINE UNDOTBS1_SYSSMU14_3005306459$ OFFLINE UNDOTBS1_SYSSMU15_1779671541$ OFFLINE UNDOTBS1_SYSSMU16_4085920032$ OFFLINE UNDOTBS1_SYSSMU17_3236800836$ OFFLINE UNDOTBS1_SYSSMU18_2858917377$ OFFLINE UNDOTBS1_SYSSMU19_1683938663$ OFFLINE UNDOTBS1SEGMENT_NAME STATUS TABLESPACE_NAME------------------------------ ---------------- ------------------------------_SYSSMU20_1078477197$ OFFLINE UNDOTBS1_SYSSMU21_3298272824$ OFFLINE UNDOTBS1_SYSSMU22_1848080707$ OFFLINE UNDOTBS1_SYSSMU23_2369118209$ OFFLINE UNDOTBS1_SYSSMU24_2601897058$ OFFLINE UNDOTBS1_SYSSMU25_3225658959$ OFFLINE UNDOTBS1_SYSSMU39_4072807697$ ONLINE UNDOTBS_TCEN_SYSSMU40_56486773$ ONLINE UNDOTBS_TCEN_SYSSMU41_3994086030$ ONLINE UNDOTBS_TCEN_SYSSMU42_326209782$ ONLINE UNDOTBS_TCEN_SYSSMU43_980468378$ ONLINE UNDOTBS_TCENSEGMENT_NAME STATUS TABLESPACE_NAME------------------------------ ---------------- ------------------------------_SYSSMU44_2332935711$ ONLINE UNDOTBS_TCEN_SYSSMU45_3152501178$ ONLINE UNDOTBS_TCEN_SYSSMU46_3162242451$ ONLINE UNDOTBS_TCEN_SYSSMU47_67541545$ ONLINE UNDOTBS_TCEN_SYSSMU48_1685486861$ ONLINE UNDOTBS_TCEN_SYSSMU49_2686039235$ ONLINE UNDOTBS_TCEN_SYSSMU50_3551989330$ ONLINE UNDOTBS_TCEN_SYSSMU51_2184994096$ ONLINE UNDOTBS_TCEN30 rows selected.SQL> drop tablespace UNDOTBS1 including contents and datafiles;Tablespace dropped.
然后去掉参数文件pfile中的下面的参数,重建spfile
_corrupted_rollback_segments=(_SYSSMU10_585802251$,_SYSSMU11_426062400$)
*.event='10513 trace name context forever,level 2'
SQL> shutdown immediateDatabase closed.Database dismounted.ORACLE instance shut down.SQL> create spfile from pfile='/oracle/zhang1.ora';File created.SQL> startup ORACLE instance started.Total System Global Area 1.0689E+10 bytesFixed Size 2255784 bytesVariable Size 5570036824 bytesDatabase Buffers 4831838208 bytesRedo Buffers 285343744 bytesDatabase mounted.Database opened.SQL> SQL> SQL> show parameter SQL> SQL> SQL> select segment_name,status,tablespace_name from dba_rollback_segs;SEGMENT_NAME STATUS TABLESPACE_NAME------------------------------ ---------------- ------------------------------SYSTEM ONLINE SYSTEM_SYSSMU39_4072807697$ ONLINE UNDOTBS_TCEN_SYSSMU40_56486773$ ONLINE UNDOTBS_TCEN_SYSSMU41_3994086030$ ONLINE UNDOTBS_TCEN_SYSSMU42_326209782$ ONLINE UNDOTBS_TCEN_SYSSMU43_980468378$ ONLINE UNDOTBS_TCEN_SYSSMU44_2332935711$ ONLINE UNDOTBS_TCEN_SYSSMU45_3152501178$ ONLINE UNDOTBS_TCEN_SYSSMU46_3162242451$ ONLINE UNDOTBS_TCEN_SYSSMU47_67541545$ ONLINE UNDOTBS_TCEN_SYSSMU48_1685486861$ ONLINE UNDOTBS_TCENSEGMENT_NAME STATUS TABLESPACE_NAME------------------------------ ---------------- ------------------------------_SYSSMU49_2686039235$ ONLINE UNDOTBS_TCEN_SYSSMU50_3551989330$ ONLINE UNDOTBS_TCEN_SYSSMU51_2184994096$ ONLINE UNDOTBS_TCEN14 rows selected.SQL> SQL>
此时数据库可以正常启动!
0 0
- ORA-00600 [17182] ORA-00600[kghstack_free1] ORA-00600[kghfrempty:ds]故障处理
- ORA-00600[kcratr_nab_less_than_odr]故障解决
- ORA-00600[kcratr_nab_less_than_odr]故障解决
- ORA-00600[kcratr_nab_less_than_odr]故障解决
- ORA-00600 ORA-07445
- ora-00607 ora-00600
- ORA-00600 [3020] ORA-10560问题处理
- 【故障处理】偶遇ORA-00600: [keltnfy-ldmInit][46], [1], []
- 【故障处理】偶遇ORA-00600: [keltnfy-ldmInit][46], [1], []
- ORA-01591故障处理
- ORA-01591故障处理
- ora-3136故障处理
- ORA-00600: ORA-12012 ORA-08102解决
- ORA-12560、ORA-01033、ORA-00600
- Oracle ORA-01034,ORA-27101,ORA-00600
- ORA-00600 ORA-01565 ORA-27046
- ora-00600
- ORA-00600
- libevent源码深度剖析四
- Auto Layout 进阶
- boost___smart_ptr
- 对于生活,我是怎么样的人?对于编程,我能力如何?
- 【android】有关scrollview与listview的something
- ORA-00600 [17182] ORA-00600[kghstack_free1] ORA-00600[kghfrempty:ds]故障处理
- Jsoncpp的使用
- libevent源码深度剖析五 ——libevent的核心:事件event
- eclipse 常用插件
- LUA中各种循环
- lua的table库函数列表
- C#操作Excel总结
- Lua获取当前时间
- Android开发之系统相机相册使用