ORA-00600 [17182] ORA-00600[kghstack_free1] ORA-00600[kghfrempty:ds]故障处理

来源:互联网 发布:淘宝发货途中能退款吗 编辑:程序博客网 时间:2024/06/01 09:38

下午2点30左右,负责测试环境的同事电话说测试数据库中心库挂了,看alert日志发现测试环境的DBA同事已经重启好多次也没有成功,启动后马上就Crash。往前

翻看alert,发现最早是中午11:34的时候报的错,这个错误是一个oracle的bug,见mos的:ORA-600 [kghstack_free1] (文档 ID 285602.1)


TNS-00505: Operation timed out    nt secondary err code: 78    nt OS err code: 0  Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=10.252.2.199)(PORT=1941))Tue Oct 21 11:34:03 2014Errors in file /oracle/diag/rdbms/tcen/tcen/trace/tcen_ora_27657688.trc  (incident=390803):***************************************************************************************************************ORA-00600: internal error code, arguments: [kghstack_free1], [kdumrp: kduutsto], [], [], [], [], [], [], [], [], [], []***************************************************************************************************************Incident details in: /oracle/diag/rdbms/tcen/tcen/incident/incdir_390803/tcen_ora_27657688_i390803.trcUse ADRCI or Support Workbench to package the incident.See Note 411.1 at My Oracle Support for error and packaging details.Tue Oct 21 11:34:04 2014Thread 1 advanced to log sequence 8998 (LGWR switch)  Current log# 3 seq# 8998 mem# 0: /oracen/tcen/redo03.logTue Oct 21 11:34:05 2014Archived Log entry 8977 added for thread 1 sequence 8997 ID 0x55472e0a dest 1:Errors in file /oracle/diag/rdbms/tcen/tcen/trace/tcen_ora_27657688.trc  (incident=390804):ORA-00600: internal error code, arguments: [kghfrempty:ds], [0x1108E7B98], [], [], [], [], [], [], [], [], [], []ORA-00600: internal error code, arguments: [kghstack_free1], [kdumrp: kduutsto], [], [], [], [], [], [], [], [], [], []Incident details in: /oracle/diag/rdbms/tcen/tcen/incident/incdir_390804/tcen_ora_27657688_i390804.trcUse ADRCI or Support Workbench to package the incident.See Note 411.1 at My Oracle Support for error and packaging details.Tue Oct 21 11:34:09 2014Dumping diagnostic data in directory=[cdmp_20141021113409], requested by (instance=1, osid=27657688), summary=[incident=390804].Errors in file /oracle/diag/rdbms/tcen/tcen/incident/incdir_390803/tcen_ora_27657688_i390803.trc:


该bug导致数据库宕机,于是测试环境DBA不断尝试重启,一直没有成功,从alert中看问题还是比较明显:

Thread 1 advanced to log sequence 9013 (thread open)Thread 1 opened at log sequence 9013  Current log# 2 seq# 9013 mem# 0: /oracen/tcen/redo02.logSuccessful open of redo thread 1MTTR advisory is disabled because FAST_START_MTTR_TARGET is not setTue Oct 21 14:43:22 2014SMON: enabling cache recovery[26151286] Successfully onlined Undo Tablespace 2.Undo initialization finished serial:0 start:1611833444 end:1611833486 diff:42 (0 seconds)Verifying file header compatibility for 11g tablespace encryption..Verifying 11g file header compatibility for tablespace encryption completedSMON: enabling tx recoveryDatabase Characterset is ZHS16GBKErrors in file /oracle/diag/rdbms/tcen/tcen/trace/tcen_smon_15075330.trc  (incident=1248316):ORA-00600: internal error code, arguments: [17182], [0x1109AD7A0], [], [], [], [], [], [], [], [], [], []Incident details in: /oracle/diag/rdbms/tcen/tcen/incident/incdir_1248316/tcen_smon_15075330_i1248316.trcUse ADRCI or Support Workbench to package the incident.See Note 411.1 at My Oracle Support for error and packaging details.No Resource Manager plan activereplication_dependency_tracking turned off (no async multimaster replication found)Starting background process QMNCTue Oct 21 14:43:22 2014QMNC started with pid=74, OS id=11209704Completed: alter database openTue Oct 21 14:43:23 2014Starting background process CJQ0Tue Oct 21 14:43:23 2014CJQ0 started with pid=99, OS id=33226882Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x373104342020200C] [PC:0x1000C2E3C, kghalf()+540] [flags: 0x0, count: 1]Errors in file /oracle/diag/rdbms/tcen/tcen/trace/tcen_smon_15075330.trc  (incident=1248317):ORA-07445: exception encountered: core dump [kghalf()+540] [SIGSEGV] [ADDR:0x373104342020200C] [PC:0x1000C2E3C] [Address not mapped to object] []ORA-00600: internal error code, arguments: [17182], [0x1109AD7A0], [], [], [], [], [], [], [], [], [], []Incident details in: /oracle/diag/rdbms/tcen/tcen/incident/incdir_1248317/tcen_smon_15075330_i1248317.trcUse ADRCI or Support Workbench to package the incident.See Note 411.1 at My Oracle Support for error and packaging details.Tue Oct 21 14:43:24 2014

这个问题是因为数据库进行完前滚后,smon进行回滚(SMON: enabling cache recovery)发现有部分block CORRUPTED导致该问题。
解决该问题思路就是:

方法1.设置event禁用smon进行tx recovery

alter system set event='10513 trace name context forever,level 2' scope='spfile';

重启数据库
方法2.设置回滚段相关隐含参数,屏蔽回滚,然后重建undo tablespace以彻底解决该问题

用方法1打开数据库,通过下面的sql可以看到要recovery的回滚段,把这些回滚段添加到隐含参数中

SQL> l  1  select segment_name,status,tablespace_name from dba_rollback_segs  2*SQL> /SEGMENT_NAME                   STATUS           TABLESPACE_NAME------------------------------ ---------------- ------------------------------SYSTEM                         ONLINE           SYSTEM_SYSSMU1_2428882487$           ONLINE           UNDOTBS1_SYSSMU2_1520383311$           ONLINE           UNDOTBS1_SYSSMU3_1371335976$           ONLINE           UNDOTBS1_SYSSMU4_686542768$            ONLINE           UNDOTBS1_SYSSMU5_538369555$            ONLINE           UNDOTBS1_SYSSMU6_2381545608$           ONLINE           UNDOTBS1_SYSSMU7_3252991061$           ONLINE           UNDOTBS1_SYSSMU8_676308921$            ONLINE           UNDOTBS1_SYSSMU9_256155103$            ONLINE           UNDOTBS1_SYSSMU10_585802251$           ONLINE           UNDOTBS1SEGMENT_NAME                   STATUS           TABLESPACE_NAME------------------------------ ---------------- ------------------------------_SYSSMU11_426062400$           PARTLY AVAILABLE UNDOTBS1  <------------- 这一行_SYSSMU12_4073093061$          OFFLINE          UNDOTBS1_SYSSMU13_3405388771$          OFFLINE          UNDOTBS1_SYSSMU14_3005306459$          OFFLINE          UNDOTBS1_SYSSMU15_1779671541$          OFFLINE          UNDOTBS1_SYSSMU16_4085920032$          OFFLINE          UNDOTBS1_SYSSMU17_3236800836$          OFFLINE          UNDOTBS1_SYSSMU18_2858917377$          OFFLINE          UNDOTBS1_SYSSMU19_1683938663$          OFFLINE          UNDOTBS1_SYSSMU20_1078477197$          OFFLINE          UNDOTBS1_SYSSMU21_3298272824$          OFFLINE          UNDOTBS1SEGMENT_NAME                   STATUS           TABLESPACE_NAME------------------------------ ---------------- ------------------------------_SYSSMU22_1848080707$          OFFLINE          UNDOTBS1_SYSSMU23_2369118209$          OFFLINE          UNDOTBS1_SYSSMU24_2601897058$          OFFLINE          UNDOTBS1_SYSSMU25_3225658959$          OFFLINE          UNDOTBS126 rows selected.


因为数据库是异常中断,因此可以通过下面的表可以找到那些Dead事务
SQL> select KTUXECFL, KTUXESQN, KTUXESIZ from x$ktuxe where KTUXECFL='DEAD';KTUXECFL                   KTUXESQN   KTUXESIZ------------------------ ---------- ----------DEAD                        4371693          2DEAD                        8449154        395  <---KTUXESIZ用来记录事务使用的回滚段块数


新建一个undo表空间

create undo tablespace undotbs_tcen datafile '/oracen/tcendata/undotbs_tcen_01.dbf' size 10240M;
alter system set undo_tablespace=undotbs_tcen;

此时直接删除原来的undo表空间是无法删除的

SQL> select segment_name,status,tablespace_name from dba_rollback_segs;SEGMENT_NAME                   STATUS           TABLESPACE_NAME------------------------------ ---------------- ------------------------------SYSTEM                         ONLINE           SYSTEM_SYSSMU10_585802251$           PARTLY AVAILABLE UNDOTBS1<-----------------_SYSSMU11_426062400$           NEEDS RECOVERY   UNDOTBS1 <----------------_SYSSMU12_4073093061$          OFFLINE          UNDOTBS1_SYSSMU13_3405388771$          OFFLINE          UNDOTBS1_SYSSMU14_3005306459$          OFFLINE          UNDOTBS1_SYSSMU15_1779671541$          OFFLINE          UNDOTBS1_SYSSMU16_4085920032$          OFFLINE          UNDOTBS1_SYSSMU17_3236800836$          OFFLINE          UNDOTBS1_SYSSMU18_2858917377$          OFFLINE          UNDOTBS1_SYSSMU19_1683938663$          OFFLINE          UNDOTBS1SEGMENT_NAME                   STATUS           TABLESPACE_NAME------------------------------ ---------------- ------------------------------_SYSSMU20_1078477197$          OFFLINE          UNDOTBS1_SYSSMU21_3298272824$          OFFLINE          UNDOTBS1_SYSSMU22_1848080707$          OFFLINE          UNDOTBS1_SYSSMU23_2369118209$          OFFLINE          UNDOTBS1_SYSSMU24_2601897058$          OFFLINE          UNDOTBS1_SYSSMU25_3225658959$          OFFLINE          UNDOTBS1_SYSSMU39_4072807697$          ONLINE           UNDOTBS_TCEN_SYSSMU40_56486773$            ONLINE           UNDOTBS_TCEN_SYSSMU41_3994086030$          ONLINE           UNDOTBS_TCEN_SYSSMU42_326209782$           ONLINE           UNDOTBS_TCEN_SYSSMU43_980468378$           ONLINE           UNDOTBS_TCENSEGMENT_NAME                   STATUS           TABLESPACE_NAME------------------------------ ---------------- ------------------------------_SYSSMU44_2332935711$          ONLINE           UNDOTBS_TCEN_SYSSMU45_3152501178$          ONLINE           UNDOTBS_TCEN_SYSSMU46_3162242451$          ONLINE           UNDOTBS_TCEN_SYSSMU47_67541545$            ONLINE           UNDOTBS_TCEN_SYSSMU48_1685486861$          ONLINE           UNDOTBS_TCEN_SYSSMU49_2686039235$          ONLINE           UNDOTBS_TCEN_SYSSMU50_3551989330$          ONLINE           UNDOTBS_TCEN_SYSSMU51_2184994096$          ONLINE           UNDOTBS_TCEN30 rows selected.SQL> drop tablespace UNDOTBS1 including contents and datafiles;drop tablespace UNDOTBS1 including contents and datafiles*ERROR at line 1:ORA-01548: active rollback segment '_SYSSMU10_585802251$' found, terminatedropping tablespace

因此需要修改参数文件:

*.diagnostic_dest='/oracle'*.dispatchers=''*.distributed_lock_timeout=700*.event='10513 trace name context forever,level 2'*.fast_start_parallel_rollback='FALSE'*.java_pool_size=134217728*.job_queue_processes=200*.large_pool_size=134217728*.log_archive_dest_1='location=/oracen/tcen/archive_log'*.log_archive_format='archive_%t_%s_%r.log'*.log_archive_max_processes=5*.open_cursors=300*.pga_aggregate_target=10737418240*.processes=6000*.remote_login_passwordfile='EXCLUSIVE'*.sec_case_sensitive_logon=FALSE*.sga_max_size=10737418240*.sga_target=0*.shared_pool_size=2147483648*.streams_pool_size=268435456*.undo_tablespace='UNDOTBS_TCEN'_corrupted_rollback_segments=(_SYSSMU10_585802251$,_SYSSMU11_426062400$)

SQL> shutdown abortORACLE instance shut down.SQL> startup mount pfile='/oracle/zhang_new.ora'ORACLE instance started.Total System Global Area 1.0689E+10 bytesFixed Size                  2255784 bytesVariable Size            5570036824 bytesDatabase Buffers         4831838208 bytesRedo Buffers              285343744 bytesDatabase mounted.SQL> alter database open;Database altered.SQL> select segment_name,status,tablespace_name from dba_rollback_segs;SEGMENT_NAME                   STATUS           TABLESPACE_NAME------------------------------ ---------------- ------------------------------SYSTEM                         ONLINE           SYSTEM_SYSSMU10_585802251$           NEEDS RECOVERY   UNDOTBS1<--------------------_SYSSMU11_426062400$           NEEDS RECOVERY   UNDOTBS1<--------------------_SYSSMU12_4073093061$          OFFLINE          UNDOTBS1_SYSSMU13_3405388771$          OFFLINE          UNDOTBS1_SYSSMU14_3005306459$          OFFLINE          UNDOTBS1_SYSSMU15_1779671541$          OFFLINE          UNDOTBS1_SYSSMU16_4085920032$          OFFLINE          UNDOTBS1_SYSSMU17_3236800836$          OFFLINE          UNDOTBS1_SYSSMU18_2858917377$          OFFLINE          UNDOTBS1_SYSSMU19_1683938663$          OFFLINE          UNDOTBS1SEGMENT_NAME                   STATUS           TABLESPACE_NAME------------------------------ ---------------- ------------------------------_SYSSMU20_1078477197$          OFFLINE          UNDOTBS1_SYSSMU21_3298272824$          OFFLINE          UNDOTBS1_SYSSMU22_1848080707$          OFFLINE          UNDOTBS1_SYSSMU23_2369118209$          OFFLINE          UNDOTBS1_SYSSMU24_2601897058$          OFFLINE          UNDOTBS1_SYSSMU25_3225658959$          OFFLINE          UNDOTBS1_SYSSMU39_4072807697$          ONLINE           UNDOTBS_TCEN_SYSSMU40_56486773$            ONLINE           UNDOTBS_TCEN_SYSSMU41_3994086030$          ONLINE           UNDOTBS_TCEN_SYSSMU42_326209782$           ONLINE           UNDOTBS_TCEN_SYSSMU43_980468378$           ONLINE           UNDOTBS_TCENSEGMENT_NAME                   STATUS           TABLESPACE_NAME------------------------------ ---------------- ------------------------------_SYSSMU44_2332935711$          ONLINE           UNDOTBS_TCEN_SYSSMU45_3152501178$          ONLINE           UNDOTBS_TCEN_SYSSMU46_3162242451$          ONLINE           UNDOTBS_TCEN_SYSSMU47_67541545$            ONLINE           UNDOTBS_TCEN_SYSSMU48_1685486861$          ONLINE           UNDOTBS_TCEN_SYSSMU49_2686039235$          ONLINE           UNDOTBS_TCEN_SYSSMU50_3551989330$          ONLINE           UNDOTBS_TCEN_SYSSMU51_2184994096$          ONLINE           UNDOTBS_TCEN30 rows selected.SQL> drop tablespace UNDOTBS1 including contents and datafiles;Tablespace dropped.



然后去掉参数文件pfile中的下面的参数,重建spfile

_corrupted_rollback_segments=(_SYSSMU10_585802251$,_SYSSMU11_426062400$)

*.event='10513 trace name context forever,level 2'


SQL> shutdown immediateDatabase closed.Database dismounted.ORACLE instance shut down.SQL> create spfile from pfile='/oracle/zhang1.ora';File created.SQL> startup ORACLE instance started.Total System Global Area 1.0689E+10 bytesFixed Size                  2255784 bytesVariable Size            5570036824 bytesDatabase Buffers         4831838208 bytesRedo Buffers              285343744 bytesDatabase mounted.Database opened.SQL> SQL> SQL> show parameter SQL> SQL> SQL> select segment_name,status,tablespace_name from dba_rollback_segs;SEGMENT_NAME                   STATUS           TABLESPACE_NAME------------------------------ ---------------- ------------------------------SYSTEM                         ONLINE           SYSTEM_SYSSMU39_4072807697$          ONLINE           UNDOTBS_TCEN_SYSSMU40_56486773$            ONLINE           UNDOTBS_TCEN_SYSSMU41_3994086030$          ONLINE           UNDOTBS_TCEN_SYSSMU42_326209782$           ONLINE           UNDOTBS_TCEN_SYSSMU43_980468378$           ONLINE           UNDOTBS_TCEN_SYSSMU44_2332935711$          ONLINE           UNDOTBS_TCEN_SYSSMU45_3152501178$          ONLINE           UNDOTBS_TCEN_SYSSMU46_3162242451$          ONLINE           UNDOTBS_TCEN_SYSSMU47_67541545$            ONLINE           UNDOTBS_TCEN_SYSSMU48_1685486861$          ONLINE           UNDOTBS_TCENSEGMENT_NAME                   STATUS           TABLESPACE_NAME------------------------------ ---------------- ------------------------------_SYSSMU49_2686039235$          ONLINE           UNDOTBS_TCEN_SYSSMU50_3551989330$          ONLINE           UNDOTBS_TCEN_SYSSMU51_2184994096$          ONLINE           UNDOTBS_TCEN14 rows selected.SQL> SQL> 

此时数据库可以正常启动!





0 0
原创粉丝点击