何为splitbrain【记一次RAC节点异常宕机】
来源:互联网 发布:java 数据一致性 编辑:程序博客网 时间:2024/06/08 08:38
写在前面:
记录一次宕机的分析过程,主要是结合CSSD日志理解splitbrain的作用过程
==================================================================================================
上周刚给开发在测试环境搭建了一套11.2.0.3的两节点RAC
今天过来一看。。。
[wh-xn01db1@root]# crs_stat -tName Type Target State Host ------------------------------------------------------------ora.DATA.dg ora....up.type ONLINE ONLINE wh-xn01db1 ora....ER.lsnr ora....er.type ONLINE ONLINE wh-xn01db1 ora....N1.lsnr ora....er.type ONLINE ONLINE wh-xn01db1 ora....N2.lsnr ora....er.type ONLINE ONLINE wh-xn01db1 ora....N3.lsnr ora....er.type ONLINE ONLINE wh-xn01db1 ora.asm ora.asm.type ONLINE ONLINE wh-xn01db1 ora.cvu ora.cvu.type ONLINE ONLINE wh-xn01db1 ora.gsd ora.gsd.type OFFLINE OFFLINE ora....network ora....rk.type ONLINE ONLINE wh-xn01db1 ora.oc4j ora.oc4j.type ONLINE ONLINE wh-xn01db1 ora.ons ora.ons.type ONLINE ONLINE wh-xn01db1 ora....ry.acfs ora....fs.type ONLINE ONLINE wh-xn01db1 ora.scan1.vip ora....ip.type ONLINE ONLINE wh-xn01db1 ora.scan2.vip ora....ip.type ONLINE ONLINE wh-xn01db1 ora.scan3.vip ora....ip.type ONLINE ONLINE wh-xn01db1 ora....SM1.asm application ONLINE ONLINE wh-xn01db1 ora....B1.lsnr application ONLINE ONLINE wh-xn01db1 ora....db1.gsd application OFFLINE OFFLINE ora....db1.ons application ONLINE ONLINE wh-xn01db1 ora....db1.vip ora....t1.type ONLINE ONLINE wh-xn01db1 ora....db2.vip ora....t1.type ONLINE ONLINE wh-xn01db1 ora.xn01db.db ora....se.type ONLINE ONLINE wh-xn01db1
所有东西都跑节点上来了,节点二已经死翘翘了。。。。
难道是我上周搭完忘记起了?试着起来看看
[wh-xn01db2@oracle]$ sqlSQL*Plus: Release 11.2.0.3.0 Production on Mon Jan 12 10:23:10 2015Copyright (c) 1982, 2011, Oracle. All rights reserved.Connected to an idle instance.xn01db2@SYS> startupORA-01078: failure in processing system parametersORA-01565: error in identifying file '+DATA/xn01db/spfilexn01db.ora'ORA-17503: ksfdopn:2 Failed to open file +DATA/xn01db/spfilexn01db.oraORA-15077: could not locate ASM instance serving a required diskgroupxn01db2@SYS>
【嗯。。CRS都没起,DB能起才怪。。】
于是试着起crs也是半天没动静
看来是真的有点问题了。。。
看errpt貌似有点动静
[wh-xn01db2@public]$ errptIDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTIONA924A5FC 0109181415 P S SYSPROC SOFTWARE PROGRAM ABNORMALLY TERMINATEDF7FA22C9 0108150815 I O SYSJ2 UNABLE TO ALLOCATE SPACE IN FILE SYSTEMF7FA22C9 0108150515 I O SYSJ2 UNABLE TO ALLOCATE SPACE IN FILE SYSTEM
[wh-xn01db2@public]$ errpt -aj A924A5FC|more---------------------------------------------------------------------------LABEL: CORE_DUMPIDENTIFIER: A924A5FCDate/Time: Fri Jan 9 18:14:58 CST 2015Sequence Number: 40Machine Id: 00C5B5F44C00Node Id: wh-xn01db2Class: SType: PERMWPAR: GlobalResource Name: SYSPROC DescriptionSOFTWARE PROGRAM ABNORMALLY TERMINATEDProbable CausesSOFTWARE PROGRAMUser CausesUSER GENERATED SIGNAL Recommended Actions CORRECT THEN RETRYFailure CausesSOFTWARE PROGRAM Recommended Actions RERUN THE APPLICATION PROGRAM IF PROBLEM PERSISTS THEN DO THE FOLLOWING CONTACT APPROPRIATE SERVICE REPRESENTATIVEDetail DataSIGNAL NUMBER 6USER'S PROCESS ID: 8323306FILE SYSTEM SERIAL NUMBER 14INODE NUMBER 0 12354CORE FILE NAME/soft/product/11.2.0.3/gih/log/wh-xn01db2/cssd/corePROGRAM NAME<span style="color:#ff0000;">ocssd.bin</span>
ocssd进程提供的是css(Cluster Synchronization Service)服务,也就是通过心跳监视和保护集群的
于是试着去ping一下对端的心跳IP,果然心跳不通了!
至于心跳为什么不通,就交给网络和系统管理员了,我开始好奇的是:心跳不通是如何导致节点宕机的,为什么是节点二宕机的?
认真比对了两个节点的CSSD日志以及查找了splitbrain的名词解释。。终于理解了这次宕机的过程,记录如下:
首先什么是splitbrain:
在集群中,节点间通过某种机制(心跳)了解彼此的健康状态,以确保各节点协调工作。 假设只有"心跳"出现问题, 各个节点还在正常运行, 这时,每个节点都认为其他的节点宕机了, 自己是整个集群环境中的"唯一建在者",自己应该获得整个集群的"控制权"。 在集群环境中,存储设备都是共享的, 这就意味着数据灾难, 这种情况就是"脑裂"解决这个问题的通常办法是使用投票算法(Quorum Algorithm). 它的算法机理如下:集群中各个节点需要心跳机制来通报彼此的"健康状态",假设每收到一个节点的"通报"代表一票。对于三个节点的集群,正常运行时,每个节点都会有3票。 当结点A心跳出现故障但节点A还在运行,这时整个集群就会分裂成2个小的partition。 节点A是一个,剩下的2个是一个。 这是必须剔除一个partition才能保障集群的健康运行。对于有3个节点的集群,A 心跳出现问题后,B 和C 是一个partion,有2票,A只有1票。 按照投票算法,B 和C 组成的集群获得控制权,A 被剔除。如果只有2个节点,投票算法就失效了。 因为每个节点上都只有1票。 这时就需要引入第三个设备:Quorum Device. Quorum Device 通常采用饿是共享磁盘,这个磁盘也叫作Quorum disk。 这个Quorum Disk 也代表一票。 当2个结点的心跳出现问题时,2个节点同时去争取Quorum Disk 这一票, 最早到达的请求被最先满足。 故最先获得Quorum Disk的节点就获得2票。另一个节点就会被剔除。
另外还有一篇相当详细的文章讲脑裂工作原理:http://czmmiao.iteye.com/blog/1577531
我只讲我的日志里记录的过程
首先看节点二的CSSD日志:
2015-01-09 18:14:38.106: [ CSSD][3605]clssnmPollingThread: node wh-xn01db1 (1) at 50% heartbeat fatal, removal in 14.264 seconds<span style="color:#ff0000;">=====》提示节点一存在心跳丢失,并于14秒后T出</span>2015-01-09 18:14:38.106: [ CSSD][3605]clssnmPollingThread: node wh-xn01db1 (1) is impending reconfig, flag 2491406, misstime 157362015-01-09 18:14:38.106: [ CSSD][3605]clssnmPollingThread: local diskTimeout set to 27000 ms, remote disk timeout set to 27000, impending reconfig status(1)2015-01-09 18:14:38.107: [ CSSD][2577]clssnmvDHBValidateNcopy: node 1, wh-xn01db1, has a disk HB, but no network HB, DHB has rcfg 316174449, wrtcnt, 114013, LATS 3029068518, lastSeqNo 114000, uniqueness 1420702946, timestamp 1420798476/3029041859···2015-01-09 18:14:45.116: [ CSSD][3605]clssnmPollingThread: node wh-xn01db1 (1) at 75% heartbeat fatal, removal in 7.254 seconds<span style="color:#ff0000;">=====》提示节点一存在心跳丢失,并于7秒后T出</span>2015-01-09 18:14:45.118: [ CSSD][2577]clssnmvDHBValidateNcopy: node 1, wh-xn01db1, has a disk HB, but no network HB, DHB has rcfg 316174449, wrtcnt, 114027, LATS 3029075530, lastSeqNo 114026, uniqueness 1420702946, timestamp 1420798484/30290493892015-01-09 18:14:45.182: [ CSSD][3862]clssnmSendingThread: sending status msg to all nodes2015-01-09 18:14:45.182: [ CSSD][3862]clssnmSendingThread: sent 4 status msgs to all nodes···2015-01-09 18:14:50.123: [ CSSD][3605]clssnmPollingThread: node wh-xn01db1 (1) at 90% heartbeat fatal, removal in 2.247 seconds, <span style="color:#ff0000;">=====》提示节点一存在心跳丢失,并于2秒后T出</span>2015-01-09 18:14:50.124: [ CSSD][2577]clssnmvDHBValidateNcopy: node 1, wh-xn01db1, has a disk HB, but no network HB, DHB has rcfg 316174449, wrtcnt, 114037, LATS 3029080535, lastSeqNo 114036, uniqueness 1420702946, timestamp 1420798489/3029054399···2015-01-09 18:14:52.371: [ CSSD][3605]clssnmPollingThread: Removal started for node wh-xn01db1 (1), flags 0x26040e, state 3, wt4c 02015-01-09 18:14:52.371: [ CSSD][3605]clssnmMarkNodeForRemoval: node 1, wh-xn01db1 marked for removal<span style="color:#ff0000;"> =====》节点二认为节点一挂了,把他T出自己的集群</span>2015-01-09 18:14:52.371: [ CSSD][3605]clssnmDiscHelper: wh-xn01db1, node(1) connection failed, endp (692), probe(0), ninf->endp 6922015-01-09 18:14:52.371: [ CSSD][3605]clssnmDiscHelper: node 1 clean up, endp (692), init state 5, cur state 52015-01-09 18:14:52.371: [GIPCXCPT][3605] gipcInternalDissociate: obj 111255090 [0000000000000692] { gipcEndpoint : localAddr 'gipcha://wh-xn01db2:56be-8234-89af-30b', remoteAdd···2015-01-09 18:14:52.372: [ CSSD][4119]clssnmSetupAckWait: node(2) is ALIVE2015-01-09 18:14:52.372: [ CSSD][4119]clssnmSendSync: syncSeqNo(316174449), indicating EXADATA fence initialization complete2015-01-09 18:14:52.372: [ CSSD][4119]List of nodes that have ACKed my sync: NULL2015-01-09 18:14:52.372: [ CSSD][4119]clssnmSendSync: syncSeqNo(316174449)2015-01-09 18:14:52.372: [ CSSD][4119]clssnmWaitForAcks: Ack message type(11), ackCount(1)2015-01-09 18:14:52.372: [ CSSD][4376]clssnmHandleSync: Node wh-xn01db2, number 2, is EXADATA fence capable2015-01-09 18:14:52.372: [ CSSD][4376]clssscUpdateEventValue: NMReconfigInProgress val 2, changes 52015-01-09 18:14:52.372: [ CSSD][4376]clssnmHandleSync: local disk timeout set to 27000 ms, remote disk timeout set to 270002015-01-09 18:14:52.372: [ CSSD][4376]clssnmHandleSync: initleader 2 newleader 22015-01-09 18:14:52.372: [ CSSD][4376]clssnmQueueClientEvent: Sending Event(2), type 2, incarn 3161744482015-01-09 18:14:52.372: [ CSSD][4376]clssnmQueueClientEvent: Node[1] state = 5, birth = 316174446, unique = 14207029462015-01-09 18:14:52.372: [ CSSD][4376]clssnmQueueClientEvent: Node[2] state = 3, birth = 316174448, unique = 14207046162015-01-09 18:14:52.372: [ CSSD][4376]clssnmHandleSync: Acknowledging sync: src[2] srcName[wh-xn01db2] seq[1] sync[316174449]2015-01-09 18:14:52.372: [ CSSD][4376]clssnmSendAck: node 2, wh-xn01db2, syncSeqNo(316174449) type(11)2015-01-09 18:14:52.372: [ CSSD][4376]clssnmHandleAck: Received ack type 11 from node wh-xn01db2, number 2, with seq 0 for sync 316174449, waiting for 0 acks2015-01-09 18:14:52.372: [ CSSD][1]clssgmStartNMMon: node 1 active, birth 3161744462015-01-09 18:14:52.372: [ CSSD][1]clssgmStartNMMon: node 2 active, birth 316174448<span style="color:#ff0000;"> =====》实际上两个节点都是活的,形成两个独立集群,存在脑裂风险,需要进行仲裁</span>2015-01-09 18:14:52.372: [ CSSD][1]NMEVENT_SUSPEND [00][00][00][06]2015-01-09 18:14:52.373: [ CSSD][1]clssgmSuspendAllGrocks: done2015-01-09 18:14:52.373: [ CSSD][1]clssgmCompareSwapEventValue: changed CmInfo State val 2, from 5, changes 132015-01-09 18:14:52.373: [ CSSD][1]clssgmUpdateEventValue: ConnectedNodes val 316174448, changes 52015-01-09 18:14:52.373: [ CSSD][1]clssgmCleanupNodeContexts(): cleaning up nodes, rcfg(316174448)2015-01-09 18:14:52.373: [ CSSD][1]clssgmCleanupNodeContexts(): successful cleanup of nodes rcfg(316174448)2015-01-09 18:14:52.373: [ CSSD][1]clssgmStartNMMon: completed node cleanup 2015-01-09 18:14:52.374: [ CSSD][2577]clssnmvDHBValidateNcopy: node 1, wh-xn01db1, has a disk HB, but no network HB, DHB has rcfg 316174449, wrtcnt, 114042, LATS 3029082785, lastSeqNo 114041, uniqueness 1420702946, timestamp 1420798491/30290569042015-01-09 18:14:52.374: [ CSSD][4119]clssnmSendSync: syncSeqNo(316174449), indicating EXADATA fence initialization complete2015-01-09 18:14:52.374: [ CSSD][4119]List of nodes that have ACKed my sync: 22015-01-09 18:14:52.374: [ CSSD][4119]clssnmWaitForAcks: done, syncseq(316174449), msg type(11)2015-01-09 18:14:52.374: [ CSSD][4119]clssnmSetMinMaxVersion:node2 product/protocol (11.2/1.4)2015-01-09 18:14:52.374: [ CSSD][4119]clssnmSetMinMaxVersion: properties common to all nodes: 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,172015-01-09 18:14:52.374: [ CSSD][4119]clssnmSetMinMaxVersion: min product/protocol (11.2/1.4)2015-01-09 18:14:52.374: [ CSSD][4119]clssnmSetMinMaxVersion: max product/protocol (11.2/1.4)2015-01-09 18:14:52.374: [ CSSD][4119]clssnmNeedConfReq: No configuration to change2015-01-09 18:14:52.374: [ CSSD][4119]clssnmDoSyncUpdate: Terminating node 1, wh-xn01db1, misstime(30001) state(5)2015-01-09 18:14:52.374: [ CSSD][4119]clssnmDoSyncUpdate: Wait for 0 vote ack(s)2015-01-09 18:14:52.374: [ CSSD][4119]clssnmCheckDskInfo: Checking disk info..<span style="color:#ff0000;">.==========》完成VOTE投票,并检查投票结果</span>2015-01-09 18:14:52.374: [ CSSD][4119]clssnmCheckSplit: Node 1, wh-xn01db1, is alive, DHB (1420798491, 3029056904) more than disk timeout of 27000 after the last NHB (1420798462, 3029027063)2015-01-09 18:14:52.374: [ CSSD][4119]clssnmCheckDskInfo: My cohort: 22015-01-09 18:14:52.374: [ CSSD][4119]clssnmCheckDskInfo: Surviving cohort: 12015-01-09 18:14:52.374: [ CSSD][4119](:CSSNM00008:)clssnmCheckDskInfo: Aborting local node to avoid splitbrain. Cohort of 1 nodes with leader 2, wh-xn01db2, is smaller than cohort of 1 nodes led by node 1, wh-xn01db1, based on map type 2<span style="color:#ff0000;">===》节点二只获得一个支持者,小于节点一</span>2015-01-09 18:14:52.374: [ CSSD][4119]###################################2015-01-09 18:14:52.374: [ CSSD][4119]clssscExit: CSSD aborting from thread clssnmRcfgMgrThread2015-01-09 18:14:52.374: [ CSSD][4119]###################################2015-01-09 18:14:52.374: [ CSSD][4119](:CSSSC00012:)clssscExit: A fatal error occurred and the CSS daemon is terminating abnormally<span style="color:#ff0000;">==========》所以节点二悲剧了</span>2015-01-09 18:14:52.374: [ CSSD][4119]
<pre name="code" class="html">2015-01-09 18:14:37.145: [ CSSD][3605]clssnmPollingThread: node wh-xn01db2 (2) at 50% heartbeat fatal, removal in 14.802 seconds <span style="color:#ff0000;">=======》节点一上有一样的信息,准备T出节点二</span>2015-01-09 18:14:37.145: [ CSSD][3605]clssnmPollingThread: node wh-xn01db2 (2) is impending reconfig, flag 2229260, misstime 151982015-01-09 18:14:37.145: [ CSSD][3605]clssnmPollingThread: local diskTimeout set to 27000 ms, remote disk timeout set to 27000, impending reconfig status(1)2015-01-09 18:14:37.147: [ CSSD][2577]clssnmvDHBValidateNcopy: node 2, wh-xn01db2, has a disk HB, but no network HB, DHB has rcfg 316174449, wrtcnt, 113662, LATS 3029042056, lastSeqNo 113659, uniqueness 1420704616, timestamp 1420798476/30290668602015-01-09 18:14:37.648: [ CSSD][2577]clssnmvDHBValidateNcopy: node 2, wh-xn01db2, has a disk HB, but no network HB, DHB has rcfg 316174449, wrtcnt, 113663, LATS 3029042557, lastSeqNo 113662, uniqueness 1420704616, timestamp 1420798477/30290678612015-01-09 18:14:37.951: [ CSSD][2063]clssnmvDiskPing: Writing with status 0x3, timestamp 3029042860/14207984772015-01-09 18:14:38.166: [ CSSD][3862]clssnmSendingThread: sending status msg to all nodes2015-01-09 18:14:38.166: [ CSSD][3862]clssnmSendingThread: sent 4 status msgs to all nodes2015-01-09 18:14:38.452: [ CSSD][2063]clssnmvDiskPing: Writing with status 0x3, timestamp 3029043360/1420798478···2015-01-09 18:14:58.473: [ CSSD][2577]clssnmReadDskHeartbeat:manual shutdown of nodename wh-xn01db2, nodenum 2 epoch 1420798498 msec -1265878681 <span style="color:#ff0000;"> ====》此时节点一赢得仲裁结果,干掉节点二</span>2015-01-09 18:14:58.473: [ CSSD][4119]clssnmWaitOnEvictions: node 2, undead 1, EXADATA fence handle 0 kill reqest id 0, last DHB (1420798498, 3029088615, 93766)2015-01-09 18:14:58.473: [ CSSD][4119]clssnmCheckKillStatus: Node 2, wh-xn01db2, down, LATS(3029063382),timeout(0)2015-01-09 18:14:58.473: [ CSSD][4119]clssnmBldSendUpdate: syncSeqNo(316174449)2015-01-09 18:14:58.473: [ CSSD][4119]clssnmBldSendUpdate: using msg version 42015-01-09 18:14:58.473: [ CSSD][4119]clssnmDoSyncUpdate: Sync 316174449 complete!2015-01-09 18:14:58.473: [ CSSD][4119]clssnmRcfgMgrThread: Reconfig in progress...2015-01-09 18:14:58.473: [ CSSD][4376]clssnmHandleUpdate: sync[316174449] src[1], msgvers 4 icin 3161744442015-01-09 18:14:58.473: [ CSSD][4376]clssnmHandleUpdate: common properties are 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,172015-01-09 18:14:58.473: [ CSSD][4376]clssnmUpdateNodeState: node wh-xn01db1, number 1, current state 3, proposed state 3, current unique 1420702946, proposed unique 1420702946, prevConuni 0, birth 3161744462015-01-09 18:14:58.473: [ CSSD][4376]clssnmUpdateNodeState: node wh-xn01db2, number 2, current state 5, proposed state 0, current unique 1420704616, proposed unique 1420704616, prevConuni 1420704616, birth 3161744482015-01-09 18:14:58.473: [ CSSD][4376]clssnmDeactivateNode: node 2, state 52015-01-09 18:14:58.473: [ CSSD][4376]clssnmDeactivateNode: node 2 (wh-xn01db2) left cluster2015-01-09 18:14:58.473: [ CSSD][4376]clssnmSendAck: node 1, wh-xn01db1, syncSeqNo(316174449) type(15)2015-01-09 18:14:58.473: [ CSSD][4376]clssnmQueueClientEvent: Sending Event(1), type 1, incarn 3161744492015-01-09 18:14:58.473: [ CSSD][4376]clssnmQueueClientEvent: Node[1] state = 3, birth = 316174446, unique = 14207029462015-01-09 18:14:58.473: [ CSSD][4376]clssnmQueueClientEvent: Node[2] state = 0, birth = 0, unique = 02015-01-09 18:14:58.473: [ CSSD][4376]clssnmHandleUpdate: SYNC(316174449) from node(1) completed2015-01-09 18:14:58.473: [ CSSD][4376]clssnmHandleUpdate: NODE 1 (wh-xn01db1) IS ACTIVE MEMBER OF CLUSTER2015-01-09 18:14:58.473: [ CSSD][4376]clssscUpdateEventValue: NMReconfigInProgress val -1, changes 122015-01-09 18:14:58.473: [ CSSD][4376]clssnmHandleUpdate: local disk timeout set to 200000 ms, remote disk timeout set to 2000002015-01-09 18:14:58.473: [ CSSD][4376]clssnmHandleAck: node wh-xn01db1, number 1, sent ack type 15 for wrong reconfig; ack is for reconfig 316174449 and we are on reconfig 3161744502015-01-09 18:14:58.474: [ CSSD][1]clssgmStartNMMon: node 1 active, birth 3161744462015-01-09 18:14:58.474: [ CSSD][1]clssgmStartNMMon: node 2 failed, birth (316174448, 0) (old/new)2015-01-09 18:14:58.474: [ CSSD][1]clssgmUpdateEventValue: Reconfig Event val 1, changes 10
至此完成了这次宕机的过程分析
PS:后话
这个机器是AIX上一个分区,那么其他分区的心跳走的是同一个网口,应该也会有问题
果然这个物理机上面三套RAC,心跳全部不通,其中两套都挂了一边节点,唯独有一套。。。心跳不通但是啥事都没有
结果发现:
[wh-uat01db01@root]# oifcfg getifen0 10.170.42.0 global cluster_interconnecten1 10.170.32.0 global public
真正的心跳是:
10.170.42.0网段。。。。。。这个可是动态逻辑分区的IP来的
#for rac10.170.32.93 wh-uat01db01 wh-uat01db01.srcb.com10.170.32.94 wh-uat01db02 wh-uat01db02.srcb.com10.170.32.95 wh-uat01db01-vip wh-uat01db01-vip.srcb.com10.170.32.96 wh-uat01db02-vip wh-uat01db02-vip.srcb.com10.170.32.97 wh-uat01db0-scan wh-uat01db0-scan.srcb.com10.170.32.98 wh-uat01db0-scan wh-uat01db0-scan.srcb.com10.170.32.99 wh-uat01db0-scan wh-uat01db0-scan.srcb.com<span style="color:#ff0000;">10.102.1.1 wh-uat01db01-priv wh-uat01db01-priv.srcb.com10.102.1.2 wh-uat01db02-priv wh-uat01db02-priv.srcb.com</span>#end
这套库不是我搭的。。。嗯
后后话:
这次心跳不通是因为物理网口坏了,看来生产环境做etherchannel还是很有必要的,嗯!
0 0
- 何为splitbrain【记一次RAC节点异常宕机】
- 痛苦的一次rac 添加节点操作
- 记一次异机单实例变更为RAC的操作
- 记一次因网卡心跳故障引发RAC节点重启故障分析
- 记一次RegionServer节点的宕机事故[hbase1.1.2]
- 何为堡垒机
- 12C RAC环境节点宕机问题
- HAIP异常,导致RAC节点无法启动的解决方案
- 一次RAC共享磁盘映射问题导致RAC异常重启的故障处理过程
- 记一次OutOfMemoryException异常
- 记一次OOM异常
- 记一次DG异常
- 为10G RAC加节点的详细步骤
- Oracle RAC ocr进行备份和恢复的一次错误操作(没关闭所有节点)
- 一次ORA-15055 ORA-12547 RAC中一个节点不能识别asm磁盘解决办法
- 10G RAC节点2宕机通过修改listener.ora实现客户端通过节点2VIP连接到数据库
- 记一次空指针异常
- 11gR2 RAC启用iptables导致节点宕机问题处理
- 对计算机科学的反思
- [LeetCode]Text Justification
- 一位资深射频工程师的成功轨迹
- Java中普通代码块,构造代码块,静态代码块区别及代码示例
- 完数
- 何为splitbrain【记一次RAC节点异常宕机】
- Mac上更改Chrome的主题
- vdceye x64 版本终于可以往hadoop里面写视频数据了
- 小米前端面试
- C#回顾
- 面向对象的三个基本元素,五个基本原则
- pypy优点
- nyoj 914 Yougth的最大化(贪心+二分搜索)
- CentOS-5.9系统下Ganglia-3.6.0监控集群安装配置实践