Oracle HANGANALYZE

来源:互联网 发布:淘宝焊接阀门 编辑:程序博客网 时间:2024/06/07 09:55
环境:
OS:Oracle Linux As 5.5
DB:11.2.0.1 RAC


如下我们通过使用hangalyze获取到两个会话更新同一行数据的等待事件情况.

1.session1更新行数据

SQL> create table tb_test (id number,name varchar2(10)) tablespace users;

Table created.

SQL> insert into tb_test values(1,'hang test');

1 row created.

SQL> commit;

Commit complete.

SQL> select userenv('sid') from dual;

USERENV('SID')
--------------
            69

SQL> update tb_test set name='hang' where id=1;

1 row updated.

SQL>

注意:这里不执行commit动作

2.session2同样更新session1更新的行

SQL> conn scott/oracle
Connected.
SQL> select userenv('sid') from dual;

USERENV('SID')
--------------
            33

SQL> update tb_test set name='hang' where id=1;

注意:这个时候已经hang住了

3.session3使用hangalyze生成trace文件
SQL> connect / as sysdba
Connected.

SQL> oradebug -g def hanganalyze 3;
Hang Analysis in /u01/app/oracle/diag/rdbms/racdb/racdb1/trace/racdb1_diag_6042.trc
SQL>

4.查看trace文件的内容

[oracle@host01-racdb1 ~]$ cat /u01/app/oracle/diag/rdbms/racdb/racdb1/trace/racdb1_diag_6042.trc
Trace file /u01/app/oracle/diag/rdbms/racdb/racdb1/trace/racdb1_diag_6042.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
ORACLE_HOME = /u01/app/oracle/product/11.2.0/db_1
System name:    Linux
Node name:      host01.example.com
Release:        2.6.18-194.el5
Version:        #1 SMP Mon Mar 29 22:10:29 EDT 2010
Machine:        x86_64
Instance name: racdb1
Redo thread mounted by this instance: 0 <none>
Oracle process number: 5
Unix process pid: 6042, image: oracle@host01.example.com (DIAG)


*** 2017-09-17 20:25:29.980
*** SESSION ID:(5.1) 2017-09-17 20:25:29.980
*** CLIENT ID:() 2017-09-17 20:25:29.980
*** SERVICE NAME:() 2017-09-17 20:25:29.980
*** MODULE NAME:() 2017-09-17 20:25:29.980
*** ACTION NAME:() 2017-09-17 20:25:29.980
 
Node id: 0
List of nodes: 0,

*** 2017-09-17 20:25:29.981
Reconfiguration starts [incarn=0]

*** 2017-09-17 20:25:29.981
I'm the master node

*** 2017-09-17 20:25:29.981
Reconfiguration completes [incarn=1]
Group reconfiguration cleanup

*** 2017-09-17 20:25:34.375
DIAG attached to DLM

*** 2017-09-17 20:27:18.434
cluster reconfiguration is ongoing: 0, 1,

*** 2017-09-17 20:27:18.467
Reconfiguration starts [incarn=2]

*** 2017-09-17 20:27:18.467
I'm the master node
Group reconfiguration cleanup

*** 2017-09-17 20:27:18.954
A rcfg proposal from node 1 is received

*** 2017-09-17 20:27:18.954
Reconfiguration completes [incarn=2]

*** 2017-09-17 21:30:25.497
DIAG sends out oradebug command to nodes [0,1]

*** 2017-09-17 21:30:26.223
****************************************************************
PORADEBUG BEGIN ORIGINATING INST:1 SERIAL:0 PID:19045
PORADATA COMMAND:hanganalyze 3
session (kjzddbx) switches to a new action
PORADATA TRACEFILE /u01/app/oracle/diag/rdbms/racdb/racdb1/trace/racdb1_diag_6042.trc
PORADEBUG END ORIGINATING INST:1 SERIAL:0 PID:19045
****************************************************************

*** 2017-09-17 21:30:28.293
===============================================================================
HANG ANALYSIS:
  instances (db_name.oracle_sid): racdb.racdb1, racdb.racdb2
  oradebug_node_dump_level: 3
  analysis initiated by oradebug
===============================================================================
 
Chains most likely to have caused the hang:
 [a] Chain 1 Signature: 'SQL*Net message from client'<='enq: TX - row lock contention'
     Chain 1 Signature Hash: 0x38c48850
 [b] Chain 2 Signature: 'Streams AQ: waiting for messages in the queue'
     Chain 2 Signature Hash: 0xa00e2e87
 [c] Chain 3 Signature: 'Streams AQ: waiting for messages in the queue'
     Chain 3 Signature Hash: 0xa00e2e87
 
===============================================================================
Non-intersecting chains:
 
-------------------------------------------------------------------------------
Chain 1:
-------------------------------------------------------------------------------
    Oracle session identified by:
    {
                instance: 1 (racdb.racdb1)
                   os id: 18330
              process id: 50, oracle@host01.example.com (TNS V1-V3)
              session id: 33
        session serial #: 1207
    }
    is waiting for 'enq: TX - row lock contention' with wait info:
    {
                      p1: 'name|mode'=0x54580006
                      p2: 'usn<<16 | slot'=0x3001a
                      p3: 'sequence'=0x7b2
            time in wait: 4 min 6 sec
           timeout after: never
                 wait id: 126
                blocking: 0 sessions
             current sql: update tb_test set name='hang' where id=1
             short stack: ksedsts()+461<-ksdxfstk()+32<-ksdxcb()+1782<-sspuser()+112<-__restore_rt()<-semtimedop()+10<-skgpwwait()+156<-ksliwat()+1821<-kslwaitctx()+162<-kjusuc()+3520<-ksipgetctx()+1413<-ksqcmi()+20639<-ksqgtlctx()+3401<-ksqgelctx()+552<-ktcwit1()+330<-kdddgb()+7099<-kdusru()+490<-updrowFastPath()+1075<-qerupFetch()+2171<-updaul()+1179<-updThreePhaseExe()+328<-updexe()+663<-opiexe()+14377<-kpoal8()+2288<-opiodr()+1149<-ttcpip()+1251<-opitsk()+1633<-opiino()+958<-opiodr()+1149<-opidrv()+570<-sou2o()+103<-opimai_real()+133<-
            wait history:
              * time between current wait and wait #1: 0.000372 sec
              1.       event: 'Disk file operations I/O'
                 time waited: 0.000011 sec
                     wait id: 125             p1: 'FileOperation'=0x2
                                              p2: 'fileno'=0x4
                                              p3: 'filetype'=0x2
              * time between wait #1 and #2: 0.000348 sec
              2.       event: 'SQL*Net message from client'
                 time waited: 17.425030 sec
                     wait id: 124             p1: 'driver id'=0x62657100
                                              p2: '#bytes'=0x1
              * time between wait #2 and #3: 0.000002 sec
              3.       event: 'SQL*Net message to client'
                 time waited: 0.000001 sec
                     wait id: 123             p1: 'driver id'=0x62657100
                                              p2: '#bytes'=0x1
    }
    and is blocked by
 => Oracle session identified by:
    {
                instance: 1 (racdb.racdb1)
                   os id: 17385
              process id: 41, oracle@host01.example.com (TNS V1-V3)
              session id: 69
        session serial #: 1789
    }
    which is waiting for 'SQL*Net message from client' with wait info:
    {
                      p1: 'driver id'=0x62657100
                      p2: '#bytes'=0x1
            time in wait: 4 min 48 sec
           timeout after: never
                 wait id: 839
                blocking: 1 session
             current sql: <none>
             short stack: ksedsts()+461<-ksdxfstk()+32<-ksdxcb()+1782<-sspuser()+112<-__restore_rt()<-read()+14<-ntpfprd()+115<-nsbasic_brc()+338<-nsbrecv()+69<-nioqrc()+485<-__PGOSF24_opikndf2()+978<-opitsk()+824<-opiino()+958<-opiodr()+1149<-opidrv()+570<-sou2o()+103<-opimai_real()+133<-ssthrdmain()+214<-main()+201<-__libc_start_main()+244<-_start()+36
            wait history:
              * time between current wait and wait #1: 0.000008 sec
              1.       event: 'SQL*Net message to client'
                 time waited: 0.000005 sec
                     wait id: 838             p1: 'driver id'=0x62657100
                                              p2: '#bytes'=0x1
              * time between wait #1 and #2: 0.008319 sec
              2.       event: 'SQL*Net message from client'
                 time waited: 1 min 53 sec
                     wait id: 837             p1: 'driver id'=0x62657100
                                              p2: '#bytes'=0x1
              * time between wait #2 and #3: 0.000004 sec
              3.       event: 'SQL*Net message to client'
                 time waited: 0.000004 sec
                     wait id: 836             p1: 'driver id'=0x62657100
                                              p2: '#bytes'=0x1
    }
 
Chain 1 Signature: 'SQL*Net message from client'<='enq: TX - row lock contention'
Chain 1 Signature Hash: 0x38c48850
-------------------------------------------------------------------------------
 
===============================================================================
Sessions in an involuntary wait or not in a wait:
 
-------------------------------------------------------------------------------
Chain 2:
-------------------------------------------------------------------------------
    Oracle session identified by:
    {
                instance: 1 (racdb.racdb1)
                   os id: 6174
              process id: 37, oracle@host01.example.com (TNS V1-V3)
              session id: 41
        session serial #: 1
    }
    is waiting for 'Streams AQ: waiting for messages in the queue' with wait info:
    {
                      p1: 'queue id'=0x3254
                      p2: 'process#'=0x7849d908
                      p3: 'wait time'=0x1
            time in wait: 0.935844 sec
           timeout after: 0.064156 sec
                 wait id: 20756
                blocking: 0 sessions
             current sql: <none>
             short stack: ksedsts()+461<-ksdxfstk()+32<-ksdxcb()+1782<-sspuser()+112<-__restore_rt()<-semtimedop()+10<-skgpwwait()+156<-ksliwat()+1821<-kslwaitctx()+162<-kslwait()+141<-kwqidexfcy()+809<-kwqidsc81i()+1183<-kwqidrdq()+572<-kwqidsfmp()+418<-kwqidafm0()+404<-kwqididqx()+4620<-kpoaqxdq()+1520<-kpoaqxq()+705<-opiodr()+1149<-ttcpip()+1251<-opitsk()+1633<-opiino()+958<-opiodr()+1149<-opidrv()+570<-sou2o()+103<-opimai_real()+133<-ssthrdmain()+214<-main()+201<-__libc_start_main()+244<-_start()+36
            wait history:
              * time between current wait and wait #1: 0.000587 sec
              1.       event: 'SQL*Net message from client'
                 time waited: 0.000130 sec
                     wait id: 20755           p1: 'driver id'=0x62657100
                                              p2: '#bytes'=0x1
              * time between wait #1 and #2: 0.000011 sec
              2.       event: 'SQL*Net message to client'
                 time waited: 0.000001 sec
                     wait id: 20754           p1: 'driver id'=0x62657100
                                              p2: '#bytes'=0x1
              * time between wait #2 and #3: 0.000002 sec
              3.       event: 'SQL*Net break/reset to client'
                 time waited: 0.000061 sec
                     wait id: 20753           p1: 'driver id'=0x62657100
                                              p2: 'break?'=0x0
    }
 
Chain 2 Signature: 'Streams AQ: waiting for messages in the queue'
Chain 2 Signature Hash: 0xa00e2e87
-------------------------------------------------------------------------------
 
-------------------------------------------------------------------------------
Chain 3:
-------------------------------------------------------------------------------
    Oracle session identified by:
    {
                instance: 1 (racdb.racdb1)
                   os id: 9078
              process id: 55, oracle@host01.example.com
              session id: 47
        session serial #: 226
    }
    is waiting for 'Streams AQ: waiting for messages in the queue' with wait info:
    {
                      p1: 'queue id'=0x3037
                      p2: 'process#'=0x784afd88
                      p3: 'wait time'=0x5
            time in wait: 1.428620 sec
           timeout after: 3.571380 sec
                 wait id: 6120
                blocking: 0 sessions
             current sql: <none>
             short stack: ksedsts()+461<-ksdxfstk()+32<-ksdxcb()+1782<-sspuser()+112<-__restore_rt()<-semtimedop()+10<-skgpwwait()+156<-ksliwat()+1821<-kslwaitctx()+162<-kslwait()+141<-kwqidexfcy()+809<-kwqidsc81i()+1183<-kwqidrdq()+572<-kwqidsfmp()+418<-kwqidafm0()+404<-kwqididqx()+4620<-kpoaqdq()+3301<-opiodr()+1149<-ttcpip()+1251<-opitsk()+1633<-opiino()+958<-opiodr()+1149<-opidrv()+570<-sou2o()+103<-opimai_real()+133<-ssthrdmain()+214<-main()+201<-__libc_start_main()+244<-_start()+36
            wait history:
              * time between current wait and wait #1: 0.000281 sec
              1.       event: 'SQL*Net message from client'
                 time waited: 0.000550 sec
                     wait id: 6119            p1: 'driver id'=0x54435000
                                              p2: '#bytes'=0x1
              * time between wait #1 and #2: 0.000016 sec
              2.       event: 'SQL*Net message to client'
                 time waited: 0.000005 sec
                     wait id: 6118            p1: 'driver id'=0x54435000
                                              p2: '#bytes'=0x1
              * time between wait #2 and #3: 0.000566 sec
              3.       event: 'Streams AQ: waiting for messages in the queue'
                 time waited: 4.962538 sec
                     wait id: 6117            p1: 'queue id'=0x3037
                                              p2: 'process#'=0x784afd88
                                              p3: 'wait time'=0x5
    }
 
Chain 3 Signature: 'Streams AQ: waiting for messages in the queue'
Chain 3 Signature Hash: 0xa00e2e87
-------------------------------------------------------------------------------
 
===============================================================================
Extra information that will be dumped at higher levels:
[level  4] :   1 node dumps -- [LEAF] [LEAF_NW]
[level  5] :   3 node dumps -- [NO_WAIT] [INVOL_WT] [SINGLE_NODE] [NLEAF] [SINGLE_NODE_NW]
 
State of ALL nodes
([nodenum]/cnode/sid/sess_srno/session/ospid/state/[adjlist]):
[32]/1/33/1207/0x7879b3d8/18330/NLEAF/[68]
[40]/1/41/1/0x78784058/6174/SINGLE_NODE/
[46]/1/47/226/0x787729b8/9078/SINGLE_NODE/
[68]/1/69/1789/0x78732c18/17385/LEAF/

*** 2017-09-17 21:30:28.295
===============================================================================
END OF HANG ANALYSIS
===============================================================================

*** 2017-09-17 21:30:28.296
===============================================================================
HANG ANALYSIS DUMPS:
  oradebug_node_dump_level: 3
===============================================================================
 
State of LOCAL nodes
([nodenum]/cnode/sid/sess_srno/session/ospid/state/[adjlist]):
[32]/1/33/1207/0x7879b3d8/18330/NLEAF/[68]
[40]/1/41/1/0x78784058/6174/SINGLE_NODE/
[46]/1/47/226/0x787729b8/9078/SINGLE_NODE/
[68]/1/69/1789/0x78732c18/17385/LEAF/
 
 
No processes qualify for dumping.
 
===============================================================================
HANG ANALYSIS DUMPS: END
===============================================================================
session (kjzddbx) switches to a new action
PORADATA SERIAL:0 RINST:2 RDIAG:5548
PORADATA SERIAL:0 RINST:2 REMOTE EXEC OSPID:5548
PORADATA SERIAL:0 RINST:2 TRACEFILE:/u01/app/oracle/diag/rdbms/racdb/racdb2/trace/racdb2_diag_5548.trc
PORADATA SERIAL:0 RINST:2  DONE
[oracle@host01-racdb1 ~]$

对上面的trace文件说明:
1.trace中最重要的部分

State of ALL nodes
([nodenum]/cnode/sid/sess_srno/session/ospid/state/[adjlist]):
[32]/1/33/1207/0x7879b3d8/18330/NLEAF/[68] --这里的NLEAF表示该SID=33被阻塞了,这里adjlist=68,对应的SID=69,即为阻塞者.
[40]/1/41/1/0x78784058/6174/SINGLE_NODE/
[46]/1/47/226/0x787729b8/9078/SINGLE_NODE/
[68]/1/69/1789/0x78732c18/17385/LEAF/  --这里的LEAF表示该SID=69,即为阻塞者.



在上面的实例中的解释如下:
 
nodenum:定义每个session的序列号
sid: session的sid
sess_srno: session的Serial#
ospid: OS的进程ID
state: node的状态
adjlist: 表示blocker node

 
IN_HANG:这表示该node处于死锁状态,通常还有其他node(blocker)也处于该状态
LEAF/LEAF_NW:该node通常是blocker.通过条目的”predecessor”列可以判断这个node是否是blocker。LEAF说明该NODE没有等待其他资源,而LEAF_NW则可能是没有等待其他资源或者是在使用CPU
NLEAF:通常可以看作这些会话是被阻塞的资源.发生这种情况一般说明数据库发生性能问题而不是数据库hang
IGN/IGN_DMP:这类会话通常被认为是空闲会话,除非其adjlist列里存在node。如果是非空闲会话则说明其adjlist里的node正在等待其他node释放资源。
SINGLE_NODE/SINGLE_NODE_NW:近似于空闲会话.

 
 
2.hanganalyze级别的定义
10   Dump all processes (IGN state)
5    Level 4 + Dump all processes involved in wait chains (NLEAF state)
4    Level 3 + Dump leaf nodes (blockers) in wait chains (LEAF,LEAF_NW,IGN_DMP state)
3    Level 2 + Dump only processes thought to be in a hang (IN_HANG state)
1-2  Only HANGANALYZE output, no process dump at all
一般情况下,我们都定义到级别3,级别3以上的使用需要谨慎,因为级别3以上的,dump信息很多,影响I/O.
 
3.单节点和RAC下hanganalyze的使用:
单节点:
ALTER SESSION SET EVENTS 'immediate trace name HANGANALYZE level <level>';
ORADEBUG hanganalyze <level>

Rac:
ORADEBUG setmypid
ORADEBUG setinst all
ORADEBUG -g def hanganalyze <level>


4、单节点和RAC下hanganalyze的使用,举例:
单节点:

SQL> alter session set events 'immediate trace name HANGANALYZE level 3';
Session altered.

--获取trace 文件名称和路径,该脚本会放在最后贴出来

SQL> @/u01/getTracePath.sql
trace_file_name
--------------------------------------------------------------------------------
/u01/app/oracle/admin/rac/udump/dave1_ora_20962.trc
/u01/app/oracle/admin/rac/udump/dave2_ora_20962.trc
 
[oracle@host01-racdb1 ~]$cat getTracePath.sql

SELECT   d.VALUE
         || '/'
         || LOWER (RTRIM (i.INSTANCE, CHR (0)))
         || '_ora_'
         || p.spid
         || '.trc'
        AS "trace_file_name"
  FROM   (SELECT   p.spid  FROM   v$mystat m, v$session s, v$process p
      WHERE   m.statistic# = 1 AND s.SID = m.SID AND p.addr = s.paddr) p,
         (SELECT   t.INSTANCE  FROM   v$thread t, v$parameter v
           WHERE   v.NAME = 'thread'
          AND (v.VALUE = 0 OR t.thread# = TO_NUMBER (v.VALUE))) i,
         (SELECT   VALUE FROM   v$parameter
           WHERE   NAME = 'user_dump_dest') d;
 
 
SQL> oradebug hanganalyze 3;
Hang Analysis in /u01/app/oracle/diag/rdbms/racdb/racdb1/trace/racdb1_ora_19045.trc

Rac:
SQL> oradebug setmypid
Statement processed.

SQL> oradebug setinst all
Statement processed.

SQL> oradebug -g def hanganalyze 3;
Hang Analysis in /u01/app/oracle/diag/rdbms/racdb/racdb1/trace/racdb1_diag_6042.trc
 
文档参考:http://blog.chinaunix.net/uid-77311-id-3245150.html
         http://blog.csdn.net/tianlesoftware/article/details/6321961
原创粉丝点击