PMON failed to acquire latch, see PMON dump

来源:互联网 发布:qt x11中文不显示linux 编辑:程序博客网 时间:2024/06/05 16:45
数据库版本:10.2.0.1
客户6点发现应用某个进程占用CPU很高,由于还没到业务时间,他决定重启服务器(他说之前都是这么做的)
6:30左右关闭监听
然后关闭数据库,就卡住了,很久才关闭
从alert日志来看
Incremental checkpoint up to RBA [0x1c07.a7e.0], current log tail at RBA [0x1c07.c68.0]
Mon Nov 30 06:26:32 2015
Errors in file /oracle_data/oracle/admin/vbsrun/bdump/vbsrun_j000_31634.trc:
ORA-12012: error on auto execute of job 41
ORA-12008: error in materialized view refresh path
ORA-12541: TNS:no listener
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2255
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2461
ORA-06512: at "SYS.DBMS_IREFRESH", line 683
ORA-06512: at "SYS.DBMS_REFRESH", line 195
ORA-06512: at line 1
Mon Nov 30 06:26:37 2015
Errors in file /oracle_data/oracle/admin/vbsrun/bdump/vbsrun_j000_31634.trc:
ORA-12012: error on auto execute of job 42
ORA-12008: error in materialized view refresh path
ORA-12541: TNS:no listener
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2255
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2461
ORA-06512: at "SYS.DBMS_IREFRESH", line 683
ORA-06512: at "SYS.DBMS_REFRESH", line 195
ORA-06512: at line 1
Mon Nov 30 06:28:37 2015
Errors in file /oracle_data/oracle/admin/vbsrun/bdump/vbsrun_j000_31634.trc:
ORA-12012: error on auto execute of job 41
ORA-12008: error in materialized view refresh path
ORA-12541: TNS:no listener
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2255
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2461
ORA-06512: at "SYS.DBMS_IREFRESH", line 683
ORA-06512: at "SYS.DBMS_REFRESH", line 195
ORA-06512: at line 1
Mon Nov 30 06:28:43 2015
Errors in file /oracle_data/oracle/admin/vbsrun/bdump/vbsrun_j000_31634.trc:
ORA-12012: error on auto execute of job 42
ORA-12008: error in materialized view refresh path
ORA-12541: TNS:no listener
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2255
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2461
ORA-06512: at "SYS.DBMS_IREFRESH", line 683
ORA-06512: at "SYS.DBMS_REFRESH", line 195
ORA-06512: at line 1
Mon Nov 30 06:32:26 2015
Starting background process EMN0
EMN0 started with pid=42, OS id=31831
Mon Nov 30 06:32:26 2015
Shutting down instance: further logons disabled
Mon Nov 30 06:32:26 2015
Stopping background process CJQ0
Mon Nov 30 06:32:26 2015
Stopping background process QMNC
Mon Nov 30 06:32:28 2015
Stopping background process MMNL
Mon Nov 30 06:32:29 2015
Stopping background process MMON
Mon Nov 30 06:32:31 2015
Shutting down instance (immediate)
License high water mark = 657
Mon Nov 30 06:32:31 2015
Stopping Job queue slave processes
Mon Nov 30 06:32:31 2015
Job queue slave processes stopped
Mon Nov 30 06:32:54 2015
All dispatchers and shared servers shutdown
Mon Nov 30 06:32:58 2015
PMON failed to acquire latch, see PMON dump    --出现了很多
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
Mon Nov 30 06:33:10 2015
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
Mon Nov 30 06:33:23 2015
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
Mon Nov 30 06:33:36 2015
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
Mon Nov 30 06:33:49 2015
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
Mon Nov 30 06:34:01 2015
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
Mon Nov 30 06:34:13 2015
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
Mon Nov 30 06:34:26 2015
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
Mon Nov 30 06:34:38 2015
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
Mon Nov 30 06:34:50 2015
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
Mon Nov 30 06:35:02 2015
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
Mon Nov 30 06:35:14 2015
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
Mon Nov 30 06:35:26 2015
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
Mon Nov 30 06:35:38 2015
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
Mon Nov 30 06:35:50 2015
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
Mon Nov 30 06:36:02 2015
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
Mon Nov 30 06:36:14 2015
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
Mon Nov 30 06:36:26 2015
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
Mon Nov 30 06:36:37 2015
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
Mon Nov 30 06:36:49 2015
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
Mon Nov 30 06:37:00 2015
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
Mon Nov 30 06:37:12 2015
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
Mon Nov 30 06:37:23 2015
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
Mon Nov 30 06:37:34 2015
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
Mon Nov 30 06:37:45 2015
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
Mon Nov 30 06:37:55 2015
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
Mon Nov 30 06:38:05 2015
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
Mon Nov 30 06:38:16 2015
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
Mon Nov 30 06:38:20 2015
Incremental checkpoint up to RBA [0x1c07.ef45.0], current log tail at RBA [0x1c07.ef5e.0]
Mon Nov 30 06:38:21 2015
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
PMON failed to acquire latch, see PMON dump
Mon Nov 30 06:38:29 2015
ALTER DATABASE CLOSE NORMAL
Mon Nov 30 06:38:30 2015
SMON: disabling tx recovery
Mon Nov 30 06:38:55 2015
SMON: disabling cache recovery
Mon Nov 30 06:38:56 2015
Shutting down archive processes
Archiving is disabled
Mon Nov 30 06:39:01 2015
ARCH shutting down
ARC1: Archival stopped
Mon Nov 30 06:39:06 2015
ARCH shutting down
ARC0: Archival stopped
Mon Nov 30 06:39:07 2015
Thread 1 closed at log sequence 7175
Successful close of redo thread 1
Mon Nov 30 06:40:49 2015
Completed: ALTER DATABASE CLOSE NORMAL
Mon Nov 30 06:40:49 2015
ALTER DATABASE DISMOUNT
Completed: ALTER DATABASE DISMOUNT
ARCH: Archival disabled due to shutdown: 1089
Shutting down archive processes
Archiving is disabled
Archive process shutdown avoided: 0 active
ARCH: Archival disabled due to shutdown: 1089
Shutting down archive processes
Archiving is disabled
Archive process shutdown avoided: 0 active
Mon Nov 30 07:11:01 2015
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Picked latch-free SCN scheme 1
Autotune of undo retention is turned on. 
IMODE=BR
ILAT =242
LICENSE_MAX_USERS = 0
SYS auditing is enabled
ksdpec: called for event 13740 prior to event group initialization
Starting up ORACLE RDBMS Version: 10.2.0.1.0.
System parameters with non-default values:

出现很多PMON failed to acquire latch, see PMON dump的提示,但是没有显示trc文件。于是登录上去,进入bdump,找到了11.30日6:30左右的pmon trc,截取部分内容如下
/oracle_data/oracle/admin/vbsrun/bdump/vbsrun_pmon_10702.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, OLAP and Data Mining options
ORACLE_HOME = /oracle_data/oracle/product/10.2.0/db
System name:        Linux
Node name:        vm.localdomain
Release:        2.6.18-164.el5
Version:        #1 SMP Tue Aug 18 15:51:48 EDT 2009
Machine:        x86_64
Instance name: vbsrun  --实例名
Redo thread mounted by this instance: 1
Oracle process number: 2
Unix process pid: 10702, image: oracle@vm.localdomain (PMON)  --PMON


*** 2015-11-30 06:32:57.447
*** SERVICE NAME:(SYS$BACKGROUND) 2015-11-30 06:32:57.443
*** SESSION ID:(2205.1) 2015-11-30 06:32:57.443
PMON unable to acquire latch  60007498 process allocation level=1   --这个60007498代表啥不知道啊
        Location from where latch is held: ksukia: 
        Context saved from call: 0
        state=busy, wlstate=free
    gotten 292629 times wait, failed first 1036 sleeps 1343
    gotten 146360 times nowait, failed: 14
  possible holder pid = 19 ospid=31825   --判断是PID=19 OSPID=31825的这个阻塞了PMON
----------------------------------------
SO: 0x7a46174f8, type: 2, owner: (nil), flag: INIT/-/-/0x00
  (process) Oracle pid=19, calls cur/top: (nil)/0x7a4acae68, flag: (0) -
            int error: 0, call error: 0, sess error: 0, txn error 0
  (post info) last post received: 1089 8 12
              last post received-location: ksusig
              last process to post me: 7a46174f8 195 0
              last post sent: 0 0 200
              last post sent-location: kmmpsh
              last process posted by me: 7a4616d10 1 0
  (latch info) wait_event=0 bits=2
    holding    (efd=4) 60007498 process allocation level=1 
        Location from where latch is held: ksukia: 
        Context saved from call: 0
        state=busy, wlstate=free
    Process Group: DEFAULT, pseudo proc: 0x79f723c90
    O/S info: user: oracle, term: pts/1, ospid: 31825
    OSD pid info: Unix process pid: 31825, image: oracle@vm.localdomain (TNS V1-V3)  --从这块看应该是个连接,但是因为我得到通知已经是12点了,也不知道到底是个啥了
    Short stack dump: ksdxfstk()+32<-ksdxcb()+1547<-sspuser()+90<-<0x368720e7c0>
Dump of memory from 0x00000007A45D65D8 to 0x00000007A45D67E0

和朋友交流,他有个AIX的10.2.0.1的也遇到一样的问题。百度了一下搜到是 Bug 6488694,但是跟我的情况好像也不完全一样,上my oracle support上的也没搜到完全一样的
所以我不明白的是,这是10.2.0.1的bug,还是啥?

我在ITPUB提问后http://www.itpub.net/forum.php?mod=viewthread&tid=1943983&page=1#pid23118844
Yong Huang大师说了:
Bug 6488694 is irrelevant. You should look at
Bug 5057695: Shutdown Immediate Very Slow To Close Database (Doc ID 428688.1)
All you need to do is to apply the patch as stated in the note. Or upgrade to 10.2.0.4. Or if I were you, upgrade the database to 11.2.0.4. Version 10.2.0.1 is so ancient and (I think) has long been desupported!


> PMON unable to acquire latch  60007498 process allocation level=1   --这个60007498代表啥不知道啊
> Location from where latch is held: ksukia: 


That's the memory location for the process allocation latch. Confirm by:


select addr from v$latch where name = 'process allocation';


I don't have 10.2.0.1 so I can't confirm it for you. The location where the latch is held, function ksukia (a function to kill user processes) is a telltale sign that you're hitting the bug 5057695.


Bug 6488694 不靠谱,你应该看Bug 5057695  Doc ID 428688.1(下面会贴出来)
你需要做的就是要么打这个patch,要么升级到10.2.0.4。我要是你,就升到11.2.0.4. 10.2.0.1太不靠谱了


> PMON unable to acquire latch  60007498 process allocation level=1   --这个60007498代表啥不知道啊
> Location from where latch is held: ksukia: 
这是内存地址 for the process allocation latch
可以通过查询select addr from v$latch where name = 'process allocation';
--根据 Location from where latch is held: ksukia: 可定位到是哪个latch
SQL> select parent_name,location from v$latch_misses where lower(location) like '%ksukia%';


PARENT_NAME                                        LOCATION
-------------------------------------------------- ------------------------------
process allocation                                 ksukia_find


看起来就是PMON要调用ksukia函数来清理用户进程,但是在获取process allocation latch时,被阻塞了,被“OSD pid info: Unix process pid: 31825, image: oracle@vm.localdomain (TNS V1-V3) ”


至于文档中提到的patch,很遗憾9.2.0.8, 10.2.0.2, 10.2.0.3才能打




Bug 5057695: Shutdown Immediate Very Slow To Close Database (文档 ID 428688.1)转到底部转到底部


In this Document
Symptoms
Changes
Cause
Solution
References
APPLIES TO:


Oracle Database - Enterprise Edition - Version 9.2.0.8 to 10.2.0.3 [Release 9.2 to 10.2]
Information in this document applies to any platform.
*** Checked for relevance 25-Sept-2014 ***
SYMPTOMS


Recently upgraded the database to a version between 9.2.0.7 to 10.2.0.3. 


A database with 150 or more connections takes a very long time to shutdown, compared to the same system when using Oracle 9i (9.2.0.6 and prior) where shutdown was very fast. 


Given the fact there is no activity on the connections in question, rollbacks aren't an issue. 


There are no trace files and no indication of a problem in the alert log.


CHANGES

Recently upgraded the database to a version between 9.2.0.7 to 10.2.0.3


CAUSE

SHUTDOWN IMMEDIATE can be slower in 9.2.0.8 and 10.2.0.1/10.2.0.2/10.2.0.3 than in earlier releases due to a change in how processes are terminated whic was introduced in these releases.

Previously during a shutdown immediate, the ksukia function would kill each individual user session and check in intervals of 5 seconds that it had died before progressing to the next user. 

With fix for Unpublished Bug 5080775, this time was reduced so that instead of waiting for 5 seconds it now waited 0.05 seconds and this would be repeated 40 times at intervals of 0.05 seconds then it would increase to checking every 5 seconds. 

With a large numbers of users this could involve a lengthy wait for shutdown that needed to be reduced. 

Note this code is relatively new. It was introduced in 9.2.0.7 and 10.2. It was a design change where previously it issues a kill (skgpkill) and then proceeds onto the next process. 




SOLUTION

Workaround :

Wait for the shutdown to finish. 


Solution : 

Apply one off Patch 5057695 available on My Oracle Support on top of 9.2.0.8, 10.2.0.2 or 10.2.0.3.

The fix is included as of the 10.2.0.4 patchset.

One-off patches are available for most platforms, but if a patch is not available for your platform, please contact Oracle Support.


REFERENCES

BUG:5057695 - SHUTDOWN IMMEDIATE SLOW TO CLOSE DOWN DATABASE WITH INACTIVE JDBC THIN SESSIONS
0 0
原创粉丝点击