High "gc buffer busy acquire" wait after Upgrade to 11.2 (文档 ID 2047582.1)

来源:互联网 发布:最小截图软件 编辑:程序博客网 时间:2024/06/16 13:09

In this Document

 Symptoms Changes Cause Solution


APPLIES TO:

Oracle Database - Enterprise Edition - Version 11.2.0.4 and later
Information in this document applies to any platform.

SYMPTOMS

After a database was migrated (using export/import) from a 10.2.0.4 RAC cluster to a new cluster with Grid Infrastructure 12.1.0.2 and RDBMS 11.2.0.4, customer observed "gc buffer busy acquire" as the top wait event whenever the load increased significantly.

This behaviour was not observed in the "old" 10.2 environment, there was no change neither in the data (amount) nor in the application.

AWR report from instance 1 shows the following during times of high load:

 Snap IdSnap TimeSessionsCursors/SessionInstancesBegin Snap:128224-Jul-15 14:30:23254.63End Snap:128324-Jul-15 15:01:062581.33Elapsed: 30.71 (mins)   DB Time: 964.02 (mins)

Top 10 Foreground Events by Total Wait Time

EventWaitsTotal Wait Time (sec)Wait Avg(ms)% DB timeWait Classgc buffer busy acquire40,26817.5K43430.2Clustergc cr block 2-way490,9487851.41613.6Clusterenq: SQ - contention4224540.5107597.8Configurationenq: TX - row lock contention21,5594305.32007.4Applicationgc current grant busy86,5193827.4446.6Clustergc current block 2-way300,1043390.2115.9Clustergc current retry5043027.360075.2Clustergc cr block congested7,5292587.43444.5Clustergc cr block 3-way76,6102463.8324.3Cluster

 

AWR from instance 2 shows a similar performance:

 Snap IdSnap TimeSessionsCursors/SessionInstancesBegin Snap:128224-Jul-15 14:30:23254.33End Snap:128324-Jul-15 15:00:352541.13Elapsed: 30.20 (mins)   DB Time: 870.22 (mins) 

Top 10 Foreground Events by Total Wait Time

EventWaitsTotal Wait Time (sec)Wait Avg(ms)% DB timeWait Classgc buffer busy acquire40,55019K46936.4Clustergc cr block 2-way507,2186006.11211.5Clusterenq: TX - row lock contention18,0384072.12267.8Applicationgc current grant busy85,5613656437.0Clustergc current retry6673581.953706.9Clustergc current block 2-way263,3472811.9115.4Clustergc cr block 3-way82,8762683.7325.1Clustergc cr block congested7,2442648.13665.1ClusterDB CPU 1672.6 3.2 gc current block congested4,5491281.52822.5Cluster

 

The top SQL is like:

Cluster Wait Time (s)Executions%TotalElapsed Time(s)%Clu%CPU%IOSQL IdSQL ModuleSQL Text5,752.8142,82012.685,814.6598.940.250.00284tprrsaqszgJDBC Thin Client

SELECT * FROM SPROD.LOKS WHERE...

It is a select for update SQL and there are many similar SQLs:

SELECT * FROM PROD.LOKS WHERE CAPS_ID IN (SELECT DISTINCT UNIQUE_SELECTOR FROM (SELECT SN, CAPS_ID AS UNIQUE_SELECTOR FROM PROD.CAPS WHERE SN = :1 AND (MN <> 'A' AND MN <> 'B') ORDER BY SN ASC ) WHERE ROWNUM <= :2 ) ORDER BY SN ASC FOR UPDATE

database alert log shows NUMA is enabled at OS level:

CELL IP affinity details:
    NUMA status: NUMA system w/ 2 process groups
    cellaffinity.ora status: cannot find affinity map at '/etc/oracle/cell/network-config/cellaffinity.ora' (see trace file for details)
CELL communication will use 1 IP group(s):
    Grp 0:
Picked latch-free SCN scheme 3
Using LOG_ARCHIVE_DEST_1 parameter default value as USE_DB_RECOVERY_FILE_DEST
ARCH: Warning; less destinations available than specified
by LOG_ARCHIVE_MIN_SUCCEED_DEST init.ora parameter
WARNING: db_recovery_file_dest is same as db_create_online_log_dest_2
Autotune of undo retention is turned on.
LICENSE_MAX_USERS = 0
SYS auditing is disabled
NUMA system with 2 nodes detected
Starting up:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
...


NUMA structure can also be verified from OS command:

# numactl --hardware
available: 2 nodes (0-1)              <<< 2 nodes here, without NUMA enabled, it will show available: 1 nodes (0)
node 0 cpus: 0 1 2 3 4 5 12 13 14 15 16 17
node 0 size: 49143 MB
node 0 free: 33241 MB
node 1 cpus: 6 7 8 9 10 11 18 19 20 21 22 23
node 1 size: 49152 MB
node 1 free: 34444 MB
node distances:
node 0 1
0: 10 21
1: 21 10

# dmesg |grep -i numa
NUMA: Initialized distance table, cnt=2         <<< without NUMA enabled, it will show "No NUMA configuration found"
NUMA: Node 0 [0,a0000) + [100000,80000000) -> [0,80000000)
NUMA: Node 0 [0,80000000) + [100000000,c80000000) -> [0,c80000000)

 

 

CHANGES

 Upgraded a database from a 10.2 cluster to a 11.2 new cluster

CAUSE

The issue was caused by NUMA being configured at the OS level, this caused the LMS processes to be blocked in status 'D' (Uninterruptible sleep - usually IO) for an excessive amount of time which in turn led to excessive gc related waits.

OSWatcher shows during this time, lms1 on instance 2 went into "D" status:

zzz ***Fri Jul 24 14:43:05 NZST 2015
USER PID PPID PRI %CPU %MEM VSZ RSS WCHAN S STARTED TIME COMMAND
oracle 24205 1 19 1.3 2.4 31730396 4762712 ? D 10:20:01 00:03:26 ora_lms1_RACDB2  <<<

zzz ***Fri Jul 24 14:43:35 NZST 2015
USER PID PPID PRI %CPU %MEM VSZ RSS WCHAN S STARTED TIME COMMAND
oracle 24205 1 19 1.3 2.4 31751004 4786908 poll_s S 10:20:01 00:03:27 ora_lms1_RACDB2


RACDB2_lmhb_24292.trc complains that lms1 process did not move for 23 sec due to the "D" status:

*** 2015-07-24 14:43:14.358
==============================
LMS1 (ospid: 24205) has not moved for 23 sec (1437705793.1437705770)
kjfmGCR_HBCheckAll: LMS1 (ospid: 24205) has status 2
 : Not in wait; last wait ended 22 secs ago.

*** 2015-07-24 14:43:16.860
loadavg : 7.77 5.32 4.26
Memory (Avail / Total) = 46962.26M / 193658.71M
Swap (Avail / Total) = 16384.00M / 16384.00M
skgpgcmdout: read() for cmd /bin/ps -elf | /bin/egrep 'PID | 24205' | /bin/grep -v grep timed out after 2.500 seconds

*** 2015-07-24 14:43:19.367
Short stack dump: ORA-32516: cannot wait for process 'Unix process pid: 24205, image: oracle@racnode2.com (LMS1)' to finish executing ORADEBUG command 'SHORT_STACK'; wait time exceeds 2500 ms


There are many traces on all instances complain a similar message to the followings as the result of lms1 process not moving:

*** MODULE NAME:(JDBC Thin Client) 2015-07-24 14:43:24.700
*** ACTION NAME:() 2015-07-24 14:43:24.700

kcbzwb: try to cancel ongoing current request (tsn 7 rdba 0xe5f547a dataobj# 81085 obj# 81085) if any  because it got timeout for 'gc buffer busy acquire' (timeout 100 csec) 30 times, to resolve potential deadlock

The dataobj# varies in the traces.

Per ASH data dump, all "gc buffer busy acquire" sessions are blocked by a session either waiting for "gc cr block 2-way", "gc cr block 3-way" or "gc cr block congested" etc, there is no blocking session for those gc wait, sometimes several sessions waiting for "gc buffer busy acquire" on the same block and blocked by the same session.

Those gc wait spikes are caused by lms process stuck in "D" status for 20-40 sec intermittently. 
Further debug performed by OS to find out why a process stuck in "D" status confirms it is due to OS NUMA activity.

Other test results shows with NUMA enabled at OS layer, with or without NUMA enabled at Oracle layer, the performance are equally slow. Only turn off NUMA completely at OS layer resolves the issue. 

SOLUTION

Engage the system administrator to turn off NUMA at the OS layer (this is platform specific but in the case of Linux it could be as simple as specifying the boot parameter "numa=off" in grub.conf).

For this case, the performance issue was resolved after turning off NUMA.

No more LMS process stuck in "D" status shown in OSwatcher, no LMS process not moving messages shown in lmhb trace.

The post change AWR report shows:

 Snap IdSnap TimeSessionsCursors/SessionInstancesBegin Snap:127629-Jul-15 16:30:43159.93End Snap:127729-Jul-15 17:00:471631.03Elapsed: 30.07 (mins)   DB Time: 28.82 (mins)

Top 10 Foreground Events by Total Wait Time

EventWaitsTotal Wait Time (sec)Wait Avg(ms)% DB timeWait ClassDB CPU 1559.5 90.2 gc cr block busy75,73366.113.8Clustergc current block 2-way306,48343.302.5Clustergc cr block 2-way267,05237.302.2Clustergc current block 3-way131,67728.901.7Clusterlog file sync44,55726.311.5Commitrdbms ipc reply299,4182301.3Othergc cr block 3-way99,73922.501.3Clustergc current grant busy93,05821.201.2Clusterdb file sequential read19,22019.111.1User I/O

The previous top SQL now shows:

Cluster Wait Time (s)Executions%TotalElapsed Time(s)%Clu%CPU%IOSQL IdSQL ModuleSQL Text12.3545,9425.4223.7651.9846.540.33284tprrsaqszgJDBC Thin ClientSELECT * FROM SPROD.LOKS WHERE...

 45K executions now only waits for 12sec in cluster wait compare to previous 5752sec cluster wait for 42K executions of the same SQL.

0 0
原创粉丝点击