WAITEVENT: "log file sync" Reference Note (文档 ID 34592.1)

来源:互联网 发布:工业数据库 编辑:程序博客网 时间:2024/04/28 16:34

"log file sync" Reference Note

This is a reference note for the wait event "log file sync" which includes the following subsections:
  • Brief definition
  • Individual wait details (eg: For waits seen in <>)
  • Systemwide wait details (eg: For waits seen in <>)
  • Reducing waits / wait times
  • Data Guard Perspective
  • Known bugs
See Note:61998.1 for an introduction to Wait Events.

Definition:

  • Versions:7.0 - 11.1 Documentation: 11g 10g
  • When a user session(foreground process) COMMITs (or rolls back), the session's redo information needs to be flushed to the redo logfile. The user session will post the LGWR to write all redo required from the log buffer to the redo log file. When the LGWR has finished it will post the user session. The user session waits on this wait event while waiting for LGWR to post it back to confirm all redo changes are safely on disk.
    当一个用户会话提交,会话的重做信息需要从内存刷新到重做日志文件,使其永久化。
     
    在提交时,用户会话会通知 LGWR 把日志缓冲区中的信息写到重做日志文件(当前所有未被写入磁盘的 redo 信息,包括本次会话的 redo 信息)。当 LGWR 完成写操作后,它会通知用户会话。当等待 LGWR 通知确认所有 redo 已经安全的保存到磁盘的过程时,用户会话会等待'log file sync'。
    This may be described further as the time user session/foreground process spends waiting for redo to be flushed to make the commit durable. Therefore, we may think of these waits as commit latency from the foreground process (or commit client generally). 
    这可能进一步描述了用户会话进程等待重做提交并刷到磁盘的过程,因此我们认为这些提交延迟的等待都是来自前台进程或客户端的提交操作。

    See Reducing Waits section below for more detailed breakdown of this wait event. 
    更多此等待事件的细节分解可以看下面的“减少等待”部分
    ("log file sync" also applies to ROLLBACK/UNDO in that once the rollback/undo is complete the end of the rollback/undo operation requires all changes to complete the rollback/undo to be flushed to the redo log)(日志文件同步”也适用于回滚/撤消一旦回滚/撤销完成到操作的结尾,要求完成回滚/撤销的所有的变化写入重做日志)

Individual Waits:

  Parameters:

  • P1 = buffer#
  • P2 = Not used
  • P3 = Not used
  • buffer#
    All changes up to this buffer number (in the log buffer) must be flushed to disk and the writes confirmed to ensure that the transaction is committed , and will remain committed upon an instance crash. Hence the wait is for LGWR to flush up to this buffer#.这个buffer编号(在日志缓冲区中)的所有改变必须刷新到磁盘,写操作的完成保证了交易COMMIT的执行,即使实例crash也会保证COMMIT(如果写操作没完成实例crash就无法保证commit)。因此LGWR的等待就是刷新这个buffer#。

  Wait Time:等待时间

The wait is entirely dependent on LGWR to write out the necessary redo blocks and confirm completion back to the user session. The wait time includes the writing of the log buffer and the post. The waiter times out and increments the sequence number every second while waiting.这种等待完全依赖于LGWR写出所有必要的redo块,确保完成后返回给用户session。等待时间包括了日志缓冲写操作和提交操作。等待的时候,每秒都会增加序列号。

  Finding Blockers:查找阻塞的块

If a session continues to wait on the the same buffer# then the SEQ# column of <> should increment every second. If not then the local session has a problem with wait event timeouts. If the SEQ# column is incrementing then the blocking process is the LGWR process. Check to see what LGWR is waiting on as it may be stuck.如果一个session持续等待同一个buffer#,那么SEQ#列应该每秒都会增加。否则本地session会出现等待事件超时的问题。如果SEQ#列持续增长,那么阻塞进程就是LGWR进程。检查LGWR正在等待哪些日志块的完成而被卡住。

Systemwide Waits: 系统级等待

Systemwide figures for waits on "log file sync" show the time spent waiting for COMMITs to complete. If this is significant then there may be a problem with LGWR's ability to flush redo out quickly enough. One can also look at:系统级”log file sync“的等待参数显示了等待COMMIT完成花费的时间。如果这种等待非常明显,那么LGWR快速完整地刷出redo的能力就有问题。这一点还可以看:LGWR的"log file parallel write"等待(见Note:34583.1)和 ”user commits“统计数据显示COMMIT的次数。
  • "log file parallel write" waits for LGWR (See Note:34583.1)
  • "user commits" statistic shows the number of commits.

Reducing Waits / Wait times: 减少等待/等待时间

Here are 3 main general tuning tips to help you reduce waits on "log file sync":

为了帮你降低“log file sync”的等待,有几种常用调优的技巧:

  • Tune LGWR to get good throughput to disk . eg: Do not put redo logs on RAID 5.调优LGWR用吞吐量好的磁盘,例如不将redo日志存储到RAID5。
  • If there are lots of short duration transactions see if it is possible to BATCH transactions together so there are fewer distinct COMMIT operations. Each commit has to have it confirmed that the relevant REDO is on disk. Although commits can be "piggybacked" by Oracle reducing the overall number of commits by batching transactions can have a very beneficial effect.如果有许多短时间的交易,看看是否可以进行批量交易,这样可以有更少的COMMIT操作。每次COMMIT都需要确认相关的redo信息是否刷新到磁盘。尽管commit是由Oracle内部处理的,但是通过批量交易可以降低commit的总体次数,达到一个非常好的效果。
  • See if any of the processing can use the COMMIT NOWAIT option (be sure to understand the semantics of this before using it).看看有没有处理能够使用COMMIT NOWAIT选项(但在使用前需要理解他的语意)
  • See if any activity can safely be done with NOLOGGING / UNRECOVERABLE options.确认任何交易使用NOLOGGING/UNRECOVERABLE选项是否安全。
  • Check to see if redologs are large enough. Enlarge the redologs so the logs switch between 15 to 20 minutes.确认redo日志是否足够大。扩大redo日志,以保证日志切换可以控制在15到20分钟之间。

For more detailed analysis for reducing waits on LOG FILE SYNC please see below: 

对于降低LOG FILE SYNC等待时间更加详细的分析可以参考如下:

The overall wait time for LOG FILE SYNC may be broken down into subsections or components.
If your system still shows high "log file sync" wait times after ensuring the general tuning tips above are completed, you should break down the total wait time into the individual components, then tune those components that make up the largest time. 

LOG FILE SYNC等待的总时间可能会被切分为若干子节或组件。如果确保上面提到的一些调优技巧已经使用了但你的系统仍旧显示较高的“log file sync”等待时间,那么你应该将总等待时间切分为单个的组件,然后调优那些组成最长用时的组件。


The log file sync wait may be broken down into the following components:

log file sync等待可能被切分为以下组件:

1. Wakeup LGWR if idle 唤醒已停止工作的LGWR
2. LGWR gathers the redo to be written and issue the I/O LGWR收集需要写入磁盘与返回的I/O
3. Time for the log write I/O to complete 日志写I/O完成的时间
4. LGWR I/O post processing LGWR提交处理I/O
5. LGWR posting the foreground/user session that the write has completed 写操作完成后LGWR提交给前台/用户session
6. Foreground/user session wakeup 唤醒前台/用户session


Tuning advice based on log file sync component breakdown above:基于log file sync切分后的组件的一些调优建议

Steps 2 and 3 are accumulated in the "redo write time" statistic. (i.e. as found under STATISICS section of Statspack and AWR)2和3累积在"redo write time"统计信息中。(例如Statspack和AWR的统计信息节中)
Step 3 is the "log file parallel write" wait event.(Note.34583.1:"log file parallel write" Reference Note:)3是“log file parallel write”等待事件
Steps 5 and 6 may become very significant as the system load increases. This is because even after the foreground has been posted it may take a some time for the OS to schedule it to run. May require monitoring from O/S level.5和6随着系统负载的增加可能变得非常明显。这是因为即使已经返回请求到前台进程,仍可能需要花费OS时间进行调度执行。需要从操作系统级别的监控。

Data Guard Perspective: Data Guard的观点

For Data Guard with synchronous (SYNC) transport and commit WAIT defaults, the above tuning steps still apply, except step 3 also includes the time for the network write and the RFS/redo write to the standby redo logs.对于Data Guard,具有异步传输与默认的COMMIT WAIT功能,以上的调优步骤仍可以使用,除了第三步也包括对于备机redo日志的网络写与RFS/redo写的用时。
This wait event and how it applies to Data Guard is explained in detail in the MAA OTN white paper:
Note 387174.1:MAA - Data Guard Redo Transport and Network Best Practices.这个等待事件如何应用与data guard在MAA OTN白皮书中有详细的说明。
You can restrict the list below to issues likely to affect one of the following versions by clicking the relevant button: 
             

The list below shows bugs affecting any version.

There are 28 bugs listed.
NBBugFixedDescription 1482337211.2.0.3.BP23, 11.2.0.4, 12.1.0.1Adaptive "log file sync" picks inaccurate polling interval on RAC 1370790411.2.0.4, 12.1.0.1LGWR sometimes uses polling, sometimes post/waitE1261408511.2.0.4, 12.1.0.1Diagnostic enhancement to add new statistics for investigating "log file sync" and "log file parallel write" relationshipP1610240111.2.0.3.BP16, 11.2.0.4, 12.1.0.1identify correct effective mutiplier for sparc t5 processor 1385176912.1.0.1ORA-16198 or 'log file sync' waits in PRIMARY database or 'reliable message' waits in STANDBY 1355140211.2.0.3.9, 11.2.0.3.BP22, 11.2.0.4, 12.1.0.1High "log file parallel write" and "log file sync" after upgrading 11.2 with Veritas/Symantec ODMPE1295161911.2.0.4, 12.1.0.1Solaris: Enhancement to allow database to use critical threads feature in Solaris 1237814711.2.0.2.7, 11.2.0.2.BP10, 11.2.0.3, 12.1.0.1Long broadcast ack warning messages, and/or many Log File Sync timeouts in foregrounds in RAC 909569611.2.0.3.7, 11.2.0.3.BP08, 11.2.0.4, 12.1.0.1"log file sync" wait time spikes with ARCHIVE_LAG_TARGET set 1307470611.2.0.3.BP14, 11.2.0.4, 12.1.0.1Long "log file sync" waits in RAC not correlated with slow writes 849087910.2.0.4.4, 10.2.0.5, 11.1.0.7.3, 11.2.0.1Long "log file sync" latencies due to broadcast on commit scheme 822073410.2.0.4.4, 10.2.0.5, 11.1.0.7.3, 11.2.0.1Long "log file sync" wait in RAC 771635610.2.0.5, 11.2.0.1Long "log file sync" latencies with broadcast on commit scheme in RAC 764363210.2.0.4.1, 10.2.0.5, 11.1.0.7.4, 11.2.0.1High log file sync in Data Guard maximum availability (sync) mode 761036210.2.0.4.4, 10.2.0.5, 11.1.0.7.3, 11.2.0.1Long "log file sync" waits in RAC with broadcast on commit in RACP756873410.2.0.5, 11.2.0.1AIX: Sporadic spikes of 'log file sync' on AIX with heavy commit concurrency 745237310.2.0.5, 11.1.0.7.1, 11.2.0.1"log file sync" timeout is not configurableD631968510.2.0.4, 11.1.0.7, 11.2.0.1LGWR posts do not scale on some platforms 619394510.2.0.4.1, 10.2.0.5, 11.1.0.7, 11.2.0.1High LGWR CPU use and long 'log file sync' latency in RAC 977643111.1.0.7.411.1.0.7.3 fix for 8220734 is incomplete - "log file sync" timeout set to 1 second 589696310.2.0.4, 11.1.0.6High LGWR CPU and longer "log file sync" with fix for bug 5065930 514738610.2.0.4.1, 10.2.0.5, 11.1.0.6Long waits on "log file sync" /random ORA-27152 "attempt to post process failed" 508759210.2.0.4, 11.1.0.6"log file sync" waits from read only commits 506593010.2.0.3, 11.1.0.6"log file sync" timeouts can occur 506106810.2.0.3, 11.1.0.6RAC using "broadcast on commit" can see delayed commit times 33112109.2.0.5, 10.1.0.2Unnecessary 0.5 seconds waits for "Broadcast on commit" SCN scheme 26631229.2.0.5, 10.1.0.2Unneccessarily long waits on "log file sync" can occur 26406869.2.0.5, 10.1.0.2Long waits for "log file sync" with broadcast SCN in RAC
  • '*' indicates that an alert exists for that issue.
  • '+' indicates a particularly notable issue / bug.
  • 'I' indicates an install issue / bug included for completeness.
  • 'P' indicates a port specific bug.
  • 'E' indicates an "Enhancement" (as opposed to a bug fix).
  • 'D' indicates that the bug fix is disabled by default.
  • "OERI:xxxx" may be used as shorthand for ORA-600 [xxxx].

Related:


REFERENCES

NOTE:7452373.8 - Bug 7452373 - "log file sync" timeout is not configurable
NOTE:7568734.8 - Bug 7568734 - AIX: Sporadic spikes of 'log file sync' on AIX with heavy commit concurrency
NOTE:7610362.8 - Bug 7610362 - Long "log file sync" waits in RAC with broadcast on commit in RAC
NOTE:7643632.8 - Bug 7643632 - High log file sync in Data Guard maximum availability (sync) mode
NOTE:7716356.8 - Bug 7716356 - Long "log file sync" latencies with broadcast on commit scheme in RAC
NOTE:8220734.8 - Bug 8220734 - Long "log file sync" wait in RAC
NOTE:8490879.8 - Bug 8490879 - Long "log file sync" latencies due to broadcast on commit scheme
NOTE:9095696.8 - Bug 9095696 - "log file sync" wait time spikes with ARCHIVE_LAG_TARGET set
NOTE:9776431.8 - Bug 9776431 - 11.1.0.7.3 fix for 8220734 is incomplete - "log file sync" timeout set to 1 second
NOTE:2640686.8 - Bug 2640686 - Long waits for "log file sync" with broadcast SCN in RAC
NOTE:2663122.8 - Bug 2663122 - Unneccessarily long waits on "log file sync" can occur
NOTE:12378147.8 - Bug 12378147 - Long broadcast ack warning messages, and/or many Log File Sync timeouts in foregrounds in RAC
NOTE:13074706.8 - Bug 13074706 - Long "log file sync" waits in RAC not correlated with slow writes

NOTE:5087592.8 - Bug 5087592 - "log file sync" waits from read only commits
NOTE:5896963.8 - Bug 5896963 - High LGWR CPU and longer "log file sync" with fix for bug 5065930
NOTE:5147386.8 - Bug 5147386 - Long waits on "log file sync" /random ORA-27152 "attempt to post process failed"
NOTE:6193945.8 - Bug 6193945 - High LGWR CPU use and long 'log file sync' latency in RAC
NOTE:61998.1 - Introduction to Tuning Oracle7 / Oracle8 / 8i / 9i
NOTE:6319685.8 - Bug 6319685 - LGWR posts do not scale on some platforms
NOTE:3311210.8 - Bug 3311210 - Unnecessary 0.5 seconds waits for "Broadcast on commit" SCN scheme

NOTE:34583.1 - WAITEVENT: "log file parallel write" Reference Note
NOTE:387174.1 - MAA - Data Guard Redo Transport and Network Best Practices
NOTE:5061068.8 - Bug 5061068 - RAC using "broadcast on commit" can see delayed commit times
NOTE:5065930.8 - Bug 5065930 - "log file sync" timeouts can occur
0 0