WAITEVENT: “log file sync” Reference Note (文档 ID 34592.1)

Versions:7.0 – 11.1 Documentation: 11g 10g
When a user session(foreground process) COMMITs (or rolls back), the session’s redo information needs to be flushed to the redo logfile. The user session will post the LGWR to write all redo required from the log buffer to the redo log file. When the LGWR has finished it will post the user session. The user session waits on this wait event while waiting for LGWR to post it back to confirm all redo changes are safely on disk.

This may be described further as the time user session/foreground process spends waiting for redo to be flushed to make the commit durable. Therefore, we may think of these waits as commit latency from the foreground process (or commit client generally).


See Reducing Waits section below for more detailed breakdown of this wait event.

(“log file sync” also applies to ROLLBACK/UNDO in that once the rollback/undo is complete the end of the rollback/undo operation requires all changes to complete the rollback/undo to be flushed to the redo log)
Individual Waits:

Parameters:

P1 = buffer#
P2 = Not used
P3 = Not used

buffer#
All changes up to this buffer number (in the log buffer) must be flushed to disk and the writes confirmed to ensure that the transaction is committed , and will remain committed upon an instance crash. Hence the wait is for LGWR to flush up to this buffer#.

Wait Time:
The wait is entirely dependent on LGWR to write out the necessary redo blocks and confirm completion back to the user session. The wait time includes the writing of the log buffer and the post. The waiter times out and increments the sequence number every second while waiting.

Finding Blockers:
If a session continues to wait on the the same buffer# then the SEQ# column of <> should increment every second. If not then the local session has a problem with wait event timeouts. If the SEQ# column is incrementing then the blocking process is the LGWR process. Check to see what LGWR is waiting on as it may be stuck.

Systemwide Waits:
Systemwide figures for waits on “log file sync” show the time spent waiting for COMMITs to complete. If this is significant then there may be a problem with LGWR’s ability to flush redo out quickly enough. One can also look at:
“log file parallel write” waits for LGWR (See Note:34583.1)
“user commits” statistic shows the number of commits.

Reducing Waits / Wait times:
Here are 3 main general tuning tips to help you reduce waits on “log file sync”:
Tune LGWR to get good throughput to disk . eg: Do not put redo logs on RAID 5.
If there are lots of short duration transactions see if it is possible to BATCH transactions together so there are fewer distinct COMMIT operations. Each commit has to have it confirmed that the relevant REDO is on disk. Although commits can be “piggybacked” by Oracle reducing the overall number of commits by batching transactions can have a very beneficial effect.
See if any of the processing can use the COMMIT NOWAIT option (be sure to understand the semantics of this before using it).
See if any activity can safely be done with NOLOGGING / UNRECOVERABLE options.
Check to see if redologs are large enough. Enlarge the redologs so the logs switch between 15 to 20 minutes.

For more detailed analysis for reducing waits on LOG FILE SYNC please see below:

The overall wait time for LOG FILE SYNC may be broken down into subsections or components.
If your system still shows high “log file sync” wait times after ensuring the general tuning tips above are completed, you should break down the total wait time into the individual components, then tune those components that make up the largest time.

The log file sync wait may be broken down into the following components:
1. Wakeup LGWR if idle
2. LGWR gathers the redo to be written and issue the I/O
3. Time for the log write I/O to complete
4. LGWR I/O post processing
5. LGWR posting the foreground/user session that the write has completed
6. Foreground/user session wakeup

Tuning advice based on log file sync component breakdown above:
Steps 2 and 3 are accumulated in the “redo write time” statistic. (i.e. as found under STATISICS section of Statspack and AWR)
Step 3 is the “log file parallel write” wait event. (Note.34583.1:”log file parallel write” Reference Note:)
Steps 5 and 6 may become very significant as the system load increases. This is because even after the foreground has been posted it may take a some time for the OS to schedule it to run. May require monitoring from O/S level.

Data Guard Perspective:
For Data Guard with synchronous (SYNC) transport and commit WAIT defaults, the above tuning steps still apply, except step 3 also includes the time for the network write and the RFS/redo write to the standby redo logs.
This wait event and how it applies to Data Guard is explained in detail in the MAA OTN white paper:
Note 387174.1:MAA – Data Guard Redo Transport and Network Best Practices.