对oracle 热备份的具体过程及讲解

来源：互联网发布：c语言验证哥德巴赫猜想编辑：程序博客网时间：2024/05/16 17:35

ORACLE TABLESPACE HOT BACKUP MODE REVISITED

Here’s a revised version of an old popular article I wrote over ten years ago. I wrote this when I was at Amazon.com,
long before I came to work at remote DBA provider Blue Gecko. Enjoy!

Oracle’s pre-RMAN hot backup mode is the subject of one of the most pervasive and persistent misconceptions about Oracle.

During an Oracle tablespace hot backup, you (or your script) puts a tablespace into backup mode, then copies the datafiles to disk or tape,
then takes the tablespace out of backup mode. These steps are widely understood by most DBAs.

However, there is a popular misconception that datafiles are “quiesced,” “frozen,” “offlined” or “locked” during backup mode.
So many people think it is true, that it appears in some books on Oracle and on numerous websites. Some have even reported that they
learned this from DBA class instructors.

The myth has a couple permutations. One is that while the datafiles are allegedly not writable, changes are stored somewhere in the SGA
, the redologs, the rollback segments or some combination thereof, then written back into the datafile when the tablespace is taken out of
backup mode. There is a passage in the SAMS title Oracle Unleashed describing this supposed mechanism.

/** -- 这里是本文作者引用别人的文章中的话，但是这个文章中的观点是错误的，后面本文作者会对这个文章的观点进行反驳。
When you place a tablespace in backup mode, the Oracle instance notes that a backup is being performed and internally compensates for it.
As you know, it is impossible to make an authentic copy of a database file that is being written to. On receipt of the command to begin the
backup, however, Oracle ceases to make direct changes to the database file. It uses a complex combination of rollback segments, buffers
, redo logs, and archive logs to store the data until the end backup command is received and the database files are brought back in sync.
Simplifying a hot backup in this way is tantamount to classifying the USS Nimitz as a boat. The complexity of the actions taken by the Oracle
RDBMS under a hot backup could consume an entire chapter and is beyond the scope of this book. What you should understand is the
trade-off for taking a hot backup is increased use of rollback segments, redo logs, archive logs, and internal buffer areas within the SGA.
– Oracle Unleashed, Copyright ? SAMS/Macmillan, Inc. 1997, chapter 14)
**/

No No No! Stop making stuff up! Oracle’s tablespace hot backup does not work this way at all. It is actually a simple, elegant and
failure-resistant mechanism. It absolutely does not stop writing to the datafiles. It actually allows continued operation of the database almost
exactly as during normal operation. Contrary to the characterization as “complex” in SAMS Oracle Unleashed, it can be summarized in a few
steps:

§ DBWn checkpoints the tablespace (writes out all dirty blocks as of a given SCN)

§ CKPT stops updating the Checkpoint SCN field in the datafile headers and begins updating the Hot Backup Checkpoint SCN field instead

§ LGWR begins logging full images of changed blocks the first time a block is changed after being written by DBWn

Those three actions are all that is required to guarantee consistency once the file is restored and recovery is applied. By freezing the Checkpoint
SCN, any subsequent recovery on that backup copy of the file will know that it must commence at that SCN. Having an old SCN in the file
header tells recovery that the file is an old one, and that it should look for the archivelog containing that SCN, and apply recovery starting there.

Note that during hot backup mode, checkpoints to datafiles are not suppressed. Only the main Checkpoint SCN flag is frozen, but CKPT
continues to update a Hot Backup Checkpoint SCN in the file header.

There is a confusing side effect of having the Checkpoint SCN frozen at an SCN earlier than the true checkpointed SCN of the database. In the
event of a system crash or a shutdown abort during hot backup of a tablespace, the automatic crash recovery routine during startup will look
at the file headers, think that the files for that tablespace are out of date, and will suggest that you need to apply old archived redologs in order
to bring them back into sync with the rest of the database. Fortunately, no media recovery is necessary. With the database started up in mount
mode:

SQL> alter database end backup;

This action will bring the Checkpoint SCN in the file headers in sync with the Hot Backup Checkpoint SCN (which is a true representation of the
last SCN to which the datafile is checkpointed). Once you do this, normal crash recovery can proceed during ‘alter database open;’.

By initially checkpointing the datafiles that comprise the tablespace and logging full block images to redo, Oracle guarantees that any blocks
changed in the datafile while in hot backup mode will also be present in the archivelogs in case they are ever used for a recovery. Most of the
Oracle user community knows that Oracle generates a greater volume of redo during hot backup mode. This is the result of Oracle logging of
full images of changed blocks in these tablespaces. Normally, Oracle writes a change vector to the redologs for every change, but it does not
write the whole image of the database block. Full block image logging during backup eliminates the possibility that the backup will contain
unresolvable split blocks. To understand this reasoning, you must first understand what a split block is.

Typically, Oracle database blocks are a multiple of O/S blocks. For example, most Unix filesystems have a default block size of 512 bytes, while
Oracle’s default block size is 8k. This means that the filesystem stores data in 512 byte chunks, while Oracle performs reads and writes in 8k
chunks or multiples thereof. While backing up a datafile, your backup script makes a copy of the datafile from the filesystem, using O/S utilities
such as copy, dd, cpio, or OCOPY. As it is making this copy, your process is reading in O/S-block-sized increments. If DBWn happens to be writing
a DB block into the datafile at the same moment that your script is reading that block’s constituent O/S blocks, your copy of the DB block could
contain some O/S blocks from before the database performed the write, and some from after. This would be a split block. By logging the full
block image of the changed block to the redologs, Oracle guarantees that in the event of a recovery, any split blocks that might be in the backup
copy of the datafile will be resolved by overlaying them with the full legitimate image of the block from the archivelogs. Upon completion of a
recovery, any blocks that got copied in a split state into the backup will have been resolved by overlaying them with the block images from the
archivelogs.

All of these mechanisms exist for the benefit of the backup copy of the files and any future recovery. They have very little effect on the current
datafiles and the database being backed up. Throughout the backup, server processes read datafiles DBWn writes them, just as when a backup
is not taking place. The only difference in the open database files is the frozen Checkpoint SCN, and the active Hot Backup Checkopint SCN. To
demonstrate the principle, we can formulate a simple proof:

Create a table and insert a row:

SQL> create table fruit (kind varchar2(32)) tablespace users;
Table created.

SQL> insert into fruit values ('orange');
1 row created.

SQL> commit;
Commit complete.
Force a checkpoint, to flush dirty blocks to the datafiles.

SQL> alter system checkpoint;
System altered.
Get the file name and block number where the row resides:

SQL> select dbms_rowid.rowid_relative_fno(rowid) file_num,
dbms_rowid.rowid_block_number(rowid) block_num,
kind
from fruit;
FILE_NUM BLOCK_NUM KIND
-------- --------- ------
4 183 orange

SQL> select name from v$datafile where file# = 4;
NAME
-----------------------------
/u01/oradata/uw01/users01.dbf
Use the dd utility to skip to block 183 and extract the DB block containing the row:

unixhost% dd bs=8k skip=183 count=1 if=/u01/oradata/uw01/users01.dbf | strings
1+0 records in
16+0 records out
orange
Now we put the tablespace into hot backup mode:

SQL> alter tablespace users begin backup;
Tablespace altered.
Update the row, commit, and force a checkpoint on the database.

SQL> update fruit set kind = 'plum';
1 row updated

SQL> commit;
Commit complete.

SQL> alter system checkpoint;
System altered.
Extract the same block. It shows that the DB block has been written to disk during backup mode:

unixhost% dd bs=8k skip=183 count=1 if=/u01/oradata/uw01/users01.dbf | strings
1+0 records in
16+0 records out
plum
orange
Don’t forget to take the tablespace out of backup mode!

SQL> alter tablespace administrator end backup;
Tablespace altered.
It is quite clear from this demonstration that datafiles receive writes even during hot backup mode!

★ 关于热备的重点归纳：

Note that during hot backup mode, checkpoints to datafiles are not suppressed. Only the main Checkpoint SCN flag is frozen
, but CKPT continues to update a Hot Backup Checkpoint SCN in the file header.
虽然热备会冻结控制文件头和数据文件头，但是只是冻结不能写入 checkpoint scn, 但是 Hot Backup Checkpoint SCN还是能写入的。

§ DBWn checkpoints the tablespace (writes out all dirty blocks as of a given SCN)
热备的时候 DBWn 还是正常写数据的，当发生增量检查点或者完全检查点的时候还是会写（但是数据文件头和控制文件头将被冻结）
§ CKPT stops updating the Checkpoint SCN field in the datafile headers and begins updating the Hot Backup Checkpoint SCN field instead
ckpt进程将不会记录传统的 checkpoint scn ，而是会用记录 HOT Backup Checkpoint SCN 替代
§ LGWR begins logging full images of changed blocks the first time a block is changed after being written by DBWn
lgwr进程会记录每一个被更改的块的完整镜像进日志（包含这个块中那条记录之前的样子和被更改之后的样子）

热备过程中，数据文件头和控制文件头都会被冻结。

但是对数据的DML，依然会通过DBWn记录进数据文件（条件跟正常情况下一样，当redo bufffer到达rba target的时候，触发增量检查点，通知DBWn
来写数据文件，同时CKPT进程更新控制文件中的checkpoint scn（这里是用 Hot Backup Checkpoint SCN 来替代，这里应该是一个全新的记录位置））

如果在热备过程中发生数据库崩溃或者停电或者shutdown abort等情况，只需要在重启的时候，进入mount阶段，然后关闭热备：
SQL> alter database end backup;
/**
This action will bring the Checkpoint SCN in the file headers in sync with the Hot Backup Checkpoint SCN
(which is a true representation of the last SCN to which the datafile is checkpointed).
**/
这一步，数据库会将，最新的 Hot Backup Checkpoint SCN（也是真正的崩溃前记录到的最新SCN）同步写入到 Checkpoint SCN。
随后，正常打开数据库：
SQL> alter database open;
这个时候，数据库会自动做普通的介质恢复。（因为只有最近一次检查点触发到日志on-disk-rba的位置）