Controlfile Structure

来源：互联网发布：mac word怎么看字数编辑：程序博客网时间：2024/05/01 08:03

Controlfile Structure

The first block of the controlfile is a header block that records just the controlfile block size and the number of blocks in the controlfile. The controlfile block size is the same as the database block size. When mounting a database, Oracle checks that the controlfile block size and the file size recorded in the controlfile header block match the db_block_size parameter and the file size reported by the operating system (if available). Otherwise an error is raised to indicate that the controlfile might have been corrupted or truncated.

After the header block, all controlfile blocks occur in pairs. Each logical block is represented by two physical blocks. This is necessary for the controlfile transaction mechanism.

It is theoretically possible that a hot backup of a controlfile could contain a split block. Therefore all controlfile blocks other than the file header have a cache header and tail that can be compared when mounting a database and whenever a controlfile block is read. The block type is 0 for virgin controlfile blocks and 21 otherwise. The physical controlfile block number is used in place of an RDBA in the cache header, and a controlfile sequence number is used in place of an SCN to record when the block was last changed. An ORA-00227 error is returned if the header and tail do not match, or if the block checksum does not match the checksum recorded in the cache header (if any).

The controlfile contains several different types of records, each in its own record section of one or more logical blocks. Records may span block boundaries within their section. The fixed view V$CONTROLFILE_RECORD_SECTION lists the types of records stored in each record section, together with the size of the record type, and the number of record slots available and in use within that section. The underlying X$KCCRS structure includes the starting logical block number (RSLBN) for each section.

Controlfile Transactions

Sessions must hold an exclusive lock on the CF enqueue for the duration of controlfile transactions. This prevents concurrent controlfile transactions, and in-flux controlfile reads, because a shared lock on the CF enqueue is needed for controlfile reads. However, there is also a need for recoverability should a process, instance or system failure occur during a controlfile transaction.

For the first record section of the controlfile, the database information entry section, this requirement is trivial, because the database information entry only takes about 210 bytes and is therefore guaranteed to always fit into a single controlfile block that can be written atomically. Therefore changes to the database entry can be implicitly committed as they are written, without any recoverability concerns.

Recoverability for changes to the other controlfile records sections is provided by maintaining all the information in duplicate. Each logical block is represented by two physical blocks. One contains the current information, and the other contains either an old copy of the information, or a pending version that is yet to be committed. To keep track of which physical copy of each logical block contains the current information, Oracle maintains a block version bitmap with the database information entry in the first record section of the controlfile.

To read information from the controlfile, a session must first read the block version bitmap to determine which physical block to read. Then if a change must be made to the logical block, the change is first written to the alternate physical block for that logical block, and then committed by atomically rewriting the block containing the block version bitmap with the bit representing that logical block flipped. When changes need to be made to multiple records in the same controlfile block, such as when updating the checkpoint SCN in all online datafiles, those changes are buffered and then written together. Note that each controlfile transaction requires at least 4 serial I/O operations against the controlfile, and possibly more if multiple blocks are affected, or if the controlfile is multiplexed and asynchronous I/O is not available. So controlfile transactions are potentially expensive in terms of I/O latency.

Whenever a controlfile transaction is committed, the controlfile sequence number is incremented. This number is recorded with the block version bitmap and database information entry in the first record section of the controlfile. It is used in the cache header of each controlfile block in place of an SCN to detect possible split blocks from hot backups. It is also used in queries that perform multiple controlfile reads to ensure that a consistent snapshot of the controlfile has been seen. If not, an ORA-00235 error is returned.

The controlfile transaction mechanism is not used for updates to the checkpoint heartbeat. Instead the size of the checkpoint progress record is overstated as half of the available space in a controlfile block, so that one physical block is allocated to the checkpoint progress record section per thread. Then, instead of using pairs of physical blocks to represent each logical block, each checkpoint progress record is maintained in its own physical block so that checkpoint heartbeat writes can be performed and committed atomically without affecting any other data.

Cache Header and Tail

All datafile blocks are written and read by the cache layer of the Oracle kernel (KCB) generally through the database buffer cache. The cache layer reads and maintains a 20-byte header and 4-byte tail on each data block, called the cache header and tail. The cache header is called the common block header in V$TYPE_SIZE and elsewhere. Controlfile blocks also have a cache header and tail, although not all the fields are used.

This is what the cache header and tail look like in a datablock dump. (This is taken from a blockdump of the segment header block of the SYSTEM rollback segment.)

buffer tsn: 0 rdba: 0x00400002 (1/2)scn: 0x0000.00e9ffb4 seq: 0x01 flg: 0x04 tail: 0xffb40e01frmt: 0x02 chkval: 0xb31e type: 0x0e=KTU UNDO HEADER W/UNLIMITED EXTENTS

The header is comprised of the following fields.

database
block address 4 bytes The tablespace relative database block address (RDBA). This is constructed from the tablespace relative file number, and the block number of the data block within that file. SCN 6 bytes The SCN at which the block was last changed. The low-order 4 bytes are called the SCN base, and the high-order 2 bytes are called the SCN wrap. sequence 1 byte A sequence number incremented for each change to a block at the same SCN. If the sequence number wraps, a new SCN must be allocated.
The value 0xff is reserved. When present it indicates that the block has been marked as corrupt by Oracle.
flag 1 byte A combination of 1-bit flag values.
1 = virgin block
2 = last change to the block was for a cleanout operation
4 = checksum value is set
8 = temporary data format 1 byte The format of the cache header was changed for Oracle8. Under Oracle8 and 9, the value is always 2. Previously, it was 1. checksum 2 bytes An optional checksum of the block contents. When a block is written, the checksum is either cleared or set depending on the setting of the db_block_checksum parameter. When a block is read, the checksum is verified if present and if the parameter is set to TRUE. Checksums are always calculated and checked for blocks in the SYSTEM tablespace.
The checksum is the XOR of all the other 2-byte pairs in the block. Thus when a block with a checksum is checked, the XOR of all the 2-byte words in the block should be 0.
block type 1 byte The most common block types is 6, which is used for all table, index and cluster data blocks. unused 4 bytes Unused space, possibly for backward or forward compatibility.

The tail is comprised of the low-order two bytes of the SCN base followed by the block type and the sequence number. The consistency of the header and tail is checked whenever a block is read. This detects most block corruptions, in particular split blocks from hot backups.

The physical order of the header fields is: block type, format, unused (2 bytes), RDBA, SCN, sequence, flag, checksum, unused (2 bytes). The following output from BBED (a low level block browser / editor utility) corresponds to the above extract from a blockdump of the segment header block of the SYSTEM rollback segment.

BBED> print kcbhstruct kcbh, 20 bytes                       @0   ub1 type_kcbh                            @0        0x0e   ub1 frmt_kcbh                            @1        0x02   ub1 spare1_kcbh                          @2        0x00   ub1 spare2_kcbh                          @3        0x00   ub4 rdba_kcbh                            @4        0x00400002   ub4 bas_kcbh                             @8        0x00e9ffb4   ub2 wrp_kcbh                             @12       0x0000   ub1 seq_kcbh                             @14       0x01   ub1 flg_kcbh                             @15       0x04 (KCBHFCKV)   ub2 chkval_kcbh                          @16       0xb31e   ub2 spare3_kcbh                          @18       0x0000BBED> print tailchkub4 tailchk                                 @2044     0xffb40e01