11g Active Standby Database Automatic Block Corruption Repair

来源：互联网发布：第五次人口普查数据编辑：程序博客网时间：2024/05/16 05:56

In addition to the real time query capability of the 11g Active Data Guard feature, we can also add to our high availability capability by using the Automatic Block Media Repair feature whereby data block corruptions on the Primary database can be repaired by obtaining those blocks from the standby site – all performed by a background process (ABMR) transparent to the application.

The same functionality can be used to repair block corruptions on the Active Standby site by applying blocks which are conversely now received from the Primary site.

Let us see a test case of the same.

SQL> alter database recover managed standby database cancel;

Database altered.

SQL> alter database open read only;

Database altered.

SQL> alter database recover managed standby database using current logfile disconnect from session;

Database altered.

Table created.

We create a test table and assign it to the Honcho tablespace.
SQL> create tablespace honcho datafile '/u01/app/oracle/oradata/prim/honcho01.dbf' size 1m;

Tablespace created.

SQL> create table honcho (id number) tablespace honcho;

Table created.

SQL> begin
2    for i in 1..70000 loop
3     insert into honcho values(i);
4    end loop;
5    commit;
6 end;
7 /

PL/SQL procedure successfully completed.

SQL> insert into honcho select * from honcho where rownum < 5000;
insert into honcho select * from honcho where rownum < 5000
*
ERROR at line 1:
ORA-01653: unable to extend table SYS.HONCHO by 8 in tablespace HONCHO

Using DBMS_ROWID, we determine the blocks which this table occupies (if you like, just restrict the query to the first 5 blocks in case the table contains many blocks)
SQL> select * from (select dbms_rowid.rowid_block_number(rowid) from honcho) where rownum < 6;

DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)
------------------------------------
                                  11
                                  11
                                  11
                                  11
                                  11

SQL> alter system switch logfile;

System altered.

SQL> /

System altered.

SQL> /

System altered.

SQL> select count(*) from honcho;

COUNT(*)
----------
70000

We can then corrupt any one of these blocks (in our case block 11) to simulate a block corruption - don't do this in production!

SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
[oracle@honcho ~]$ dd of=/u01/app/oracle/oradata/prim/honcho01.dbf bs=8192 seek=11 conv=notrunc count=1 if=/dev/zero
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 4.5e-05 seconds, 182 MB/s

We now run a query on the Primary database (after flushing the buffer cache first to force a new data block read) and even though we have corrupted a data block, the query completes without an ORA-01578 block corruption error – we notice a slight glitch while the blocks are transported over the network.

But if we examine the database alert log, we will see that a block corruption was detected, but a background process (ABMR) was started which repaired the corrupt blocks.

ALTER SYSTEM: Flushing buffer cache
Hex dump of (file 5, block 11) in trace file /u01/app/oracle/diag/rdbms/prim/prim/trace/prim_ora_5295.trc
Corrupt block relative dba: 0x0140000b (file 5, block 11)
Completely zero block found during multiblock buffer read
Reading datafile '/u01/app/oracle/oradata/prim/honcho01.dbf' for corruption at rdba: 0x0140000b (file 5, block 11)
Reread (file 5, block 11) found same corrupt data
Starting background process ABMR
Tue Oct 11 22:41:34 2011
ABMR started with pid=31, OS id=5312
Auto BMR service is active.
Requesting Auto BMR for (file# 5, block# 11)
Waiting Auto BMR response for (file# 5, block# 11)
Auto BMR successful

Let us see how the same scenario pans out on the Active Standby site.

We run the same block corruption ‘dd’ command now on the standby host and when we run the query the first time, we will get an error as shown below.

SQL> alter system flush buffer_cache;

System altered.

SQL> select count(*) from honcho;
select count(*) from honcho
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 5, block # 11)
ORA-01110: data file 5: ‘/u01/app/oracle/oradata/prim/honcho01.dbf’

But if we run the same query again, we will not see any error as the blocks have now been repaired from the Primary database site.

SQL> select count(*) from honcho;

COUNT(*)
----------
70000