Oracle Transparent Application Failover(TAF) 说明

来源:互联网 发布:navicat找不到数据库 编辑:程序博客网 时间:2024/04/30 21:22

前几天和一个朋友讨论到Oracle Net Services的高级特性的问题,就研究了下。

 

Oracle 官网上的说明参考:

      Enabling Advanced Features of Oracle Net Services

       http://download.oracle.com/docs/cd/B19306_01/network.102/b14212/advcfg.htm#i473297

 

在这篇文章里讨论到了Net Services的几个特性:

·         Configuring Advanced Network Address and Connect Data Information

·         Configuring Runtime Connection Load Balancing

·         Configuring Transparent Application Failover

·         Configuring Connections to Non-Oracle Database Services

 

在这篇文章中,我们重点看一下TAF。

Configuring Transparent Application Failover

http://download.oracle.com/docs/cd/B19306_01/network.102/b14212/advcfg.htm#i475648

 

在RAC的Failover中也对TAF进行了说明:

       Oracle RAC Failover 详解

       http://blog.csdn.net/xujinyang/article/details/6829647

 

 

一.  TAF 介绍

 

1.1  官网对TAF 的说明:

Transparent Application Failover (TAF) is a client-side feature that allows for clients to reconnect to surviving databases in the event of a failure of a database instance. Notifications are used by the server to trigger TAF callbacks on the client-side.

TAF is configured using either client-side specified TNS connect string or using server-side service attributes. However, if both methods are used to configure TAF, the server-side service attributes will supersede the client-side settings. The server-side service attributes are the preferred way to set up TAF.

TAF can operate in one of two modes, Session Failover and Select Failover. Session Failover will recreate lost connections and sessions. Select Failover will replay queries that were in progress.

When there is a failure, callback functions will be initiated on the client-side via OCI callbacks. This will work with standard OCI connections as well as Connection Pool and Session Pool connections. Please see the OCI manual for more details on callbacks, Connection Pools, and Session Pools.

TAF will work with RAC. For more details and recommended configurations, please see the RAC Administration Guide.

TAF will operate with Physical Data Guard to provide automatic failover.

 

1.2  TAF 使用场合

       TAF works with the following database configurations to effectively mask a database failure:

(1)Oracle Real Application Clusters

(2)Replicated systems

(3)Standby databases

(4)Single instance Oracle database

 

 

1.3  FAILOVER_MODE 参数

 

FAILOVER_MODE 参数必须包含CONNECT_DATA 选项,也可以包含一些其他的参数,具体参数和意义参考下表:

 

FAILOVER_MODE Subparameter

Description

BACKUP

Specify a different net service name for backup connections. A backup should be specified when using preconnect to pre-establish connections.

TYPE

Specify the type of failover. Three types of Oracle Net failover functionality are available by default to Oracle Call Interface (OCI) applications:

·         session: Set to failover the session. If a user's connection is lost, a new session is automatically created for the user on the backup. This type of failover does not attempt to recover selects.

·         select: Set to enable users with open cursors to continue fetching on them after failure. However, this mode involves overhead on the client side in normal select operations.

·         none: This is the default. No failover functionality is used. This can also be explicitly specified to prevent failover from happening.

METHOD

Determines how fast failover occurs from the primary node to the backup node:

·         basic: Set to establish connections at failover time. This option requires almost no work on the backup server until failover time.

·         preconnect: Set to pre-established connections. This provides faster failover but requires that the backup instance be able to support all connections from every supported instance.

RETRIES

Specify the number of times to attempt to connect after a failover. If DELAY is specified, RETRIES defaults to five retry attempts.

Note: If a callback function is registered, then this subparameter is ignored.

DELAY

Specify the amount of time in seconds to wait between connect attempts. If RETRIES is specified, DELAY defaults to one second.

Note: If a callback function is registered, then this subparameter is ignored.

 

 

 

二.  TAF的示例

 

2.1 注意事项:

       不能在listener.ora 配置文件的SID_LIST_listener_name 部分设置GLOBAL_DBNAME参数, 这个静态的global配置会禁用TAF.

      

       启用这种Failover的方法就是在客户端的tnsnames.ora中添加FAILOVER=ON 条目,这个参数默认就是ON,所以即使不添加这个条目,客户端也会获得这种Failover能力。

 

 

2.2  TAF with Connect-Time Failover and Client Load Balancing

 

sales.us.acme.com=

 (DESCRIPTION=

  (LOAD_BALANCE=on)

  (FAILOVER=on)

  (ADDRESS=

       (PROTOCOL=tcp) 

       (HOST=sales1-server) 

       (PORT=1521))

  (ADDRESS=

       (PROTOCOL=tcp) 

       (HOST=sales2-server) 

       (PORT=1521))

  (CONNECT_DATA=

     (SERVICE_NAME=sales.us.acme.com)

     (FAILOVER_MODE=

       (TYPE=select)

       (METHOD=basic))))

 

       在这个示例中, Oracle的net 连接会随即去连2个地址,如果连接失败,会去连其他节点。

 

 

2.3  TAF Retrying a Connection

 
sales.us.acme.com=
(DESCRIPTION=
  (ADDRESS=
       (PROTOCOL=tcp) 
       (HOST=sales1-server) 
       (PORT=1521))
  (CONNECT_DATA=
     (SERVICE_NAME=sales.us.acme.com)
     (FAILOVER_MODE=
       (TYPE=select)
       (METHOD=basic)
       (RETRIES=20)
       (DELAY=15))))

 

       在这个示例中,我们设置了一个ADDRESS, 并且设置了 Retries和DELAY 参数。 当连接失败后, Oracle net 会等15秒,然后再次去连接address的地址。 最多重连20次。

 

 

2.4  TAF Pre-Establishing a Connection

 

sales1.us.acme.com=
(DESCRIPTION=
  (ADDRESS=
       (PROTOCOL=tcp) 
       (HOST=sales1-server) 
       (PORT=1521))
  (CONNECT_DATA=
     (SERVICE_NAME=sales.us.acme.com)
     (INSTANCE_NAME=sales1)
     (FAILOVER_MODE=
       (BACKUP=sales2.us.acme.com)
       (TYPE=select)
       (METHOD=preconnect))))
sales2.us.acme.com=
(DESCRIPTION=
  (ADDRESS=
       (PROTOCOL=tcp) 
       (HOST=sales2-server) 
       (PORT=1521))
  (CONNECT_DATA=
     (SERVICE_NAME=sales.us.acme.com)
     (INSTANCE_NAME=sales2)
     (FAILOVER_MODE=
       (BACKUP=sales1.us.acme.com)
       (TYPE=select)
       (METHOD=preconnect))))

 

       在这里我们设置成preconnect模式。 就是在最初建立连接时就同时建立到所有实例的连接,当发生故障时,立刻就可以切换到其他链路上。

       BASIC方式在Failover时会有时间延迟,PRECONNECT方式虽然没有时间延迟,但是建立多个冗余连接会消耗更多资源,两者就是是用时间换资源和用资源换时间的区别。

 

       这里要注意, 如果使用preconnect 模式,那么必须指定BACKUP参数。

 

 

三.  在Data Guard 下验证TAF

       RAC 下的TAF 之前做过多次, 这里用Data Guard 做一个验证。

 

在客户端的tnsnames.ora 文件里添加如下参数:

TAFTEST=

 (DESCRIPTION=

  (LOAD_BALANCE=on)

  (FAILOVER=on)

  (ADDRESS= (PROTOCOL=tcp) (HOST=192.168.6.2) (PORT=1521))

  (ADDRESS= (PROTOCOL=tcp) (HOST=192.168.6.3) (PORT=1521))

  (CONNECT_DATA= (SERVICE_NAME=orcl)

     (FAILOVER_MODE=

       (TYPE=select)

       (METHOD=basic)

       )))

 

用Tnsping 测试一下:

C:/Users/Administrator.DavidDai>tnsping taftest

TNS Ping Utility for 32-bit Windows: Version 11.2.0.1.0 - Production on 13-12月-2010 00:37:08

Copyright (c) 1997, 2010, Oracle.  All rights reserved.

已使用的参数文件:

D:/app/Administrator/product/11.2.0/dbhome_1/network/admin/sqlnet.ora

 

已使用 TNSNAMES 适配器来解析别名

尝试连接 (DESCRIPTION= (LOAD_BALANCE=on) (FAILOVER=on) (ADDRESS= (PROTOCOL=tcp) (HOST=192.168.6.2) (PORT=1521)) (ADDRESS= (PROTOCOL=tcp) (HOST=192.168.6.3) (PORT=1521)) (CONNECT_DATA= (SERVICE_NAME=orcl) (FAILOVER_MODE= (TYPE=select) (METHOD=basic))))

OK (20 毫秒)

 

C:/Users/Administrator.DavidDai>sqlplus /nolog

SQL*Plus: Release 11.2.0.1.0 Production on 星期一 12月 13 00:40:49 2010

Copyright (c) 1982, 2010, Oracle.  All rights reserved.

SQL>  conn sys/oracle@taftest as sysdba;

已连接。

SQL>  select db_unique_name from v$database;

DB_UNIQUE_NAME

------------------------------

orcl_pd

 

这时,我们把主库shutdown,在来查看:

SQL> select db_unique_name from v$database;

DB_UNIQUE_NAME

------------------------------

orcl_st

 

这里变成了备库,但是备库是mount standby模式,我们查看确认一下:

SQL> select open_mode from v$database;

OPEN_MODE

----------

MOUNTED

SQL>

 

TAF 切换成功。

 

       我们还可以通过对V$SESSION 视图的FAILOVER_TYPE, FAILOVER_METHOD,和 FAILED_OVER 三个字段的查看来验证TAF 的配置。

 

SQL 如下:

SQL> SELECT MACHINE, FAILOVER_TYPE, FAILOVER_METHOD, FAILED_OVER, COUNT(*)

FROM V$SESSION

GROUP BY MACHINE, FAILOVER_TYPE, FAILOVER_METHOD, FAILED_OVER;

 

MACHINE              FAILOVER_TYPE FAILOVER_M FAI   COUNT(*)

-------------------- ------------- ---------- --- ----------

dg1                  NONE          NONE       NO           2

dg2                  NONE          NONE       NO          15

WORKGROUP/DAVIDDAI   SELECT        BASIC      YES          1

 

 

 

 

      做这个测试的目的就是为了DG 切换的方便。 一般情况下应用会连接数据库是对应一个实例,假设这个数据库是DG. 当某次意外,我们进行了主备切换,这时候,IP地址发生改变,应用就不能连接到备库了。 所以,这就是对Data Guard设置TAF的意义。 设置TAF之后,即使发生切换,我们也可以不用修改IP,应用能正常连接数据库。

 

      当然如果客户端比较多的情况下,修改监听配置也是很麻烦的。 不过现在的系统,很多都是通过中间件与数据库进行连接的。 这种情况下,我们只需要把中间件与数据库连接这块搞定就ok了。

 

 

 

 

 

 

------------------------------------------------------------------------------