TimesTen 数据库复制学习:19. 解决复制冲突

来源:互联网 发布:小说下载软件 编辑:程序博客网 时间:2024/04/27 08:19

本文针对的是classic复制的双向复制模式

复制冲突是如何产生的

有update冲突,insert冲突,delete/update冲突,在本节中都有示例说明。基本都是由于两个master同时操作同一条记录所致,导致从两方发出的操作交织在一起,导致不一致,如果是串行的倒也无所谓了。还需强调一点,这些冲突可以检测到,但并不会自动解决,需要应用来处理

Within a single database, update conflicts are prevented by the locking protocol: only one transaction at a time can update a specific row in the database. However, update conflicts can occur in replicated systems due to the ability of each database to operate independently

Update conflicts: This type of conflict occurs when concurrently running transactions at different databases make simultaneous update requests on the same row in the same table, and install different values for one or more columns.

Uniqueness conflicts: This type of conflict occurs when concurrently running transactions at different databases make simultaneous insert requests for a row in the same table that has the same primary or unique key, but different values for one or more other columns.

Delete conflicts: This type of conflict occurs when a transaction at one database deletes a row while a concurrent transaction at another database simultaneously updates or inserts the same row. Currently, TimesTen can detect delete/update conflicts, but cannot detect delete/insert conflicts. TimesTen cannot resolve either type of delete conflict.

在实际的应用中,复制冲突是很难模拟也是很难发现的,后面的例子通过停止复制代理可以模拟和发现此情况。

update冲突

master1> update a1 set v = 101 where k = 1;1 row updated.master2>  update a1 set v = 102 where k = 1;1 row updated.master2> select * from a1;< 1, 102 >1 row found.

更常见的应用是两个应用更改的同一主键记录的不同非主键部分。

insert冲突(唯一性冲突)

insert冲突则是指两方同时插入一条数据,主键一致,但其它列的值不一样

master1> insert into a1 values(2, 11);1 row inserted.master2>  insert into a1 values(2, 12);  907: Unique constraint (A1 on ORACLE.A1) violated at Rowid <BMUFVUAAACZAAAAJBp>The command failed.

delete冲突

master1> delete from a1 where k = 2;1 row deleted.master2> update a1 set v = 102 where k = 2;0 rows updated.

简单来说,以上的冲突都发生在针对同一个主键的同一条记录上。
但也可以不算是冲突,毕竟TimesTen保留最后的一个交易

Although TimesTen cannot ensure synchronization between databases following such a conflict, it does ensure that the most recent transaction is applied to each database.

使用 timestamp 解决冲突

For replicated tables that are subject to conflicts, create the table with a special column of type BINARY(8) to hold a timestamp value that indicates the time the row was inserted or last updated.

简单来说,为复制的表添加一个timestamp列,系统自动产生,或应用写入时间。然后最新的时间所属的交易将胜出。

the conflict is resolved by comparing the two timestamp values and the operation with the larger timestamp wins

配置 timestamp 比较

  1. 为复制的表加一列BINARY(8),必须在建表时建立,不能后续ALTER TABLE来加
  2. 在创建rep scheme时,指定检测冲突的列
    CREATE REPLICATION r1
    ELEMENT elem_accounts_1 TABLE accounts
    CHECK CONFLICTS BY ROW TIMESTAMP
    COLUMN tstamp
    UPDATE BY SYSTEM - 系统自动产生时间戳,也可以用UPDATE BY USER,让应用来产生时间戳
    ON EXCEPTION ROLLBACK WORK

可见保持时间同步在数据复制环境下的重要性了,通过时间戳,就可以决定选择保留哪些交易

汇报冲突

可以以文本,XML的形式产生汇报冲突。

可以设定阈值,当产生冲突太多,停止产生汇报冲突。

即使不用REPORT TO ‘conflict.txt’ FORMAT STANDARD 来设置,也会记录在tterrors.log中

唯一性冲突示例

相对于之前简单的双向复制,添加的主要是下面的语句:

CHECK CONFLICTS BY ROW TIMESTAMP    COLUMN t    UPDATE BY SYSTEM    ON EXCEPTION ROLLBACK WORK

在master1和master2上同时执行以下:

create table a1(k int, v int, t binary(8), primary key(k));CREATE REPLICATION r1ELEMENT elem1 TABLE a1  CHECK CONFLICTS BY ROW TIMESTAMP    COLUMN t    UPDATE BY SYSTEM    ON EXCEPTION ROLLBACK WORK    REPORT TO 'conflict.txt' FORMAT STANDARD  MASTER master1 ON "timesten-hol"  SUBSCRIBER master2 ON "timesten-hol" RETURN RECEIPTELEMENT elem2 TABLE a1  CHECK CONFLICTS BY ROW TIMESTAMP    COLUMN t    UPDATE BY SYSTEM    ON EXCEPTION ROLLBACK WORK    REPORT TO 'conflict.txt' FORMAT STANDARD  MASTER master2 ON "timesten-hol"  SUBSCRIBER master1 ON "timesten-hol" RETURN RECEIPT;call ttrepstart;

查看复制策略, 注意第一部分带Conflict的部分

master2> repschemes;Replication Scheme ORACLE.R1:  Element: ELEM1                            Type: Table ORACLE.A1  Conflict Check Column: T                                Conflict Exception Action: Rollback Work  Conflict Timestamp Update: System  Conflict Report File: /home/oracle/TimesTen/tt1122/info/conflict.txt  Conflict Report Format: Standard  Master Store: MASTER1 on TIMESTEN-HOL Transmit Durable  Subscriber Store: MASTER2 on TIMESTEN-HOL Return Receipt  Element: ELEM2                            Type: Table ORACLE.A1  Conflict Check Column: T                                Conflict Exception Action: Rollback Work  Conflict Timestamp Update: System  Conflict Report File: /home/oracle/TimesTen/tt1122/info/conflict.txt  Conflict Report Format: Standard  Master Store: MASTER2 on TIMESTEN-HOL Transmit Durable  Subscriber Store: MASTER1 on TIMESTEN-HOL Return Receipt  Store: MASTER1 on TIMESTEN-HOL    Port: (auto)    Log Fail Threshold: (none)    Retry Timeout: 120 seconds    Compress Traffic: Disabled    Return Service Wait Time: 10 seconds    Return Service on Replication Stop: Disabled    Return Service Failure Policy: (none)  Store: MASTER2 on TIMESTEN-HOL    Port: (auto)    Log Fail Threshold: (none)    Retry Timeout: 120 seconds    Compress Traffic: Disabled    Return Service Wait Time: 10 seconds    Return Service on Replication Stop: Disabled    Return Service Failure Policy: (none)1 replication scheme found.

看一下正常复制的行为:

master1> insert into a1 values(1, 1);1 row inserted.master1> select * from a1;< 1, 1, 5779F4B0000C132F >1 row found.master2> select * from a1;< 1, 1, 5779F4B0000C132F >1 row found.

模拟一个insert冲突,可以看到在master1上先插入的数据由于时间戳较老,而被抛弃了。

master1> delete from a1;master2> call ttrepstop;master1> select * from a1;0 rows found.master1> insert into a1 values(1, 1);Warning  8170: Receipt or commit acknowledgement not returned in the specified timeout interval for XID:1.1831 row inserted.master1> select * from a1;< 1, 1, 5779F9BB000DEA5C >1 row found.master2> insert into a1 values(1, 2);1 row inserted.master2> select * from a1;< 1, 2, 5779F9E2000C7F90 >1 row found.master2> call ttrepstart;master1> select * from a1;< 1, 2, 5779F9E2000C7F90 >1 row found.master2> select * from a1;< 1, 2, 5779F9E2000C7F90 >1 row found.

查看错误日志:

[oracle@timesten-hol info]$ cat conflict.txt Conflict detected at 22:54:05 on 07-03-2016Datastore : /tmp/master2Transmitting name : MASTER1Table : ORACLE.A1Conflicting insert tuple timestamp: 5779F9BB000DEA5CExisting tuple timestamp: 5779F9E2000C7F90The existing tuple : <                                                              1,                                                               2, 5779F9E2000C7F90>The conflicting tuple: <                                                              1,                                                               1, 5779F9BB000DEA5C>The key columns for the tuple:<K :1>Transaction containing this insert skippedFailed transaction:Insert into table ORACLE.A1 <                                                              1,                                                               1, 5779F9BB000DEA5C>End of failed transaction在tterrors.log中的记录:22:54:05.34 Err : REP:  4741: [1115072832, 0, noAwt] MASTER2:receiver.c(15509): TT16213: Table: ORACLE.A1. Failed to insert row. TimeStamp conflict

update冲突示例

接着上一个例子做, 两个数据库同时更新一条记录

master1> select * from a1;< 1, 2, 5779F9E2000C7F90 >1 row found.master2> select * from a1;< 1, 2, 5779F9E2000C7F90 >1 row found.master2> call ttrepstop;master1> update a1 set v = 11 where k = 1;Warning  8170: Receipt or commit acknowledgement not returned in the specified timeout interval for XID:1.1951 row updated.master2> update a1 set v = 22 where k = 1;1 row updated.master2> call ttrepstart;master1> select * from a1;< 1, 22, 5779FC6600083A21 >1 row found.master2> select * from a1;< 1, 22, 5779FC6600083A21 >1 row found.

查看错误日志:

[oracle@timesten-hol info]$ cat conflict.txt Conflict detected at 23:04:29 on 07-03-2016Datastore : /tmp/master2Transmitting name : MASTER1Table : ORACLE.A1Conflicting update tuple timestamp: 5779FC49000CC649Existing tuple timestamp: 5779FC6600083A21The existing tuple :<                                                              1,                                                              22, 5779FC6600083A21>The conflicting tuple:<V :11, T :5779FC49000CC649>The old values in the conflicting update:<V :2, T :5779F9E2000C7F90>The key columns for the tuple:<K :1>Transaction containing this update skippedFailed transaction:Update table ORACLE.A1 with keys:<K :1>New tuple value: <V :11, T :5779FC49000CC649>End of failed transaction在tterrors.log中的记录:23:04:29.63 Err : REP:  4815: [1105008960, 0, noAwt] MASTER2:receiver.c(15509): TT16215: Table: ORACLE.A1. Failed to update row. TimeStamp conflict

delete冲突示例

接着上一个例子做, 两个数据库同时更新一条记录

master1> select * from a1;< 1, 2, 5779F9E2000C7F90 >1 row found.master2> select * from a1;< 1, 2, 5779F9E2000C7F90 >1 row found.master2> call ttrepstop;master1> delete from a1 where k = 1;Warning  8170: Receipt or commit acknowledgement not returned in the specified timeout interval for XID:1.2011 row deleted.master1> select * from a1;0 rows found.master2> update a1 set v = 111 where k = 1;1 row updated.master2> select * from a1;< 1, 111, 5779FEB5000BB8AE >1 row found.master2> call ttrepstart;master2> select * from a1;< 1, 111, 5779FEB5000BB8AE >1 row found.master1> select * from a1;0 rows found.

问题来了,和之前的两个实验不同,这里两个表的数据是不一致的,也就是说,并非所有的复制冲突都可以通过时间戳解决。delete冲突就必须通过人工干预才能解决。

An update operation that cannot find the updated row is considered a delete conflict, which is reported but cannot be resolved.

查看错误日志:

[oracle@timesten-hol info]$ cat conflict.txt Conflict detected at 23:15:15 on 07-03-2016Datastore : /tmp/master2Transmitting name : MASTER1Table : ORACLE.A1Conflicting delete tuple timestamp: 5779FE80000B58D1Existing tuple timestamp: 5779FEB5000BB8AEThe existing tuple :<                                                              1,                                                             111, 5779FEB5000BB8AE>The key columns for the tuple:<K :1>Transaction containing this delete skippedFailed transaction:Delete table ORACLE.A1 with keys:<K :1>End of failed transaction在tterrors.log中的记录:23:15:15.58 Err : REP:  4885: [1087179072, 0, noAwt] MASTER2:receiver.c(15509): TT16214: Table: ORACLE.A1. Failed to delete row. TimeStamp conflict

接下来我们看一下如何人工让表保持一致, 一种方法是将master2中这条数据删除:

master2> delete from a1 where k = 1;

这时虽然日志中会报警,不过数据毕竟已经删除了

23:25:20.78 Err : REP:  4720: [1097214272, 0, noAwt] MASTER1:receiver.c(11929): TT16080: Table: ORACLE.A1. Failed to lookup row <K :1> from master MASTER2 for 'delete'

总结

  • classic复制中双向复制是典型的配置
  • 双向复制中,由于两个库可以同时更改,可能产生复制冲突
  • 冲突可以通过设置系统自动产生或应用自动写入的时间戳来检测,一些冲突可以自动解决,一些冲突需要人工干预
  • 复制系统的时间同步非常重要
  • 尽量还是不要让两个数据库执行相同的工作负载,最好是各负责不同的应用,也就是双向复制,但不是严格的双活
0 0