Mysql集群的HA原理及配置指南之字段冲突修复（四）

来源：互联网发布：淘宝卖家自动回复语编辑：程序博客网时间：2024/06/03 16:56

在master-master的构架中，一台mysql机器可能会down掉，但是数据依然是在往另一台写的，但是当down掉的mysql起起来了我们会发现很多问题，其中最明显的就是由于AUTO_INCREMENT的主键导致的数据同步ERROR问题，话不多说，直接干货上：

Start:

1.首先准备2台master-master双主构架的机器，之前的文章（二）有讲这个模式的配置方式，先测试2台机器的同步：

数据库a插入信息:

a ~ $ mysql -u rootmysql> INSERT INTO important.stuff SET details='Gift from A to B';Query OK, 1 row affected (0.03 sec)

数据库b同步：

b ~ $ mysql -u rootmysql> SELECT * FROM important.stuff ORDER BY id DESC LIMIT 1\G*************************** 1. row ***************************      id: 7 details: Gift from A to Bhappened: 2013-03-27 04:09:401 rows in set (0.00 sec)

反过来数据库b插入数据：

mysql> INSERT INTO important.stuff SET details='Gift from B to A';Query OK, 1 row affected (0.03 sec)mysql>

数据库a同步：

mysql> INSERT INTO important.stuff SET details='Gift from B to A';Query OK, 1 row affected (0.03 sec)mysql>

2.在a,b上停止I/O thread,即停止数据备份同步：

mysql> STOP SLAVE IO_THREAD;Query OK, 0 rows affected (0.01 sec)mysql> show slave status\G*************************** 1. row ***************************               Slave_IO_State:                  Master_Host: a.example.com                  Master_User: replicator                  Master_Port: 3306                Connect_Retry: 60              Master_Log_File: mysql-bin.000001          Read_Master_Log_Pos: 3415               Relay_Log_File: mysql-relay-bin.000002                Relay_Log_Pos: 803        Relay_Master_Log_File: mysql-bin.000001             Slave_IO_Running: No            Slave_SQL_Running: Yes              Replicate_Do_DB:          Replicate_Ignore_DB:           Replicate_Do_Table:       Replicate_Ignore_Table:      Replicate_Wild_Do_Table:  Replicate_Wild_Ignore_Table:                   Last_Errno: 0                   Last_Error:                 Skip_Counter: 0          Exec_Master_Log_Pos: 3415              Relay_Log_Space: 959              Until_Condition: None               Until_Log_File:                Until_Log_Pos: 0           Master_SSL_Allowed: No           Master_SSL_CA_File:           Master_SSL_CA_Path:              Master_SSL_Cert:            Master_SSL_Cipher:               Master_SSL_Key:        Seconds_Behind_Master: NULLMaster_SSL_Verify_Server_Cert: No                Last_IO_Errno: 0                Last_IO_Error:               Last_SQL_Errno: 0               Last_SQL_Error:  Replicate_Ignore_Server_Ids:             Master_Server_Id: 201 row in set (0.00 sec)mysql>

3.在a中插入数据：

mysql> INSERT INTO important.stuff SET details='Unique data from A';Query OK, 1 row affected (0.03 sec)

在b中也插入数据：

mysql> INSERT INTO important.stuff SET details='Unique data from B';Query OK, 1 row affected (0.03 sec)

4.在a,b上开启slave:

mysql> START SLAVE;Query OK, 0 rows affected (0.00 sec)

5.检查slave状态，我们会发现会有key的错误：

mysql> show slave status\G*************************** 1. row ***************************               Slave_IO_State: Waiting for master to send event                  Master_Host: b.example.com                  Master_User: replicator                  Master_Port: 3306                Connect_Retry: 60              Master_Log_File: mysql-bin.000001          Read_Master_Log_Pos: 4127               Relay_Log_File: mysql-relay-bin.000003                Relay_Log_Pos: 253        Relay_Master_Log_File: mysql-bin.000001             Slave_IO_Running: Yes            Slave_SQL_Running: No              Replicate_Do_DB:          Replicate_Ignore_DB:           Replicate_Do_Table:       Replicate_Ignore_Table:      Replicate_Wild_Do_Table:  Replicate_Wild_Ignore_Table:                   Last_Errno: 1062                   Last_Error: Error 'Duplicate entry '11' for key 'PRIMARY'' on query. Default database: ''. Query: 'INSERT INTO important.stuff SET details='Unique data from B''                 Skip_Counter: 0          Exec_Master_Log_Pos: 3884              Relay_Log_Space: 1585              Until_Condition: None               Until_Log_File:                Until_Log_Pos: 0           Master_SSL_Allowed: No           Master_SSL_CA_File:           Master_SSL_CA_Path:              Master_SSL_Cert:            Master_SSL_Cipher:               Master_SSL_Key:        Seconds_Behind_Master: NULLMaster_SSL_Verify_Server_Cert: No                Last_IO_Errno: 0                Last_IO_Error:               Last_SQL_Errno: 1062               Last_SQL_Error: Error 'Duplicate entry '11' for key 'PRIMARY'' on query. Default database: ''. Query: 'INSERT INTO important.stuff SET details='Unique data from B''  Replicate_Ignore_Server_Ids:             Master_Server_Id: 201 row in set (0.00 sec)mysql>

到这一步大家应该都知道文章开头说的ERROR原因了，由于auto_increment的key导致两张表里面在slave关闭是插入的数据的id是一样的，但是数据内容不一样这就导致了数据库报错。

6.为解决这个Replication ERROR，我们需要在A中修改配置文件:

mysql> exita ~ $ sudoedit /etc/my.cnf

[mysqld]auto_increment_increment = 2auto_increment_offset = 1relay_log = mysql-relay-binlog_slave_updates = 1log_bin = mysql-binserver_id = 10

重启mysql:

a ~ $ sudo service mysqld restartStopping mysqld:                                           [  OK  ]Starting mysqld:                                           [  OK  ]

在b中同样修改配置文件：

mysql> exitb ~ $ sudoedit /etc/my.cnf

[mysqld]auto_increment_increment = 2auto_increment_offset = 2relay_log = mysql-relay-binlog_slave_updates = 1log_bin = mysql-binserver_id = 20

重启Mysql服务：

b ~ $ sudo service mysqld restartStopping mysqld:                                           [  OK  ]Starting mysqld:                                           [  OK  ]

7.在b中查看ERROR：

b ~ $ mysql -u rootmysql> show slave status\G*************************** 1. row ***************************               Slave_IO_State: Waiting for master to send event                  Master_Host: a.example.com......some content not shown  Replicate_Wild_Ignore_Table:                   Last_Errno: 1062                   Last_Error: Error 'Duplicate entry '11' for key 'PRIMARY'' on query. Default database: ''. Query: 'INSERT INTO important.stuff SET details='Unique data from B''                 Skip_Counter: 0          Exec_Master_Log_Pos: 3882

在表中查找这条id = 11的数据:

mysql> select * from important.stuff where id=11;+----+--------------------+---------------------+| id | details            | happened            |+----+--------------------+---------------------+| 11 | Unique data from B | 2013-04-06 19:33:44 |+----+--------------------+---------------------+1 row in set (0.00 sec)mysql>

关闭b数据库中的sql_log_bin参数（具体作用见前面文章），并检查是否关闭，且删除那条冲突的数据

mysql> SET sql_log_bin="OFF";Query OK, 0 rows affected (0.00 sec)mysql> show variables like "sql_log_bin";+---------------+-------+| Variable_name | Value |+---------------+-------+| sql_log_bin   | OFF   |+---------------+-------+1 row in set (0.00 sec)mysql> delete from important.stuff where id=11;

将sql_log_bin打开并插入和之前相同的数据

mysql> SET sql_log_bin="ON";Query OK, 0 rows affected (0.00 sec)mysql> insert into important.stuff VALUES(NULL, 'Unique data from B', '2013-04-06 19:33:44');Query OK, 1 row affected (0.02 sec)

会发现这次插入的数据id = 12

mysql> select * from important.stuff order by id DESC LIMIT 1;+----+--------------------+---------------------+| id | details            | happened            |+----+--------------------+---------------------+| 12 | Unique data from B | 2013-04-06 19:33:44 |+----+--------------------+---------------------+1 row in set (0.00 sec)mysql>

启动b的slave

mysql> slave start;Query OK, 0 rows affected (0.00 sec)mysql> show slave status\G*************************** 1. row ***************************               Slave_IO_State: Waiting for master to send event                  Master_Host: a.example.com                  Master_User: replicator                  Master_Port: 3306                Connect_Retry: 60              Master_Log_File: mysql-bin.000002          Read_Master_Log_Pos: 107               Relay_Log_File: mysql-relay-bin.000006                Relay_Log_Pos: 253        Relay_Master_Log_File: mysql-bin.000002             Slave_IO_Running: Yes            Slave_SQL_Running: Yes              Replicate_Do_DB:          Replicate_Ignore_DB:           Replicate_Do_Table:       Replicate_Ignore_Table:      Replicate_Wild_Do_Table:  Replicate_Wild_Ignore_Table:                   Last_Errno: 0                   Last_Error:                 Skip_Counter: 0          Exec_Master_Log_Pos: 107              Relay_Log_Space: 555              Until_Condition: None               Until_Log_File:                Until_Log_Pos: 0           Master_SSL_Allowed: No           Master_SSL_CA_File:           Master_SSL_CA_Path:              Master_SSL_Cert:            Master_SSL_Cipher:               Master_SSL_Key:        Seconds_Behind_Master: 0Master_SSL_Verify_Server_Cert: No                Last_IO_Errno: 0                Last_IO_Error:               Last_SQL_Errno: 0               Last_SQL_Error:  Replicate_Ignore_Server_Ids:             Master_Server_Id: 101 row in set (0.00 sec)

会发现a的数据同步进来了，b已恢复正常

mysql> select * from important.stuff where id >= 11;+----+--------------------+---------------------+| id | details            | happened            |+----+--------------------+---------------------+| 11 | Unique data from A | 2013-04-06 19:33:44 || 12 | Unique data from B | 2013-04-06 19:33:44 |+----+--------------------+---------------------+2 rows in set (0.00 sec)mysql>

在a数据库中，我们需要避免b中的之前的冲突数据插入进来

a ~ $ mysql -u rootmysql> SET GLOBAL sql_slave_skip_counter = 1;Query OK, 0 rows affected (0.00 sec)mysql>

启动a的slave

mysql> slave start;Query OK, 0 rows affected (0.00 sec)mysql>

会发现a表存进了b中id=12的数据，此时a也恢复正常

mysql> select * from important.stuff where id >= 11;+----+--------------------+---------------------+| id | details            | happened            |+----+--------------------+---------------------+| 11 | Unique data from A | 2013-04-06 19:33:44 || 12 | Unique data from B | 2013-04-06 19:33:44 |+----+--------------------+---------------------+2 rows in set (0.00 sec)mysql>

总结：

由于自动增加的主键在双主模式数据库down掉时发生的数据conflict问题，我们可以通过修改/etc/my.cnf配置文件修改auto increment的步长从而解决ERROR。

暂时博主也还没有发现其他特别好的方法。

0 0