MySQL OSC的实现

来源:互联网 发布:ubuntu 升级安装软件 编辑:程序博客网 时间:2024/04/30 05:34

背景:产品需求经常变更的情况下,造成了线上数据库的表结构需要不停地进行修改。

5.6版本之前,在线变更表结构需要在备库执行alter table操作,然后将备库切为主库;或者使用第三方工具:OAK的oak-online-alter-table或者pt-online-schema-change等实现Online DDL操作。而MySQL 5.6官方提供了一个Online DDL功能。


官方Online DDL

首先介绍下官方的Online DDL,执行方式分为 INPLACE 和 COPY 两种执行方式。

ALGORITHM [=] {DEFAULT|INPLACE|COPY}

DEFAULT:显示地指定该参数与不去指定效果相同。指定为该参数意味着,先尝试使用ALGORITM=INPLACE,如果不被支持则转变为ALGORITHM=COPY

INPLACE:避免重建表。相较INPLACE减少了IO和CPU消耗

COPY:复制原始表。如果是大表复制到临时表会占用buffer pool,内存大量消耗,影响性能。


LOCK [=] {DEFAULT|NONE|SHARED|EXCLUSIVE}

DEFAULT:根据给定的ALGORITHM提供尽可能大的并发性:尽量支持并发读写,如果不被支持则降级为仅支持并发读,如果仍然不被支持则降级为堵塞读写

NONE:可支持并发读写

SHARED:支持并发读,但堵塞写

EXCLUSIVE:堵塞读写(即使ALGORITHM中支持并发操作)


5.6 Online DDL执行过程:

一、prepare阶段

语法检查,合理性、冲突检查

对原表创建临时frm文件

在原表上加表级排他锁(Exclusive-MDL),禁止读写

根据alter table类型确定执行方式:in-replace(Online-rebuild、Online-norebuild),copy

更新数据字典的内存对象,系统表中创建索引

分配row_log对象记录增量日志,增量日志用于记录:DDL操作过程中,记录DML操作对数据的修改。log大小由 innodb_online_alter_log_max_size 决定

执行方式为rebuild,则生成临时ibd文件,提交数据字典操作的事务,释放数据字典的锁


二、DDL执行阶段

降级Exclusive-MDL锁,允许读写

扫描原表的聚簇索引每条记录

遍历新表的聚簇索引和二级索引,逐一处理

根据记录构造对应的索引项

将构造的索引项插入sort_buffer块

利用sort_buffer构造新的索引

如果是rebuild的操作,则还需要处理DDL执行过程中产生的增量,应用row_log,将新数据加入到ibd文件中


三、Commit阶段

升级Exclusive-MDL锁,禁止读写

前一次应用日志到本阶段升级Exclusive-MDL锁这段时间之间的row_log中,可能新产生了日志,再次应用之。

更新innodb的数据字典表

提交事务(刷事务的redo日志)

修改统计信息(数据字典、索引信息等)

rename临时ibd文件、frm文件

变更完成。


几个关键参数:

innodb_online_alter_log_max_size:DDL操作期间产生的日志,保存在内存中,大小由该参数控制,默认128M。可基于会话级别动态调整。

如果产生的日志大于该值,则会抛出如下错误:

Error:1799SQLSTATE:HY000(ER_INNODB_ONLINE_LOG_TOO_BIG)Message: Creating index 'idx_aaa' required more than 'innodb_online_alter_log_max_size' bytes of modification log. Please try again.

old_alter_table:Alter table执行方式(copy、in-place、默认OFF)

tmpdir:DDL执行阶段,构造索引过程中排序时内存空间不足时,需要的临时空间


oak-online-alter-table

oak-online-alter-table采用的是copy的方式执行DDL,执行期间新增的DML产生的数据通过一个触发器同步到临时表。

使用oak-online-alter-table的注意点:

必须为单列唯一索引(联合索引和联合主键不可以,否则触发mysql的一个bug)

不能存在外键、不能存在触发器(对于已有触发器,先备份再删除,再执行oak ddl)

Select * from information_schema.key_column_usage where Table_schema=@dbname and table_name=@tablename and Referenced_table_name is not null ;
Select * from information_schema.key_column_usage where Referenced_table_schema=@dbname and Referenced_table_name=@tablename;

执行前检查是否存在大查询,导致Online DDL失败

执行前预估执行时间,选择业务低谷期执行

执行完之后,需要进行数据校验,检查原表和复制的临时表的数据一致性(是因为DDL如果改变表了字段类型,可能导致数据变化)

把临时表切换为正式表

删除OAK触发器

删除原表


1)使用sysbench创建测试表,表结构如下

mysql> use sysbench;mysql> show tables;+--------------------+| Tables_in_sysbench |+--------------------+| sbtest1            |+--------------------+1 row in set (0.00 sec)mysql> show create table sbtest1 \G*************************** 1. row ***************************       Table: sbtest1Create Table: CREATE TABLE `sbtest1` (  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,  `k` int(10) unsigned NOT NULL DEFAULT '0',  `c` char(120) NOT NULL DEFAULT '',  `pad` char(60) NOT NULL DEFAULT '',  PRIMARY KEY (`id`),  KEY `k_1` (`k`)) ENGINE=InnoDB AUTO_INCREMENT=20001 DEFAULT CHARSET=utf8mb4 MAX_ROWS=1000000
表数据2万行
mysql> select count(*) from sbtest1;+----------+| count(*) |+----------+|    20000 |+----------+1 row in set (0.01 sec)

2)检查外键、触发器情况

mysql> select TRIGGER_SCHEMA,TRIGGER_NAME,EVENT_OBJECT_SCHEMA,EVENT_OBJECT_TABLE from information_schema.TRIGGERS where EVENT_OBJECT_SCHEMA='sysbench';Empty set (0.00 sec)mysql> mysql> Select * from information_schema.key_column_usage where Table_schema="sysbench" and table_name="sbtest1" and Referenced_table_name is not null;Empty set (0.01 sec)mysql> mysql> Select * from information_schema.key_column_usage where Referenced_table_schema="sysbench" and Referenced_table_name="sbtest1";Empty set (0.05 sec)

3)使用OAK工具包内的oak-online-alter-table进行在线DDL操作(以增加表字段:last_update_time 和索引:lut 为例)

每次从原表中取出的行数: -c CHUNK_SIZE, --chunk-size=CHUNK_SIZENumber of rows to act on in chunks. Default: 1000

[root@237_12 ~]# oak-online-alter-table -uroot --ask-pass -S /tmp/mysqld.sock -d sysbench -t sbtest1 -g new_sbtest1 -a "add last_update_time timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,add key lut(last_update_time)" --sleep=300 --skip-delete-pass-- Connecting to MySQLPassword: -- Table sysbench.sbtest1 is of engine innodb-- Checking for UNIQUE columns on sysbench.sbtest1, by which to chunk-- Possible UNIQUE KEY column names in sysbench.sbtest1:-- - id-- Table sysbench.new_sbtest1 has been created-- Table sysbench.new_sbtest1 has been altered-- Checking for UNIQUE columns on sysbench.new_sbtest1, by which to chunk-- Possible UNIQUE KEY column names in sysbench.new_sbtest1:-- - id-- Checking for UNIQUE columns on sysbench.sbtest1, by which to chunk-- - Found following possible unique keys:-- - id (int)-- Chosen unique key is 'id'-- Shared columns: c, pad, k, id-- Created AD trigger-- Created AU trigger-- Created AI trigger-- Attempting to lock tables-- Tables locked WRITE-- id (min, max) values: ([1L], [20000L])-- Tables unlocked-- - Reminder: altering sysbench.sbtest1: add last_update_time timestamp...-- Copying range (1), (1000), progress: 0%-- + Will sleep for 0.3 seconds-- Copying range (1000), (2000), progress: 5%-- + Will sleep for 0.3 seconds-- Copying range (2000), (3000), progress: 10%-- + Will sleep for 0.3 seconds-- Copying range (3000), (4000), progress: 15%-- + Will sleep for 0.3 seconds-- Copying range (4000), (5000), progress: 20%-- + Will sleep for 0.3 seconds-- Copying range (5000), (6000), progress: 25%-- + Will sleep for 0.3 seconds-- Copying range (6000), (7000), progress: 30%-- + Will sleep for 0.3 seconds-- Copying range (7000), (8000), progress: 35%-- + Will sleep for 0.3 seconds-- Copying range (8000), (9000), progress: 40%-- + Will sleep for 0.3 seconds-- Copying range (9000), (10000), progress: 45%-- + Will sleep for 0.3 seconds-- Copying range (10000), (11000), progress: 50%-- + Will sleep for 0.3 seconds-- Copying range (11000), (12000), progress: 55%-- + Will sleep for 0.3 seconds-- Copying range (12000), (13000), progress: 60%-- + Will sleep for 0.3 seconds-- Copying range (13000), (14000), progress: 65%-- + Will sleep for 0.3 seconds-- Copying range (14000), (15000), progress: 70%-- + Will sleep for 0.3 seconds-- Copying range (15000), (16000), progress: 75%-- + Will sleep for 0.3 seconds-- Copying range (16000), (17000), progress: 80%-- + Will sleep for 0.3 seconds-- Copying range (17000), (18000), progress: 85%-- + Will sleep for 0.3 seconds-- Copying range (18000), (19000), progress: 90%-- + Will sleep for 0.3 seconds-- Copying range (19000), (20000), progress: 95%-- + Will sleep for 0.3 seconds-- Copying range 100% complete. Number of rows: 20000-- Ghost table creation completed. Note that triggers on sysbench.sbtest1 were not removed[root@237_12 ~]#

此时模拟DDL操作期间原表有新数据插入

mysql> insert into sbtest1 values(99999,99999,"c99999","pad99999");Query OK, 1 row affected (0.00 sec)mysql> select count(*) from sbtest1;+----------+| count(*) |+----------+|    20001 |+----------+1 row in set (0.01 sec)

待online DDL操作完成之后,查看new_sbtest1表的数据量:

mysql> select count(*) from new_sbtest1;+----------+| count(*) |+----------+|    20001 |+----------+1 row in set (0.01 sec)

我们注意观察oak的输出日志:Copying range 100% complete. Number of rows: 20000

说明在DDL操作期间之前的原表数据是通过COPY操作复制到新表上去,而操作期间的DML带来的数据变更通过触发器来同步到新表中去。

OAK触发器信息如下:

mysql> select TRIGGER_SCHEMA,TRIGGER_NAME,EVENT_OBJECT_SCHEMA,EVENT_OBJECT_TABLE from information_schema.TRIGGERS where EVENT_OBJECT_SCHEMA='sysbench';+----------------+----------------+---------------------+--------------------+| TRIGGER_SCHEMA | TRIGGER_NAME   | EVENT_OBJECT_SCHEMA | EVENT_OBJECT_TABLE |+----------------+----------------+---------------------+--------------------+| sysbench       | sbtest1_AI_oak | sysbench            | sbtest1            || sysbench       | sbtest1_AU_oak | sysbench            | sbtest1            || sysbench       | sbtest1_AD_oak | sysbench            | sbtest1            |+----------------+----------------+---------------------+--------------------+3 rows in set (0.00 sec)

4)数据一致性校验

1、查看表结构及索引信息

mysql> desc sbtest1;+-------+------------------+------+-----+---------+----------------+| Field | Type             | Null | Key | Default | Extra          |+-------+------------------+------+-----+---------+----------------+| id    | int(10) unsigned | NO   | PRI | NULL    | auto_increment || k     | int(10) unsigned | NO   | MUL | 0       |                || c     | char(120)        | NO   |     |         |                || pad   | char(60)         | NO   |     |         |                |+-------+------------------+------+-----+---------+----------------+4 rows in set (0.00 sec)mysql> desc new_sbtest1;+------------------+------------------+------+-----+-------------------+-----------------------------+| Field            | Type             | Null | Key | Default           | Extra                       |+------------------+------------------+------+-----+-------------------+-----------------------------+| id               | int(10) unsigned | NO   | PRI | NULL              | auto_increment              || k                | int(10) unsigned | NO   | MUL | 0                 |                             || c                | char(120)        | NO   |     |                   |                             || pad              | char(60)         | NO   |     |                   |                             || last_update_time | timestamp        | NO   | MUL | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |+------------------+------------------+------+-----+-------------------+-----------------------------+5 rows in set (0.00 sec)

2、新、旧表数据总量的校验,上面的 3)已经展示过

3、对比类型为int的两组字段 id 和 k 的检验和

mysql> select sum(crc32(concat(ifnull(id,'NULL'),ifnull(k,'NULL')))) as sum_old from sbtest1;+----------------+| sum_old        |+----------------+| 42815177029049 |+----------------+1 row in set (0.02 sec)mysql> select sum(crc32(concat(ifnull(id,'NULL'),ifnull(k,'NULL')))) as sum_new from new_sbtest1;+----------------+| sum_new        |+----------------+| 42815177029049 |+----------------+1 row in set (0.02 sec)

5)rename(该阶段虽然会存在锁表的情况,但只需要修改数据字典所以时间非常快)

mysql> use sysbench;Database changedmysql> set names utf8;Query OK, 0 rows affected (0.00 sec)mysql> rename table sbtest1 to old_sbtest1,new_sbtest1 to sbtest1;Query OK, 0 rows affected (0.02 sec)mysql> show tables;+--------------------+| Tables_in_sysbench |+--------------------+| old_sbtest1        || sbtest1            |+--------------------+2 rows in set (0.00 sec)

删除OAK的3个触发器:

mysql> drop trigger sbtest1_AI_oak;Query OK, 0 rows affected (0.00 sec)mysql> drop trigger sbtest1_AU_oak;Query OK, 0 rows affected (0.01 sec)mysql> drop trigger sbtest1_AD_oak;Query OK, 0 rows affected (0.00 sec)

删除原表old_sbtest1

mysql> drop table old_sbtest1;Query OK, 0 rows affected (0.01 sec)mysql> show tables;+--------------------+| Tables_in_sysbench |+--------------------+| sbtest1            |+--------------------+1 row in set (0.00 sec)

至此,使用OAK工具进行Online DDL操作完毕。




(1)官方手册:Online DDL Overview

https://dev.mysql.com/doc/refman/5.6/en/innodb-create-index-overview.html

中文译本:http://blog.csdn.net/paololiu/article/details/53765818

(2)官方手册:pt-online-schema-change

https://www.percona.com/doc/percona-toolkit/2.1/pt-online-schema-change.html

(3)官方手册:oak-online-alter-table

http://openarkkit.googlecode.com/svn/trunk/openarkkit/doc/html/oak-online-alter-table.html

(4)pt-online-schema-change VS oak-online-alter-table:

http://www.cnblogs.com/gomysql/p/3777607.html

(5)Github提供的gh-ost:

http://www.oschina.net/news/76606/gh-ost-github-s-online-migration-tool-for-mysql

http://www.jianshu.com/p/70bc5c06b289