MySQL InnoDB Primary Key Choice: GUID/UUID vs Integer Insert Performance
来源:互联网 发布:后端数据库 编辑:程序博客网 时间:2024/06/08 19:31
When working with MySQL it is common to choose auto incrementing integer primary keys; however, there are situations where using a GUID/UUID is desirable. For example, prior to MySQL 5.0, you were unable to safely use auto incrementing primary keys in a multi-master replicated setup due to potential (and inevitable) duplicate key generation on the multiple masters. (This issue can be avoided in MySQL 5.0 and greater by appropriately configuring auto_increment_increment & auto_increment_offset on each master.) Other reasons GUIDs/UUIDs are desirable include: data set merging ease and non-sequential, unique identification for security (think credit card numbers). Regardless of the motivation for their use, there is a very real consequence I encountered while working on a rapidly increasing MySQL InnoDB data set. In short, the larger the data set became, the longer it took to insert a new record. This behavior is not necessarily unexpected, but the degree to which the insert time grew became rather problematic. So, years later, I've decided to take a closer look to see if I could replicate this issue with MySQL 5.0.45, and if so, determine (to a degree) what was happening.
Code
For the sake of brevity, I am not including the actual Java code used to run this experiment; however, I will explain the general idea. Essentially, the code truncates all tables and then proceeds to loop through each table, inserting 100,000 per transaction, recording the time elapsed. It should be noted that id's for the innodb_integer and innodb_uuid tables are generated in the Java code and passed via a PreparedStatement.
(how to read: it takes 20 minutes to insert 100,000 records into a UUID table which already contains 15 million records)
(how to read: it takes 25 hours to insert 15 million records into an empty UUID table)
(how to read: it takes 20 minutes to insert 100,000 records into a UUID table which already contains 15 million records)
(how to read: it takes 25 hours to insert 15 million records into an empty UUID table)
Experimental Setup
Hardware:PowerEdge SC440 with a Core2Duo 2160 (dedicated for the duration of the experiment)MySQL:5.0.45DDLcreate table innodb_auto_increment (
id integer auto_increment not null,
primary key (id)
) type=innodb;
create table innodb_integer (
id integer not null,
primary key (id)
) type=innodb;
create table innodb_uuid (
id char(36),
primary key (id)
) type=innodb;
id integer auto_increment not null,
primary key (id)
) type=innodb;
create table innodb_integer (
id integer not null,
primary key (id)
) type=innodb;
create table innodb_uuid (
id char(36),
primary key (id)
) type=innodb;
Code
For the sake of brevity, I am not including the actual Java code used to run this experiment; however, I will explain the general idea. Essentially, the code truncates all tables and then proceeds to loop through each table, inserting 100,000 per transaction, recording the time elapsed. It should be noted that id's for the innodb_integer and innodb_uuid tables are generated in the Java code and passed via a PreparedStatement.
And We're Off...
I wasn't sure how long I would have to wait for the expected results to show up, so, I started it up and forgot about it for a couple of days. When I came back here is what I found:(how to read: it takes 20 minutes to insert 100,000 records into a UUID table which already contains 15 million records)
(how to read: it takes 25 hours to insert 15 million records into an empty UUID table)
Initial Thoughts
As you can see both the auto increment and manual increment tables grow pretty gracefully and the UUID table, ya, not so much. So what is going on here? My first guess was the implicit unique index on innodb_uuid's primary key was to blame, but I was not able to say for sure that it simply was not the fact that the id field was bigger on innodb_uuid or that MySQL was configured poorly. So, first I changed the MySQL config a bit, increasing various buffers by a factor of 5 and rerunning for a while. What I found is that increasing the buffers (specifically innodb_buffer_pool_size) did make a difference, but only for a brief period of time. Regardless of how much RAM you provide, this issue will rear its ugly head eventually. In order to narrow down the issue further, the previous experiment was ran with the addition of these tables:create table innodb_uuid_no_key (
id char(36)
) type=innodb;
create table innodb_uuid_no_key_indexed (
id char(36),
index index1 (id)
) type=innodb;
create table innodb_uuid_no_key_unique_indexed (
id char(36),
unique index index1 (id)
) type=innodb;
id char(36)
) type=innodb;
create table innodb_uuid_no_key_indexed (
id char(36),
index index1 (id)
) type=innodb;
create table innodb_uuid_no_key_unique_indexed (
id char(36),
unique index index1 (id)
) type=innodb;
Experiment Run #2
This time around I decided to watch vmstat while the experiment was running. At first, everything was as expected, high IO wait due to heavy writing; however, after a few million records a trend started to appear: when records were being inserted into innodb_uuid the high IO wait persisted but the heavy writing was replaced with very heavy reading and much lower writing. This read/write disparity grew with innodb_uuid's record count. Anyways, I allowed the second run to reach the same 28 million records as the first. The results for all UUID tables can be seen below:(how to read: it takes 20 minutes to insert 100,000 records into a UUID table which already contains 15 million records)
(how to read: it takes 25 hours to insert 15 million records into an empty UUID table)
Conclusion
As you can see innodb_uuid_no_key performs closely to its integer counterparts, innodb_uuid_no_key_indexed exhibits the same trend as innodb_uuid to a much less severe degree and innodb_uuid_no_key_unique_indexed is nearly identical to innodb_uuid. So the unique index appears to be the issue here. But why? Well, given the above information, I'd say that MySQL is unable to buffer enough data to guarantee a value is unique and is therefore caused to perform a tremendous amount of reading for each insert to guarantee uniqueness.That's Great and All, But What About MyISAM?
The MyISAM results were virtually identical (in scale) to that of InnoDB.
源地址:http://kccoder.com/mysql/uuid-vs-int-insert-performance/
- MySQL InnoDB Primary Key Choice: GUID/UUID vs Integer Insert Performance
- INTEGER PRIMARY KEY简介
- INTEGER PRIMARY KEY简介
- mysql primary key key
- iOS UDID VS UUID(GUID)
- INTEGER PRIMARY KEY AUTOINCREMENT 和 INTEGER PRIMARY KEY 区别
- mysql innodb 引擎 key - 和 Primary key的区别 以及 索引的用法总结
- InnoDB的Primary Key不易过大
- Mybatis insert or update return primary key
- Sqlite中INTEGER PRIMARY KEY的修正
- mysql innodb 行锁 insert
- MySql 的unique key 与 primary key
- GUID/UUID
- MySQL - InnoDB vs MyISAM
- DataRow indexing performance (integer vs. string)
- mysql中key 、primary key 、unique key 与index区别
- mysql中key 、primary key 、unique key 与index区别
- mysql中key 、primary key 、unique key 与index区别
- JQuery有用的50段代码
- 百度地图车辆运动轨迹
- 分享非常有用的Java程序 (关键代码) (二)---JavaSript获取上传文件路径的文件名
- pt_in_rect
- libevent杂谈
- MySQL InnoDB Primary Key Choice: GUID/UUID vs Integer Insert Performance
- DelegatingFilterProxy详解
- 开始学习C#-------》关键字
- C#中virtual 方法和abstract方法的区别
- android 从xml文件读取数据初始化,并实现无分级的树状菜单的显示
- 根据点序列的分布来判断点是否孤僻
- LIBEVENT应用举例1
- ASP.NET 缓存概述
- 在win7上安装visual c++ 2008 redistributable 发生错误error 1935