mongodb pre-splitting sharding测试
来源:互联网 发布:炭知天下价目表 编辑:程序博客网 时间:2024/06/05 19:18
关于mongodb的预分片技术,网上的资料很多,好处是显而易见的,可以减轻因auto-balancer 要split chunk所带来的性能影响,下面介绍Pre-splitting的操作步骤,仅做为测试。
首先,搭建环境,最好是Repl Set+Sharding环境,此次测试是在自己的电脑上做的,两个分片,每个分片3个副本集,要做pre-splitting,最好先计算要分片的collection有多大,平均每条数据有多大,如果不清楚,也可以随便建个coll,插入一条,然后db.XXX.stats()可以看到avgsize,这就是一条数据的大小,单位是byte.
以下是测试过程,本次测试设定chunksize为5M,采用:for(var i=1; i<=200000; i++){
db.test.insert({_id:i,name:"wzw",age:i,uid:i+1});}这条语句插入数据(放到最后操作,此处做为示例),每条数据大概112bytes,20万条数据大概21M。
本次测试在test库中测试。
use test;
mongos> sh.enableSharding("test");
{ "ok" : 1 }
mongos> sh.shardCollection("test.test",{"_id":1});
{ "collectionsharded" : "test.test", "ok" : 1 }
20万条数据 分配在两个分片上,每个分片大概10M(10W条)数据,那么做以下分配:
划分10个chunk,每个chunk 20000条数据;
操作之前先停止自动均衡;
mongos> use testswitched to db test
mongos> sh.stopBalancer();
Waiting for active hosts...
Waiting for the balancer lock...
Waiting again for active hosts after balancer is off...
mongos> use admin
switched to db admin
mongos> for( var y=1; y<10; y++ ) {
... var x=20000
... var prefix = x*y;
... db.runCommand( { split : "test.test" , middle : { "_id": prefix } } );
... }
{ "ok" : 1 }
mongos> sh.status();
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("55c82589acfaad53c0a64845")
}
shards:
{ "_id" : "shard1", "host" : "shard1/127.0.0.1:11111,127.0.0.1:22222,127.0.0.1:33333" }
{ "_id" : "shard2", "host" : "shard2/127.0.0.1:44444,127.0.0.1:55555,127.0.0.1:60000" }
balancer:
Currently enabled: no
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
85 : Success
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "test", "partitioned" : true, "primary" : "shard1" }
test.test
shard key: { "_id" : 1 }
chunks:
shard1 10
{ "_id" : { "$minKey" : 1 } } -->> { "_id" : 20000 } on : shard1 Timestamp(1, 1)
{ "_id" : 20000 } -->> { "_id" : 40000 } on : shard1 Timestamp(1, 3)
{ "_id" : 40000 } -->> { "_id" : 60000 } on : shard1 Timestamp(1, 5)
{ "_id" : 60000 } -->> { "_id" : 80000 } on : shard1 Timestamp(1, 7)
{ "_id" : 80000 } -->> { "_id" : 100000 } on : shard1 Timestamp(1, 9)
{ "_id" : 100000 } -->> { "_id" : 120000 } on : shard1 Timestamp(1, 11)
{ "_id" : 120000 } -->> { "_id" : 140000 } on : shard1 Timestamp(1, 13)
{ "_id" : 140000 } -->> { "_id" : 160000 } on : shard1 Timestamp(1, 15)
{ "_id" : 160000 } -->> { "_id" : 180000 } on : shard1 Timestamp(1, 17)
{ "_id" : 180000 } -->> { "_id" : { "$maxKey" : 1 } } on : shard1 Timestamp(1, 18)
{ "_id" : "user", "partitioned" : true, "primary" : "shard1" }
user.test
shard key: { "_id" : 1 }
chunks:
shard1 1
{ "_id" : { "$minKey" : 1 } } -->> { "_id" : { "$maxKey" : 1 } } on : shard1 Timestamp(1, 0)
可以看到,系统按照预先定义的规则分配了10个chunk,都在shard1上,数据分配很均匀。
mongos> use test
switched to db test
mongos> db.test.stats();
{
"sharded" : true,
"paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.",
"userFlags" : 1,
"capped" : false,
"ns" : "test.test",
"count" : 0,
"numExtents" : 1,
"size" : 0,
"storageSize" : 8192,
"totalIndexSize" : 8176,
"indexSizes" : {
"_id_" : 8176
},
"avgObjSize" : 0,
"nindexes" : 1,
"nchunks" : 10,
"shards" : {
"shard1" : {
"ns" : "test.test",
"count" : 0,
"size" : 0,
"numExtents" : 1,
"storageSize" : 8192,
"lastExtentSize" : 8192,
"paddingFactor" : 1,
"paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.",
"userFlags" : 1,
"capped" : false,
"nindexes" : 1,
"totalIndexSize" : 8176,
"indexSizes" : {
"_id_" : 8176
},
"ok" : 1,
"$gleStats" : {
"lastOpTime" : Timestamp(1439189515, 1),
"electionId" : ObjectId("55c81b008266db61a673c152")
}
}
},
"ok" : 1
}
mongos> sh.startBalancer();
mongos> sh.status();
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("55c82589acfaad53c0a64845")
}
shards:
{ "_id" : "shard1", "host" : "shard1/127.0.0.1:11111,127.0.0.1:22222,127.0.0.1:33333" }
{ "_id" : "shard2", "host" : "shard2/127.0.0.1:44444,127.0.0.1:55555,127.0.0.1:60000" }
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
90 : Success
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "test", "partitioned" : true, "primary" : "shard1" }
test.test
shard key: { "_id" : 1 }
chunks:
shard1 5
shard2 5
{ "_id" : { "$minKey" : 1 } } -->> { "_id" : 20000 } on : shard2 Timestamp(2, 0)
{ "_id" : 20000 } -->> { "_id" : 40000 } on : shard2 Timestamp(3, 0)
{ "_id" : 40000 } -->> { "_id" : 60000 } on : shard2 Timestamp(4, 0)
{ "_id" : 60000 } -->> { "_id" : 80000 } on : shard2 Timestamp(5, 0)
{ "_id" : 80000 } -->> { "_id" : 100000 } on : shard2 Timestamp(6, 0)
{ "_id" : 100000 } -->> { "_id" : 120000 } on : shard1 Timestamp(6, 1)
{ "_id" : 120000 } -->> { "_id" : 140000 } on : shard1 Timestamp(1, 13)
{ "_id" : 140000 } -->> { "_id" : 160000 } on : shard1 Timestamp(1, 15)
{ "_id" : 160000 } -->> { "_id" : 180000 } on : shard1 Timestamp(1, 17)
{ "_id" : 180000 } -->> { "_id" : { "$maxKey" : 1 } } on : shard1 Timestamp(1, 18)
{ "_id" : "user", "partitioned" : true, "primary" : "shard1" }
user.test
shard key: { "_id" : 1 }
chunks:
shard1 1
{ "_id" : { "$minKey" : 1 } } -->> { "_id" : { "$maxKey" : 1 } } on : shard1 Timestamp(1, 0)
可以看到数据 被平均分配到两个分片上,现在插入数据:
for(var i=1; i<=200000; i++){
db.test.insert({_id:i,name:"wzw",age:i,uid:i+1});},通过另一个会话观察:
[root@mongodb ~]# mongostat --port 50000
insert query update delete getmore command flushes mapped vsize res faults qr|qw ar|aw netIn netOut conn set repl time
849 *0 *0 *0 0 850|0 0 263.0M 8.0M 0 0|0 0|0 125k 54k 2 RTR 03:51:22
937 *0 *0 *0 0 938|0 0 263.0M 8.0M 0 0|0 0|0 138k 59k 2 RTR 03:51:23
889 *0 *0 *0 0 890|0 0 263.0M 8.0M 0 0|0 0|0 131k 56k 2 RTR 03:51:24
862 *0 *0 *0 0 863|0 0 263.0M 8.0M 0 0|0 0|0 127k 54k 2 RTR 03:51:25
774 *0 *0 *0 0 776|0 0 263.0M 8.0M 0 0|0 0|0 114k 50k 2 RTR 03:51:26
851 *0 *0 *0 0 852|0 0 263.0M 8.0M 0 0|0 0|0 125k 54k 2 RTR 03:51:27
857 *0 *0 *0 0 858|0 0 263.0M 8.0M 0 0|0 0|0 126k 54k 2 RTR 03:51:28
800 *0 *0 *0 0 801|0 0 263.0M 8.0M 0 0|0 0|0 118k 51k 2 RTR 03:51:29
816 *0 *0 *0 0 817|0 0 263.0M 8.0M 0 0|0 0|0 120k 52k 2 RTR 03:51:30
838 *0 *0 *0 0 840|0 0 263.0M 8.0M 0 0|0 0|0 123k 53k 2 RTR 03:51:31
insert query update delete getmore command flushes mapped vsize res faults qr|qw ar|aw netIn netOut conn set repl time
875 *0 *0 *0 0 876|0 0 263.0M 8.0M 0 0|0 0|0 129k 55k 2 RTR 03:51:32
778 *0 *0 *0 0 779|0 0 263.0M 8.0M 0 0|0 0|0 114k 50k 2 RTR 03:51:33
850 *0 *0 *0 0 851|0 0 263.0M 8.0M 0 0|0 0|0 125k 54k 2 RTR 03:51:34
897 *0 *0 *0 0 898|0 0 263.0M 8.0M 0 0|0 0|0 132k 56k 2 RTR 03:51:35
848 *0 *0 *0 0 850|0 0 263.0M 8.0M 0 0|0 0|0 125k 54k 2 RTR 03:51:36
846 *0 *0 *0 0 847|0 0 263.0M 8.0M 0 0|0 0|0 124k 54k 2 RTR 03:51:37
886 *0 *0 *0 0 887|0 0 263.0M 8.0M 0 0|0 0|0 130k 56k 2 RTR 03:51:38
866 *0 *0 *0 0 867|0 0 263.0M 8.0M 0 0|0 0|0 127k 55k 2 RTR 03:51:39
880 *0 *0 *0 0 881|0 0 263.0M 8.0M 0 0|0 0|0 129k 55k 2 RTR 03:51:40
mongos> for(var i=1; i<=200000; i++){
... db.test.insert({_id:i,name:"wzw",age:i,uid:i+1});}
WriteResult({ "nInserted" : 1 })
mongos> db.test.stats();
{
"sharded" : true,
"paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.",
"userFlags" : 1,
"capped" : false,
"ns" : "test.test",
"count" : 200000,
"numExtents" : 14,
"size" : 22400000,
"storageSize" : 45015040,
"totalIndexSize" : 5625088,
"indexSizes" : {
"_id_" : 5625088
},
"avgObjSize" : 112,
"nindexes" : 1,
"nchunks" : 10,
"shards" : {
"shard1" : {
"ns" : "test.test",
"count" : 100001,
"size" : 11200112,
"avgObjSize" : 112,
"numExtents" : 7,
"storageSize" : 22507520,
"lastExtentSize" : 11325440,
"paddingFactor" : 1,
"paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.",
"userFlags" : 1,
"capped" : false,
"nindexes" : 1,
"totalIndexSize" : 2812544,
"indexSizes" : {
"_id_" : 2812544
},
"ok" : 1,
"$gleStats" : {
"lastOpTime" : Timestamp(1439189515, 1),
"electionId" : ObjectId("55c81b008266db61a673c152")
}
},
"shard2" : {
"ns" : "test.test",
"count" : 99999,
"size" : 11199888,
"avgObjSize" : 112,
"numExtents" : 7,
"storageSize" : 22507520,
"lastExtentSize" : 11325440,
"paddingFactor" : 1,
"paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.",
"userFlags" : 1,
"capped" : false,
"nindexes" : 1,
"totalIndexSize" : 2812544,
"indexSizes" : {
"_id_" : 2812544
},
"ok" : 1,
"$gleStats" : {
"lastOpTime" : Timestamp(1439187391, 1),
"electionId" : ObjectId("55c82518bd50f12a8dd39e0f")
}
}
},
"ok" : 1
}
可以看到,test表被平均分配到两个分片上。
至此,mongodb pre-splitting测试结束,此次测试过程很简单,但是在做预分配chunks时要计算好如何分配,如果没做好,可能导致数据不均衡。当然在生产环境中,可以多做些chunks,为以后数据扩展做准备,并且片键采用hashed方式,不然数据还是不均衡的,本次因测试,所经以只做了10个chunks未做hashed。
- mongodb pre-splitting sharding测试
- Hbase踩坑-pre-splitting
- mongodb sharding
- MongoDB sharding
- MongoDB Sharding
- hbase-region预分区(pre-splitting)
- redis的pre-sharding(预分布式)
- MongoDB auto-sharding
- Mongodb --- Manual sharding
- mongodb的sharding
- mongodb sharding 学习笔记
- MongoDB Sharding配置
- mongodb的sharding(分片)
- mongodb sharding 配置
- mongodb sharding 机制
- mongodb sharding replica set
- MongoDB---Sharding分片
- MongoDB Sharding 分片技术
- [最短路]使用优先队列优化的Dijkstra算法
- 使用NLTK的朴素贝叶斯分类器来训练并完成分类工作
- Android 内存优化总结
- android ListView奇偶行显示不同颜色
- hdu-2087 剪花布条
- mongodb pre-splitting sharding测试
- 知识点四:Menu和actionBar用法
- 自己动手实现数据结构——排序算法1(冒泡、插入、归并、简单选择)(C++实现)
- C++与Java的语法区别
- Linux内核启动中显示的logo的修改
- OSGI(Felix)项目开发和持续集成流程
- Java数字时钟(现在是北京时间 20:13:14)
- MySQL必知必会笔记(一)
- 中国创客面临无限挑战