扩展Elasticsearch Azure Plugin支持读/写snapshot到多个Azure存储账号
来源:互联网 发布:淘宝整点秒杀优惠券 编辑:程序博客网 时间:2024/05/21 22:57
忙忙碌碌中2016就要过去,借这篇博客小小总结一下。像往年一样每年都是很忙很忙,今年尤其如此,哈哈哈!适逢年末,这一年可以总结的东西有很多,比如:开始读了这本专业书《Modern Authentication with Azure Active Directory for Web applications》,只可惜还没有完全读完,来年仍需要加把劲,多花些时间在读书,少些时间在手机上!此外,自从搬到28楼后,投身了一项新的群众体育运动 - 桌球,积极健身好好工作。
默认的Elasticsearch Azure插件只支持向一个Azure存储账号(storage account)写入/读出集群快照(snapshot)数据,索引的快照数据是以 block blob的形式存储在Azure存储账号的blob中的,我在另一篇博客《Elasticsearch-cloud-azure插件使用哪种Azure blob?》中分析这部分的代码。这个限制对于大型Elasticsearch集群(例如:数据量很大TB, 数据节点>30)而言,会导致过载单一的storage account以至于snapshot失败或者PARTIAL失败,在Elasticsearch的日志文件或者快照状态信息中会看到IndexShardSnapshotFailedException,如下面的例子所示:
"state": "PARTIAL",
"start_time": "2016-09-21T00:10:09.180Z",
"start_time_in_millis": 1474416609180,
"end_time": "2016-09-21T02:14:36.642Z",
"end_time_in_millis": 1474424076642,
"duration_in_millis": 7467462,"failures": [
{
"node_id": "SVT4jVpiTVmiH8K7ctWrOQ",
"index": "my_index_20160913d",
"reason": "IndexShardSnapshotFailedException[[my_index_20160913d][1] Failed to perform snapshot (index files)]; nested: IOException; nested: StorageException[The server encountered an unknown failure: ]; nested: IOException[Error writing to server]; ",
"shard_id": 1,
"status": "INTERNAL_SERVER_ERROR"
},
...
]
[2016-07-22 01:27:36,168][WARN ][snapshots ] [ESNode-ElasticSearchData_IN_53] [[myindex.2016_07_10][4]] [snapshot:001008] failed to create snapshot
org.elasticsearch.index.snapshots.IndexShardSnapshotFailedException: [myindex.2016_07_10][4] Failed to perform snapshot (index files)
at org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardRepository$SnapshotContext.snapshot(BlobStoreIndexShardRepository.java:509)
at org.elasticsearch.index.snapshots.blobstore.BlobStoreIndexShardRepository.snapshot(BlobStoreIndexShardRepository.java:140)
at org.elasticsearch.index.snapshots.IndexShardSnapshotAndRestoreService.snapshot(IndexShardSnapshotAndRestoreService.java:85)
at org.elasticsearch.snapshots.SnapshotsService$5.run(SnapshotsService.java:871)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)Caused by: java.io.IOException
at com.microsoft.azure.storage.core.Utility.initIOException(Utility.java:643)
at com.microsoft.azure.storage.blob.BlobOutputStream.writeBlock(BlobOutputStream.java:444)
at com.microsoft.azure.storage.blob.BlobOutputStream.access$000(BlobOutputStream.java:53)
at com.microsoft.azure.storage.blob.BlobOutputStream$1.call(BlobOutputStream.java:388)
at com.microsoft.azure.storage.blob.BlobOutputStream$1.call(BlobOutputStream.java:385)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)... 3 more
Caused by: com.microsoft.azure.storage.StorageException: The server encountered an unknown failure:
at com.microsoft.azure.storage.StorageException.translateException(StorageException.java:101)
at com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:199)
at com.microsoft.azure.storage.blob.CloudBlockBlob.uploadBlockInternal(CloudBlockBlob.java:1006)
at com.microsoft.azure.storage.blob.CloudBlockBlob.uploadBlock(CloudBlockBlob.java:978)
at com.microsoft.azure.storage.blob.BlobOutputStream.writeBlock(BlobOutputStream.java:438)... 9 more
Caused by: java.io.IOException: Error writing to server
at sun.net.www.protocol.http.HttpURLConnection.writeRequests(HttpURLConnection.java:666)
at sun.net.www.protocol.http.HttpURLConnection.writeRequests(HttpURLConnection.java:678)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1534)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1441)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
at com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:119)
... 12 more
通过扩展Elasticsearch Azure Plugin,可以让它支持读/写snapshot数据到多个storage account,从而将Lucene索引数据文件平均分配到多个storage account中,这样就避免到了过载一个storage account而导致snapshot失败。下表总结了将一个包含3777G,185847个Lucene文件的集群snapshot写入到9个Azure存储账户的结果,不难看出,扩展的Azure Plugin将快照数据基本上平均(文件个数和数据大小)写入到配置的9个账户中。基于这个思路,改造了Elasticsearch Azure Plugin支持配置多个storage accounts并读/写快照数据。可以在这里下载到这个扩展的Azure Plugin,目前支持的Elasticsearch版本有:
- 1.7.X
- 2.4.1
- 2.4.2
- 2.4.3
- 5.2.1 (包含了Bug#23483修复代码)
Azure存储账号
文件数量
文件大小
esstorage1
20634
420 G
esstorage2
20597
419 G
esstorage3
20549
418 G
esstorage4
20559
416 G
esstorage5
20637
420 G
esstorage6
20726
420 G
esstorage7
20769
424 G
esstorage8
20697
420 G
esstorage9
20679
417 G
此外,Elasticsearch支持读写snapshot到Read-Access Geo-Redundant (RA-GRS) 类型的storage accounts。通过写入snapshot到primary location,然后在secondary location恢复数据,可以在不同地域的数据中心之间建立互为备份的Elasticsearch集群,如下图所示:
Elasticsearch 1.7.X 版本插件配置及命令
Elasticsearch.yml
cloud.azure.storage.account: [storageaccount1,storageaccount2,storageaccount3]
cloud.azure.storage.key: [key1, key2, key3]
Commands
#1: define repository
PUT _snapshot/plugintest160921
{
"type": "azure",
"settings": {
"account": "storageaccount1,storageaccount2,storageaccount3",
"container": "plugintest160921"
}
}
#2: take snapshot
PUT _snapshot/plugintest160921/backup0921?wait_for_completion=true
{
}
#3: restore
POST _snapshot/plugintest160921/backup0921/_restore?wait_for_completion=true
{
"ignore_unavailable": "true",
"include_global_state": false
}
#4: define repository for secondary
PUT _snapshot/plugintest160921
{
"type": "azure",
"settings": {
"account": "storageaccount1,storageaccount2,storageaccount3",
"container": "plugintest160921",
"location_mode": "secondary_only"
}
}
Elasticsearch 2.4.X 版本插件配置及命令
Elasticsearch.yml
cloud.azure.storage.my_account1.account: storageaccount1
cloud.azure.storage.my_account1.key: key1
cloud.azure.storage.my_account1.default: true
cloud.azure.storage.my_account2.account: storageaccount2
cloud.azure.storage.my_account2.key: key2
cloud.azure.storage.my_account2.default: true
cloud.azure.storage.my_account3.account: storageaccount3
cloud.azure.storage.my_account3.key: key3
cloud.azure.storage.my_account3.default: true
Commands
#1: define repository
PUT _snapshot/plugintest160921
{
"type": "azure",
"settings": {
"account": "my_account1,my_account2,my_account3",
"container": "plugintest160921"
}
}
#2: take snapshot
PUT _snapshot/plugintest160921/backup0921?wait_for_completion=true
{
}
#3: restore
POST _snapshot/plugintest160921/backup0921/_restore?wait_for_completion=true
{
"ignore_unavailable": "true",
"include_global_state": false
}
#4: define repository for secondary
PUT _snapshot/plugintest160921
{
"type": "azure",
"settings": {
"account": "my_account1,my_account2,my_account3",
"container": "plugintest160921",
"location_mode": "secondary_only"
}
}
创建了Pull Request#22709给Elasticsearch,希望能够集成这个功能到Elasticsearch 5.X或者6.X中去。但Elasticsearch团队对snapshot/restore的未来实现有其他的考虑,所以我的PR未被接受。
- 扩展Elasticsearch Azure Plugin支持读/写snapshot到多个Azure存储账号
- Azure
- Azure
- Azure
- Azure虚拟机磁盘扩展
- Elasticsearch-cloud-azure插件使用哪种Azure blob?
- elasticsearch Snapshot 写php shell
- 微软 Azure 宣布支持 OpenBSD
- 微软 Azure 宣布支持 OpenBSD
- [Azure]使用Powershell统计经典存储账号下容器中Blob的使用情况
- [Azure]使用Powershell统计ARM存储账号下容器中Blob的使用情况
- Microsoft Azure存储架构设计
- Windows Azure 存储管理器 (2014)
- Azure Blob存储(1)
- Azure Blob存储(2)
- Azure Table存储(1)
- Azure Table存储(2)
- Azure Queue队列存储(2)
- jsp自定义标签库引用java属性值
- Leetcode 279. Perfect Squares
- IP地址与MAC地址
- AWS EC2/S3命令记录
- ffmpeg 学习记录
- 扩展Elasticsearch Azure Plugin支持读/写snapshot到多个Azure存储账号
- 页面改变的监听
- 做一个缓存,记录是否进入过此页面
- Leetcode Generate Parentheses
- 警惕 MySql 更新 sql 的 WHERE 从句中的 IN() 子查询时出现的陷阱
- Cena win8/win10配置
- PATCH_ERIC
- C++ 操作PDFlib实例
- 超轻便自由的快速启动应用工具