hbase Region split policy 分区 分裂策略 算法
来源:互联网 发布:access sql 编辑:程序博客网 时间:2024/06/07 20:09
本文是基于hbase-0.98.6-cdh5.2.0
Region split policy
HBase的region split策略一共有以下几种:
- IncreasingToUpperBoundRegionSplitPolicy
- ConstantSizeRegionSplitPolicy
- DisabledRegionSplitPolicy
- KeyPrefixRegionSplitPolicy
- DelimitedKeyPrefixRegionSplitPolicy
IncreasingToUpperBoundRegionSplitPolicy
直接查看源码IncreasingToUpperBoundRegionSplitPolicy.java头部声明
Split size is the number of regions that are on this server that all are
of the same table, cubed, times 2x the region flush size OR the maximum
region split size, whichever is smaller. For example, if the flush size
is 128M, then after two flushes (256MB) we will split which will make two regions
that will split when their size is 2^3 * 128M * 2 = 2048M. If one of these
regions splits, then there are three regions and now the split size is
3^3 * 128M * 2 = 6912M, and so on until we reach the configured
maximum filesize and then from there on out, we’ll use that.region split的计算公式是:regioncount^3 * 128M * 2,当region达到该size的时候进行split
但是在该类内部的getSizeToCheck方法更直接的体现了region进行split的size
/** * @return Region max size or <code>count of regions squared * flushsize, which ever is * smaller; guard against there being zero regions on this server. */ protected long getSizeToCheck(final int tableRegionsCount) { // safety check for 100 to avoid numerical overflow in extreme cases return tableRegionsCount == 0 || tableRegionsCount > 100 ? getDesiredMaxFileSize(): Math.min(getDesiredMaxFileSize(), this.initialSize * tableRegionsCount * tableRegionsCount * tableRegionsCount); }
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
从这个方法来看,最终是当region达到以下size的时候进行split
Math.min(getDesiredMaxFileSize(), this.initialSize * tableRegionsCount * tableRegionsCount * tableRegionsCount)// getDesiredMaxFileSize() 这个值是hbase.hregion.max.filesize参数值,10GB// this.initialSize值为2 * hbase.hregion.memstore.flush.size,256MB// 最终是取Math.min(10G, 256 * regioncount^3)
- 1
- 2
- 3
- 4
- 5
第一次split:1^3 * 256 = 256MB
第二次split:2^3 * 256 = 2048MB
第三次split:3^3 * 256 = 6912MB
第四次split:4^3 * 256 = 16384MB > 10GB,因此取较小的值10GB
后面每次split的size都是10GB了
ConstantSizeRegionSplitPolicy
This is the default split policy. From 0.94.0 on the default split policy has changed to {@link IncreasingToUpperBoundRegionSplitPolicy}
0.94.0之前该策略是region的默认split策略,0.94.0之后region的默认split策略为IncreasingToUpperBoundRegionSplitPolicy,当region size达到hbase.hregion.max.filesize(默认10G)配置的大小后进行split。
DisabledRegionSplitPolicy
直接查看源码DisabledRegionSplitPolicy.java头部声明
This should be used with care, since it will disable automatic sharding.
该策略是直接禁用了region的自动split。
KeyPrefixRegionSplitPolicy
直接查看源码KeyPrefixRegionSplitPolicy.java头部声明
A custom RegionSplitPolicy implementing a SplitPolicy that groups rows by a prefix of the row-key
根据rowKey的前缀对数据进行分组,这里是指定rowKey的前多少位作为前缀,比如rowKey都是16位的,指定前5位是前缀,那么前5位相同的rowKey在进行region split的时候会分到相同的region中。
DelimitedKeyPrefixRegionSplitPolicy
直接查看源码DelimitedKeyPrefixRegionSplitPolicy.java头部声明
A custom RegionSplitPolicy implementing a SplitPolicy that groups rows by a prefix of the row-key with a delimiter. Only the first delimiter for the row key will define the prefix of the row key that is used for grouping.This ensures that a region is not split “inside” a prefix of a row key.
I.e. rows can be co-located in a region by their prefix.
As an example, if you have row keys delimited with _ , like userid_eventtype_eventid, and use prefix delimiter _, this split policy ensures that all rows starting with the same userid, belongs to the same region.保证相同前缀的数据在同一个region中,例如rowKey的格式为:userid_eventtype_eventid,指定的delimiter为 _ ,则split的的时候会确保userid相同的数据在同一个region中。
- hbase Region split policy 分区 分裂策略 算法
- hbase region split策略
- HBase Region split 策略
- HBASE-region的SPLIT策略
- HBase Region分裂
- HBase Region分裂
- HBase Region分裂
- HBase Region分裂
- HBase Region 的分裂
- HBase Region分裂
- region分裂策略
- hbase中region分裂设置
- HBase Region分裂 原理,过程
- HBase预分区region自定义算法
- hbase region 手动 split
- hbase的region分区
- hbase的Region分裂代码分析
- hbase region split 源码分析
- MySQL 5.7 深度解析: 半同步复制技术
- linux下部署mysql数据库
- 弯梁摩托车调整离合器
- 1006. 换个格式输出整数 (15)
- MySQL添加字段和修改字段的方法
- hbase Region split policy 分区 分裂策略 算法
- java抽象类练习题
- 从零开始前端学习[10]:控制字体的样式font样式
- 数据结构 P16 算法实现
- js玩转多个div互换
- 程序运行时三种内存分配策略:静态的、栈式的、和堆式的,以及区别 按照编译原理的观点
- Generic Cow Protests-G——60分做法
- 使用k-d树进行无序点云去噪
- Linux、ubuntu下pip安装aiohttp失败,修改python默认版本