HBase-split
来源:互联网 发布:淘宝店铺名怎么改呢 编辑:程序博客网 时间:2024/06/08 19:00
HBase-split代码分析
相关类:
- SplitRequest : 具体执行split过程的类
- CompactSplitThread : compact split 线程控制
- MemStoreFlusher : flush memstore 实现
- RSRpcServices : regionserver RPC实现类
- TableLock : 是TableLockManager中的一个接口,实现类:ZKTableLockManager
- IncreasingToUpperBoundRegionSplitPolicy:默认的split策略
触发split的情况
- HBaseAdmin : HBaseAdmin.split
- compact : CompactSplitThread.CompactRunner
- memstore flush : FlushHandler.java
- compact 触发 split:
CompactionRunner.run()中:
public void run() { //....一些条件判断 if (this.compaction == null) { ..... // Finally we can compact something. assert this.compaction != null; ... try { ... boolean completed = region.compact(compaction, store); ... if (completed) { // degenerate case: blocked regions require recursive enqueues if (store.getCompactPriority() <= 0) { requestSystemCompaction(region, store, "Recursive enqueue"); } else { // see if the compaction has caused us to exceed max region size //*********如果超过最大的region大小****** requestSplit(region); } } } catch (IOException ex) { ... } server.checkFileSystem(); } catch (Exception ex) { ... } finally { LOG... } this.compaction.getRequest().afterExecute();//一个空的方法 }
上面的:store.getCompactPriority() <= 0 是什么意思??
我们来看一下HStore.java中 getCompactPriority()
@Override public int getCompactPriority() { // 从StoreFileManager中获取Compact Priority int priority = this.storeEngine.getStoreFileManager().getStoreCompactionPriority(); if (priority == PRIORITY_USER) { LOG.warn("Compaction priority is USER despite there being no user compaction"); } return priority; }
它转而从StoreFileManager中获取Compact Priority,继续吧!在StoreFileManager的默认实现DefaultStoreFileManager中,代码如下:
@Override public int getStoreCompactionPriority() { isTooManyStoreFiles:MemStore在进行flush时会判断HRegion上每个HStore下的文件数是否太多,太多则意味着MemStore的flush会被推迟进行,优先进行compact,否则文件数则会越来越多,而这里,离blockingFileCount越远,当前文件数越小的话,则意味着MemStore的flush可以优先进行,而compact可以在它flush之后再进行,将资源利用效率最大化 // BLOCKING_STOREFILES_KEY = "hbase.hstore.blockingStoreFiles" // HStore.DEFAULT_BLOCKING_STOREFILE_COUNT = 7 为什么为7??? int blockingFileCount = conf.getInt( HStore.BLOCKING_STOREFILES_KEY, HStore.DEFAULT_BLOCKING_STOREFILE_COUNT); // 优先级为上述blockingFileCount减去当前storefiles的数目 int priority = blockingFileCount - storefiles.size(); // 如果priority为1,则返回2,否则返回原值 return (priority == HStore.PRIORITY_USER) ? priority + 1 : priority; }
回到 store.getCompactPriority() <= 0 这个问题
如果 store.getCompactPriority() <= 0 则 blockingFileCount(=7) - storefiles.size() <= 0
说明 storefiles.size() >=7
然后执行 requestSystemCompaction(region, store, “Recursive enqueue”);
这个里面又是需要执行CompactionRunner
requestSplit
好了,相当于 storefiles.size() < 7 的话
CompactionRunner.run()中执行requestSplit()
这个方法是CompactSplitThread中的requestSplit()
public synchronized boolean requestSplit(final HRegion r) { // 1.shouldSplitRegion() 判断当前RS上region数量是否大于系统设置 // 2.r.getCompactPriority() >= 1 if (shouldSplitRegion() && r.getCompactPriority() >= Store.PRIORITY_USER) { byte[] midKey = r.checkSplit(); if (midKey != null) { requestSplit(r, midKey); return true; } } return false; }
看一下shouldSplitRegion()方法里面做了什么判断?
private boolean shouldSplitRegion() { //this.regionSplitLimit=conf.getInt(//REGION_SERVER_REGION_SPLIT_LIMIT,//DEFAULT_REGION_SERVER_REGION_SPLIT_LIMIT); 默认为1000 if(server.getNumberOfOnlineRegions() > 0.9*regionSplitLimit) { //如果当前regionserver上的region数 > 900 打印WARN LOG LOG.warn("Total number of regions is approaching the upper limit " + regionSplitLimit + ". " + "Please consider taking a look at http://hbase.apache.org/book.html#ops.regionmgt"); } // regionSplitLimit 大于 当前RS的online region数则返回true return (regionSplitLimit > server.getNumberOfOnlineRegions()); }
region在RS上的数量和compact优先级都判断完了
下面执行HRegion checkSplit()
/** * Return the splitpoint. null indicates the region isn't splittable * If the splitpoint isn't explicitly specified, it will go over the stores * to find the best splitpoint. Currently the criteria of best splitpoint * is based on the size of the store. * 返回split point。null 表示不能被split。 * 如果split point 没有指定。则会根据stores寻找最佳split point 。最佳split point基于store的size */ public byte[] checkSplit() { // META表和NAMESPACE元数据表不能被split // recovering(恢复中)状态的表不能被split //splitPolicy(split策略)默认为IncreasingToUpperBoundRegionSplitPolicy if (!splitPolicy.shouldSplit()) { return null; }//获取具体的split point byte[] ret = splitPolicy.getSplitPoint(); if (ret != null) { try { //判断row是否在这个region当中 checkRow(ret, "calculated split"); } catch (IOException e) { LOG.error("Ignoring invalid split", e); return null; } } return ret; }
默认splitPolicy为:
IncreasingToUpperBoundRegionSplitPolicy
看一下它里面的shouldSplit()方法
@Override protected boolean shouldSplit() { if (region.shouldForceSplit()) return true; boolean foundABigStore = false; // Get count of regions that have the same common table as this.region // table的region数量 int tableRegionsCount = getCountOfCommonTableRegions(); // Get size to check // 获取根据hbase.hregion.max.filesize和region数量以及hbase.hregion.memstore.flush.size计算的CheckSize long sizeToCheck = getSizeToCheck(tableRegionsCount); //循环遍历region下面所有store for (Store store : region.getStores().values()) { // 如果有的region不能被split,比如有的region包含引用文件,则返回false if ((!store.canSplit())) { return false; } // Mark if any store is big enough long size = store.getSize(); //如果store大于check size,设置foundABigStore为true if (size > sizeToCheck) { LOG.debug("ShouldSplit because " + store.getColumnFamilyName() + " size=" + size + ", sizeToCheck=" + sizeToCheck + ", regionsWithCommonTable=" + tableRegionsCount); foundABigStore = true; } } return foundABigStore; }
IncreasingToUpperBoundRegionSplitPolicy getSizeToCheck()
/** * @return Region max size or <code>count of regions squared * flushsize, which ever is * smaller; guard against there being zero regions on this server. */ protected long getSizeToCheck(final int tableRegionsCount) { // safety check for 100 to avoid numerical overflow in extreme cases //如果 region数=0或者>100 返回 hbase.hregion.max.filesize 值 //否则 在 max_filesize和 之间选择一个小的值 128M * regionCt * regionCt * regionCt initialSize= table属性里设置的MEMSTORE_FLUSHSIZE,或者默认为hbase.hregion.memstore.flush.size(默认128M) return tableRegionsCount == 0 || tableRegionsCount > 100 ? getDesiredMaxFileSize(): Math.min(getDesiredMaxFileSize(), this.initialSize * tableRegionsCount * tableRegionsCount * tableRegionsCount); }
getDesiredMaxFileSize()方法:
返回IncreasingToUpperBoundRegionSplitPolicy继承的ConstantSizeRegionSplitPolicy类的desiredMaxFileSize值.(hbase.hregion.max.filesize)
desiredMaxFileSize的赋值过程:
@Override protected void configureForRegion(HRegion region) { super.configureForRegion(region); Configuration conf = getConf(); HTableDescriptor desc = region.getTableDesc(); if (desc != null) { //如果table设置了MAX_FILESIZE属性,则返回这个属性的值,否则返回-1 this.desiredMaxFileSize = desc.getMaxFileSize(); } //如果 desc.getMaxFileSize()返回 < 0 的值 //则获取hbase.hregion.max.filesize属性值,或者默认值:10 * 1024 * 1024 * 1024L=10G if (this.desiredMaxFileSize <= 0) { this.desiredMaxFileSize = conf.getLong(HConstants.HREGION_MAX_FILESIZE, HConstants.DEFAULT_MAX_FILE_SIZE); } }
上面根据IncreasingToUpperBoundRegionSplitPolicy的shouldSplit()方法判断了:
region数量与max filesize 以及 当前region的store中是否包含引用文件等
下面我们继续看HRegion checkSplit()后面执行了什么:
splitPolicy.getSplitPoint()
IncreasingToUpperBoundRegionSplitPolicy getSplitPoint()
//IncreasingToUpperBoundRegionSplitPolicy getSplitPoint()
经过上面的各种check和get mid ,现在终于要执行requestSplit了
CompactSplitThread requestSplit()
public synchronized void requestSplit(final HRegion r, byte[] midKey) {...//this.splits 线程池,默认线程数为1 this.splits.execute(new SplitRequest(r, midKey, this.server)); ... }
看看SplitRequest的run()方法吧
@Override public void run() { //判断server是否停止服务 //split metric + 1 long startTime = EnvironmentEdgeManager.currentTime(); SplitTransaction st = new SplitTransaction(parent, midKey); try { //获取table的读锁TableLock类型 tableLock.acquire(); } catch (IOException ex) { tableLock = null; throw ex; } // If prepare does not return true, for some reason -- logged inside in // the prepare call -- we are not ready to split just now. Just return. //st.prepare()也是比较重要的一步 if (!st.prepare()) return; try { st.execute(this.server, this.server); success = true; } catch (Exception e) { ... try { //split失败,进行回滚操作 if (st.rollback(this.server, this.server)) { ... } else { ... } } catch (RuntimeException ee) { ... this.server.abort(msg + " -- Cause: " + ee.getMessage()); } return; } } catch (IOException ex) { ... server.checkFileSystem(); } finally { //Coprocessor postCompleteSplit if (parent.shouldForceSplit()) { parent.clearSplit(); } //释放TableLock releaseTableLock(); // Split succ LOG }//end finally }
先看一下prepare():
new了A,B两个HRegionInfo
/** * Does checks on split inputs. * @return <code>true</code> if the region is splittable else * <code>false</code> if it is not (e.g. its already closed, etc.). */ public boolean prepare() { //parent region如果不能被split,则直接return false //mid不能为null HRegionInfo hri = this.parent.getRegionInfo(); parent.prepareToSplit(); // Check splitrow. byte [] startKey = hri.getStartKey(); byte [] endKey = hri.getEndKey(); if (Bytes.equals(startKey, splitrow) || !this.parent.getRegionInfo().containsRow(splitrow)) { LOG.info("Split row is not inside region key range or is equal to " + "startkey: " + Bytes.toStringBinary(this.splitrow)); return false; } //构造regionId,如果构造的regionId小于parent regionId,则自动加1(保证在meta表中的顺序) long rid = getDaughterRegionIdTimestamp(hri); //创建A,B两个子region this.hri_a = new HRegionInfo(hri.getTable(), startKey, this.splitrow, false, rid); this.hri_b = new HRegionInfo(hri.getTable(), this.splitrow, endKey, false, rid); this.journal.add(new JournalEntry(JournalEntryType.PREPARED)); return true; }
看看executor
/** * Run the transaction. * @param server Hosting server instance. Can be null when testing * @param services Used to online/offline regions. * @throws IOException If thrown, transaction failed. * Call {@link #rollback(Server, RegionServerServices)} * @return Regions created * @throws IOException * @see #rollback(Server, RegionServerServices) */ public PairOfSameType<HRegion> execute(final Server server, final RegionServerServices services) throws IOException { useZKForAssignment = server == null ? true : ConfigUtil.useZKForAssignment(server.getConfiguration()); if (useCoordinatedStateManager(server)) {//状态判断 std = ((BaseCoordinatedStateManager) server.getCoordinatedStateManager()) .getSplitTransactionCoordination().getDefaultDetails(); } PairOfSameType<HRegion> regions = createDaughters(server, services); if (this.parent.getCoprocessorHost() != null) { this.parent.getCoprocessorHost().preSplitAfterPONR(); } return stepsAfterPONR(server, services, regions); }
createDaughters()负责 下线parent region 上线子region
/** * 准备region和region files * 参数:services 用来上下线region * 返回的是创建的region */ /* package */PairOfSameType<HRegion> createDaughters(final Server server, final RegionServerServices services) throws IOException { LOG.info("Starting split of region " + this.parent); if ((server != null && server.isStopped()) || (services != null && services.isStopping())) { throw new IOException("Server is stopped or stopping"); } assert !this.parent.lock.writeLock().isHeldByCurrentThread(): "Unsafe to hold write lock while performing RPCs"; journal.add(new JournalEntry(JournalEntryType.BEFORE_PRE_SPLIT_HOOK)); // Coprocessor callback if (this.parent.getCoprocessorHost() != null) { // TODO: Remove one of these this.parent.getCoprocessorHost().preSplit(); this.parent.getCoprocessorHost().preSplit(this.splitrow); } journal.add(new JournalEntry(JournalEntryType.AFTER_PRE_SPLIT_HOOK)); // If true, no cluster to write meta edits to or to update znodes in. boolean testing = server == null? true: server.getConfiguration().getBoolean("hbase.testing.nocluster", false); this.fileSplitTimeout = testing ? this.fileSplitTimeout : server.getConfiguration().getLong("hbase.regionserver.fileSplitTimeout", this.fileSplitTimeout); PairOfSameType<HRegion> daughterRegions = stepsBeforePONR(server, services, testing); List<Mutation> metaEntries = new ArrayList<Mutation>(); if (this.parent.getCoprocessorHost() != null) { if (this.parent.getCoprocessorHost(). preSplitBeforePONR(this.splitrow, metaEntries)) { throw new IOException("Coprocessor bypassing region " + this.parent.getRegionNameAsString() + " split."); } try { for (Mutation p : metaEntries) { HRegionInfo.parseRegionName(p.getRow()); } } catch (IOException e) { LOG.error("Row key of mutation from coprossor is not parsable as region name." + "Mutations from coprocessor should only for hbase:meta table."); throw e; } } // This is the point of no return. Adding subsequent edits to .META. as we // do below when we do the daughter opens adding each to .META. can fail in // various interesting ways the most interesting of which is a timeout // BUT the edits all go through (See HBASE-3872). IF we reach the PONR // then subsequent failures need to crash out this regionserver; the // server shutdown processing should be able to fix-up the incomplete split. // The offlined parent will have the daughters as extra columns. If // we leave the daughter regions in place and do not remove them when we // crash out, then they will have their references to the parent in place // still and the server shutdown fixup of .META. will point to these // regions. // We should add PONR JournalEntry before offlineParentInMeta,so even if // OfflineParentInMeta timeout,this will cause regionserver exit,and then // master ServerShutdownHandler will fix daughter & avoid data loss. (See // HBase-4562). this.journal.add(new JournalEntry(JournalEntryType.PONR)); // Edit parent in meta. Offlines parent region and adds splita and splitb // as an atomic update. See HBASE-7721. This update to META makes the region // will determine whether the region is split or not in case of failures. // If it is successful, master will roll-forward, if not, master will rollback // and assign the parent region. //不是测试模式****************** if (!testing && useZKForAssignment) { if (metaEntries == null || metaEntries.isEmpty()) { MetaTableAccessor.splitRegion(server.getConnection(), parent.getRegionInfo(), daughterRegions.getFirst().getRegionInfo(), daughterRegions.getSecond().getRegionInfo(), server.getServerName(), parent.getTableDesc().getRegionReplication()); } else { //元数据的变化 下线parent,并且更新新的region信息 offlineParentInMetaAndputMetaEntries(server.getConnection(), parent.getRegionInfo(), daughterRegions.getFirst().getRegionInfo(), daughterRegions .getSecond().getRegionInfo(), server.getServerName(), metaEntries, parent.getTableDesc().getRegionReplication()); } } else if (services != null && !useZKForAssignment) { if (!services.reportRegionStateTransition(TransitionCode.SPLIT_PONR, parent.getRegionInfo(), hri_a, hri_b)) { // Passed PONR, let SSH clean it up throw new IOException("Failed to notify master that split passed PONR: " + parent.getRegionInfo().getRegionNameAsString()); } } return daughterRegions; }
executor的最后
stepsAfterPONR(server, services, regions) open新region,修改zookeeper里面的信息
/hbase/region-in-transition
- HBase Split
- HBase split
- HBase Split
- HBase Split
- HBase-split
- hbase split
- Hbase manual split
- HBase Split 过程
- hbase 之 split
- hbase region split策略
- HBase的split分析
- HBase Region split 策略
- HBase Split 过程
- HBase Split分析
- hbase region 手动 split
- hbase split策略
- HBase split操作介绍
- hbase log split
- NGUI的图片轮播类似英雄联盟选皮肤的小DEMO
- JAVA蓝桥杯:杨辉三角形
- 399. Evaluate Division
- 最大子序列和的四种算法之讲解
- location和location.href跳转url的区别
- HBase-split
- 设置mysql允许外网访问
- Python3+urllib爬取海量精美图片
- 取消浏览器的默认行为
- linux下创建svn服务器
- C++表达式求值
- Android-BroadcastReceiver
- Log4J日志配置详解
- ubuntu 14.04 双显卡安装NVIDIA GPU驱动+CUDA+编译配置caffe