hadoop之DataBlockScanner
来源:互联网 发布:windows defender win7 编辑:程序博客网 时间:2024/06/05 18:59
每一个DataNode都会初始化一个数据块扫描器(DataBlockScanner),用于周期性检测DataNode上的存储的所有数据块的正确性。如果发现有损坏的数据块则报告给NameNode。由于DataNode的数据块都是放在块池里面的,所以会持有一个BlockPoolSliceScanner对象,每一个BlockPoolSliceScanner对象负责验证一个指定的块池下数据块的正确性和完整性.
2.5版本的数据块扫描,其实2.6版本核心的扫描方法是一样的,都是通过发送一个数据到空的输出流上
DataBlockScanner:
private final TreeMap<String, BlockPoolSliceScanner> blockPoolScannerMap = new TreeMap<String, BlockPoolSliceScanner>();public void run() { String currentBpId = ""; boolean firstRun = true; //除第一次遍历外,以后每次要sleep 5s while (datanode.shouldRun && !Thread.interrupted()) { if (!firstRun) { try { Thread.sleep(SLEEP_PERIOD_MS); } catch (InterruptedException ex) { // Interrupt itself again to set the interrupt status blockScannerThread.interrupt(); continue; } } else { firstRun = false; } //选择一个距离现在时间最长的BlockPoolSliceScanner BlockPoolSliceScanner bpScanner = getNextBPScanner(currentBpId); //......省略 //调用BlockPoolSliceScanner的scanBlockPoolSlice进行扫描 bpScanner.scanBlockPoolSlice(); } //......省略}private BlockPoolSliceScanner getNextBPScanner(String currentBpId) { String nextBpId = null; while (datanode.shouldRun && !blockScannerThread.isInterrupted()) { waitForInit(); synchronized (this) { if (getBlockPoolSetSize() > 0) { //寻找一个距离现在时间最长的BlockPoolSliceScanner long lastScanTime = 0; for (String bpid : blockPoolScannerMap.keySet()) { final long t = getBPScanner(bpid).getLastScanTime(); if (t != 0L) { if (bpid == null || t < lastScanTime) { lastScanTime = t; nextBpId = bpid; } } } //......省略 } } //......省略 } return null;}
BlockPoolSliceScanner:
voidscanBlockPoolSlice() {
//......省略
// Start scanning
scan();
}
private void scan() {
//调整节流器
adjustThrottler();
while (datanode.shouldRun
&& !datanode.blockScanner.blockScannerThread.isInterrupted()
&& datanode.isBPServiceAlive(blockPoolId)){
long now = Time.monotonicNow();
synchronized(this) {
if (now >= (currentPeriodStart+ scanPeriod)) {
startNewPeriod();
}
}
if (((now - getEarliestScanTime())>= scanPeriod)
|| ((!blockInfoSet.isEmpty())&& !(this.isFirstBlockProcessed()))) {
//挑一个block验证
verifyFirstBlock();
} else {
//......省略
}
}
}
private voidverifyFirstBlock() {
Block block = null;
synchronized (this) {
if (!blockInfoSet.isEmpty()){
block = blockInfoSet.first();
}
}
if ( block != null ) {
//校验block
verifyBlock(new ExtendedBlock(blockPoolId,block));
processedBlocks.put(block.getBlockId(),1);
}
}
void verifyBlock(ExtendedBlock block) {
BlockSender blockSender = null;
for (int i=0; i<2; i++){
boolean second = (i> 0);
try {
//调整节流器
adjustThrottler();
//创建BlockSender,将数据块发送到空的输出流中
blockSender = new BlockSender(block,0,-1, false, true, true,
datanode, null, CachingStrategy.newDropBehind());
DataOutputStream out =
new DataOutputStream(newIOUtils.NullOutputStream());
blockSender.sendBlock(out,null,throttler);
if ( second ) {
totalTransientErrors++;
}
//数据块校验正确,更新数据块扫描状态
updateScanStatus(block.getLocalBlock(),ScanType.VERIFICATION_SCAN, true);
return;
} catch (IOException e){
updateScanStatus(block.getLocalBlock(),ScanType.VERIFICATION_SCAN, false);
//如果FsDataSet数据块不存在,则删除这个快
if (!dataset.contains(block)) {
LOG.info(block+ " isno longer in the dataset");
deleteBlock(block.getLocalBlock());
return;
}
if (e instanceofFileNotFoundException ) {
LOG.info("Verification failed for "+ block +
" - may be due to race with write");
deleteBlock(block.getLocalBlock());
return;
}
//如果2次检查数据块的checksum都出错了,调用handleScanFailure通知NameNode这个数据块已经损坏
if (second) {
totalScanErrors++;
datanode.getMetrics().incrBlockVerificationFailures();
handleScanFailure(block);
return;
}
} finally {
IOUtils.closeStream(blockSender);
datanode.getMetrics().incrBlocksVerified();
totalScans++;
}
}
}
- hadoop之DataBlockScanner
- Hadoop DataNode启动之DataBlockScanner
- HDFS源码分析数据块校验之DataBlockScanner
- HDFS源码分析数据块校验之DataBlockScanner
- hadoop安装之-hadoop
- hadoop 之Hadoop生态系统
- hadoop之hadoop配置
- DataNode节点上数据块的完整性——DataBlockScanner
- DataNode节点上数据块的完整性——DataBlockScanner
- hadoop 学习之hadoop基础
- hadoop学习之hadoop安装
- 《Hadoop基础教程》之初识Hadoop
- 《Hadoop基础教程》之初识Hadoop
- 《Hadoop基础教程》之初识Hadoop
- 《Hadoop基础教程》之初识Hadoop
- 《Hadoop基础教程》之初识Hadoop
- 《Hadoop基础教程》之初识Hadoop
- 《Hadoop基础教程》之初识Hadoop
- 算法题练习系列之(三十九):星际密码
- svn--4.常用操作
- CentOS7添加端口访问
- 机器学习算法实践-树回归-转
- html编码
- hadoop之DataBlockScanner
- 【平台经济】中驰车福张后启:B2B是垂直产业的供应链再造 终局是建成一个M2b生态平台
- 查看本机IP地址
- jdk实现高性能异步线程开启
- 常用端口说明
- ios-RegexKitLite框架的简单使用
- centos 国内源
- Sublime按ctrl+shift+f打不开在文件中搜索功能
- MVP做二级购物车