DataNode节点的数据块管理(2)——DF、DU

来源：互联网发布：火凤凰云计算基地编辑：程序博客网时间：2024/05/09 09:29

HDFS作为一个分布式文件系统，它必须要知道自己当前整个集群存储空间的状态信息，如总容量、使用量、使用率、剩余可用量等信息，而HDFS正是通过统计所有的DataNode节点的存储空间信息来实现的。当DataNode节点向NameNode发送心跳包的时候，会顺便把自己当前的存储容量信息一并报告给它，那么，DataNode是如何知道自己的存储空间使用信息的呢？如果仅仅是通过人工输入总存储空间容量，统计每一个文件大小来计算使用量和剩余可用量是根本不靠谱的，因为所有文件大小之和并不等于这些文件实际占用本地磁盘空间的大小，另外，DataNode节点所在的机器中的其它进程也可能占用磁盘空间。为了能够比较准确地获取一个DataNode节点的存储空间的总容量、使用量和可用量，HDFS通过程序实现了unix系统的df、du命令，它们被分别用来获取系统本地磁盘的使用情况和目录或文件的大小信息。

HDFS通过org.apache.hadoop.fs.DF类来实现unix的df命令，org.apache.hadoop.fs.DU类来实现unix的du命令。DF类和DU类都是通过使用java程序执行Shell脚本命令来是想各自的功能的。下面就来详细的介绍这两个类的具体是如何实现。

1.Shell

private long    interval;   // 刷新间隔private long    lastTime;   // 最后执行命令的时间private Map<String, String> environment; // 命令行执行所需要的操作系统环境private File dir;//当前执行命令所在的工作目录,默认为系统“user.dir”变量值private Process process; // 执行命令行的子进程private int exitCode;//执行命令行完成后，退出状态码

shell类被用来真正的执行一个脚本命令，而且它被设计成了一个抽象的类，以此就可以执行任何脚本命令了，因为它向用用开放了两个抽象方法getExecString()和parseExecResult()，通过getExecString()方法，用户可以设置要执行的shell脚本命令和所需的参数，通过parseExecResult()方法，用户可以根据自己设置的shell命令来解析返回的对应结果。另一方面，考虑到某一shell脚本命令的返回结果变化的频率不是很快，而且有些shell脚本命令执行一次可能需要耗费大量的系统资源或者响应时间，所以为了提高执行效率，Shell在其内部设计了一次shell命令执行的时间间隔interval，同时这个时间间隔可以由用户自己根据情况来设置，对于结果变化比较快的shell脚本命令，这个interval值可以设置的比较小。

   /**   * 通过interval与lastTime属性来检查，是否有必要重新执行一次，如果是就执行，   * 否则重置退出状态码exitCode为0，正常退出   */  protected void run() throws IOException {    if (lastTime + interval > System.currentTimeMillis()) return;    exitCode = 0; // reset for next run    runCommand();  }  /** 执行一次某个脚本命令 */  private void runCommand() throws IOException {     //获取到一个命令名称及其参数，从而基于此构造一个ProcessBuilder进程实例    ProcessBuilder builder = new ProcessBuilder(getExecString());    boolean completed = false;//标识执行命令完成情况        if (environment != null) {      builder.environment().putAll(this.environment);//设置命令行执行环境    }    if (dir != null) {      builder.directory(this.dir);//设置命令行执行所在工作目录    }        //启动ProcessBuilder builder进程，返回一个用来管理命令行执行情况的子进程process    process = builder.start();    //当builder进程启动后，检查提交的命令行是否合法，如果不合法或者执行出错，将出错信息写入到缓冲流中，可以从其中解析读取出来    final BufferedReader errReader =  new BufferedReader(new InputStreamReader(process.getErrorStream()));    //执行命令返回执行结果，通过process管理子线程来获取执行流中的执行结果信息    BufferedReader inReader =  new BufferedReader(new InputStreamReader(process.getInputStream()));    //存放执行命令出错信息的String缓冲区    final StringBuffer errMsg = new StringBuffer();        //定义解析线程，解析命令行执行出错信息所在的流，解析完成后释放流缓冲区    Thread errThread = new Thread() {      @Override      public void run() {        try {          String line = errReader.readLine();          while((line != null) && !isInterrupted()) {            errMsg.append(line);            errMsg.append(System.getProperty("line.separator"));            line = errReader.readLine();          }        } catch(IOException ioe) {          LOG.warn("Error reading the error stream", ioe);        }      }    };    try {      errThread.start();//启动线程，处理出错信息    } catch (IllegalStateException ise) { }        try {      parseExecResult(inReader); // 解析执行命令返回的结果信息      // clear the input stream buffer      String line = inReader.readLine();      while(line != null) {         line = inReader.readLine();      }      // 等待进程process处理完毕，置exitCode状态码      exitCode = process.waitFor();      try {        //等待出错信息处理线程执行完成        errThread.join();      } catch (InterruptedException ie) {        LOG.warn("Interrupted while reading the error stream", ie);      }      completed = true;//置命令行执行完成状态      if (exitCode != 0) {        throw new ExitCodeException(exitCode, errMsg.toString());      }    } catch (InterruptedException ie) {      throw new IOException(ie.toString());    } finally {      // close the input stream      try {        inReader.close();      } catch (IOException ioe) {        LOG.warn("Error while closing the input stream", ioe);      }      if (!completed) {        errThread.interrupt();      }      try {        errReader.close();      } catch (IOException ioe) {        LOG.warn("Error while closing the error stream", ioe);      }      process.destroy();//终止子进程process      lastTime = System.currentTimeMillis();//设置当前时间为该命令行执行的最后时间    }  }

2.DF

private String  dirPath;//执行df命令所在工作目录private String filesystem;//磁盘设备名private long capacity;//磁盘总容量private long used;//磁盘使用量private long available;//磁盘可用量private int percentUsed;//磁盘使用率private String mount;//磁盘挂载位置

实际上，DF被设计用来获取dirPath路径所在的磁盘的空间状态信息，对应的unix的shell脚本命令格式是：df -kpath，DF的刷新频率默认是3000ms，但也可以通过DataNode节点的配置文件来设置，对应的配置项是：dfs.df.interval。

  /*构建df的shell脚本命令*/  protected String[] getExecString() {    // ignoring the error since the exit code it enough    return new String[] {"bash","-c","exec 'df' '-k' '" + dirPath + "' 2>/dev/null"};  }    /*解析df命令返回来的结果*/  protected void parseExecResult(BufferedReader lines) throws IOException {    lines.readLine();                         // skip headings      String line = lines.readLine();    if (line == null) {      throw new IOException( "Expecting a line not the end of stream" );    }    StringTokenizer tokens = new StringTokenizer(line, " \t\n\r\f%");        this.filesystem = tokens.nextToken();    if (!tokens.hasMoreTokens()) {            // for long filesystem name      line = lines.readLine();      if (line == null) {        throw new IOException( "Expecting a line not the end of stream" );      }      tokens = new StringTokenizer(line, " \t\n\r\f%");    }        //更新记录的磁盘空间状态信息    this.capacity = Long.parseLong(tokens.nextToken()) * 1024;    this.used = Long.parseLong(tokens.nextToken()) * 1024;    this.available = Long.parseLong(tokens.nextToken()) * 1024;    this.percentUsed = Integer.parseInt(tokens.nextToken());    this.mount = tokens.nextToken();  }

3.DU

private String  dirPath;    //所要查询的目录或文件的路径private AtomicLong used = new AtomicLong();    //记录当前文件或目录占用磁盘空间的大小private Thread refreshUsed;   //更新used的后台线程private long refreshInterval;   //更新used的频率

DU类实现了unix的du命令，显示文件或目录dirPath占用磁盘空间的大小信息。在DataNode内部的具体实现中，DU的刷新频率是0ms，但是DU又在其内部开启了一个后台刷新线程来定时的更新used的值，这个时间间隔在DataNode内部固定为600000ms。总之，在启动后台刷新线程的情况下，更新频率是600000ms，否则更新频率是0ms。值得注意的是，对某一个存储目录执行du命令的时间开销与该目录下的文件总数量相关，如果一个DataNode配置了多个存储路径的情况下，同时每一个目录执行du命令的时间开销在100s级别的话，很有可能导致该DataNode节点无法向NameNode发送心跳包，而致使NameNode节点误以为该DataNode节点已经挂了，在这种情况下，就只有在NameNode上延长DataNode的过期时间heartbeatExpireInterval，这个时间主要依赖于heartbeatInterval和heartbeatRecheckInterval：
heartbeatExpireInterval = 2*heartbeatRecheckInterval + 10*heartbeatInterval

其中，heartbeatInterval和heartbeatRecheckInterval都可以通过Hadoop的配置文件来设置，它们对应的配置项为：dfs.heartbeat.interval、heartbeat.recheck.interval。

/*构建du的shell脚本命令*/  protected String[] getExecString() {    return new String[] {"du", "-sk", dirPath};  }  /*解析du命令执行的结果*/    protected void parseExecResult(BufferedReader lines) throws IOException {    String line = lines.readLine();    if (line == null) {      throw new IOException("Expecting a line not the end of stream");    }    String[] tokens = line.split("\t");    if(tokens.length == 0) {      throw new IOException("Illegal du output");    }    /*更新记录的值*/    this.used.set(Long.parseLong(tokens[0])*1024);}/*获取文件或目录所占用磁盘空间的大小*/public long getUsed() throws IOException {    //if the updating thread isn't started, update on demand    if(refreshUsed == null) {      run();    } else {      synchronized (DU.this) {        //if an exception was thrown in the last run, rethrow        if(duException != null) {          IOException tmp = duException;          duException = null;          throw tmp;        }      }    }        return used.longValue();  }

本文的重点应该是DataNode节点在管理数据块的时候如何来获取自己的存储空间信息，但其实，Shell类更重，因为它把java程序执行shell脚本的过程抽象成了一个执行模型，当我们在其它的应用场景中需要用java程序执行某个unix的shell命令式，完全可以应用这个Shell类并实现之。