Hadoop2.6下OIV源码分析

来源:互联网 发布:知秋新书王剑仁 编辑:程序博客网 时间:2024/05/29 11:47

Hadoop2.6下OIV源码分析

首先我们看下官方是如何介绍OIV的

The Offline Image Viewer is a tool to dump the contents of hdfs fsimage files to a human-readable format and provide read-only WebHDFS API in order to allow offline analysis and examination of an Hadoop cluster’s namespace. The tool is able to process very large image files relatively quickly. The tool handles the layout formats that were included with Hadoop versions 2.4 and up. If you want to handle older layout formats, you can use the Offline Image Viewer of Hadoop 2.3 or oiv_legacy Command. If the tool is not able to process an image file, it will exit cleanly. The Offline Image Viewer does not require a Hadoop cluster to be running; it is entirely offline in its operation.

可见在2.6.0版本中提供了两套实现以兼容早期Hadoop版本中fsImage的功能。通过查看hdfs脚本,可以看到不同的命令调用的主函数是不一样的

elif [ "$COMMAND" = "oiv" ] ; then CLASS=org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageViewerPBelif [ "$COMMAND" = "oiv_legacy" ] ; then CLASS=org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageViewer

OIV调用OfflineImageViewerPB,oiv_legacy调用了OfflineImageViewer
我们首先来分析下OfflineImageViewerPB处理逻辑,先上结论对于不需要交互或者不需要INode文件目录结构的的功能,那么通过调用FSImageUtil.loadSummary(file);得到10个sections后解析就可以(这里解析section是有顺序的,因为像String_Table保存的信息是用来还原INode内permission信息的)。具体的10个sections内容是什么可以参考这里。对于像web服务这种需要提供交互的功能,就需要使用FSImageLoader.load(inputFile)将文件目录结构,INode具体信息保存在内存中。

下面我抽XML和WEB两个具体展开来分析下。

OfflineImageViewerPB

这是处理OIV的主类,对命令参数进行解析,依据参数调用具体逻辑。最主要的方法run()代码如下:

public static int run(String[] args) throws Exception {    ...    String inputFile = cmd.getOptionValue("i");    String processor = cmd.getOptionValue("p", "Web");    String outputFile = cmd.getOptionValue("o", "-");    ...    Configuration conf = new Configuration();    try {      if (processor.equals("FileDistribution")) {        long maxSize = Long.parseLong(cmd.getOptionValue("maxSize", "0"));        int step = Integer.parseInt(cmd.getOptionValue("step", "0"));        new FileDistributionCalculator(conf, maxSize, step, out)            .visit(new RandomAccessFile(inputFile, "r"));      } else if (processor.equals("XML")) {        new PBImageXmlWriter(conf, out).visit(new RandomAccessFile(inputFile,            "r"));      } else if (processor.equals("ReverseXML")) {        try {          OfflineImageReconstructor.run(inputFile, outputFile);        } catch (Exception e) {          System.err.println("OfflineImageReconstructor failed: " +              e.getMessage());          e.printStackTrace(System.err);          System.exit(1);        }      } else if (processor.equals("Web")) {        String addr = cmd.getOptionValue("addr", "localhost:5978");        WebImageViewer viewer = new WebImageViewer(NetUtils.createSocketAddr                (addr));        try {          viewer.start(inputFile);        } finally {          viewer.close();        }      } else if (processor.equals("Delimited")) {        try (PBImageDelimitedTextWriter writer =            new PBImageDelimitedTextWriter(                new PrintStream(new WriterOutputStream(out)), delimiter, tempPath)) {          writer.visit(new RandomAccessFile(inputFile, "r"));        }      }else {        System.err.println("Invalid processor specified : " + processor);        printUsage();        return -1;      }      return 0;    } catch (EOFException e) {      System.err.println("Input file ended unexpectedly. Exiting");    } catch (IOException e) {      System.err.println("Encountered exception.  Exiting: " + e.getMessage());    } finally {      IOUtils.cleanup(null, out);    }    return -1;  }  

可以看出OIV目前支持FileDistribution、ReverseXML、Web、Delimited这四种功能,run()方法也很简单,就是解析命令。

XML

将FSImage文件内容转成XML格式使用的是PBImageXmlWriter类,最主要的是visit()方法,源码如下:

public void visit(RandomAccessFile file) throws IOException {    if (!FSImageUtil.checkFileFormat(file)) {      throw new IOException("Unrecognized FSImage");    }    FileSummary summary = FSImageUtil.loadSummary(file);    FileInputStream fin = null;    try {      fin = new FileInputStream(file.getFD());      out.print("<?xml version=\"1.0\"?>\n<fsimage>");      out.print("<version>");      o("layoutVersion", summary.getLayoutVersion());      o("onDiskVersion", summary.getOndiskVersion());      // Output the version of OIV (which is not necessarily the version of      // the fsimage file).  This could be helpful in the case where a bug      // in OIV leads to information loss in the XML-- we can quickly tell      // if a specific fsimage XML file is affected by this bug.      o("oivRevision", VersionInfo.getRevision());      out.print("</version>\n");      ArrayList<FileSummary.Section> sections = Lists.newArrayList(summary          .getSectionsList());      Collections.sort(sections, new Comparator<FileSummary.Section>() {        @Override        public int compare(FileSummary.Section s1, FileSummary.Section s2) {          SectionName n1 = SectionName.fromString(s1.getName());          SectionName n2 = SectionName.fromString(s2.getName());          if (n1 == null) {            return n2 == null ? 0 : -1;          } else if (n2 == null) {            return -1;          } else {            return n1.ordinal() - n2.ordinal();          }        }      });      for (FileSummary.Section s : sections) {        fin.getChannel().position(s.getOffset());        InputStream is = FSImageUtil.wrapInputStreamForCompression(conf,            summary.getCodec(), new BufferedInputStream(new LimitInputStream(                fin, s.getLength())));        switch (SectionName.fromString(s.getName())) {        case NS_INFO:          dumpNameSection(is);          break;        case STRING_TABLE:          loadStringTable(is);          break;        case INODE:          dumpINodeSection(is);          break;        case INODE_REFERENCE:          dumpINodeReferenceSection(is);          break;        case INODE_DIR:          dumpINodeDirectorySection(is);          break;        case FILES_UNDERCONSTRUCTION:          dumpFileUnderConstructionSection(is);          break;        case SNAPSHOT:          dumpSnapshotSection(is);          break;        case SNAPSHOT_DIFF:          dumpSnapshotDiffSection(is);          break;        case SECRET_MANAGER:          dumpSecretManagerSection(is);          break;        case CACHE_MANAGER:          dumpCacheManagerSection(is);          break;        default:          break;        }      }      out.print("</fsimage>\n");    } finally {      IOUtils.cleanup(null, fin);    }  }

可以看到第一步调用FSImageUtil.checkFileFormat()来验证FSImage是否合法,接着就是调用FSImageUtil.loadSummary(file)来拿到10个section。然后就是对section进行排序,注意这里的排序,会影响到解析结果,因为section间是有关系的,最简单来说String_Table中保存了对应关系起到一个map的作用,INode上permission的具体含义是需要依赖于String_Table的。
显而易见FSImageUtil是一个很重要的类,它将FSImage文件信息初步解析出来,提供给具体的功能需求类在它的基础上精加工。

public final class FSImageUtil {  public static final byte[] MAGIC_HEADER = "HDFSIMG1".getBytes();  public static final int FILE_VERSION = 1;  public static boolean checkFileFormat(RandomAccessFile file)      throws IOException {    if (file.length() < Loader.MINIMUM_FILE_LENGTH)      return false;    byte[] magic = new byte[MAGIC_HEADER.length];    file.readFully(magic);    if (!Arrays.equals(MAGIC_HEADER, magic))      return false;    return true;  }  public static FileSummary loadSummary(RandomAccessFile file)      throws IOException {    final int FILE_LENGTH_FIELD_SIZE = 4;    long fileLength = file.length();    file.seek(fileLength - FILE_LENGTH_FIELD_SIZE);    int summaryLength = file.readInt();    if (summaryLength <= 0) {      throw new IOException("Negative length of the file");    }    file.seek(fileLength - FILE_LENGTH_FIELD_SIZE - summaryLength);    byte[] summaryBytes = new byte[summaryLength];    file.readFully(summaryBytes);    FileSummary summary = FileSummary        .parseDelimitedFrom(new ByteArrayInputStream(summaryBytes));    if (summary.getOndiskVersion() != FILE_VERSION) {      throw new IOException("Unsupported file version "          + summary.getOndiskVersion());    }    if (!NameNodeLayoutVersion.supports(Feature.PROTOBUF_FORMAT,        summary.getLayoutVersion())) {      throw new IOException("Unsupported layout version "          + summary.getLayoutVersion());    }    return summary;  }  public static InputStream wrapInputStreamForCompression(      Configuration conf, String codec, InputStream in) throws IOException {    if (codec.isEmpty())      return in;    FSImageCompression compression = FSImageCompression.createCompression(        conf, codec);    CompressionCodec imageCodec = compression.getImageCodec();    return imageCodec.createInputStream(in);  }}

这个类对外提供了3个静态方法,checkFileFormat()、loadSummary()、wrapInputStreamForCompression()。
checkFileFormat()通过对文件头的比对判断文件是否是可以解析的FSImage文件;wrapInputStreamForCompression()通过配置文件信息来检测FSImage文件是否压缩过,调用合适的压缩器来处理输入流;loadSummary()则是取出包含summary信息的字节,解析出对应的信息。具体的解析方法就不分析了。
整个XML的处理流程大体就是这样,下面来看下WEB.

WEB

WebImageViewer源码具体如下:

public class WebImageViewer implements Closeable {...   public void start(String fsimage) throws IOException {    try {      initServer(fsimage);      channel.closeFuture().await();    } catch (InterruptedException e) {      close();    }  }  @VisibleForTesting  public void initServer(String fsimage)          throws IOException, InterruptedException {    final FSImageLoader loader = FSImageLoader.load(fsimage);    bootstrap.childHandler(new ChannelInitializer<SocketChannel>() {      @Override      protected void initChannel(SocketChannel ch) throws Exception {        ChannelPipeline p = ch.pipeline();        p.addLast(new HttpRequestDecoder(),          new StringEncoder(),          new HttpResponseEncoder(),          new FSImageHandler(loader, allChannels));      }    });    channel = bootstrap.bind(address).sync().channel();    allChannels.add(channel);    address = (InetSocketAddress) channel.localAddress();  } }

主要作用就是在给定的端口上开一个管道,调用FSImageLoader.load(fsimage);将FSImage文件中INode信息和文件目录树信息保存在内存中交给FSImageHandler响应具体的命令。这里最重要的是FSImageLoader类将交互使用到的信息保存在了内存中,我们从它的源码上看看这个类的具体实现。

class FSImageLoader {       ...  private FSImageLoader(String[] stringTable, byte[][] inodes,                        Map<Long, long[]> dirmap) {    this.stringTable = stringTable;    this.inodes = inodes;    this.dirmap = dirmap;  }  static FSImageLoader load(String inputFile) throws IOException {    Configuration conf = new Configuration();    RandomAccessFile file = new RandomAccessFile(inputFile, "r");    FsImageProto.FileSummary summary = FSImageUtil.loadSummary(file);    FileInputStream fin = null;    try {      // Map to record INodeReference to the referred id      ImmutableList<Long> refIdList = null;      String[] stringTable = null;      byte[][] inodes = null;      Map<Long, long[]> dirmap = null;      fin = new FileInputStream(file.getFD());      ArrayList<FsImageProto.FileSummary.Section> sections =          Lists.newArrayList(summary.getSectionsList());      Collections.sort(sections,          new Comparator<FsImageProto.FileSummary.Section>() {            @Override            public int compare(FsImageProto.FileSummary.Section s1,                               FsImageProto.FileSummary.Section s2) {              FSImageFormatProtobuf.SectionName n1 =                  FSImageFormatProtobuf.SectionName.fromString(s1.getName());              FSImageFormatProtobuf.SectionName n2 =                  FSImageFormatProtobuf.SectionName.fromString(s2.getName());              if (n1 == null) {                return n2 == null ? 0 : -1;              } else if (n2 == null) {                return -1;              } else {                return n1.ordinal() - n2.ordinal();              }            }          });      for (FsImageProto.FileSummary.Section s : sections) {        fin.getChannel().position(s.getOffset());        InputStream is = FSImageUtil.wrapInputStreamForCompression(conf,            summary.getCodec(), new BufferedInputStream(new LimitInputStream(            fin, s.getLength())));        switch (FSImageFormatProtobuf.SectionName.fromString(s.getName())) {          case STRING_TABLE:            stringTable = loadStringTable(is);            break;          case INODE:            inodes = loadINodeSection(is);            break;          case INODE_REFERENCE:            refIdList = loadINodeReferenceSection(is);            break;          case INODE_DIR:            dirmap = loadINodeDirectorySection(is, refIdList);            break;          default:            break;        }      }      return new FSImageLoader(stringTable, inodes, dirmap);    } finally {      IOUtils.cleanup(null, fin);    }  }

依靠FSImageUtil.loadSummary(file);取得summary信息,解析STRING_TABLE、INODE、INODE_REFERENCE、INODE_DIR四个section,stringTable保存着映射关系,用来还原inode上permisson具体内容;inodes是一个二维字节数组保存了INode具体信息;dirmap是个map,其中key是父目录INodeID,Value是一个数组保存了子INode的ID。抽loadINodeSection的源码进行展开下

private static byte[][] loadINodeSection(InputStream in)          throws IOException {    FsImageProto.INodeSection s = FsImageProto.INodeSection        .parseDelimitedFrom(in);    final byte[][] inodes = new byte[(int) s.getNumInodes()][];    for (int i = 0; i < s.getNumInodes(); ++i) {      int size = CodedInputStream.readRawVarint32(in.read(), in);      byte[] bytes = new byte[size];      IOUtils.readFully(in, bytes, 0, size);      inodes[i] = bytes;    }    Arrays.sort(inodes, INODE_BYTES_COMPARATOR);    return inodes;  }

通过调用FsImageProto.INodeSection解析INodeSection段拿到总的Inode数量后,CodedInputStream.readRawVarint32(in.read(), in);得到一个INode占用的字节数,直接读取到一个字节数组上。


OfflineImageViewer

粗略的扫了下OfflineImageViewerPB的流程,可以看出执行过程还是很简单的。下面再来看下OfflineImageViewer,和OfflineImageViewerPB一样也是解析命令,选择具体那个类进行处理,不过这里将所有的处理类都抽象出了一个抽象类ImageVisitor,还是粗略的看下源码:

public class OfflineImageViewer {  private final boolean skipBlocks;//标记是否跳过数据块的处理  private final String inputFile;  private final ImageVisitor processor;  public OfflineImageViewer(String inputFile, ImageVisitor processor,              boolean skipBlocks) {    this.inputFile = inputFile;    this.processor = processor;    this.skipBlocks = skipBlocks;  }  /**   * Process image file.   */  public void go() throws IOException  {    DataInputStream in = null;    PositionTrackingInputStream tracker = null;    ImageLoader fsip = null;    boolean done = false;    try {      tracker = new PositionTrackingInputStream(new BufferedInputStream(               new FileInputStream(new File(inputFile))));      in = new DataInputStream(tracker);      int imageVersionFile = findImageVersion(in);      fsip = ImageLoader.LoaderFactory.getLoader(imageVersionFile);      if(fsip == null)         throw new IOException("No image processor to read version " +            imageVersionFile + " is available.");      fsip.loadImage(in, processor, skipBlocks);      done = true;    } finally {      if (!done) {        LOG.error("image loading failed at offset " + tracker.getPos());      }      IOUtils.cleanup(LOG, in, tracker);    }  }   private int findImageVersion(DataInputStream in) throws IOException {    in.mark(42); // arbitrary amount, resetting immediately    int version = in.readInt();    in.reset();    return version;  }  public static void main(String[] args) throws IOException {    ...    boolean skipBlocks = cmd.hasOption("skipBlocks");    String inputFile = cmd.getOptionValue("i");    ...    ImageVisitor v;    if(processor.equals("Indented")) {      v = new IndentedImageVisitor(outputFile, printToScreen);    } else if (processor.equals("XML")) {      v = new XmlImageVisitor(outputFile, printToScreen);    }else {      v = new LsImageVisitor(outputFile, printToScreen);      skipBlocks = false;    }      OfflineImageViewer d = new OfflineImageViewer(inputFile, v, skipBlocks);      d.go();  }}

和OfflineImageViewerPB不同的地方是这里经过抽象类ImageVisitor,利用多态将处理流程统一了,都靠go方法去实现,它是实现思想很简单首先取到FSImage的版本信息这里通过findImageVersion来获取,通过版本信息拿到合适的文件解析器,这里同过ImageLoader接口多态实现,拿到了处理类,拿到对应版本的ImageLoader实现,通过fsip.loadImage(in, processor, skipBlocks);一切就水到渠成了。
我们先来看下ImageVisitor这个抽象类

abstract class ImageVisitor {  /**   * Structural elements of an FSImage that may be encountered within the   * file. ImageVisitors are able to handle processing any of these elements.   */  public enum ImageElement {    FS_IMAGE,    IMAGE_VERSION,    NAMESPACE_ID,    IS_COMPRESSED,    COMPRESS_CODEC,    LAYOUT_VERSION,    NUM_INODES,    GENERATION_STAMP,    GENERATION_STAMP_V2,    GENERATION_STAMP_V1_LIMIT,    LAST_ALLOCATED_BLOCK_ID,    INODES,    INODE,    INODE_PATH,    REPLICATION,    MODIFICATION_TIME,    ACCESS_TIME,    BLOCK_SIZE,    NUM_BLOCKS,    BLOCKS,    BLOCK,    BLOCK_ID,    NUM_BYTES,    NS_QUOTA,    DS_QUOTA,    PERMISSIONS,    SYMLINK,    NUM_INODES_UNDER_CONSTRUCTION,    INODES_UNDER_CONSTRUCTION,    INODE_UNDER_CONSTRUCTION,    PREFERRED_BLOCK_SIZE,    CLIENT_NAME,    CLIENT_MACHINE,    USER_NAME,    GROUP_NAME,    PERMISSION_STRING,    CURRENT_DELEGATION_KEY_ID,    NUM_DELEGATION_KEYS,    DELEGATION_KEYS,    DELEGATION_KEY,    DELEGATION_TOKEN_SEQUENCE_NUMBER,    NUM_DELEGATION_TOKENS,    DELEGATION_TOKENS,    DELEGATION_TOKEN_IDENTIFIER,    DELEGATION_TOKEN_IDENTIFIER_KIND,    DELEGATION_TOKEN_IDENTIFIER_SEQNO,    DELEGATION_TOKEN_IDENTIFIER_OWNER,    DELEGATION_TOKEN_IDENTIFIER_RENEWER,    DELEGATION_TOKEN_IDENTIFIER_REALUSER,    DELEGATION_TOKEN_IDENTIFIER_ISSUE_DATE,    DELEGATION_TOKEN_IDENTIFIER_MAX_DATE,    DELEGATION_TOKEN_IDENTIFIER_EXPIRY_TIME,    DELEGATION_TOKEN_IDENTIFIER_MASTER_KEY_ID,    TRANSACTION_ID,    LAST_INODE_ID,    INODE_ID,    SNAPSHOT_COUNTER,    NUM_SNAPSHOTS_TOTAL,    NUM_SNAPSHOTS,    SNAPSHOTS,    SNAPSHOT,    SNAPSHOT_ID,    SNAPSHOT_ROOT,    SNAPSHOT_QUOTA,    NUM_SNAPSHOT_DIR_DIFF,    SNAPSHOT_DIR_DIFFS,    SNAPSHOT_DIR_DIFF,    SNAPSHOT_DIFF_SNAPSHOTID,    SNAPSHOT_DIR_DIFF_CHILDREN_SIZE,    SNAPSHOT_INODE_FILE_ATTRIBUTES,    SNAPSHOT_INODE_DIRECTORY_ATTRIBUTES,    SNAPSHOT_DIR_DIFF_CREATEDLIST,    SNAPSHOT_DIR_DIFF_CREATEDLIST_SIZE,    SNAPSHOT_DIR_DIFF_CREATED_INODE,    SNAPSHOT_DIR_DIFF_DELETEDLIST,    SNAPSHOT_DIR_DIFF_DELETEDLIST_SIZE,    SNAPSHOT_DIR_DIFF_DELETED_INODE,    IS_SNAPSHOTTABLE_DIR,    IS_WITHSNAPSHOT_DIR,    SNAPSHOT_FILE_DIFFS,    SNAPSHOT_FILE_DIFF,    NUM_SNAPSHOT_FILE_DIFF,    SNAPSHOT_FILE_SIZE,    SNAPSHOT_DST_SNAPSHOT_ID,    SNAPSHOT_LAST_SNAPSHOT_ID,    SNAPSHOT_REF_INODE_ID,    SNAPSHOT_REF_INODE,    CACHE_NEXT_ENTRY_ID,    CACHE_NUM_POOLS,    CACHE_POOL_NAME,    CACHE_POOL_OWNER_NAME,    CACHE_POOL_GROUP_NAME,    CACHE_POOL_PERMISSION_STRING,    CACHE_POOL_WEIGHT,    CACHE_NUM_ENTRIES,    CACHE_ENTRY_PATH,    CACHE_ENTRY_REPLICATION,    CACHE_ENTRY_POOL_NAME  }  /**   * Begin visiting the fsimage structure.  Opportunity to perform   * any initialization necessary for the implementing visitor.   */  abstract void start() throws IOException;  /**   * Finish visiting the fsimage structure.  Opportunity to perform any   * clean up necessary for the implementing visitor.   */  abstract void finish() throws IOException;  /**   * Finish visiting the fsimage structure after an error has occurred   * during the processing.  Opportunity to perform any clean up necessary   * for the implementing visitor.   */  abstract void finishAbnormally() throws IOException;  /**   * Visit non enclosing element of fsimage with specified value.   *   * @param element FSImage element   * @param value Element's value   */  abstract void visit(ImageElement element, String value) throws IOException;  // Convenience methods to automatically convert numeric value types to strings  void visit(ImageElement element, int value) throws IOException {    visit(element, Integer.toString(value));  }  void visit(ImageElement element, long value) throws IOException {    visit(element, Long.toString(value));  }  /**   * Begin visiting an element that encloses another element, such as   * the beginning of the list of blocks that comprise a file.   *   * @param element Element being visited   */  abstract void visitEnclosingElement(ImageElement element)     throws IOException;  /**   * Begin visiting an element that encloses another element, such as   * the beginning of the list of blocks that comprise a file.   *   * Also provide an additional key and value for the element, such as the   * number items within the element.   *   * @param element Element being visited   * @param key Key describing the element being visited   * @param value Value associated with element being visited   */  abstract void visitEnclosingElement(ImageElement element,      ImageElement key, String value) throws IOException;  // Convenience methods to automatically convert value types to strings  void visitEnclosingElement(ImageElement element,      ImageElement key, int value)     throws IOException {    visitEnclosingElement(element, key, Integer.toString(value));  }  void visitEnclosingElement(ImageElement element,      ImageElement key, long value)     throws IOException {    visitEnclosingElement(element, key, Long.toString(value));  }  /**   * Leave current enclosing element.  Called, for instance, at the end of   * processing the blocks that compromise a file.   */  abstract void leaveEnclosingElement() throws IOException;}

映入眼帘是一个超大的枚举类,看注释它枚举了FSImage可能包含的元素,具体的实现类中也是根据这些元素去调用对应的解析过程。剩下的方法中
abstract void start() throws IOException;开始解析FSImage
abstract void finish() throws IOException;解析完成
abstract void visit(ImageElement element, String value) throws IOException;用来解析没有子元素的element的方法
abstract void visitEnclosingElement(ImageElement element)
throws IOException;用来解析包含子元素的element,如BLOCKS子元素就是BLOCK
abstract void leaveEnclosingElement() throws IOException;离开包含子元素的element
我们看下XmlImageVisitor中上面几个方法的具体实现:

@Override  void start() throws IOException {    write("<?xml version=\"1.0\" ?>\n");  }  @Override  void finish() throws IOException {    close();  }   @Override  void visit(ImageElement element, String value) throws IOException {    writeTag(element.toString(), value);  }  @Override  void visitEnclosingElement(ImageElement element) throws IOException {    write("<" + element.toString() + ">\n");    tagQ.push(element);  }  @Override  void visitEnclosingElement(ImageElement element,      ImageElement key, String value)       throws IOException {    write("<" + element.toString() + " " + key + "=\"" + value +"\">\n");    tagQ.push(element);  }  @Override  void leaveEnclosingElement() throws IOException {    if(tagQ.size() == 0)      throw new IOException("Tried to exit non-existent enclosing element " +                "in FSImage file");    ImageElement element = tagQ.pop();    write("</" + element.toString() + ">\n");  }

在构建XML文件时,如果没有子元素,直接按照XML格式输出字段名,值。如果有子元素那先输出父element属性名,值等。加入栈中依靠栈的先进后出实现XML文件上下包围的文件格式。
ImageVisitor具体实现类依据各自的输出格式,在进入元素,离开元素时候进行适当的处理。

最后我们来理下ImageLoader以及它的实现类,

interface ImageLoader {  public void loadImage(DataInputStream in, ImageVisitor v,      boolean enumerateBlocks) throws IOException;  public boolean canLoadVersion(int version);  @InterfaceAudience.Private  public class LoaderFactory {       static public ImageLoader getLoader(int version) {      ImageLoader[] loaders = { new ImageLoaderCurrent() };      for (ImageLoader l : loaders) {        if (l.canLoadVersion(version))          return l;      }      return null;    }  }}

在接口中通过一个内部类提供了一个静态方法实现了工厂功能,想法是提供不同版本的ImageLoader来实现各自版本下的解析。不过这里就只有一个具体的实现类ImageLoaderCurrent
来看下它的源码

/** * ImageLoaderCurrent processes Hadoop FSImage files and walks over * them using a provided ImageVisitor, calling the visitor at each element * enumerated below. * * The only difference between v18 and v19 was the utilization of the * stickybit.  Therefore, the same viewer can reader either format. * * Versions -19 fsimage layout (with changes from -16 up): * Image version (int) * Namepsace ID (int) * NumFiles (long) * Generation stamp (long) * INodes (count = NumFiles) *  INode *    Path (String) *    Replication (short) *    Modification Time (long as date) *    Access Time (long) // added in -16 *    Block size (long) *    Num blocks (int) *    Blocks (count = Num blocks) *      Block *        Block ID (long) *        Num bytes (long) *        Generation stamp (long) *    Namespace Quota (long) *    Diskspace Quota (long) // added in -18 *    Permissions *      Username (String) *      Groupname (String) *      OctalPerms (short -> String)  // Modified in -19 *    Symlink (String) // added in -23 * NumINodesUnderConstruction (int) * INodesUnderConstruction (count = NumINodesUnderConstruction) *  INodeUnderConstruction *    Path (bytes as string) *    Replication (short) *    Modification time (long as date) *    Preferred block size (long) *    Num blocks (int) *    Blocks *      Block *        Block ID (long) *        Num bytes (long) *        Generation stamp (long) *    Permissions *      Username (String) *      Groupname (String) *      OctalPerms (short -> String) *    Client Name (String) *    Client Machine (String) *    NumLocations (int) *    DatanodeDescriptors (count = numLocations) // not loaded into memory *      short                                    // but still in file *      long *      string *      long *      int *      string *      string *      enum *    CurrentDelegationKeyId (int) *    NumDelegationKeys (int) *      DelegationKeys (count = NumDelegationKeys) *        DelegationKeyLength (vint) *        DelegationKey (bytes) *    DelegationTokenSequenceNumber (int) *    NumDelegationTokens (int) *    DelegationTokens (count = NumDelegationTokens) *      DelegationTokenIdentifier *        owner (String) *        renewer (String) *        realUser (String) *        issueDate (vlong) *        maxDate (vlong) *        sequenceNumber (vint) *        masterKeyId (vint) *      expiryTime (long)      * */class ImageLoaderCurrent implements ImageLoader {  protected final DateFormat dateFormat =                                       new SimpleDateFormat("yyyy-MM-dd HH:mm");  private static int[] versions = { -16, -17, -18, -19, -20, -21, -22, -23,      -24, -25, -26, -27, -28, -30, -31, -32, -33, -34, -35, -36, -37, -38, -39,      -40, -41, -42, -43, -44, -45, -46, -47, -48, -49, -50, -51 };  private int imageVersion = 0;  private final Map<Long, Boolean> subtreeMap = new HashMap<Long, Boolean>();  private final Map<Long, String> dirNodeMap = new HashMap<Long, String>();  /* (non-Javadoc)   * @see ImageLoader#canProcessVersion(int)   */  @Override  public boolean canLoadVersion(int version) {    for(int v : versions)      if(v == version) return true;    return false;  }  /* (non-Javadoc)   * @see ImageLoader#processImage(java.io.DataInputStream, ImageVisitor, boolean)   */  @Override  public void loadImage(DataInputStream in, ImageVisitor v,      boolean skipBlocks) throws IOException {    boolean done = false;    try {      v.start();      v.visitEnclosingElement(ImageElement.FS_IMAGE);      imageVersion = in.readInt();      if( !canLoadVersion(imageVersion))        throw new IOException("Cannot process fslayout version " + imageVersion);      if (NameNodeLayoutVersion.supports(Feature.ADD_LAYOUT_FLAGS, imageVersion)) {        LayoutFlags.read(in);      }      v.visit(ImageElement.IMAGE_VERSION, imageVersion);      v.visit(ImageElement.NAMESPACE_ID, in.readInt());      long numInodes = in.readLong();      v.visit(ImageElement.GENERATION_STAMP, in.readLong());      if (NameNodeLayoutVersion.supports(Feature.SEQUENTIAL_BLOCK_ID, imageVersion)) {        v.visit(ImageElement.GENERATION_STAMP_V2, in.readLong());        v.visit(ImageElement.GENERATION_STAMP_V1_LIMIT, in.readLong());        v.visit(ImageElement.LAST_ALLOCATED_BLOCK_ID, in.readLong());      }      if (NameNodeLayoutVersion.supports(Feature.STORED_TXIDS, imageVersion)) {        v.visit(ImageElement.TRANSACTION_ID, in.readLong());      }      if (NameNodeLayoutVersion.supports(Feature.ADD_INODE_ID, imageVersion)) {        v.visit(ImageElement.LAST_INODE_ID, in.readLong());      }      boolean supportSnapshot = NameNodeLayoutVersion.supports(Feature.SNAPSHOT,          imageVersion);      if (supportSnapshot) {        v.visit(ImageElement.SNAPSHOT_COUNTER, in.readInt());        int numSnapshots = in.readInt();        v.visit(ImageElement.NUM_SNAPSHOTS_TOTAL, numSnapshots);        for (int i = 0; i < numSnapshots; i++) {          processSnapshot(in, v);        }      }      if (NameNodeLayoutVersion.supports(Feature.FSIMAGE_COMPRESSION, imageVersion)) {        boolean isCompressed = in.readBoolean();        v.visit(ImageElement.IS_COMPRESSED, String.valueOf(isCompressed));        if (isCompressed) {          String codecClassName = Text.readString(in);          v.visit(ImageElement.COMPRESS_CODEC, codecClassName);          CompressionCodecFactory codecFac = new CompressionCodecFactory(              new Configuration());          CompressionCodec codec = codecFac.getCodecByClassName(codecClassName);          if (codec == null) {            throw new IOException("Image compression codec not supported: "                + codecClassName);          }          in = new DataInputStream(codec.createInputStream(in));        }      }      processINodes(in, v, numInodes, skipBlocks, supportSnapshot);      subtreeMap.clear();      dirNodeMap.clear();      processINodesUC(in, v, skipBlocks);      if (NameNodeLayoutVersion.supports(Feature.DELEGATION_TOKEN, imageVersion)) {        processDelegationTokens(in, v);      }      if (NameNodeLayoutVersion.supports(Feature.CACHING, imageVersion)) {        processCacheManagerState(in, v);      }      v.leaveEnclosingElement(); // FSImage      done = true;    } finally {      if (done) {        v.finish();      } else {        v.finishAbnormally();      }    }  }  }
原创粉丝点击