Hadoop读写过程的源码分析

来源：互联网发布：手机淘宝关注链接编辑：程序博客网时间：2024/06/11 07:20

读取过程

文件读写入口
进入open(Path f, final int bufferSize)方法，该方法属于DistributedFileSystem类，在返回结果的时候，创建了一个FileSystemLinkResolver对象，并实现了此类的两个抽象方法，最后调用了resolve()方法，其中doCall()方法和next()方法都在resolve()方法里用到了，只是next()方法只是在resolve()方法异常捕获时才调用。所以跟踪doCall()方法，doCall()方法里的open()方法有3个参数其中src表示要打开的文件路径，buffersize表示缓冲大小，verifyChecksum表示是否校验和。

 @Override  public FSDataInputStream open(Path f, final int bufferSize)      throws IOException {    statistics.incrementReadOps(1);    Path absF = fixRelativePart(f);    return new FileSystemLinkResolver<FSDataInputStream>() {      @Override      public FSDataInputStream doCall(final Path p)          throws IOException, UnresolvedLinkException {        final DFSInputStream dfsis =          dfs.open(getPathName(p), bufferSize, verifyChecksum);        return dfs.createWrappedInputStream(dfsis);      }      @Override      public FSDataInputStream next(final FileSystem fs, final Path p)          throws IOException {        return fs.open(p, bufferSize);      }    }.resolve(this, absF);  }

通过DistribtutedFileSystem类调用的doCall方法调用的DFSClient类中的open方法如下：

  ```  ```//这里是DFSclient类   * work.   */  public DFSInputStream open(String src, int buffersize, boolean verifyChecksum)      throws IOException, UnresolvedLinkException {    checkOpen();    //    Get block info from namenode    TraceScope scope = getPathTraceScope("newDFSInputStream", src);    try {      return new DFSInputStream(this, src, verifyChecksum);    } finally {      scope.close();    }  }

上面这个方法主要是校验和开启一个DFSInptStream输入流对象

DFSInputStream(DFSClient dfsClient, String src, boolean verifyChecksum                 ) throws IOException, UnresolvedLinkException {    this.dfsClient = dfsClient;    this.verifyChecksum = verifyChecksum;    this.src = src;    synchronized (infoLock) {      this.cachingStrategy = dfsClient.getDefaultReadCachingStrategy();    }    openInfo();  }

在DFSInptStream这个对象中我们利用其构造函数异步调用getDefaultReadCachingStrategy（）方法获取一个默认的读取缓存策略，然后在调用其内部的openInfo方法

 /**   * Grab the open-file info from namenode   */  void openInfo() throws IOException, UnresolvedLinkException {    synchronized(infoLock) {      lastBlockBeingWrittenLength = fetchLocatedBlocksAndGetLastBlockLength();      int retriesForLastBlockLength = dfsClient.getConf().retryTimesForGetLastBlockLength;      while (retriesForLastBlockLength > 0) {        // Getting last block length as -1 is a special case. When cluster        // restarts, DNs may not report immediately. At this time partial block        // locations will not be available with NN for getting the length. Lets        // retry for 3 times to get the length.        if (lastBlockBeingWrittenLength == -1) {          DFSClient.LOG.warn("Last block locations not available. "              + "Datanodes might not have reported blocks completely."              + " Will retry for " + retriesForLastBlockLength + " times");          waitFor(dfsClient.getConf().retryIntervalForGetLastBlockLength);          lastBlockBeingWrittenLength = fetchLocatedBlocksAndGetLastBlockLength();        } else {          break;        }        retriesForLastBlockLength--;      }      if (retriesForLastBlockLength == 0) {        throw new IOException("Could not obtain the last block locations.");      }    }  }

openInfo这个方法最主要是从namenode得到文件块的信息，首先得到本地文件块的长度，然后得到最后一个文件块的长度，通过while循环去读取文件，在这个过程中我们回去通过Namenode得到数据位置的返回信息。

public LocatedBlocks   getBlockLocations(String src,                                        long offset,                                        long length) throws IOException {  return namesystem.getBlockLocations(getClientMachine(),                                      src, offset, length);}

主要调用了方法fetchLocatedBlocksAndGetLastBlockLength()方法来读取数据块的信息。该方法名字虽然长，但是说的很明白，即读取数据块信息并且获得最后一个数据块的长度。为什么偏偏要获取最后一个数据块的长度呢？因为之前的数据块大小固定嘛，如果是默认的，那就是128M，而最后一块大小就不一定了，有必要获取下。进入fetchLocatedBlocksAndGetLastBlockLength()方法：

  private long fetchLocatedBlocksAndGetLastBlockLength() throws IOException {    final LocatedBlocks newInfo = dfsClient.getLocatedBlocks(src, 0);    if (DFSClient.LOG.isDebugEnabled()) {      DFSClient.LOG.debug("newInfo = " + newInfo);    }    if (newInfo == null) {      throw new IOException("Cannot open filename " + src);    }    if (locatedBlocks != null) {      Iterator<LocatedBlock> oldIter = locatedBlocks.getLocatedBlocks().iterator();      Iterator<LocatedBlock> newIter = newInfo.getLocatedBlocks().iterator();      while (oldIter.hasNext() && newIter.hasNext()) {        if (! oldIter.next().getBlock().equals(newIter.next().getBlock())) {          throw new IOException("Blocklist for " + src + " has changed!");        }      }    }    locatedBlocks = newInfo;    long lastBlockBeingWrittenLength = 0;    if (!locatedBlocks.isLastBlockComplete()) {      final LocatedBlock last = locatedBlocks.getLastLocatedBlock();      if (last != null) {        if (last.getLocations().length == 0) {          if (last.getBlockSize() == 0) {            // if the length is zero, then no data has been written to            // datanode. So no need to wait for the locations.            return 0;          }          return -1;        }        final long len = readBlockLength(last);        last.getBlock().setNumBytes(len);        lastBlockBeingWrittenLength = len;       }    }    fileEncryptionInfo = locatedBlocks.getFileEncryptionInfo();    return lastBlockBeingWrittenLength;  }

在getLocatedBlocks方法中我们回去获取block信息

public LocatedBlocks getLocatedBlocks(String src, long start, long length)      throws IOException {    TraceScope scope = getPathTraceScope("getBlockLocations", src);    try {      return callGetBlockLocations(namenode, src, start, length);  //这个回调函数主要是通过namenode的Rpc通信得到相关文件的block信息    } finally {      scope.close();    }

具体实现

  static LocatedBlocks callGetBlockLocations(ClientProtocol namenode,      String src, long start, long length)       throws IOException {    try {      return namenode.getBlockLocations(src, start, length);    } catch(RemoteException re) {      throw re.unwrapRemoteException(AccessControlException.class,                                     FileNotFoundException.class,                                     UnresolvedPathException.class);    }  }

总结：在读过程中我们首先调用FileSystem.open（），然后在这个方法中我们会使用首先返回一个通过new FileSystemLinkResolver（）的对象，在这个对象的我们回去调用doCall方法和next方法，next方法我们暂时不讨论，因为next方法主要是出现了exception时才会调用的方法。调用的doCall方法时，内部会去调用DFSClient类的open方法，在这个方法中我们会同步一个锁得到读取数据的策略，然后我们调用了openinfo这个方法，在这个方法中我们我们首先调用了fetchLocatedBlocksAndGetLastBlockLength这个和方法，在这个方法中我们首先通过getLocatedBlocks从那么namenode中得到数据的nodeid位置信息，然后将所有的数据位置信息获取到后返回给方法，然后通过openinfo中的while循环去读取这个数据。

写过程

写入数据之前我们首先会调用DistributedFileSystem中的create方法去创建一个目录文件

public HdfsDataOutputStream create(final Path f,      final FsPermission permission, final boolean overwrite,      final int bufferSize, final short replication, final long blockSize,      final Progressable progress, final InetSocketAddress[] favoredNodes)          throws IOException {    statistics.incrementWriteOps(1);    Path absF = fixRelativePart(f);    return new FileSystemLinkResolver<HdfsDataOutputStream>() {      @Override      public HdfsDataOutputStream doCall(final Path p)          throws IOException, UnresolvedLinkException {        final DFSOutputStream out = dfs.create(getPathName(f), permission,            overwrite ? EnumSet.of(CreateFlag.CREATE, CreateFlag.OVERWRITE)                : EnumSet.of(CreateFlag.CREATE),            true, replication, blockSize, progress, bufferSize, null,            favoredNodes);        return dfs.createWrappedOutputStream(out, statistics);      }      @Override      public HdfsDataOutputStream next(final FileSystem fs, final Path p)          throws IOException {        if (fs instanceof DistributedFileSystem) {          DistributedFileSystem myDfs = (DistributedFileSystem)fs;          return myDfs.create(p, permission, overwrite, bufferSize, replication,              blockSize, progress, favoredNodes);        }        throw new UnsupportedOperationException("Cannot create with" +            " favoredNodes through a symlink to a non-DistributedFileSystem: "            + f + " -> " + p);      }    }.resolve(this, absF);  }

然后调用DFSClient的 create方法去创建一个输入的路径

public DFSOutputStream create(String src,                              FsPermission permission,                             EnumSet<CreateFlag> flag,                              boolean createParent,                             short replication,                             long blockSize,                             Progressable progress,                             int buffersize,                             ChecksumOpt checksumOpt,                             InetSocketAddress[] favoredNodes) throws IOException {    checkOpen();    if (permission == null) {      permission = FsPermission.getFileDefault();  //设置默认权限    }    FsPermission masked = permission.applyUMask(dfsClientConf.uMask);    if(LOG.isDebugEnabled()) {      LOG.debug(src + ": masked=" + masked);    }    final DFSOutputStream result = DFSOutputStream.newStreamForCreate(this,        src, masked, flag, createParent, replication, blockSize, progress,        buffersize, dfsClientConf.createChecksum(checksumOpt),        getFavoredNodesStr(favoredNodes));  //创建一个新的输入流    beginFileLease(result.getFileId(), result);  //通过这个方法去输入数据    return result;  }

beginFileLease解析源码

/** Get a lease and start automatic renewal */  private void beginFileLease(final long inodeId, final DFSOutputStream out)      throws IOException {    getLeaseRenewer().put(inodeId, out, this);  //通过nodeid和输出流我们去将数据输入到hdfs上面  }

newStreamForCreate解析源码

tatic DFSOutputStream newStreamForCreate(DFSClient dfsClient, String src,      FsPermission masked, EnumSet<CreateFlag> flag, boolean createParent,      short replication, long blockSize, Progressable progress, int buffersize,      DataChecksum checksum, String[] favoredNodes) throws IOException {    TraceScope scope =        dfsClient.getPathTraceScope("newStreamForCreate", src);    try {      HdfsFileStatus stat = null;      // Retry the create if we get a RetryStartFileException up to a maximum      // number of times      boolean shouldRetry = true;      int retryCount = CREATE_RETRY_COUNT;      while (shouldRetry) {        shouldRetry = false;        try {          stat = dfsClient.namenode.create(src, masked, dfsClient.clientName,              new EnumSetWritable<CreateFlag>(flag), createParent, replication,              blockSize, SUPPORTED_CRYPTO_VERSIONS); //通过namenode返回一个我们需要写入数据的createParent，replication，blockSize，SUPPORTED_CRYPTO_VERSIONS信息等等          break;        } catch (RemoteException re) {          IOException e = re.unwrapRemoteException(              AccessControlException.class,              DSQuotaExceededException.class,              FileAlreadyExistsException.class,              FileNotFoundException.class,              ParentNotDirectoryException.class,              NSQuotaExceededException.class,              RetryStartFileException.class,              SafeModeException.class,              UnresolvedPathException.class,              SnapshotAccessControlException.class,              UnknownCryptoProtocolVersionException.class);          if (e instanceof RetryStartFileException) {            if (retryCount > 0) {              shouldRetry = true;              retryCount--;            } else {              throw new IOException("Too many retries because of encryption" +                  " zone operations", e);            }          } else {            throw e;          }        }      }      Preconditions.checkNotNull(stat, "HdfsFileStatus should not be null!");      final DFSOutputStream out = new DFSOutputStream(dfsClient, src, stat,          flag, progress, checksum, favoredNodes);      out.start();      return out;    } finally {      scope.close();    }  }

阅读全文

0 0