Hadoop源码分析之DFSClient对象的创建

来源:互联网 发布:在手机上怎样注册淘宝 编辑:程序博客网 时间:2024/05/21 10:44

再上一篇文章Hadoop源码分析之DistributedFileSystem中说道:DistributedFileSystem的重点在于其成员变量DFSClient dfs,它执行了文件系统的具体的操作。所以现在就来学习一个DFSClient这个类。

再来看看这个图:

DFSClient就再HDFS客户端那里,DistributedFileSystem就是通过DFSClient来与NameNode和DataNode通信,由于这三类节点可能不在一台机器上面,所以其通信方式就使用了RPC机制进行通信,在Java中RPC可以使用RMI来实现,但是Hadoop没有使用Java RMI方式,而是重新实现了一种节点间通信的方法,称为Hadoop IPC(Inter-Process Communication,进程间通信),因为Java RMI方式进行进程间通信,开销较大,而Hadoop需要精确控制进程间通信,如连接,超时,缓存等通信细节,关于Hadoop IPC在后面的分析中会学习到,现在先来看看DFSClient对象的创建。

DFSClient对象创建

再DistributedFileSystem的initialize()方法中的代码this.dfs = new DFSClient(namenode, conf, statistics);是用于初始化dfs变量,其中namenode为NameNode节点的网络地址,conf为配置信息,statistics用于统计信息的记录。DFSClient有四个构造方法,但最终都是调用的同一个构造方法(参数最多的),代码如下:

public DFSClient(Configuration conf) throws IOException {    this(NameNode.getAddress(conf), conf);  }  public DFSClient(InetSocketAddress nameNodeAddr, Configuration conf      ) throws IOException {    this(nameNodeAddr, conf, null);  }  public DFSClient(InetSocketAddress nameNodeAddr, Configuration conf,                   FileSystem.Statistics stats)    throws IOException {    this(nameNodeAddr, null, conf, stats);  }  /**    * Create a new DFSClient connected to the given nameNodeAddr or rpcNamenode.   * Exactly one of nameNodeAddr or rpcNamenode must be null.   */  DFSClient(InetSocketAddress nameNodeAddr, ClientProtocol rpcNamenode,      Configuration conf, FileSystem.Statistics stats)    throws IOException {    this.conf = conf;    this.stats = stats;    //和网络相关的一些参数    /**NameNode网络地址**/    this.nnAddress = nameNodeAddr;    /**socket连接超时时间**/    this.socketTimeout = conf.getInt("dfs.socket.timeout",                                      HdfsConstants.READ_TIMEOUT);    this.datanodeWriteTimeout = conf.getInt("dfs.datanode.socket.write.timeout",                                            HdfsConstants.WRITE_TIMEOUT);    this.timeoutValue = this.socketTimeout;    this.socketFactory = NetUtils.getSocketFactory(conf, ClientProtocol.class);    // dfs.write.packet.size is an internal config variable    /**往数据节点写数据的数据包最大长度**/    this.writePacketSize = conf.getInt("dfs.write.packet.size", 64*1024);    //读写失败时重试的次数    this.maxBlockAcquireFailures = getMaxBlockAcquireFailures(conf);    this.hdfsTimeout = Client.getTimeout(conf);    ugi = UserGroupInformation.getCurrentUser();//用户信息    this.authority = nameNodeAddr == null? "null":      nameNodeAddr.getHostName() + ":" + nameNodeAddr.getPort();    String taskId = conf.get("mapred.task.id", "NONMAPREDUCE");    this.clientName = "DFSClient_" + taskId + "_" +         r.nextInt()  + "_" + Thread.currentThread().getId();    defaultBlockSize = conf.getLong("dfs.block.size", DEFAULT_BLOCK_SIZE);//默认的数据块大小    defaultReplication = (short) conf.getInt("dfs.replication", 3);//默认副本数//和名字节点建立IPC连接 if (nameNodeAddr != null && rpcNamenode == null) {      this.rpcNamenode = createRPCNamenode(nameNodeAddr, conf, ugi);      this.namenode = createNamenode(this.rpcNamenode, conf);    } else if (nameNodeAddr == null && rpcNamenode != null) {      //This case is used for testing.      this.namenode = this.rpcNamenode = rpcNamenode;    } else {      throw new IllegalArgumentException(          "Expecting exactly one of nameNodeAddr and rpcNamenode being null: "          + "nameNodeAddr=" + nameNodeAddr + ", rpcNamenode=" + rpcNamenode);    }    // read directly from the block file if configured.    //如果客户端和数据块再一台主机上,是否使用本地读优化,默认为false    this.shortCircuitLocalReads = conf.getBoolean(        DFSConfigKeys.DFS_CLIENT_READ_SHORTCIRCUIT_KEY,        DFSConfigKeys.DFS_CLIENT_READ_SHORTCIRCUIT_DEFAULT);    if (LOG.isDebugEnabled()) {      LOG.debug("Short circuit read is " + shortCircuitLocalReads);    }    this.connectToDnViaHostname = conf.getBoolean(        DFSConfigKeys.DFS_CLIENT_USE_DN_HOSTNAME,        DFSConfigKeys.DFS_CLIENT_USE_DN_HOSTNAME_DEFAULT);//是否使用DataNode节点的主机名连接到DataNode    if (LOG.isDebugEnabled()) {      LOG.debug("Connect to datanode via hostname is " + connectToDnViaHostname);    }    String localInterfaces[] =      conf.getStrings(DFSConfigKeys.DFS_CLIENT_LOCAL_INTERFACES);//与DataNode节点传输数据时使用的网络接口    if (null == localInterfaces) {      localInterfaces = new String[0];    }    this.localInterfaceAddrs = getLocalInterfaceAddrs(localInterfaces);    if (LOG.isDebugEnabled() && 0 != localInterfaces.length) {      LOG.debug("Using local interfaces [" +          StringUtils.join(",",localInterfaces)+ "] with addresses [" +          StringUtils.join(",",localInterfaceAddrs) + "]");    }  }
再构造方法中进行了成员变量的初始化,与NameNode连接相关的代码是创建连接到NameNode的远程连接对象,代码如下:

if (nameNodeAddr != null && rpcNamenode == null) {      this.rpcNamenode = createRPCNamenode(nameNodeAddr, conf, ugi);      this.namenode = createNamenode(this.rpcNamenode, conf);    } else if (nameNodeAddr == null && rpcNamenode != null) {      //This case is used for testing.      this.namenode = this.rpcNamenode = rpcNamenode;    } else {      throw new IllegalArgumentException(          "Expecting exactly one of nameNodeAddr and rpcNamenode being null: "          + "nameNodeAddr=" + nameNodeAddr + ", rpcNamenode=" + rpcNamenode);    }
再nameNodeAddr不为null而rpcNamenode为null时创建rpcNameNode对象和namenode对象,这两个变量都是DFSClient的成员变量,用于与NameNode节点进行通信使用了Hadoop IPC机制实现。createRPCNameNode()方法与createNamenode()用于建立到NameNode节点的IPC连接,createRPCNamenode()方法先建立与NameNode的远程连接,代码如下:

/**   * 用于建立到名字节点的IPC连接,如果与NameNode的连接失败,则不重试   * @param nameNodeAddr NameNode的网络地址   * @param conf 配置信息   * @param ugi 用户信息   * @return   * @throws IOException   */  private static ClientProtocol createRPCNamenode(InetSocketAddress nameNodeAddr,      Configuration conf, UserGroupInformation ugi)     throws IOException {    return (ClientProtocol)RPC.getProxy(ClientProtocol.class,        ClientProtocol.versionID, nameNodeAddr, ugi, conf,        NetUtils.getSocketFactory(conf, ClientProtocol.class), 0,        RetryUtils.getMultipleLinearRandomRetry(                conf,                 DFSConfigKeys.DFS_CLIENT_RETRY_POLICY_ENABLED_KEY,//用于查找是否允许重试的参数                DFSConfigKeys.DFS_CLIENT_RETRY_POLICY_ENABLED_DEFAULT,                DFSConfigKeys.DFS_CLIENT_RETRY_POLICY_SPEC_KEY,//用于获取具体重试的策略                DFSConfigKeys.DFS_CLIENT_RETRY_POLICY_SPEC_DEFAULT                ),        false);      }
RPC.getProxy()方法就是IPC中获取到NameNode节点的连接对象,在学习IPC机制的时候会更加深入的学习这个方法。createRPCNamenode()方法返回RPC.getProxy()的结果,即到NameNode的连接对象,而createNamenode()方法并没有重新建立与NameNode节点的连接,而是使用已经建立的连接对象,具体代码如下:

/**   * 用于建立到名字节点的IPC连接,如果与NameNode的连接失败,则根据一定的配置的重试策略进行重试   * @param rpcNamenode   * @param conf   * @return   * @throws IOException   */  private static ClientProtocol createNamenode(ClientProtocol rpcNamenode,      Configuration conf) throws IOException {    //default policy,获取默认的重试策略    @SuppressWarnings("unchecked")    final RetryPolicy defaultPolicy =         RetryUtils.getDefaultRetryPolicy(            conf,             DFSConfigKeys.DFS_CLIENT_RETRY_POLICY_ENABLED_KEY,            DFSConfigKeys.DFS_CLIENT_RETRY_POLICY_ENABLED_DEFAULT,            DFSConfigKeys.DFS_CLIENT_RETRY_POLICY_SPEC_KEY,            DFSConfigKeys.DFS_CLIENT_RETRY_POLICY_SPEC_DEFAULT,            SafeModeException.class            );        //create policy    RetryPolicy createPolicy = RetryPolicies.retryUpToMaximumCountWithFixedSleep(        5, LEASE_SOFTLIMIT_PERIOD, TimeUnit.MILLISECONDS);        Map<Class<? extends Exception>,RetryPolicy> remoteExceptionToPolicyMap =      new HashMap<Class<? extends Exception>, RetryPolicy>();    remoteExceptionToPolicyMap.put(AlreadyBeingCreatedException.class, createPolicy);    Map<Class<? extends Exception>,RetryPolicy> exceptionToPolicyMap =      new HashMap<Class<? extends Exception>, RetryPolicy>();    exceptionToPolicyMap.put(RemoteException.class,         RetryPolicies.retryByRemoteException(            defaultPolicy, remoteExceptionToPolicyMap));    RetryPolicy methodPolicy = RetryPolicies.retryByException(        defaultPolicy, exceptionToPolicyMap);    Map<String,RetryPolicy> methodNameToPolicyMap = new HashMap<String,RetryPolicy>();        methodNameToPolicyMap.put("create", methodPolicy);    final ClientProtocol cp = (ClientProtocol) RetryProxy.create(ClientProtocol.class,        rpcNamenode, defaultPolicy, methodNameToPolicyMap);    RPC.checkVersion(ClientProtocol.class, ClientProtocol.versionID, cp);    return cp;  }
在createNamenode()方法中加入了连接策略对象,即在客户端与NameNode节点连接失败时,进行重连的策略。重试策略定义了再连接失败时进行重连的具体方式,具体的实现方式可以参考类org.apache.hadoop.io.retry.RetryPolicies。在createNamenode()方法中用于创建连接对象的代码是:
    final ClientProtocol cp = (ClientProtocol) RetryProxy.create(ClientProtocol.class,        rpcNamenode, defaultPolicy, methodNameToPolicyMap);
这行代码调用了RetryProxy.create()方法,其中rpcNamenode对象是已经建立的与NameNode节点的连接对象,接下来进入到这个方法看看:

public static Object create(Class<?> iface, Object implementation,      RetryPolicy defaultPolicy, Map<String,RetryPolicy> methodNameToPolicyMap) {    return Proxy.newProxyInstance(        implementation.getClass().getClassLoader(),        new Class<?>[] { iface },        new RetryInvocationHandler(implementation, defaultPolicy, methodNameToPolicyMap)        );  }}

在这个方法中,使用了Java动态代理的方式来创建一个代理对象,RetryInvocationHandler类实现了Java动态代理要求实现的InvocationHandler接口,Proxy.newProxyInstance用于创建代理类,按照Java动态代理的要求,代理类方法的执行统一到RetryInvocationHandler类的invoke()方法中执行,所以客户端与NameNode通信都要经过RetryInvocationHandler.invoke()方法。在RetryInvocationHandler的invoke()方法中执行真正的方法调用,若调用失败,则按照重试策略进行重试,直到方法执行成功,或者重复策略的shouldRetry()方法判定无需要进行重复连接为止。RetryInvocationHandler.invoke()方法的代码如下:

public Object invoke(Object proxy, Method method, Object[] args)    throws Throwable {    RetryPolicy policy = methodNameToPolicyMap.get(method.getName());    if (policy == null) {      policy = defaultPolicy;    }        int retries = 0;    while (true) {      try {        return invokeMethod(method, args);      } catch (Exception e) {        if (!policy.shouldRetry(e, retries++)) {          LOG.info("Exception while invoking " + method.getName()                   + " of " + implementation.getClass() + ". Not retrying."                   + StringUtils.stringifyException(e));          if (!method.getReturnType().equals(Void.TYPE)) {            throw e; // non-void methods can't fail without an exception          }          return null;        }        LOG.debug("Exception while invoking " + method.getName()                 + " of " + implementation.getClass() + ". Retrying."                 + StringUtils.stringifyException(e));      }    }  }
当invoke()方法执行完成后,DFSClient.createNamenode()方法执行完成,如果执行成功则成功创建HDFS客户端到NameNode节点的远程连接对象。以后就可以使用该对象与NameNode节点进行通信。

DFSClient的构造方法的其他代码比较简单,就不多说了。

总结

在DFSClient的构造方法中主要实现了其成员变量的初始化,到NameNode节点的连接对象的创建主要利用了Java的动态代理和Hadoop IPC机制实现,执行过程比较简单。

Reference

《Hadoop技术内幕:深入解析Hadoop Common和HDFS架构设计与实现原理》

  HDFS Source Code Analysis

0 0