ToolRunner运行Hadoop原理分析

来源:互联网 发布:淘宝商家不提供发票 编辑:程序博客网 时间:2024/06/06 07:24

运行三步骤

一、Configurable

Configurable

public interface Configurable {  /** Set the configuration to be used by this object. */  void setConf(Configuration conf);  /** Return the configuration used by this object. */  Configuration getConf();}

Tool

public interface Tool extends Configurable {  /**   * Execute the command with the given arguments.   *    * @param args command specific arguments.   * @return exit code.   * @throws Exception   */  int run(String [] args) throws Exception;}


Configured

public class Configured implements Configurable {  private Configuration conf;  /** Construct a Configured. */  public Configured() {    this(null);  }    /** Construct a Configured. */  public Configured(Configuration conf) {    setConf(conf);  }  // inherit javadoc  @Override  public void setConf(Configuration conf) {    this.conf = conf;  }  // inherit javadoc  @Override  public Configuration getConf() {    return conf;  }}


MR运行主类

public class MainTest extends Configured implements Tool {    @Override    public int run(String[] args) {       // mr运行主体程序            return 0;    }              public static void main(String[] args) throws Exception {        int result = ToolRunner.run(new MainTest(), args);        System.exit(result);    }}

上述Configurable、Tool、Configured UML关系




二、Configuration

默认情况下,hadoop会加载core-default.xml以及core-site.xml中的参数,或者从命令行获得的hadoop配置配置参数都会存在Configuration中,可以在mr运行主类中通过getConf()获得





三、运行

ToolRunner

public class ToolRunner {   /**   * Runs the given <code>Tool</code> by {@link Tool#run(String[])}, after    * parsing with the given generic arguments. Uses the given    * <code>Configuration</code>, or builds one if null.   *    * Sets the <code>Tool</code>'s configuration with the possibly modified    * version of the <code>conf</code>.     *    * @param conf <code>Configuration</code> for the <code>Tool</code>.   * @param tool <code>Tool</code> to run.   * @param args command-line arguments to the tool.   * @return exit code of the {@link Tool#run(String[])} method.   */  public static int run(Configuration conf, Tool tool, String[] args)     throws Exception{    if(conf == null) {      conf = new Configuration();    }    GenericOptionsParser parser = new GenericOptionsParser(conf, args);    //set the configuration back, so that Tool can configure itself    tool.setConf(conf);        //get the args w/o generic hadoop args    String[] toolArgs = parser.getRemainingArgs();    return tool.run(toolArgs);  }    /**   * Runs the <code>Tool</code> with its <code>Configuration</code>.   *    * Equivalent to <code>run(tool.getConf(), tool, args)</code>.   *    * @param tool <code>Tool</code> to run.   * @param args command-line arguments to the tool.   * @return exit code of the {@link Tool#run(String[])} method.   */  public static int run(Tool tool, String[] args)     throws Exception{    return run(tool.getConf(), tool, args);  }    /**   * Prints generic command-line argurments and usage information.   *    *  @param out stream to write usage information to.   */  public static void printGenericCommandUsage(PrintStream out) {    GenericOptionsParser.printGenericCommandUsage(out);  }      /**   * Print out a prompt to the user, and return true if the user   * responds with "y" or "yes". (case insensitive)   */  public static boolean confirmPrompt(String prompt) throws IOException {    while (true) {      System.err.print(prompt + " (Y or N) ");      StringBuilder responseBuilder = new StringBuilder();      while (true) {        int c = System.in.read();        if (c == -1 || c == '\r' || c == '\n') {          break;        }        responseBuilder.append((char)c);      }        String response = responseBuilder.toString();      if (response.equalsIgnoreCase("y") ||          response.equalsIgnoreCase("yes")) {        return true;      } else if (response.equalsIgnoreCase("n") ||          response.equalsIgnoreCase("no")) {        return false;      }      System.err.println("Invalid input: " + response);      // else ask them again    }  }}


由上述代码MR在运行时候实质调用了

run(Configuration conf, Tool tool, String[] args) 

conf :存储hadoop配置参数

tool:调用run运行mr

args:用户传入的命令行参数







0 0
原创粉丝点击