关于Hadoop的GenericOptionsParser…

来源：互联网发布：照片生成器软件编辑：程序博客网时间：2024/06/16 05:39

由于集群为hadoop配置了执行队列，所以如果我们程序写成下面这样的话：

Configuration conf = newConfiguration();

String[] otherArgs = newGenericOptionsParser(conf, args).getRemainingArgs();

...........

FileInputFormat.addInputPath(job, newPath(otherArgs[0]));

FileOutputFormat.setOutputPath(job, newPath(otherArgs[1]));

我们在运行mapreduce的时候命令上要加 -D mapreduce.job.queuename参数

hadoop jar WordCount.jar WordCount.WordCount-Dmapreduce.job.queuename=root.default xrli/input xrli/output

如果实在想把这些参数都写到代码里面的话，就照着下面写：

Configuration conf = new Configuration(); //从配置文件读取参数

String[] ioArgs=newString[]{"-Dmapreduce.job.queuename=root.default","xrli/STJoin_in","xrli/STJoin_out"};

String[] otherArgs = newGenericOptionsParser(conf, ioArgs).getRemainingArgs();

....................

// 设置输入和输出目录

FileInputFormat.addInputPath(job, new Path(otherArgs[0]));

FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));

这样运行的时候只要执行以下命令就行了。

hadoop jar WordCount.jarWordCount.WordCount

这里分析以下Hadoop的GenericOptionsParser类

它能够解析命令行参数的基本类。它能够辨别一些标准的命令行参数。

比如这里的-D mapreduce.job.queuename 就被它识别了，并且配置到了参数文件中去，而函数getRemainingArgs()就是获取了剩余的两个参数"xrli/STJoin_in","xrli/STJoin_out"，并且将它们组合为数组otherArgs。

它能够识别的参数包括： fs jtlibjars files archives D tokenCacheFile

http://www.cnblogs.com/caoyuanzhanlang/archive/2013/02/21/2920934.htmlhttp://www.cnblogs.com/caoyuanzhanlang/archive/2013/02/21/2920934.html

0 0