hadoop 自学指南二之开发环境搭建

来源:互联网 发布:68.168.16.158现在域名 编辑:程序博客网 时间:2024/06/15 21:21

一、前言

从一个简单的程序观察hadoop 的运行过程

二、window 下hadoop 的开发环境搭建

环境准备:
hadoop 1.2.1
eclipse Version: Mars Release (4.5.0)
hadoop-eclipse-plugin-1.2.1(网上有许多,不再重复提供)
hadoop-eclipse-plugin-1.2.1放到dropins 启动eclipse


根据自己的情况填写


advanced parameters:这里要改一下


出现


代表成功了

问题ps: 一直有listening.....请检查插件版本与hadoop 版本是否一致
插件安装没有成功,原因可以是eclipse版本过旧,你到官网下载最新的版本

三、hadoop 运行过程

下面是hadoop-example的一个经典程序hello word,是wordCount
package hadoop.v3;import java.io.IOException;import java.util.Iterator;import java.util.StringTokenizer;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.IntWritable;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapred.FileInputFormat;import org.apache.hadoop.mapred.FileOutputFormat;import org.apache.hadoop.mapred.JobClient;import org.apache.hadoop.mapred.JobConf;import org.apache.hadoop.mapred.MapReduceBase;import org.apache.hadoop.mapred.Mapper;import org.apache.hadoop.mapred.OutputCollector;import org.apache.hadoop.mapred.Reducer;import org.apache.hadoop.mapred.Reporter;import org.apache.hadoop.mapred.TextInputFormat;import org.apache.hadoop.mapred.TextOutputFormat;import org.hai.hdfs.utils.HDFSUtils;/** * hadoop 的 hello word * @author : chenhaipeng * @date : 2015年9月5日 下午8:38:13 */public class WordCount {//Map public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable>{private final static IntWritable one = new IntWritable(1);private Text word = new Text();public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException{String line = value.toString();StringTokenizer tokenizer = new StringTokenizer(line);while(tokenizer.hasMoreTokens()){word.set(tokenizer.nextToken());output.collect(word, one);}}}//reducepublic static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable>{public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output,Reporter reporter)throws IOException{int sum = 0; while(values.hasNext()){sum += values.next().get();}output.collect(key, new IntWritable(sum));}}public static void deletedir(String path){try {HDFSUtils.DeleteHDFSFile(path);} catch (IOException e) {e.printStackTrace();}}public static void main(String[] args) throws Exception {JobConf conf  = new JobConf(WordCount.class);conf.setJobName("mywordcount");conf.setOutputKeyClass(Text.class);conf.setOutputValueClass(IntWritable.class);conf.setMapperClass(Map.class);conf.setReducerClass(Reduce.class);conf.setInputFormat(TextInputFormat.class);conf.setOutputFormat(TextOutputFormat.class);FileInputFormat.setInputPaths(conf, new Path(args[0]));deletedir(args[1]);FileOutputFormat.setOutputPath(conf, new Path(args[1]));JobClient.runJob(conf);}}


输入为:
Mary had a little lamb
its fleece very white as snow
and everywhere that Mary went
the lamb was sure to go
查看输出结果如下 :

进入http://hai:50030/jobtracker.jsp 发现根本就没有运行这个job,但结果有输出,什么鬼?
参巧hadoop 自学指南-hadoop 的各种问题汇总


0 0
原创粉丝点击