hadoop2.7.3 Windows eclipse开发环境搭建及WordCount实例运行
来源:互联网 发布:深圳华强北数据恢复 编辑:程序博客网 时间:2024/06/06 04:45
一、准备工作
● eclipse-jee-luna-SR2-win32-x86_64
● hadoop-eclipse-plugin-2.7.3.jar
● hadoop-2.7.3.tar.gz
● hadoop.dll、winutils.exe
二、eclipse添加hadoop插件
将下载的hadoop-eclipse-plugin-2.7.3.jar拷贝到eclipse的dropins或plugins目录中,重启eclipse即可看到左上角的DFS Locations。
三、配置DFS Location
从Window->show view->other中选择调出,新建一个DFS Location
点击finish,即可浏览新增加hdfs location中的文件
四、运行环境配置
- 新建HADOOP_HOME环境变量
- 在PATH中增加%HADOOP_HOME%/bin
- 将下载的hadoop.dll、winutils.exe拷贝到%HADOOP_HOME%/bin目录下
五、新建Map/Reduce Project——WordCount
import java.io.IOException;import java.util.StringTokenizer;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.IntWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Job;import org.apache.hadoop.mapreduce.Mapper;import org.apache.hadoop.mapreduce.Reducer;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;public class WordCount { public static class WordCountMapper extends Mapper<Object, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(Object key, Text value, Context context) throws IOException, InterruptedException { StringTokenizer itr = new StringTokenizer(value.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); } } } public static class WordCountReducer extends Reducer<Text, IntWritable, Text, IntWritable> { private IntWritable result = new IntWritable(); public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } result.set(sum); context.write(key, result); } } public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); if (args.length != 2) { System.err.println("Usage: <in> <out>"); System.exit(2); } Job job = Job.getInstance(conf, "word count"); job.setJarByClass(WordCount.class); job.setMapperClass(WordCountMapper.class); job.setCombinerClass(WordCountReducer.class); job.setReducerClass(WordCountReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); System.exit(job.waitForCompletion(true) ? 0 : 1); }}
六、配置运行参数
在hdfs中新建一个tmp目录:hadoop fs -mkdir /tmp
修改目录权限:hadoop fs -chmod -R 777 /tmp
在本地新建一个文件input01,输入:
hello world
hello china
hello jiangsu
hello suzhou
配置Run Configurations
七、运行程序
运行结果如下:
0 0
- hadoop2.7.3 Windows eclipse开发环境搭建及WordCount实例运行
- Hadoop2.x实战:Eclipse本地开发环境搭建与本地运行wordcount实例
- 在eclipse上搭建mapreduce开发环境及运行wordcount
- Eclipse下搭建Hadoop2.7.3开发环境
- hadoop2.7.3-windows下开发环境搭建
- Hadoop学习笔记(4)-Eclipse下搭建Hadoop2.6.4开发环境并写wordcount
- Windows下使用Eclipse工具搭建Hadoop2.6.4开发环境
- hadoop2.2.0配置eclipse运行wordcount程序问题及解决方法
- hadoop开发:eclipse运行wordcount实例
- hadoop2.7.3 eclipse环境搭建
- Eclipse下搭建Hadoop2.4.0开发环境
- Eclipse下搭建Hadoop2.4.0开发环境
- Eclipse下搭建Hadoop2.4.0开发环境
- windows7+eclipse+hadoop2.3开发环境搭建
- Eclipse下搭建Hadoop2.7.0开发环境
- 用Eclipse搭建Hadoop2.7.1开发环境
- Win7+Eclipse+Hadoop2.6.4开发环境搭建
- Win7+Eclipse+Hadoop2.6.4开发环境搭建
- 数字滤波器
- 从排序的数组中删除重复项
- 查询系统中保留量大于现在量的物料
- Android常用框架
- 最近需要使用jenkins 进行项目集成管理——1
- hadoop2.7.3 Windows eclipse开发环境搭建及WordCount实例运行
- 验证码的产生 python
- 初始化TableViewCell时获取到的宽度错误
- GPP加密破解工具gpp-decrypt
- Spring struts hibernate下载地址
- ARM体系架构
- Java Scoket网络编程,转自commandingofficer的博客(http://blog.sina.com.cn/s/blog_616e189f0100s3px.html)
- eclipse常用设置
- 获取当前Java对象的类型的工具类