老王大数据教程(二) Hadoop eclipse 开发
来源:互联网 发布:拍摄淘宝图片技巧 编辑:程序博客网 时间:2024/06/08 10:45
Hadoop eclipse 开发
(一)需要相关文件
- l Eclipse插件hadoop-eclipse-plugin-2.7.3.jar
- l Hadoop 安装包 hadoop-2.7.3.tar.gz
- l hadoop源码文件hadoop-2.7.3-src
- l hadoop.ll和winutils.exe
(1) 添加系统变量HADOOP_HOME
(2) 安装eclipse将hadoop-eclipse-plugin-2.7.3.jar放到 eclipse\jee-neon\eclipse\dropins中并重启eclipse在windows->preferences下可看见hadoop Map/Reduce界面,路径选择你WINDOWS下的hadoop(hadoop-2.7.3.tar.gz)解压后的路径
(3) 选择Windows->show view->others下的MapReduce Locations
(3)新建配置
在Map/Reduce Locations 新建
host为你的远程hadoop待连接的主机IP地址;接着上一节这里为master ,在windows下修改hosts文件192.168.202.5 master ,端口为
vim /usr/local/hadoop/etc/hadoop/core-site.xml中的端口号
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
user name 填你windows的用户名;
保存配置参数并重启myeclipse,可以看见如下的文件结构说明配置连接成功。
五下载hadoop.ll和winutils.exe到windows的hadoop/bin目录下
并将hadoop.dll添加到windows->system32目录下
六 新建项目:File-->New-->Other-->Map/Reduce Project ,项目名可以随便取
它会自动添加依赖包,如下:
七,编写WordCount测试
代码如下
package testjar; import java.io.IOException;import java.util.StringTokenizer; import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.IntWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Job;import org.apache.hadoop.mapreduce.Mapper;import org.apache.hadoop.mapreduce.Reducer;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;import org.apache.hadoop.util.GenericOptionsParser; public class WordCount { public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable>{ private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(Object key, Text value, Context context ) throws IOException, InterruptedException { StringTokenizer itr = new StringTokenizer(value.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); } } } public static class IntSumReducer extends Reducer<Text,IntWritable,Text,IntWritable> { private IntWritable result = new IntWritable(); public void reduce(Text key, Iterable<IntWritable> values, Context context ) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } result.set(sum); context.write(key, result); } } public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs(); if (otherArgs.length != 2) { System.err.println("Usage: wordcount <in> <out>"); System.exit(2); }// @SuppressWarnings("deprecation")Job job = new Job(conf, "word count"); job.setJarByClass(WordCount.class); job.setMapperClass(TokenizerMapper.class); job.setCombinerClass(IntSumReducer.class); job.setReducerClass(IntSumReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); FileInputFormat.addInputPath(job, new Path(otherArgs[0])); FileOutputFormat.setOutputPath(job, new Path(otherArgs[1])); System.exit(job.waitForCompletion(true) ? 0 : 1); }}
在HDFS上创建目录input
hadoop fs -mkdir /wlt/input
拷贝本地README.txt到HDFS的input里
向input里面put txt文件
/usr/local/hadoop/bin/hdfs dfs -put test.txt /wlt/input/
配置运行环境 Run as ->run configuration
生成结果
至此环境搭建成功!!!!!!!!!!
参考文献
http://www.cnblogs.com/duking1991/p/6056923.html
http://blog.csdn.net/young_kim1/article/details/50208837
http://www.cnblogs.com/gaopeng527/p/4314215.html
参考文献
http://www.cnblogs.com/duking1991/p/6056923.html
- 老王大数据教程(二) Hadoop eclipse 开发
- eclipse下开发hadoop程序配置(二)
- 一起艳学大数据Hadoop(二)——eclipse配置hadoop
- eclipse+makefile开发hadoop
- eclipse中开发hadoop
- eclipse hadoop开发步骤
- Eclipse开发Hadoop程序
- hadoop eclipse集成开发
- Windows Eclipse Hadoop 开发
- Hadoop+eclipse开发环境
- eclipse集成Hadoop开发
- Ubuntu Linux hadoop开发环境搭建详细教程 二、配置Hadoop集群环境
- Eclipse开发经典教程:SWT布局(二)
- Eclipse开发经典教程:代码重构(二)
- Hadoop教程(二):安装
- 老王大数据教程(一) centos上安装hadoop集群
- 大数据教程(二)—— Hadoop集群坏境CentOS安装
- 大数据教程(二) Hadoop集群坏境CentOS安装
- windows10在python2.7的anaconda下安装py3.5.2并安装tensorflow遇到的问题
- Java实现RSA非对称加密
- oracle自增序列
- Git生成SSHkey
- SQL之操作字符串函数CONCAT、SUBSTR、SUBSTRING、SUBSTRING_INDEX、LEFT、RIGHT
- 老王大数据教程(二) Hadoop eclipse 开发
- 逻辑回归学习总结
- .\Opt\TMES.axf: Error: L6406E: No space in execution regions with .ANY selector matching indicate_se
- 线程的创建
- 4.2 使用工厂方法创建Bean
- python 使用CGI进行远程编辑1
- x86汇编
- Will , will be 的差別以及用法
- filter配置url-pattern问题