MapReduce程序测试
来源:互联网 发布:手机壁纸主题软件 编辑:程序博客网 时间:2024/05/21 19:40
下面是在《Hadoop实战》中的一段代码,测试步骤如下
1、在完成hadoop环境搭建的基础,将 export CLASSPATH=$($HADOOP_HOME/bin/hadoop classpath):$CLASSPATH
添加到/etc/profile末尾的,可使代码中的import语句完成导入,具体代码如下
import java.io.IOException;
import java.util.*;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.*;
import org.apache.hadoop.util.*;
public class WordCount{
public static class Map extends MapReduceBase implements Mapper<LongWritable,Text,Text,IntWritable>{
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key,Text value,OutputCollector<Text,IntWritable> output,Reporter reporter) throws IOException{
String line=value.toString();
StringTokenizer tokenizer=new StringTokenizer(line);
while(tokenizer.hasMoreTokens()){
word.set(tokenizer.nextToken());
output.collect(word,one);
}
}
}
public static class Reduce extends MapReduceBase implements Reducer<Text,IntWritable,Text,IntWritable>{
public void reduce(Text key,Iterator<IntWritable> values,OutputCollector<Text,IntWritable> output,Reporter reporter) throws IOException{
int sum = 0;
while(values.hasNext()){
sum +=values.next().get();
}
output.collect(key,new IntWritable(sum));
}
}
public static void main(String[] args) throws Exception{
JobConf conf = new JobConf(WordCount.class);
conf.setJobName("wordcount");
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
conf.setMapperClass(Map.class);
conf.setReducerClass(Reduce.class);
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
FileInputFormat.setInputPaths(conf,new Path(args[0]));
FileOutputFormat.setOutputPath(conf,new Path(args[1]));
JobClient.runJob(conf);
}
}
2、代码保存为\home\hadoop\codetest\helloworld\WordCount.java
3、在\home\hadoop\codetest\helloworld\进行以下操作
$javac WordCount.jar
$jar -cvf wordcount.jar -c ./ .
$hadoop -dfs -mkdir /user
$hadoop -dfs -mkdir /user/hadoop
$hadoop -dfs -mkdir /user/hadoop/input
$echo "Hello World Bye World" >file01
$echo "Hello Hadoop Goodbye Hadoop">file02
$hadoop dfs -put file* /user/hadoop/input/
$rm -f file*
$hadoop jar wordcount.jar WordCount input output
测试时,重新运行时,可以先将output删除
$hadoop -dfs -rm -r -r /user/hadoop/output
4、结果
16/11/22 16:29:42 INFO client.RMProxy: Connecting to ResourceManager at hadoop-namenode/192.168.137.11:8032
16/11/22 16:29:42 INFO client.RMProxy: Connecting to ResourceManager at hadoop-namenode/192.168.137.11:8032
16/11/22 16:29:43 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
16/11/22 16:29:43 INFO mapred.FileInputFormat: Total input paths to process : 2
16/11/22 16:29:43 INFO mapreduce.JobSubmitter: number of splits:25
16/11/22 16:29:43 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1479800054795_0003
16/11/22 16:29:44 INFO impl.YarnClientImpl: Submitted application application_1479800054795_0003
16/11/22 16:29:44 INFO mapreduce.Job: The url to track the job: http://hadoop-namenode:8088/proxy/application_1479800054795_0003/
16/11/22 16:29:44 INFO mapreduce.Job: Running job: job_1479800054795_0003
16/11/22 16:29:51 INFO mapreduce.Job: Job job_1479800054795_0003 running in uber mode : false
16/11/22 16:29:51 INFO mapreduce.Job: map 0% reduce 0%
16/11/22 16:30:19 INFO mapreduce.Job: map 12% reduce 0%
16/11/22 16:30:21 INFO mapreduce.Job: map 16% reduce 0%
16/11/22 16:30:23 INFO mapreduce.Job: map 20% reduce 0%
16/11/22 16:30:25 INFO mapreduce.Job: map 28% reduce 0%
16/11/22 16:30:26 INFO mapreduce.Job: map 35% reduce 0%
16/11/22 16:30:27 INFO mapreduce.Job: map 36% reduce 0%
16/11/22 16:30:32 INFO mapreduce.Job: map 40% reduce 0%
16/11/22 16:30:37 INFO mapreduce.Job: map 43% reduce 0%
16/11/22 16:30:38 INFO mapreduce.Job: map 44% reduce 0%
16/11/22 16:30:40 INFO mapreduce.Job: map 53% reduce 0%
16/11/22 16:30:41 INFO mapreduce.Job: map 60% reduce 0%
16/11/22 16:30:46 INFO mapreduce.Job: map 68% reduce 0%
16/11/22 16:30:49 INFO mapreduce.Job: map 84% reduce 0%
16/11/22 16:30:54 INFO mapreduce.Job: map 96% reduce 0%
16/11/22 16:31:09 INFO mapreduce.Job: map 100% reduce 0%
16/11/22 16:31:13 INFO mapreduce.Job: map 100% reduce 25%
16/11/22 16:31:14 INFO mapreduce.Job: map 100% reduce 50%
16/11/22 16:31:16 INFO mapreduce.Job: map 100% reduce 100%
16/11/22 16:31:18 INFO mapreduce.Job: Job job_1479800054795_0003 completed successfully
16/11/22 16:31:18 INFO mapreduce.Job: Counters: 50
File System Counters
FILE: Number of bytes read=122
FILE: Number of bytes written=3353307
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=2942
HDFS: Number of bytes written=41
HDFS: Number of read operations=87
HDFS: Number of large read operations=0
HDFS: Number of write operations=8
Job Counters
Launched map tasks=25
Launched reduce tasks=4
Data-local map tasks=23
Rack-local map tasks=2
Total time spent by all maps in occupied slots (ms)=1080282
Total time spent by all reduces in occupied slots (ms)=81957
Total time spent by all map tasks (ms)=1080282
Total time spent by all reduce tasks (ms)=81957
Total vcore-seconds taken by all map tasks=1080282
Total vcore-seconds taken by all reduce tasks=81957
Total megabyte-seconds taken by all map tasks=1106208768
Total megabyte-seconds taken by all reduce tasks=83923968
Map-Reduce Framework
Map input records=2
Map output records=8
Map output bytes=82
Map output materialized bytes=698
Input split bytes=2600
Combine input records=0
Combine output records=0
Reduce input groups=5
Reduce shuffle bytes=698
Reduce input records=8
Reduce output records=5
Spilled Records=16
Shuffled Maps =100
Failed Shuffles=0
Merged Map outputs=100
GC time elapsed (ms)=4020
CPU time spent (ms)=22170
Physical memory (bytes) snapshot=6518628352
Virtual memory (bytes) snapshot=25351811072
Total committed heap usage (bytes)=4766302208
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=342
File Output Format Counters
Bytes Written=41
$ hadoop dfs -cat /user/hadoop/output/part-00000
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
Goodbye 1
$ hadoop dfs -cat /user/hadoop/output/part-00001
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
Bye 1
Hello 2
World 2
$ hadoop dfs -cat /user/hadoop/output/part-00002
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
Hadoop 2
- MapReduce程序测试
- MapReduce程序的测试--MRUnit
- 一个使用MRUnit测试mapreduce程序例子
- 在本地文件系统上测试MapReduce程序
- mapreduce测试
- hadoop-0.23.9安装以及第一个mapreduce测试程序
- 第一个mapreduce程序的测试与分析
- hadoop-0.23.9安装以及第一个mapreduce测试程序
- win10 Intellij idea开发Hadoop MapReduce程序单机测试
- hadoop集群测试mapreduce程序的各种坑
- mapreduce测试 错误汇总
- MapReduce基准测试
- mapreduce程序编写注意事项
- 基础MapReduce程序骨架
- mapreduce程序编写注意事项
- mapreduce程序-awk脚本
- mapreduce程序运行
- 开发MapReduce程序
- python字符串
- iframe 父页面a标签链接跳转子页面锚点方法
- Android -- service之绑定服务
- Tesseract 3.05及之后版本编译生成动态链接库DLL
- Lookup 方法注入和任意方法替换
- MapReduce程序测试
- Hibernate4 HQL查询占位符的问题
- $_GET、$_POST、$_COOKIE等几个数组进行转义函数
- 解决pip install 时locale.Error: unsupported locale setting
- struts+spring action应配置为scope="prototype"
- GCC 处理二进制位的内置函数
- 使用balancer_by_lua_block做应用层负载均衡
- JavaScript定时器--图片轮播
- js(window.open)浏览器弹框居中显示