《Hadoop The Definitive Guide》ch05 Developing a MapReduce Application
来源:互联网 发布:知乎 图标资源 编辑:程序博客网 时间:2024/06/05 16:38
1. 介绍
MapReduce应用开发包含特定的流程。首先,编写map和reduce函数,最好能进行单元测试以保证它们能如期运行。然后写一个驱动程序来运行作业,可以使用数据集中的少量数据从IDE运行,看它是否能够正常运行。
2. GenericOptionsParser, Tool和ToolRunner
[ate: /local/nomad2/hadoop/tomwhite-hadoop-book-32dae01 ]>> hadoop ConfigurationPrinter -conf conf/hadoop-localhost.xml |grep mapred.job.tracker=mapred.job.tracker=localhost:8021
3. 问题:老是抱怨找不到class?
解决办法:
1. stop-all.sh
2. rm -rf /tmp/hadoop-nomad2/*
3. hadoop namenode -format
4. start-all.sh
5. jps 确认datanode进程起来
6. 重新运行程序,注意这里的jar文件是在HADOOP_CLASSPATH中的,而不是在hdfs中。
[ate: /local/nomad2/hadoop/tomwhite-hadoop-book-32dae01 ]>> hadoop jar ch05.jar v3.MaxTemperatureDriver input/ncdc/all max-temp12/07/03 01:33:40 INFO mapred.FileInputFormat: Total input paths to process : 212/07/03 01:33:40 INFO mapred.JobClient: Running job: job_201207030133_000112/07/03 01:33:41 INFO mapred.JobClient: map 0% reduce 0%12/07/03 01:33:55 INFO mapred.JobClient: map 100% reduce 0%12/07/03 01:34:07 INFO mapred.JobClient: map 100% reduce 100%12/07/03 01:34:12 INFO mapred.JobClient: Job complete: job_201207030133_000112/07/03 01:34:12 INFO mapred.JobClient: Counters: 2612/07/03 01:34:12 INFO mapred.JobClient: Job Counters 12/07/03 01:34:12 INFO mapred.JobClient: Launched reduce tasks=112/07/03 01:34:12 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=1634712/07/03 01:34:12 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=012/07/03 01:34:12 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=012/07/03 01:34:12 INFO mapred.JobClient: Launched map tasks=212/07/03 01:34:12 INFO mapred.JobClient: Data-local map tasks=212/07/03 01:34:12 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=1000412/07/03 01:34:12 INFO mapred.JobClient: File Input Format Counters 12/07/03 01:34:12 INFO mapred.JobClient: Bytes Read=14797212/07/03 01:34:12 INFO mapred.JobClient: File Output Format Counters 12/07/03 01:34:12 INFO mapred.JobClient: Bytes Written=1812/07/03 01:34:12 INFO mapred.JobClient: FileSystemCounters12/07/03 01:34:12 INFO mapred.JobClient: FILE_BYTES_READ=2812/07/03 01:34:12 INFO mapred.JobClient: HDFS_BYTES_READ=14818412/07/03 01:34:12 INFO mapred.JobClient: FILE_BYTES_WRITTEN=6292312/07/03 01:34:12 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=1812/07/03 01:34:12 INFO mapred.JobClient: Map-Reduce Framework12/07/03 01:34:12 INFO mapred.JobClient: Map output materialized bytes=3412/07/03 01:34:12 INFO mapred.JobClient: Map input records=1313012/07/03 01:34:12 INFO mapred.JobClient: Reduce shuffle bytes=3412/07/03 01:34:12 INFO mapred.JobClient: Spilled Records=412/07/03 01:34:12 INFO mapred.JobClient: Map output bytes=11816112/07/03 01:34:12 INFO mapred.JobClient: Map input bytes=177716812/07/03 01:34:12 INFO mapred.JobClient: Combine input records=1312912/07/03 01:34:12 INFO mapred.JobClient: SPLIT_RAW_BYTES=21212/07/03 01:34:12 INFO mapred.JobClient: Reduce input records=212/07/03 01:34:12 INFO mapred.JobClient: Reduce input groups=212/07/03 01:34:12 INFO mapred.JobClient: Combine output records=212/07/03 01:34:12 INFO mapred.JobClient: Reduce output records=212/07/03 01:34:12 INFO mapred.JobClient: Map output records=13129
4. MapReduce web用户界面
http://server:50030
Job的详细信息,
- 《Hadoop The Definitive Guide》ch05 Developing a MapReduce Application
- 《Hadoop The Definitive Guide》ch02 MapReduce
- 《Hadoop The Definitive Guide》ch08 MapReduce Features
- 《Hadoop The Definitive Guide》ch06 How MapReduce Works
- 《Hadoop The Definitive Guide》ch07 MapReduce Types and Formats
- 《Hadoop: The Definitive Guide》读书笔记 -- Chapter 2 MapReduce
- 《Hadoop The Definitive Guide》ch09 Setting Up a Hadoop Cluster
- Hadoop- The Definitive Guide 笔记
- Hadoop经典书籍----- Hadoop: The Definitive Guide
- 《Hadoop The Definitive Guide》ch10 Administering Hadoop
- Notes for Hadoop the definitive guide
- Hadoop- The Definitive Guide 笔记2
- 《Hadoop The Definitive Guide》ch12 HBase
- 《Hadoop The Definitive Guide》ch13 ZooKeeper
- 《Hadoop The Definitive Guide》ch11 Pig
- 《Hadoop The Definitive Guide》ch14 Case Studies
- Notes for Hadoop the definitive guide
- Hadoop YARN Installation: The definitive guide
- java中关于switch语句的知识
- 文本输入控件wx.TextCtrl
- 结构体复制
- 函数调用
- 《在PPT中插入Flash动画方法四种》
- 《Hadoop The Definitive Guide》ch05 Developing a MapReduce Application
- Emacs 入门指引(一) Emacs简介
- [QT] Qt学习方法(丁林松总结)
- 寻找丑数问题 HDOJ 1058 Humble Numbers
- 一些看起来有用但没用过的函数
- 学习linux 的方法(个人总结)
- http://cdn.ac.nbutoj.com/Problem/view.xhtml?id=1186
- struts中的constant详解
- android 获取gps位置并标注