Hadoop Streaming Input and Output
来源:互联网 发布:c 语言入门经典 编辑:程序博客网 时间:2024/06/05 16:54
StreamJob.java
run() method:
init(); 生成 Environment env_ 对象
prePorcessArgs();
parseArgv(); 解析Hadoop Streaming 命令参数,并赋值给StreamJob成员变量
postProcessArgs(); 检查输入参数的完整性,有效性,充分性
setJobConf(); 根据上面的命令参数,配置mapreduce job 的各项参数
JobConf: jobConf_ : general MapRed job properties
Configuration: config_ : as parameter to create JobConf object.
Class fmt=TextInputFormat.class
TextInputFormat implements InputFormat interface:
public interface InputFormat<K,V>
InputFormat
describes the input-specification for a Map-Reduce job.The Map-Reduce framework relies on the
InputFormat
of the job to:
- Validate the input-specification of the job.
- Split-up the input file(s) into logical
InputSplit
s, each of which is then assigned to an individualMapper
.- Provide the
RecordReader
implementation to be used to glean input records from the logicalInputSplit
for processing by theMapper
.
- Hadoop Streaming Input and Output
- input and output
- itk input and output
- Python input and output
- Input and Output
- 7. Input and Output
- 8 Input and Output
- Input and output of iostream.
- Ch.7 - Input and Output
- Redirecting standard input and output
- File Input and Output(1)
- Standard Input and Output Redirection
- I/O Input and Output
- [Language]File Input and Output
- 14.2. Input and Output Operators
- Synchronization and Overlapped Input and Output
- 1.5.输入和输出(Input and Output)
- Input and Output(easy to use)
- Mongoose源码剖析:mongoose的工作模型
- <二叉树 前中后 层序 非递归遍历 c语言实现>
- activity的生命周期
- 通向架构师的道路(第四天)之Tomcat性能调优-让小猫飞奔(暂时忽略)
- 一个题目涉及到的50个Sql语句
- Hadoop Streaming Input and Output
- Mongoose源码剖析:外篇之web服务器
- 通向架构师的道路(第五天)之tomcat集群-群猫乱舞
- Centos 同步时间
- Android UI学习 - TableLayout
- 中断处理函数中不用disable_irq而用disable_irq_nosync原因
- Ubutnu下arm-linux-gcc交叉编译环境的搭建64位和32位都能用
- winform 修改系统日期
- Log4j基础入门(二)