Hadoop in Action Note

来源:互联网 发布:罗马人的故事版本 知乎 编辑:程序博客网 时间:2024/05/16 13:05

MapReduce vs. Relational DataBase

  • scale out vs. scale up
  • key/value pairs vs. relational tables
  • functional programming vs. declarative query
  • offline batch processing vs. online transactions(MapReduce is more like warehouse)

 New API vs. Old API

  • start from 0.20
  • classes moved from org.apache.haddop.mapred(.lib) toorg.apache.haddop.mapreduce(.lib)
  • introduction of context object to unitify communication with framework
    • with this way, map/reduce need not to be changed for further feature addition
    • only some new method need to add to context object
  • JobClient and JobConf replaced by Configuration and Job class
    • Configuration class configures a job
    • Job class defines and controls execution of job