2016/10/19

来源：互联网发布：java架构师工作编辑：程序博客网时间：2024/05/22 04:30

MapReduce的过程：把原来的数据分成块，一条一条记录使用MAP函数生成键值对，以键值把把键值对归类形成集合，再把这些集合进行排序。
开发者定义四个过程：输入-》键值对，MAP,REDUCE,键值对=》输出

炼数成金

namenode:metadata
Configuration file
- default file and site file
secondary namenode: not backup, housekeeping
- merge edit and fsimage
- edits: accumulate the change since the last changepoint
- fsimage:last checkpoing
- fstime: contains the timestamp of the last checkpoint
Task Tracker:
- accepts requests for task such as map, reduce ad shuffle
- slota= cores on the machine
- ???多处理器和多核的区别???
- hearbeat: tell whether healthy and how many free slots are available
Job Tracker:
- schedule: close to the data block
- determin number of taks
YERN
- the idea is to have a global resource manager and a per-application Application Master.
- components
  - global resouece mannager
    - primaly a schedular
    - ensure uptimal cluster utilization
  - node manager
    - local resource manager
    - slave service.
    - take requests form resource manager and allocates containers to application
    - eachnode has its own node manager
  - application-specific application master
    - is the key defferentiatorbetween the older MapReduce v1 framework and YARN
    - each type has an application master
    - improved scalability
    - a more generic framework
  - scheduler
  - container
    - CPU and memory

0 0