Little Tips for Hadoop

来源:互联网 发布:网络舆情的应对与处置 编辑:程序博客网 时间:2024/04/29 23:11

1. Hadoop (HDFS)


[1]Namenode and Datanode

master/slave arch

     Namenode: center server, manage FS namespace and client access.

     Datanode: per node, manage nodes' storage.

  

     User stores data as file which will be divided into one or multi data blocks on a set of data nodes.

     Namenode executes FS namespace operation like open, close, rename file and directory, and also takes responsibility for mapping blocks to data node.  Datanode taks responsibility for handling client r/w request with blocks creation, delete and replication under name node uniform control.

 

 

An application can specify the number of replicas of a file.

The blocks of a file are replicated for fault tolerance. The block size and replication factor are configurable per file.

Files in HDFS are write-once and have strictly one writer at any time.

The NameNode makes all decisions regarding replication of blocks. It periodically receives a Heartbeat and a Blockreport from each of the DataNodes in the cluster. Receipt of a Heartbeat implies that the DataNode is functioning properly. A Blockreport contains a list of all blocks on a DataNode.

 


More Information please refer to Refer to http://hadoop.apache.org/common/docs/r0.18.2/hdfs_design.html


原创粉丝点击