the introduction of the hadoop

来源:互联网 发布:测试工程师 知乎 编辑:程序博客网 时间:2024/06/05 06:57

the Apache Hadoop   delivers functionality  of reliable ,scalable and distributed computing

Apache Hadoop library allows for  the  distributed processing  of the large data sets across  clusters of computers using the simple programming models. it is designed to scale up from the single server to thousands of machine,each offering  local storage and computation rather than rely on hardware to deliver high availability

there are the following modules:

1 Hadoop common:the common utilities to support others Hadoop module

2 Hadoop  Distributed  File System (HDFS):a distributed file system which provides high though out access to application data

3 Hadoop YARN :job schedule

4 Hadoop MapReduce: parallel processing

0 0