Hadoop当前所包含的子项目汇总

来源:互联网 发布:管理系统软件的软件 编辑:程序博客网 时间:2024/05/01 08:18

目前,Hadoop project下已经包含了很多的子项目,有的是从原有的hadoop项目中细化出来的,有的是在hadoop的基础之上演变出来的,本文只是引用hadoop文档中关于其子项目的介绍,以备了解。


The project includes these subprojects:

  • Hadoop Common: The common utilities that support the other Hadoop subprojects.
  • Hadoop Distributed File System (HDFS™): A distributed file system that provides high-throughput access to application data.
  • Hadoop MapReduce: A software framework for distributed processing of large data sets on compute clusters.

Other Hadoop-related projects at Apache include:

  • Avro™: A data serialization system.
  • Cassandra™: A scalable multi-master database with no single points of failure.
  • Chukwa™: A data collection system for managing large distributed systems.
  • HBase™: A scalable, distributed database that supports structured data storage for large tables.
  • Hive™: A data warehouse infrastructure that provides data summarization and ad hoc querying.
  • Mahout™: A Scalable machine learning and data mining library.
  • Pig™: A high-level data-flow language and execution framework for parallel computation.
  • ZooKeeper™: A high-performance coordination service for distributed applications.

原创粉丝点击