hbase初识---开始认识hbase

来源：互联网发布：淘宝号怎么养编辑：程序博客网时间：2024/06/06 15:02

hbase作为hadoop生态圈的数据存储系统，在整个大数据技术栈中占用重要地位，也是google三大论文，bigtable的对应产品。我们先看看hbase官网介绍吧：

Welcome to Apache HBase™
Apache HBase™ is the Hadoop database, a distributed, scalable, big data store.

Use Apache HBase™ when you need random, realtime read/write access to your Big Data. This project’s goal is the hosting of very large tables – billions of rows X millions of columns – atop clusters of commodity hardware. Apache HBase is an open-source, distributed, versioned, non-relational database modeled after Google’s Bigtable: A Distributed Storage System for Structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, Apache HBase provides Bigtable-like capabilities on top of Hadoop and HDFS.

hbase是hadoop的数据存储系统，分布式，可扩展的大数据存储系统。
使用hbase你可以随机的，实时的，读写大规模数据集，这也是hbase这个项目的初衷。
hbase可以存储数十亿的行，和几百万的列。
hbase是开源的，分布式，多版本，非关系型数据库，对应google三大论文的bigtable。
hbase提供基于hadoop的hdfs的像bigtable那样分布式存储的大容量存储功能。

Features
Linear and modular scalability.
Strictly consistent reads and writes.
Automatic and configurable sharding of tables
Automatic failover support between RegionServers.
Convenient base classes for backing Hadoop MapReduce jobs with Apache HBase tables.
Easy to use Java API for client access.
Block cache and Bloom Filters for real-time queries.
Query predicate push down via server side Filters
Thrift gateway and a REST-ful Web service that supports XML, Protobuf, and binary data encoding options
Extensible jruby-based (JIRB) shell
Support for exporting metrics via the Hadoop metrics subsystem to files or Ganglia; or via JMX

特征
线性和模块的可扩展性。
强一致的读取和写入。
自动和可配置的表分片
RegionServers之间的自动故障转移支持。
方便的基类，用于使用Apache HBase表来支持Hadoop MapReduce作业。
易于使用Java API进行客户端访问。
阻止高速缓存和Bloom Filters进行实时查询。
通过服务器端过滤器查询谓词下推
Thrift网关和支持XML，Protobuf和二进制数据编码选项的REST-ful Web服务
可扩展的jruby-based（JIRB）外壳
支持通过Hadoop指标子系统将度量输出到文件或Ganglia; 或通过JMX

下载http://www.apache.org/dyn/closer.cgi/hbase/
安装先尝试一下

阅读全文

0 0