Start Hbase

来源:互联网 发布:网络销售培训视频 编辑:程序博客网 时间:2024/05/16 10:29
 

Some Research of HBase

 

1.    The Architecture of HBase

 

There are three major components of the HBase architecture:

1.         The HBaseMaster (analogous to the Bigtable master server)

2.         The HRegionServer (analogous to the Bigtable tablet server)

3.         The HBase client, defined by org.apache.hadoop.hbase.client.HTable

The Architecture of HBase

 

1)     HBase Master duties:

l         Cluster initialization

l         Assigning/unassigning regions to/from HRegionServers (unassigning is for load balance)

l         Monitor the health and load of each HRegionServer

l         Changes to the table schema and handling table administrative functions

l         Data localization

 

2)     HBase Region Server duties:

l         Serving HRegions assigned to HRegionServer

l         Handling client read and write requests

l         Flushing cache to HDFS

l         Keeping HLog

l         Compactions

l         Region Splits

 

3)      HBase Client

HBase is a Heavy Client System. Each client manages its own connection to appropriate server

2.    The process of commit and update data.

 

The process of commit and update data.

 

Write Requests

 

When a write request is received, it is first written to a write-ahead log called a HLog. All write requests for every region the region server is serving are written to the same log. Once the request has been written to the HLog, it is stored in an in-memory cache called the Memcache. There is one Memcache for each HStore.

 

Read Requests

 

Reads are handled by first checking the Memcache and if the requested data is not found, the MapFiles are searched for results.

 

Cache Flushes

 

When the Memcache reaches a configurable size, it is flushed to disk, creating a new MapFile and a marker is written to the HLog, so that when it is replayed, log entries before the last flush can be skipped. A flush may also be triggered to relieve memory pressure on the region server.

 

Cache flushes happen concurrently with the region server processing read and write requests. Just before the new MapFile is moved into place, reads and writes are suspended until the MapFile has been added to the list of active MapFiles for the HStore.

 

 

 

3.    Basic Operation of HBase Table

 

Operation

HBase shell Command

Create Table

create 't1', {NAME => 'f1', VERSIONS => 5}

Add the column family

alter 't1', {NAME => 'f1', VERSIONS => 5}

Delete the column family

alter 't1', {NAME => 'f1', METHOD => 'delete'}

Get row

get 't1', 'r1'

Get cell content

get 't1', 'r1', {COLUMN => 'c1', TIMESTAMP => ts1}

List all tables

list

Count the table

count ‘t1’

scan

Scanner specifications may include one or more of the following: LIMIT, STARTROW, STOPROW, TIMESTAMP, or COLUMNS:

1.      scan '.META.', {COLUMNS => 'info:regioninfo'}

2.      scan 't1', {COLUMNS => ['c1', 'c2'], LIMIT => 10, \

             STARTROW => 'xyz'}

 

 

4.    Check the status:

1)      Check the DFS namenode status: http://{DFSNameNodeIP}:50070/dfshealth.jsp

2)      Check the HBase status:  http://{masterServerIP}:60010/master.jsp

 

5.    Meet some error and the solve method:

Error

Root

Resolve

org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Incompatible namespaceIDs in /tmp/hadoop-root/dfs/data: namenode namespaceID = 1743665947; datanode namespaceID = 352063137

Incompatible namespaceIDs

rm –rf /tmp/*

NativeException: org.apache.hadoop.hbase.TableNotDisabledException: org.apache.hadoop.hbase.TableNotDisabledException: t1

Before alter the column family , you need to disable the table first

Hbase>disable ‘t1’

 

6.    How to deploy the HBase

http://hi.baidu.com/webcell/blog/item/83ee17303e7d2391a8018e5d.html/cmtid/1ae33bfa2d7bfe14a9d311b5

 

7.    How to turning the HBase Performance

http://wiki.apache.org/hadoop/PerformanceTuning

 

原创粉丝点击