How does HBase perform load balancing?

来源：互联网发布：郝蕾辱骂河南人知乎编辑：程序博客网时间：2024/06/14 09:26

MauMau提出下面这样一个问题（hbase的版本应该是0.20.xx）：

[Q1] Load balancing
Does HBase move regions to a newly added region server (logically, not
physically on storage) immediately? If not immediately, what timing?
On what criteria does the master unassign and assign regions among region
servers? CPU load, read/write request rates, or just the number of regions
the region servers are handling?

According the HBase design overview on the page below, the master monitors
the load of each region server and moves regions.

http://wiki.apache.org/hadoop/Hbase/DesignOve...

The related part is the following:

----------------------------------------
HMaster duties:
Assigning/unassigning regions to/from HRegionServers (unassigning is for
load balance)
Monitor the health and load of each HRegionServer
...
If HMaster detects overloaded or low loaded H!RegionServer, it will unassign
(close) some regions from most loaded H!RegionServer. Unassigned regions
will be assigned to low loaded servers.
----------------------------------------

When I read the above, I thought that the master checks the load of region
servers periodically (once a few minutes or so) and performs load balancing.
And I thought that the master unassigns some regions from the existing
loaded region servers to a newly added one immediately when the new server
joins the cluster and contacts the master.
However, the benchmark report by Yahoo! Research describes as follows. This
says that HBase does not move regions until compaction（这里应该是move data）,

so I cannot get the effect of adding new servers immediately even if I added the

new server to solve the overload problem.
What's the fact?

Benchmark report by Yahoo!

----------------------------------------

6.7 Elastic Speedup
As the figure shows, the read latency spikes initially after
the sixth server is added, before the latency stabilizes at a
value slightly lower than the latency for five servers. This result
indicates that HBase is able to shift read and write load
to the new server, resulting in lower latency. HBase does
not move existing data to the new server until compactions
occur2. The result is less latency variance compared to Cassandra
since there is no repartitioning process competing
for the disk. However, the new server is underutilized, since
existing data is served off the old servers.
...
2 It is possible to run the HDFS load balancer to force data
to the new servers, but this greatly disrupts HBase’s ability
to serve data partitions from the same servers on which they
are stored.

Ryan Rawson给出的答复：

hey,

HBase currently uses region count to load balance. Regions are
assigned in a semi-randomish order to other regionservers.

The paper is somewhat correct in that we are not moving data around
aggressively, because then people would write in complaining we move
data around too much :-)

So a few notes, HBase is not a key-value store, its a tabluar data
store, which maintains key order, and allows the easy construction of
left-match key indexes.

One other thing... if you are using a DHT (eg: cassandra), when a node
fails the load moves to the other servers in the ring-segment. For
example if you have N=3 and you lose a node in a segment, the load of
a server would move to 2 other servers. Your monitoring system should
probably be tied into the DHT topology since if a second node fails in
the same ring you probably want to take action. Ironically nodes in
cassandra are special (unlike the publicly stated info) and they
"belong" to a particular ring segment and cannot be used to store
other data. There are tools to do node swap in, but you want your
cluster management to be as automated as possible.

Compared to a bigtable architecture, the load of a failed regionserver
is evenly spread across the entire rest of the cluster. No node has a
special role in HDFS and HBase, any data can be hosted and served from
any node. As nodes fail, as long as you have enough nodes to serve
the load you are in good shape. The HDFS missing block report lets you
know when you have lost too many nodes. Nodes have no special role and
can host and hold any data.

In the future we want to add a load balancing based on
requests/second. We have all the requisite data and architecture, but
other things are up more important right now. Pure region count load
balancing tends to work fairly well in practice.

有兴趣看原文吧，那里面有更多的信息：

http://web.archiveorange.com/archive/v/gMxNAM9Cdnlhwpo7e6uC