Hbase 学习笔记一 》starting from scrath

来源:互联网 发布:windows设备管理器在哪 编辑:程序博客网 时间:2024/06/08 08:52

The shell opens a connection to HBase and greets you with a prompt. With the shell

prompt ahead of you, create your first table:

A word about Java

The vast majority of code used in this book is written in Java. We use pseudo-code

here and there to help teach concepts, but the working code is Java. Java is a practical

reality of using HBase. The entire Hadoop stack, including HBase, is implemented

in Java. The HBase client library is Java. The MapReduce library is Java. An HBase

deployment requires tuning the JVM for optimal performance. But there are means

for interacting with Hadoop and HBase from non-Java and non-JVM languages. We

cover many of these options in chapter 6.

1.Starting from scratch 

$ hbase shellHBase Shell; enter 'help<RETURN>' for list of supported commands.Type "exit<RETURN>" to leave the HBase ShellVersion 0.92.0, r1231986, Mon Jan 16 13:16:35 UTC 2012hbase(main):001:0>



Presumably 'users' is the name of the table, but what about this 'info' business?

Just like tables in a relational database, tables in HBase are organized intorowsandcolumns.

HBase treats columns a little differently than a relational database. Columns in

HBase are organized into groups calledcolumn families.info is a column family in the

users table. A table in HBase must have at least one column family. Among other

things, column families impact physical characteristics of the data store in HBase. For

this reason, at least one column family must be specified at table creation time. You

can alter column families after the table is created, but doing so is a little tedious.

We’ll discuss column families in more detail later. For now, know that your users table

is as simple as it gets—a single column family with default parameters.

2.Examine table schema

If you’re familiar with relational databases, you’ll notice right away that the table creation

didn’t involve any columns or types. Other than the column family name, HBase

doesn’t require you to tell it anything about your data ahead of time. That’s why HBase

is often described as a schema-less database.

You can verify that your users table was created by asking HBase for a listing of all

registered tables:

hbase(main):002:0> listTABLEusers1 row(s) in 0.0220 secondshbase(main):003:0>


The list command proves the table exists, but HBase can also give you extended

details about your table. You can see all those default parameters using the describe

command:

hbase(main):003:0> describe 'users'DESCRIPTION ENABLED{NAME => 'users', FAMILIES => [{NAME => 'info', trueBLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3', TTL=> '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}1 row(s) in 0.0330 secondshbase(main):004:0>

The shell describes your table as a map with two properties: the table name and a list

of column families. Each column family has a number of associated configuration

3. Establish a connection

The shell is well and good, but who wants to implement TwitBase in shell commands?

Those wise HBase developers thought of this and equipped HBase with a complete

Java client library. A similar API is exposed to other languages too; we’ll cover those in

chapter 6. For now, we’ll stick with Java. The Java code for opening a connection to

the users table looks like this:

HTableInterface usersTable = new HTable("users");

The HTable constructor reads the default configuration information to locate HBase,

similar to the way the shell did. It then locates the users table you created earlier and

gives you a handle to it.

You can also pass a custom configuration object to the HTable object:

Configuration myConf = HBaseConfiguration.create();HTableInterface usersTable = new HTable(myConf, "users");

This is equivalent to letting the HTable object create the configuration object on its

own. To customize the configuration, you can define parameters like this:

myConf.set("parameter_name", "parameter_value");

                                  HBase client configuration
HBase client applications need to have only one configuration piece available to them
to access HBase—the ZooKeeper quorum address. You can manually set this configuration
like this:
myConf.set("hbase.zookeeper.quorum", "serverip");
Both ZooKeeper and the exact interaction between client and the HBase cluster are
covered in the next chapter where we go into details of HBase as a distributed store.
For now, all you need to know is that the configuration parameters can be picked either
by the Java client from the hbase-site.xml file in their classpath or by you setting
the configuration explicitly in the connection. When you leave the configuration completely
unspecified, as you do in this sample code, the default configuration is read
and localhost is used for the ZooKeeper quorum address. When working in local
mode, as you are here, that’s exactly what you want.


Connection management

Creating a table instance is a relatively expensive operation, requiring a bit of network

overhead. Rather than create a new table handle on demand, it’s better to use a

Closing the table when you’re finished with it allows the underlying connection

resources to be returned to the pool.

What good is a table without data in it? No good at all. Let’s store some data.

connection pool. Connections are allocated from and returned to the pool. Using an
HTablePool is more common in practice than instantiating HTables directly:HTablePool pool = new HTablePool();HTableInterface usersTable = pool.getTable("users");... // work with the tableusersTable.close();


Closing the table when you’re finished with it allows the underlying connection
resources to be returned to the pool.
What good is a table without data in it? No good at all. Let’s store some data.





0 0
原创粉丝点击