Thrift安装配置及使用python通过thrift连接HBase测试

来源:互联网 发布:芜湖编程招聘 编辑:程序博客网 时间:2024/05/30 23:51

准备工作及测试环境:

1)Hadoop集群【已完成】
版本:hadoop-0.20.2
安装路径:/usr/local/hadoop-0.20.2
NameNode:192.168.85.128 h1
DataNode:192.168.85.130 h2
192.168.85.131 h3

2)HBase环境
版本:0.9.5
安装路径:/usr/local/hbase-0.9.5
192.168.85.128 h1(Master)
192.168.85.130 h2
192.168.85.131 h3

3)python
系统自带版本为2.4.3,测试使用2.7.3版本
安装包:Python-2.7.3.tar.bz2
下载地址:http://www.python.org/ftp/python/2.7.3/Python-2.7.3.tar.bz2

4)thrift
版本:0.9.0
安装包:thrift-0.9.0.tar.gz
下载地址:https://dist.apache.org/repos/di … thrift-0.9.0.tar.gz

一、安装python
1、编译安装
[root@h1 ~]# wget http://www.python.org/ftp/python/2.7.3/Python-2.7.3.tar.bz2
[root@h1 ~]# tar jxvf Python-2.7.3.tar.bz2
[root@h1 ~]# ./configure –prefix=/usr/local/python-2.7.3
[root@h1 ~]# cd Python-2.7.3
[root@h1 Python-2.7.3]# ./configure –prefix=/usr/local/python-2.7.3 #指定安装路径为/usr/local/python-2.7.3
[root@h1 Python-2.7.3]# make
[root@h1 Python-2.7.3]# make install

2、环境变量设置
为了不影响系统python版本,通过设置grid用户python环境变量方式使用python 2.7.3版本
[root@h1 Python-2.7.3]# su - grid
PATH=HOME/bin:/usr/local/pig0.9.2/bin:/usr/local/hadoop0.20.2/bin:/usr/local/thrift0.9.0/bin:/usr/local/python2.7.3/bin:PATH
export PATH
[grid@h1 ~]source.bashprofile[grid@h1 ] thrift -version
Thrift version 0.9.0
[grid@h1 ~]$ which python
/usr/local/python-2.7.3/bin/python

二、安装thrift
1、编译安装
[root@h1 ~]# wget https://dist.apache.org/repos/di … thrift-0.9.0.tar.gz
[root@h1 ~]# tar zxvf thrift-0.9.0.tar.gz
[root@h1 ~]# cd thrift-0.9.0
[root@h1 thrift-0.9.0]# ./configure –prefix=/usr/local/thrift-0.9.0
[root@h1 thrift-0.9.0]# make
[root@h1 thrift-0.9.0]# make install

2、生成hbase模块
[root@h1 thrift-0.9.0]# su - grid
[grid@h1 ~]thriftgenpy/usr/local/hbase0.90.5/src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift[grid@h1 ] tree gen-py/
gen-py/
|– init.py
-- hbase
|-- Hbase-remote
|-- Hbase.py
|-- __init__.py
|-- constants.py
– ttypes.py

1 directory, 6 files

3、将生成的hbase模块拷贝至python site-packages目录下
[root@h1 grid]# cp -r gen-py/hbase/ /usr/local/python-2.7.3/lib/python2.7/site-packages/ #需要使用root权限
[root@h1 grid]# ls /usr/local/python-2.7.3/lib/python2.7/site-packages/
hbase README
[root@h1 grid]# ls /usr/local/python-2.7.3/lib/python2.7/site-packages/hbase/
constants.py Hbase.py Hbase-remote init.py ttypes.py

4、使用easy_install安装thrift模块
[root@h1 src]# wget http://pypi.python.org/packages/ … 613b509fb44feefe74e
[root@h1 src]# tar zxvf setuptools-0.6c11.tar.gz
[root@h1 src]# cd setuptools-0.6c11
[root@h1 setuptools-0.6c11]# /usr/local/python-2.7.3/bin/python setup.py install
[root@h1 setuptools-0.6c11]# /usr/local/python-2.7.3/bin/easy_install-2.7 thrift
[root@h1 setuptools-0.6c11]# ls /usr/local/python-2.7.3/lib/python2.7/site-packages/
easy-install.pth hbase README setuptools-0.6c11-py2.7.egg setuptools.pth thrift-0.9.0-py2.7-linux-i686.egg

三、使用python通过thrift连接HBase测试
1、启动thrift【需提前启动Hadoop和HBase】
[grid@h1 ~]$ /usr/local/hbase-0.90.5/bin/hbase thrift -p 9090 start
12/12/16 14:47:21 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.3.2-1031432, built on 11/05/2010 05:32 GMT
12/12/16 14:47:21 INFO zookeeper.ZooKeeper: Client environment:host.name=h1
12/12/16 14:47:21 INFO zookeeper.ZooKeeper: Client environment:java.version=1.7.0_07
12/12/16 14:47:21 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
12/12/16 14:47:21 INFO zookeeper.ZooKeeper: Client environment:java.home=/usr/local/jdk1.7.0_07/jre
12/12/16 14:47:21 INFO zookeeper.ZooKeeper: Client environment:java.class.path=/usr/local/hbase-0.90.5//conf:/usr/local/jdk1.7.0_07//lib/tools.jar:/usr/local/hbase-0.90.5/:/usr/local/hbase-0.90.5//hbase-0.90.5.jar:/usr/local/hbase-0.90.5//hbase-0.90.5-tests.jar:/usr/local/hbase-0.90.5//lib/activation-1.1.jar:/usr/local/hbase-0.90.5//lib/asm-3.1.jar:/usr/local/hbase-0.90.5//lib/avro-1.3.3.jar:/usr/local/hbase-0.90.5//lib/commons-cli-1.2.jar:/usr/local/hbase-0.90.5//lib/commons-codec-1.4.jar:/usr/local/hbase-0.90.5//lib/commons-el-1.0.jar:/usr/local/hbase-0.90.5//lib/commons-httpclient-3.1.jar:/usr/local/hbase-0.90.5//lib/commons-lang-2.5.jar:/usr/local/hbase-0.90.5//lib/commons-logging-1.1.1.jar:/usr/local/hbase-0.90.5//lib/commons-net-1.4.1.jar:/usr/local/hbase-0.90.5//lib/core-3.1.1.jar:/usr/local/hbase-0.90.5//lib/guava-r06.jar:/usr/local/hbase-0.90.5//lib/hadoop-0.20.2-core.jar:/usr/local/hbase-0.90.5//lib/jackson-core-asl-1.5.5.jar:/usr/local/hbase-0.90.5//lib/jackson-jaxrs-1.5.5.jar:/usr/local/hbase-0.90.5//lib/jackson-mapper-asl-1.4.2.jar:/usr/local/hbase-0.90.5//lib/jackson-xc-1.5.5.jar:/usr/local/hbase-0.90.5//lib/jasper-compiler-5.5.23.jar:/usr/local/hbase-0.90.5//lib/jasper-runtime-5.5.23.jar:/usr/local/hbase-0.90.5//lib/jaxb-api-2.1.jar:/usr/local/hbase-0.90.5//lib/jaxb-impl-2.1.12.jar:/usr/local/hbase-0.90.5//lib/jersey-core-1.4.jar:/usr/local/hbase-0.90.5//lib/jersey-json-1.4.jar:/usr/local/hbase-0.90.5//lib/jersey-server-1.4.jar:/usr/local/hbase-0.90.5//lib/jettison-1.1.jar:/usr/local/hbase-0.90.5//lib/jetty-6.1.26.jar:/usr/local/hbase-0.90.5//lib/jetty-util-6.1.26.jar:/usr/local/hbase-0.90.5//lib/jruby-complete-1.6.0.jar:/usr/local/hbase-0.90.5//lib/jsp-2.1-6.1.14.jar:/usr/local/hbase-0.90.5//lib/jsp-api-2.1-6.1.14.jar:/usr/local/hbase-0.90.5//lib/jsr311-api-1.1.1.jar:/usr/local/hbase-0.90.5//lib/log4j-1.2.16.jar:/usr/local/hbase-0.90.5//lib/protobuf-java-2.3.0.jar:/usr/local/hbase-0.90.5//lib/servlet-api-2.5-6.1.14.jar:/usr/local/hbase-0.90.5//lib/slf4j-api-1.5.8.jar:/usr/local/hbase-0.90.5//lib/slf4j-log4j12-1.5.8.jar:/usr/local/hbase-0.90.5//lib/stax-api-1.0.1.jar:/usr/local/hbase-0.90.5//lib/thrift-0.2.0.jar:/usr/local/hbase-0.90.5//lib/xmlenc-0.52.jar:/usr/local/hbase-0.90.5//lib/zookeeper-3.3.2.jar:/usr/local/hbase-0.90.5/conf
12/12/16 14:47:21 INFO zookeeper.ZooKeeper: Client environment:java.library.path=/usr/java/packages/lib/i386:/lib:/usr/lib
12/12/16 14:47:21 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
12/12/16 14:47:21 INFO zookeeper.ZooKeeper: Client environment:java.compiler=
12/12/16 14:47:21 INFO zookeeper.ZooKeeper: Client environments.name=Linux
12/12/16 14:47:21 INFO zookeeper.ZooKeeper: Client environments.arch=i386
12/12/16 14:47:21 INFO zookeeper.ZooKeeper: Client environments.version=2.6.18-92.el5
12/12/16 14:47:21 INFO zookeeper.ZooKeeper: Client environment:user.name=grid
12/12/16 14:47:21 INFO zookeeper.ZooKeeper: Client environment:user.home=/home/grid
12/12/16 14:47:21 INFO zookeeper.ZooKeeper: Client environment:user.dir=/home/grid
12/12/16 14:47:21 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=h3:2181,h2:2181,h1:2181 sessionTimeout=180000 watcher=hconnection
12/12/16 14:47:21 INFO zookeeper.ClientCnxn: Opening socket connection to server h2/192.168.85.130:2181
12/12/16 14:47:21 INFO zookeeper.ClientCnxn: Socket connection established to h2/192.168.85.130:2181, initiating session
12/12/16 14:47:21 INFO zookeeper.ClientCnxn: Session establishment complete on server h2/192.168.85.130:2181, sessionid = 0x13ba2472f650003, negotiated timeout = 180000
12/12/16 14:47:22 INFO ThriftServer: starting HBase ThreadPool Thrift server on /0.0.0.0:9090

2、在hbase中创建测试表hbase_thrift
[grid@h1 ~]$ /usr/local/hbase-0.90.5/bin/hbase shell
HBase Shell; enter ‘help’ for list of supported commands.
Type “exit” to leave the HBase Shell
Version 0.90.5, r1212209, Fri Dec 9 05:40:36 UTC 2011

hbase(main):001:0> create ‘hbase_thrift’, ‘key1’, ‘key2’, ‘key3’
0 row(s) in 3.9830 seconds

hbase(main):002:0> list
TABLE
hbase_thrift
1 row(s) in 0.0510 seconds

hbase(main):004:0> describe ‘hbase_thrift’
DESCRIPTION ENABLED
{NAME => ‘hbase_thrift’, FAMILIES => [{NAME => ‘key1’, BLOOMFILTER => ‘NONE’, REPLICATION_SCOPE => ‘0’, COMPRESSI true ON => ‘NONE’, VERSIONS => ‘3’, TTL => ‘2147483647’, BLOCKSIZE => ‘65536’, IN_MEMORY => ‘false’, BLOCKCACHE => ‘true’}, {NAME => ‘key2’, BLOOMFILTER => ‘NONE’, REPLICATION_SCOPE => ‘0’, COMPRESSION => ‘NONE’, VERSIONS => ‘3’, TTL => ‘2147483647’, BLOCKSIZE => ‘65536’, IN_MEMORY => ‘false’, BLOCKCACHE => ‘true’}, {NAME => ‘key3’,BLOOMFILTER => ‘NONE’, REPLICATION_SCOPE => ‘0’, COMPRESSION => ‘NONE’, VERSIONS => ‘3’, TTL => ‘2147483647’, BLOCKSIZE => ‘65536’, IN_MEMORY => ‘false’, BLOCKCACHE=>’true’}]}
1 row(s) in 0.0790 seconds

3、python测试脚本

!//usr/local/python-2.7.3/bin/python

import sys

Hbase.thrift生成的py文件放在这里

sys.path.append(‘/usr/local/python-2.7.3/lib/python2.7/site-packages/hbase’)
from thrift import Thrift
from thrift.transport import TSocket
from thrift.transport import TTransport
from thrift.protocol import TBinaryProtocol
from hbase import Hbase

如ColumnDescriptor 等在hbase.ttypes中定义

from hbase.ttypes import *

Make socket

此处可以修改地址和端口

transport = TSocket.TSocket(‘192.168.85.128’, 9090)

Buffering is critical. Raw sockets are very slow

还可以用TFramedTransport,也是高效传输方式

transport = TTransport.TBufferedTransport(transport)

Wrap in a protocol

传输协议和传输过程是分离的,可以支持多协议

protocol = TBinaryProtocol.TBinaryProtocol(transport)

客户端代表一个用户

client = Hbase.Client(protocol)

打开连接

transport.open()

获取并打印表名

print(client.getTableNames())

创建测试表,用户信息表users

try:
colusername = ColumnDescriptor( name = ‘username:’,maxVersions = 1 )
colpass = ColumnDescriptor( name = ‘pass:’,maxVersions = 1 )
colage = ColumnDescriptor( name = ‘age:’,maxVersions = 1 )
colinfo = ColumnDescriptor( name = ‘info:’,maxVersions = 1 )

client.createTable('users', [colusername,colpass,colage,colinfo])print client.getTableNames()

4、执行脚本进行测试
[grid@h1 ~]$ python test.py
[‘hbase_thrift’]
[‘hbase_thrift’, ‘users’]

有输出表示脚本执行成功。

5、在Hbase查看是否已创建用户表users
hbase(main):001:0> list
TABLE
hbase_thrift
users
2 row(s) in 2.4330 seconds

hbase(main):002:0> describe ‘users’
DESCRIPTION ENABLED
{NAME => ‘users’, FAMILIES => [{NAME => ‘age’, BLOOMFILTER => ‘NONE’, REPLICATION_SCOPE => ‘0’, COMPRESSION => ‘N true
ONE’, VERSIONS => ‘1’, TTL => ‘-1’, BLOCKSIZE => ‘65536’, IN_MEMORY => ‘false’, BLOCKCACHE => ‘false’}, {NAME =>
‘info’, BLOOMFILTER => ‘NONE’, REPLICATION_SCOPE => ‘0’, COMPRESSION => ‘NONE’, VERSIONS => ‘1’, TTL => ‘-1’, BLO
CKSIZE => ‘65536’, IN_MEMORY => ‘false’, BLOCKCACHE => ‘false’}, {NAME => ‘pass’, BLOOMFILTER => ‘NONE’, REPLICAT
ION_SCOPE => ‘0’, COMPRESSION => ‘NONE’, VERSIONS => ‘1’, TTL => ‘-1’, BLOCKSIZE => ‘65536’, IN_MEMORY => ‘false’
, BLOCKCACHE => ‘false’}, {NAME => ‘username’, BLOOMFILTER => ‘NONE’, REPLICATION_SCOPE => ‘0’, COMPRESSION => ‘N
ONE’, VERSIONS => ‘1’, TTL => ‘-1’, BLOCKSIZE => ‘65536’, IN_MEMORY => ‘false’, BLOCKCACHE => ‘false’}]}
1 row(s) in 0.0630 seconds

通过thrift使用python成功创建用户表users。

综上,使用Python编程语言,通过thrift连接HBase测试成功!


实验过程中出现以下错误:
[grid@h1 lib]/usr/local/hbase0.90.5/bin/starthbase.shh2:startingzookeeper,loggingto/usr/local/hbase0.90.5/bin/../logs/hbasegridzookeeperh2.outh3:startingzookeeper,loggingto/usr/local/hbase0.90.5/bin/../logs/hbasegridzookeeperh3.outh1:startingzookeeper,loggingto/usr/local/hbase0.90.5//logs/hbasegridzookeeperh1.outstartingmaster,loggingto/usr/local/hbase0.90.5//logs/hbasegridmasterh1.outh2:regionserverrunningasprocess14833.Stopitfirst.h3:regionserverrunningasprocess14511.Stopitfirst.[grid@h1lib] tail -f /usr/local/hbase-0.90.5/logs/hbase-grid-master-h1.log
2012-12-15 20:59:58,276 INFO org.apache.hadoop.hbase.metrics: new MBeanInfo
2012-12-15 20:59:58,277 INFO org.apache.hadoop.hbase.metrics: new MBeanInfo
2012-12-15 20:59:58,277 INFO org.apache.hadoop.hbase.master.metrics.MasterMetrics: Initialized
2012-12-15 20:59:58,326 INFO org.apache.hadoop.hbase.master.ActiveMasterManager: Master=h1:60000
2012-12-15 20:59:58,926 INFO org.apache.hadoop.hdfs.DFSClient: No node available for block: blk_-3179415130917630864_1002 file=/hbase/hbase.version
2012-12-15 20:59:58,926 INFO org.apache.hadoop.hdfs.DFSClient: Could not obtain block blk_-3179415130917630864_1002 from any node: java.io.IOException: No live nodes contain current block
2012-12-15 21:00:01,935 INFO org.apache.hadoop.hdfs.DFSClient: No node available for block: blk_-3179415130917630864_1002 file=/hbase/hbase.version
2012-12-15 21:00:01,935 INFO org.apache.hadoop.hdfs.DFSClient: Could not obtain block blk_-3179415130917630864_1002 from any node: java.io.IOException: No live nodes contain current block
2012-12-15 21:00:04,939 INFO org.apache.hadoop.hdfs.DFSClient: No node available for block: blk_-3179415130917630864_1002 file=/hbase/hbase.version
2012-12-15 21:00:04,939 INFO org.apache.hadoop.hdfs.DFSClient: Could not obtain block blk_-3179415130917630864_1002 from any node: java.io.IOException: No live nodes contain current block
2012-12-15 21:00:07,946 WARN org.apache.hadoop.hdfs.DFSClient: DFS Read: java.io.IOException: Could not obtain block: blk_-3179415130917630864_1002 file=/hbase/hbase.version
at org.apache.hadoop.hdfs.DFSClientDFSInputStream.chooseDataNode(DFSClient.java:1812)atorg.apache.hadoop.hdfs.DFSClientDFSInputStream.blockSeekTo(DFSClient.java:1638)
at org.apache.hadoop.hdfs.DFSClientDFSInputStream.read(DFSClient.java:1767)atorg.apache.hadoop.hdfs.DFSClientDFSInputStream.read(DFSClient.java:1695)
at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:337)
at java.io.DataInputStream.readUTF(DataInputStream.java:589)
at org.apache.hadoop.hbase.util.FSUtils.getVersion(FSUtils.java:175)
at org.apache.hadoop.hbase.util.FSUtils.checkVersion(FSUtils.java:211)
at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:251)
at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:106)
at org.apache.hadoop.hbase.master.MasterFileSystem.(MasterFileSystem.java:91)
at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:342)
at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:279)

2012-12-15 21:00:07,947 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown.
java.io.IOException: Could not obtain block: blk_-3179415130917630864_1002 file=/hbase/hbase.version
at org.apache.hadoop.hdfs.DFSClientDFSInputStream.chooseDataNode(DFSClient.java:1812)atorg.apache.hadoop.hdfs.DFSClientDFSInputStream.blockSeekTo(DFSClient.java:1638)
at org.apache.hadoop.hdfs.DFSClientDFSInputStream.read(DFSClient.java:1767)atorg.apache.hadoop.hdfs.DFSClientDFSInputStream.read(DFSClient.java:1695)
at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:337)
at java.io.DataInputStream.readUTF(DataInputStream.java:589)
at org.apache.hadoop.hbase.util.FSUtils.getVersion(FSUtils.java:175)
at org.apache.hadoop.hbase.util.FSUtils.checkVersion(FSUtils.java:211)
at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:251)
at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:106)
at org.apache.hadoop.hbase.master.MasterFileSystem.(MasterFileSystem.java:91)
at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:342)
at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:279)

错误原因:由Hadoop引起,数据节点数据有误,导致HDFS safe出错。
处理方法:将数据节点上data目录下内容删除,重新格式化NameNode,然后启动集群即可。【生产环境慎用,会丢失所有数据】

0 0