走向云计算之HBase的shell命令和JavaAPI详解
来源:互联网 发布:苹果双系统删除windows 编辑:程序博客网 时间:2024/06/05 10:14
一、HBase的启动
由上一篇可知,HBase是建立在Hadoop HDFS之上的,因此在启动HBase之前要确保已经启动了Hadoop,启动Hadoop的命令是:start-all.sh
。在hadoop2.x中,启动hadoop推荐使用start-hdfs.sh
和start-yarn.sh
两个命令来代替start-all.sh。hadoop集群启动后,启动HBase使用命令:start-hbase.sh
。
我这里配置的是完全分布模式,一台机器作为NameNode和HMaster,另外两台作为DataNode和HRegionServer。Hadoop和HBase启动完毕后,通过jps
命令查看结果如下:
机器一:
机器二:
机器三:
通过访问页面:http://192.168.2.120:60010/master-status可以看到HBase集群的相关情况。
可能会遇到的问题
有时候,我们启动Hadoop和Hbase后,访问Hadoop集群的web端口页面可以显示,但是访问HBase的web页面时却显示不出来。这时候再使用jps命令查看发现HMaster这个HBase主进程自动退出了。我们可以通过查看HBase的log日志来定位问题,对于我的情况来说,是因为三台虚拟机的物理时间不一致导致的,通过在三台虚拟机上执行命令:ntpdate time.nist.gov
来同步时间,然后执行命令stop-hbase.sh
,再执行start-habse.sh
就行了。
二、HBase Shell的相关操作。
1、进入Hbase shell
通过运行命令hbase shell
来进入:
首先需要注意,在hbase shell中使用回退键是无效的,如果输错信息要回退,请按住ctrl键再按回退键。
2、列出所有表
通过list
命令得出:
可以看到这里已经有一张“users”的表,这是我之前自主创建的,如果第一次执行HBase,list结果肯定是空。
3、删除表
在HBase中删除表需要两步,首先disable,其次drop
4、创建表:
这里创建了一张表users,有三个列族user_id,address,info
create 'users','user_id','address','info'
5、获取表的具体描述:
通过命令describe 'users'
即可。
6、增删改查
- 增加记录:put
put 'users','xiaoming','info:age','24'
这个命令的意思就是向表users
的行xiaoming
、列info:age
添加数据24
。同理依次执行如下语句:
put 'users','xiaoming','info:birthday','1987-06-17'put 'users','xiaoming','info:company','alibaba'put 'users','xiaoming','address:contry','china'put 'users','xiaoming','address:province','zhejiang'put 'users','xiaoming','address:city','hangzhou'
- 扫描users表的所有记录:scan
通过命令scan ‘users’即可。 - 获取一条记录
①取得一个id(row_key)的所有数据get 'users','xiaoming'
②获取一个id的一个列族的所有数据get 'users','xiaoming','info'
③获取一个id,一个列族中一个列的所有数据get 'users','xiaoming','info:age'
- 更新一条记录:put
更新users表中小明的年龄为29put 'users','xiaoming','info:age' ,'29'
- 删除记录:delete与deleteall
①删除xiaoming的值的’info:age’字段delete 'users','xiaoming','info:age'
②删除xiaoming的整行信息deleteall 'users','xiaoming'
- 其他几个比较有用的命令
count:统计行数count 'users'
当前users表中只有xiaoming一行数据。
truncate:清空指定表`truncate 'users'
这个操作实际上是先删除表,然后又创建了一张相同的表。
三、HBase的JavaAPI详解
1、JavaAPI和HBase数据模型之间的关系
2、HBaseConfiguration
该类主要对HBase进行配置。主要方法如下:
示例:
Configuration conf = HBaseConfiguration.create();
该方法用HBase的默认资源来创建Configuration,它默认会加载classpath下的habse-site.xml来初始化Configuration。
3、HBaseAdmin
该类主要提供一个接口来管理HBase数据库的表信息,包括方法如下:
使用示例:
HBaseAdmin admin = new HBaseAdmin(conf);admin.disableTable(tableName);admin.deleteTable(tableName);
4、HTableDescriptor
该类包含了表的名字即对应表的列族,主要方法如下:
示例:
HTableDescriptor tableDesc = new HTableDescriptor(tableName);for (int i = 0; i < familys.length; i++) { tableDesc.addFamily(new HColumnDescriptor(familys[i])); }
5、HColumnDescriptor
该类主要用于维护关于列族的信息例如版本号,压缩设置等。通常在创建表或为表添加列族时使用。主要方法如下:
示例:
HTableDescriptor tableDesc = new HTableDescriptor(tableName);for (int i = 0; i < familys.length; i++) { tableDesc.addFamily(new HColumnDescriptor(familys[i])); }
6、HTable
该类主要用于和HBase的表进行通信,对更新操作来说是非线程安全的。在多线程操作的环境下,建议使用HTablePool类进行操作。主要方法如下:
示例:
HTable table = new HTable(conf, tableName);Scan s = new Scan();ResultScanner ss = table.getScanner(s);
7、Put
该类主要对单个行执行添加或者更新操作。主要方法如下:
示例:
HTable table = new HTable(conf, tableName);Put put = new Put(Bytes.toBytes(rowKey));put.add(Bytes.toBytes(family), Bytes.toBytes(qualifier), Bytes.toBytes(value));table.put(put);
8、Get
该类主要用于获取单个行的相关信息。主要方法如下:
示例:
HTable table = new HTable(conf, tableName);Get get = new Get(rowKey.getBytes());Result rs = table.get(get);
9、Result
该类用于存储Get或者Scan操作后获取的表的单行值。主要方法如下:
示例:
HTable table = new HTable(conf, tableName);Get get = new Get(rowKey.getBytes());Result rs = table.get(get);for (KeyValue kv : rs.raw()) { System.out.print(new String(kv.getRow()) + " "); System.out.print(new String(kv.getFamily()) + ":"); System.out.print(new String(kv.getQualifier()) + " "); System.out.print(kv.getTimestamp() + " "); System.out.println(new String(kv.getValue())); }
10、ResultScanner
该类主要作用是提供客户端获取值得接口。主要方法如下:
示例:
HTable table = new HTable(conf, tableName);Scan s = new Scan();ResultScanner ss = table.getScanner(s);for (Result r : ss) { for (KeyValue kv : r.raw()) { System.out.print(new String(kv.getRow()) + " "); System.out.print(new String(kv.getFamily()) + ":"); System.out.print(new String(kv.getQualifier()) + " "); System.out.print(kv.getTimestamp() + " "); System.out.println(new String(kv.getValue())); } }
四、HBase的JavaAPI实例
1、导入相关jar包
Eclipse中执行Hbase程序需要导入的jar包如下:
- Hadoop全部jar包
- Hbase部分jar包
Hbase jar包不能多也不能少,多了会冲突,少了会提醒找不到相应类,Hbase这些jar包整理了一下如下图所示:
2、获取配置
//获取配置 static { conf = HBaseConfiguration.create(); //获取zookeeper集群的位置 conf.set("hbase.zookeeper.quorum", "sparkproject1,sparkproject2,sparkproject3"); System.out.println(conf.get("hbase.zookeeper.quorum")); }
sparkproject1,sparkproject2,sparkproject3
是hadoop集群的主机名称,注意在本地的hosts文件中加入这三个主机的ip地址。
3、表的创建和删除
/** * 创建一张表 */ public static void creatTable(String tableName, String[] familys) throws Exception { HBaseAdmin admin = new HBaseAdmin(conf); if (admin.tableExists(tableName)) { System.out.println("table already exists!"); } else { HTableDescriptor tableDesc = new HTableDescriptor(tableName); for (int i = 0; i < familys.length; i++) { tableDesc.addFamily(new HColumnDescriptor(familys[i])); } admin.createTable(tableDesc); System.out.println("create table " + tableName + " ok."); } } /** * 删除表 */ public static void deleteTable(String tableName) throws Exception { try { HBaseAdmin admin = new HBaseAdmin(conf); admin.disableTable(tableName); admin.deleteTable(tableName); System.out.println("delete table " + tableName + " ok."); } catch (MasterNotRunningException e) { e.printStackTrace(); } catch (ZooKeeperConnectionException e) { e.printStackTrace(); } }
4、表数据的增删改查
/** * 插入一行记录 */ public static void addRecord(String tableName, String rowKey, String family, String qualifier, String value) throws Exception { try { HTable table = new HTable(conf, tableName); Put put = new Put(Bytes.toBytes(rowKey)); put.add(Bytes.toBytes(family), Bytes.toBytes(qualifier), Bytes.toBytes(value)); table.put(put); System.out.println("insert recored " + rowKey + " to table " + tableName + " ok."); } catch (IOException e) { e.printStackTrace(); } } /** * 删除一行记录 */ public static void delRecord(String tableName, String rowKey) throws IOException { HTable table = new HTable(conf, tableName); List list = new ArrayList(); Delete del = new Delete(rowKey.getBytes()); list.add(del); table.delete(list); System.out.println("del recored " + rowKey + " ok."); } /** * 查找一行记录 */ public static void getOneRecord(String tableName, String rowKey) throws IOException { HTable table = new HTable(conf, tableName); Get get = new Get(rowKey.getBytes()); Result rs = table.get(get); for (KeyValue kv : rs.raw()) { System.out.print(new String(kv.getRow()) + " "); System.out.print(new String(kv.getFamily()) + ":"); System.out.print(new String(kv.getQualifier()) + " "); System.out.print(kv.getTimestamp() + " "); System.out.println(new String(kv.getValue())); } } /** * 显示所有数据 */ public static void getAllRecord(String tableName) { try { HTable table = new HTable(conf, tableName); Scan s = new Scan(); ResultScanner ss = table.getScanner(s); for (Result r : ss) { for (KeyValue kv : r.raw()) { System.out.print(new String(kv.getRow()) + " "); System.out.print(new String(kv.getFamily()) + ":"); System.out.print(new String(kv.getQualifier()) + " "); System.out.print(kv.getTimestamp() + " "); System.out.println(new String(kv.getValue())); } } } catch (IOException e) { e.printStackTrace(); } }
5、整体测试
源码如下:
package com.kang.hbase;import java.io.IOException;import java.util.ArrayList;import java.util.List;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.hbase.HBaseConfiguration;import org.apache.hadoop.hbase.HColumnDescriptor;import org.apache.hadoop.hbase.HTableDescriptor;import org.apache.hadoop.hbase.KeyValue;import org.apache.hadoop.hbase.MasterNotRunningException;import org.apache.hadoop.hbase.ZooKeeperConnectionException;import org.apache.hadoop.hbase.client.Delete;import org.apache.hadoop.hbase.client.Get;import org.apache.hadoop.hbase.client.HBaseAdmin;import org.apache.hadoop.hbase.client.HTable;import org.apache.hadoop.hbase.client.Put;import org.apache.hadoop.hbase.client.Result;import org.apache.hadoop.hbase.client.ResultScanner;import org.apache.hadoop.hbase.client.Scan;import org.apache.hadoop.hbase.util.Bytes;public class HBaseTest { private static final String TABLE_NAME = "demo_table"; public static Configuration conf = null; public HTable table = null; public HBaseAdmin admin = null; //获取配置 static { conf = HBaseConfiguration.create(); //获取zookeeper集群的位置 conf.set("hbase.zookeeper.quorum", "sparkproject1,sparkproject2,sparkproject3"); System.out.println(conf.get("hbase.zookeeper.quorum")); } /** * 创建一张表 */ public static void creatTable(String tableName, String[] familys) throws Exception { HBaseAdmin admin = new HBaseAdmin(conf); if (admin.tableExists(tableName)) { System.out.println("table already exists!"); } else { HTableDescriptor tableDesc = new HTableDescriptor(tableName); for (int i = 0; i < familys.length; i++) { tableDesc.addFamily(new HColumnDescriptor(familys[i])); } admin.createTable(tableDesc); System.out.println("create table " + tableName + " ok."); } } /** * 删除表 */ public static void deleteTable(String tableName) throws Exception { try { HBaseAdmin admin = new HBaseAdmin(conf); admin.disableTable(tableName); admin.deleteTable(tableName); System.out.println("delete table " + tableName + " ok."); } catch (MasterNotRunningException e) { e.printStackTrace(); } catch (ZooKeeperConnectionException e) { e.printStackTrace(); } } /** * 插入一行记录 */ public static void addRecord(String tableName, String rowKey, String family, String qualifier, String value) throws Exception { try { HTable table = new HTable(conf, tableName); Put put = new Put(Bytes.toBytes(rowKey)); put.add(Bytes.toBytes(family), Bytes.toBytes(qualifier), Bytes.toBytes(value)); table.put(put); System.out.println("insert recored " + rowKey + " to table " + tableName + " ok."); } catch (IOException e) { e.printStackTrace(); } } /** * 删除一行记录 */ public static void delRecord(String tableName, String rowKey) throws IOException { HTable table = new HTable(conf, tableName); List list = new ArrayList(); Delete del = new Delete(rowKey.getBytes()); list.add(del); table.delete(list); System.out.println("del recored " + rowKey + " ok."); } /** * 查找一行记录 */ public static void getOneRecord(String tableName, String rowKey) throws IOException { HTable table = new HTable(conf, tableName); Get get = new Get(rowKey.getBytes()); Result rs = table.get(get); for (KeyValue kv : rs.raw()) { System.out.print(new String(kv.getRow()) + " "); System.out.print(new String(kv.getFamily()) + ":"); System.out.print(new String(kv.getQualifier()) + " "); System.out.print(kv.getTimestamp() + " "); System.out.println(new String(kv.getValue())); } } /** * 显示所有数据 */ public static void getAllRecord(String tableName) { try { HTable table = new HTable(conf, tableName); Scan s = new Scan(); ResultScanner ss = table.getScanner(s); for (Result r : ss) { for (KeyValue kv : r.raw()) { System.out.print(new String(kv.getRow()) + " "); System.out.print(new String(kv.getFamily()) + ":"); System.out.print(new String(kv.getQualifier()) + " "); System.out.print(kv.getTimestamp() + " "); System.out.println(new String(kv.getValue())); } } } catch (IOException e) { e.printStackTrace(); } } public static void main(String[] args) { // TODO Auto-generated method stub try { String tablename = "scores"; String[] familys = { "grade", "course" }; HBaseTest.creatTable(tablename, familys); // add record zkb HBaseTest.addRecord(tablename, "zkb", "grade", "", "5"); HBaseTest.addRecord(tablename, "zkb", "course", "", "90"); HBaseTest.addRecord(tablename, "zkb", "course", "math", "97"); HBaseTest.addRecord(tablename, "zkb", "course", "art", "87"); // add record baoniu HBaseTest.addRecord(tablename, "baoniu", "grade", "", "4"); HBaseTest.addRecord(tablename, "baoniu", "course", "math", "89"); System.out.println("===========get one record========"); HBaseTest.getOneRecord(tablename, "zkb"); System.out.println("===========show all record========"); HBaseTest.getAllRecord(tablename); System.out.println("===========del one record========"); HBaseTest.delRecord(tablename, "baoniu"); HBaseTest.getAllRecord(tablename); System.out.println("===========show all record========"); HBaseTest.getAllRecord(tablename); } catch (Exception e) { e.printStackTrace(); } }}
run as –>Run on Hadoop,控制台显示结果如下:
在hbase shell中查看如下:
五、HBase结合MapReduce
结合之前博客走向云计算之MapReduce的代码辅助优化和改善中的手机上网日志为背景,我们要做的就是将日志通过MapReduce导入到HBase中进行存储。
1、在HBase中创建表
在HBase中通过Shell创建一张表:wlan_log,这里为了简单定义,之定义了一个列族cf。
create 'wlan_log','cf'
2、将数据输出到HBase中
在ecplise中新建一个类,该类的代码如下所示:
package com.kang.hbase;import java.text.SimpleDateFormat;import java.util.Date;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.hbase.client.Put;import org.apache.hadoop.hbase.mapreduce.TableOutputFormat;import org.apache.hadoop.hbase.mapreduce.TableReducer;import org.apache.hadoop.hbase.util.Bytes;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.NullWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Counter;import org.apache.hadoop.mapreduce.Job;import org.apache.hadoop.mapreduce.Mapper;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;public class MRHbase { static class BatchImportMapper extends Mapper<LongWritable, Text, LongWritable, Text> { SimpleDateFormat dateformat1 = new SimpleDateFormat("yyyyMMddHHmmss"); Text v2 = new Text(); protected void map(LongWritable key, Text value, Context context) throws java.io.IOException, InterruptedException { final String[] splited = value.toString().split("\t"); try { final Date date = new Date(Long.parseLong(splited[0].trim())); final String dateFormat = dateformat1.format(date); String rowKey = splited[1] + ":" + dateFormat; v2.set(rowKey + "\t" + value.toString()); context.write(key, v2); } catch (NumberFormatException e) { final Counter counter = context.getCounter("BatchImportJob", "ErrorFormat"); counter.increment(1L); System.out.println("出错了" + splited[0] + " " + e.getMessage()); } }; } static class BatchImportReducer extends TableReducer<LongWritable, Text, NullWritable> { protected void reduce(LongWritable key, java.lang.Iterable<Text> values, Context context) throws java.io.IOException, InterruptedException { for (Text text : values) { final String[] splited = text.toString().split("\t"); final Put put = new Put(Bytes.toBytes(splited[0])); put.add(Bytes.toBytes("cf"), Bytes.toBytes("date"), Bytes.toBytes(splited[1])); put.add(Bytes.toBytes("cf"), Bytes.toBytes("msisdn"), Bytes.toBytes(splited[2])); // 省略其他字段,调用put.add(....)即可 context.write(NullWritable.get(), put); } }; } public static void main(String[] args) throws Exception { final Configuration configuration = new Configuration(); // 设置zookeeper configuration.set("hbase.zookeeper.quorum", "sparkproject1,sparkproject2,sparkproject3"); // 设置hbase表名称 configuration.set(TableOutputFormat.OUTPUT_TABLE, "wlan_log"); // 将该值改大,防止hbase超时退出 configuration.set("dfs.socket.timeout", "180000"); final Job job = new Job(configuration, "HBaseBatchImportJob"); job.setMapperClass(BatchImportMapper.class); job.setReducerClass(BatchImportReducer.class); // 设置map的输出,不设置reduce的输出类型 job.setMapOutputKeyClass(LongWritable.class); job.setMapOutputValueClass(Text.class); job.setInputFormatClass(TextInputFormat.class); // 不再设置输出路径,而是设置输出格式类型 job.setOutputFormatClass(TableOutputFormat.class); FileInputFormat.setInputPaths(job, "hdfs://sparkproject1:9000/root/input/"); boolean success = job.waitForCompletion(true); if (success) { System.out.println("Bath import to HBase success!"); System.exit(0); } else { System.out.println("Batch import to HBase failed!"); System.exit(1); } }}
上述代码执行后,在HBase中通过Shell命令(list)查看导入结果:
3、通过java代码来访问显示数据
package com.kang.hbase;import java.io.IOException;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.hbase.HBaseConfiguration;import org.apache.hadoop.hbase.client.HTable;import org.apache.hadoop.hbase.client.Result;import org.apache.hadoop.hbase.client.ResultScanner;import org.apache.hadoop.hbase.client.Scan;import org.apache.hadoop.hbase.util.Bytes;public class WlanLogApp { private static final String TABLE_NAME = "wlan_log"; private static final String FAMILY_NAME = "cf"; /** * HBase Java API基本使用示例 * * @throws Exception */ public static void main(String[] args) throws Exception { System.out.println("手机13600217502的所有上网记录如下:"); scan(TABLE_NAME,"13600217502"); System.out.println("134号段的所有上网记录如下:"); scanPeriod(TABLE_NAME, "136"); } /* * 查询手机13600217502的所有上网记录 */ public static void scan(String tableName, String mobileNum) throws IOException { HTable table = new HTable(getConfiguration(), tableName); Scan scan = new Scan(); scan.setStartRow(Bytes.toBytes(mobileNum + ":/")); scan.setStopRow(Bytes.toBytes(mobileNum + "::")); ResultScanner scanner = table.getScanner(scan); int i = 0; for (Result result : scanner) { System.out.println("Scan: " + i + " " + result); i++; } } /* * 查询134号段的所有上网记录 */ public static void scanPeriod(String tableName, String period) throws IOException { HTable table = new HTable(getConfiguration(), tableName); Scan scan = new Scan(); scan.setStartRow(Bytes.toBytes(period + "/")); scan.setStopRow(Bytes.toBytes(period + ":")); scan.setMaxVersions(1); ResultScanner scanner = table.getScanner(scan); int i = 0; for (Result result : scanner) { System.out.println("Scan: " + i + " " + result); i++; } } /* * 获取HBase配置 */ private static Configuration getConfiguration() { Configuration conf = HBaseConfiguration.create(); conf.set("hbase.zookeeper.quorum", "sparkproject1,sparkproject2,sparkproject3"); return conf; }}
运行结果如下:
- 走向云计算之HBase的shell命令和JavaAPI详解
- 走向云计算之HBase的基本原理和架构介绍
- 走向云计算之MapReduce原理和运行流程详解
- 走向云计算之Hive基本架构和使用详解
- 云计算(三十)-hbase shell基础和常用命令详解
- 走向云计算之HDFS详解
- 走向云计算之HBase实际案例分析
- Hbase shell 命令详解
- hbase shell命令详解
- Hbase的JavaApi和工具类
- HBase之JavaAPI
- HBase Shell及JavaAPI操作
- HBase的JavaAPI使用
- Hbase的JavaAPI----HbaseDao
- 走向云计算之MapReduce应用案例详解
- Hbase Shell 命令官方详解
- hbase的shell命令
- Hbase的shell命令
- Android Studio Installation failed with message Failed to establish session.
- 获取当前请求路径
- JVM-堆学习之新生代老年代持久带的使用关系
- 腾讯云CentOS7安装LNMP
- Linux文件系统详解
- 走向云计算之HBase的shell命令和JavaAPI详解
- Java正则表达式
- 链队列的结构及其操作
- 自然语言处理一:基于朴素贝叶斯的语种检测
- web前端开发常用代码段及知识点
- 自动弹出下载提示框
- HDU Frogs 5514 容斥
- Object类中的方法,修饰符
- 直播中拖动不准 3分钟搞明白