Hadoop基础教程-第12章 Hive:进阶(12.5 Hive外表)(草稿)
来源:互联网 发布:淘宝达人粉丝怎么刷 编辑:程序博客网 时间:2024/04/30 01:12
第12章 Hive:进阶
12.5 Hive外表
12.5.1 准备数据
[root@nb0 data]# vi gen.sh[root@nb0 data]# cat gen.sh#!/bin/shfor i in {1..100000};do echo -e $i'\t'$RANDOM'\t'$RANDOM'\t'$RANDOMdone;[root@nb0 data]# sh gen.sh > mydata.txtYou have mail in /var/spool/mail/root[root@nb0 data]# vi mydata.txt [root@nb0 data]# hdfs dfs -put mydata.txt input17/07/19 20:38:55 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable[root@nb0 data]# hdfs dfs -ls input17/07/19 20:39:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableFound 1 items-rw-r--r-- 3 root hbase 1698432 2017-07-19 20:38 input/mydata.txtYou have mail in /var/spool/mail/root[root@nb0 data]#
12.5.2 创建HBase表
创建hbase数据表abc
hbase(main):007:0> create 'abc','info'
12.5.3 创建Hive外表
创建Hive外表
创建一个指向已经存在的HBase表的Hive表
由于HBase中没有数据类型信息,所以在存储数据的时候都转化为String类型
hive> create external table hbase_t1(rowkey string,data1 string,data2 string,data3 string) > stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' > with serdeproperties ("hbase.columns.mapping" = ":key,info:data1,info:data2,info:data3") > tblproperties ("hbase.table.name"="abc","hbase.mapred.out.puttable"="abc");OKTime taken: 0.213 secondshive>
12.5.4 创建Hive数据表
创建一个Hive数据表
hive> create external table hive_t1(rowkey string,data1 string,data2 string,data3 string) > row format delimited > fields terminated by '\t' > stored as textfile;OKTime taken: 0.105 secondshive>
导入数据
hive> load data inpath 'input/mydata.txt' into table hive_t1;Loading data to table default.hive_t1Table default.hive_t1 stats: [numFiles=1, totalSize=2287063]OKTime taken: 0.795 secondshive>
12.5.5 使用HQL向HBase表中插入数据
hive> insert overwrite table hbase_t1 > select rowkey,data1,data2,data3 from hive_t1;Query ID = root_20170720022542_8b5292a8-7903-4e22-8258-223555fab220Total jobs = 1Launching Job 1 out of 1Number of reduce tasks is set to 0 since there's no reduce operatorStarting Job = job_1500448404940_0002, Tracking URL = http://nb0:8088/proxy/application_1500448404940_0002/Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_1500448404940_0002Hadoop job information for Stage-0: number of mappers: 2; number of reducers: 02017-07-20 02:25:52,647 Stage-0 map = 0%, reduce = 0%2017-07-20 02:26:03,331 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 19.73 secMapReduce Total cumulative CPU time: 19 seconds 730 msecEnded Job = job_1500448404940_0002MapReduce Jobs Launched: Stage-Stage-0: Map: 2 Cumulative CPU: 19.73 sec HDFS Read: 2345042 HDFS Write: 0 SUCCESSTotal MapReduce CPU Time Spent: 19 seconds 730 msecOKTime taken: 22.873 secondshive>
12.5.6 查看结果
hive> select count(*) from hbase_t1;Query ID = root_20170720022640_c705b5a2-7db5-4bfa-8b43-f218ad588226Total jobs = 1Launching Job 1 out of 1Number of reduce tasks determined at compile time: 1In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number>In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number>In order to set a constant number of reducers: set mapreduce.job.reduces=<number>Starting Job = job_1500448404940_0003, Tracking URL = http://nb0:8088/proxy/application_1500448404940_0003/Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_1500448404940_0003Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 12017-07-20 02:26:51,257 Stage-1 map = 0%, reduce = 0%2017-07-20 02:27:01,201 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 6.91 sec2017-07-20 02:27:07,482 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 9.56 secMapReduce Total cumulative CPU time: 9 seconds 560 msecEnded Job = job_1500448404940_0003MapReduce Jobs Launched: Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 9.56 sec HDFS Read: 14530 HDFS Write: 7 SUCCESSTotal MapReduce CPU Time Spent: 9 seconds 560 msecOK100000Time taken: 29.51 seconds, Fetched: 1 row(s)hive>
hive> select * from hbase_t1 limit 10;OK1 199 4567 2594310 10448 19496 31645100 26984 29011 131771000 4008 1275 823610000 10121 14333 24945100000 14619 17100 55610001 28304 22506 583610002 29960 4367 1918710003 25065 21803 2193210004 19965 31442 18762Time taken: 0.188 seconds, Fetched: 10 row(s)hive>
hbase(main):007:0> count 'abc'Current count: 1000, row: 10897 Current count: 2000, row: 11797 Current count: 3000, row: 12697 Current count: 4000, row: 13597 Current count: 5000, row: 14497 Current count: 6000, row: 15397 Current count: 7000, row: 16297 Current count: 8000, row: 17197 Current count: 9000, row: 18097 Current count: 10000, row: 18998 Current count: 11000, row: 19898 Current count: 12000, row: 20797 Current count: 13000, row: 21697 Current count: 14000, row: 22597 Current count: 15000, row: 23497 Current count: 16000, row: 24397 Current count: 17000, row: 25297 Current count: 18000, row: 26197 Current count: 19000, row: 27097 Current count: 20000, row: 27998 Current count: 21000, row: 28898 Current count: 22000, row: 29798 Current count: 23000, row: 30697 Current count: 24000, row: 31597 Current count: 25000, row: 32497 Current count: 26000, row: 33397 Current count: 27000, row: 34297 Current count: 28000, row: 35197 Current count: 29000, row: 36097 Current count: 30000, row: 36998 Current count: 31000, row: 37898 Current count: 32000, row: 38798 Current count: 33000, row: 39698 Current count: 34000, row: 40597 Current count: 35000, row: 41497 Current count: 36000, row: 42397 Current count: 37000, row: 43297 Current count: 38000, row: 44197 Current count: 39000, row: 45097 Current count: 40000, row: 45998 Current count: 41000, row: 46898 Current count: 42000, row: 47798 Current count: 43000, row: 48698 Current count: 44000, row: 49598 Current count: 45000, row: 50497 Current count: 46000, row: 51397 Current count: 47000, row: 52297 Current count: 48000, row: 53197 Current count: 49000, row: 54097 Current count: 50000, row: 54998 Current count: 51000, row: 55898 Current count: 52000, row: 56798 Current count: 53000, row: 57698 Current count: 54000, row: 58598 Current count: 55000, row: 59498 Current count: 56000, row: 60397 Current count: 57000, row: 61297 Current count: 58000, row: 62197 Current count: 59000, row: 63097 Current count: 60000, row: 63998 Current count: 61000, row: 64898 Current count: 62000, row: 65798 Current count: 63000, row: 66698 Current count: 64000, row: 67598 Current count: 65000, row: 68498 Current count: 66000, row: 69398 Current count: 67000, row: 70297 Current count: 68000, row: 71197 Current count: 69000, row: 72097 Current count: 70000, row: 72998 Current count: 71000, row: 73898 Current count: 72000, row: 74798 Current count: 73000, row: 75698 Current count: 74000, row: 76598 Current count: 75000, row: 77498 Current count: 76000, row: 78398 Current count: 77000, row: 79298 Current count: 78000, row: 80197 Current count: 79000, row: 81097 Current count: 80000, row: 81998 Current count: 81000, row: 82898 Current count: 82000, row: 83798 Current count: 83000, row: 84698 Current count: 84000, row: 85598 Current count: 85000, row: 86498 Current count: 86000, row: 87398 Current count: 87000, row: 88298 Current count: 88000, row: 89198 Current count: 89000, row: 90097 Current count: 90000, row: 90998 Current count: 91000, row: 91898 Current count: 92000, row: 92798 Current count: 93000, row: 93698 Current count: 94000, row: 94598 Current count: 95000, row: 95498 Current count: 96000, row: 96398 Current count: 97000, row: 97298 Current count: 98000, row: 98198 Current count: 99000, row: 99098 Current count: 100000, row: 99999 100000 row(s) in 9.7420 seconds=> 100000hbase(main):008:0>
hbase(main):013:0> get 'abc','100000'COLUMN CELL info:data1 timestamp=1500531979008, value=14619 info:data2 timestamp=1500531979008, value=17100 info:data3 timestamp=1500531979008, value=556 3 row(s) in 0.0150 secondshbase(main):014:0>
阅读全文
0 0
- Hadoop基础教程-第12章 Hive:进阶(12.5 Hive外表)(草稿)
- Hadoop基础教程-第12章 Hive:进阶(12.4 Hive Metastore)(草稿)
- Hadoop基础教程-第12章 Hive:进阶(12.1 内置函数)(草稿)
- Hadoop基础教程-第12章 Hive:进阶(12.2 自定义函数)(草稿)
- Hadoop基础教程-第12章 Hive:进阶(12.3 HiveServer2)(草稿)
- Hadoop基础教程-第11章 Hive:SQL on Hadoop(11.1 Hive 介绍)(草稿)
- Hadoop基础教程-第11章 Hive:SQL on Hadoop(11.7 HQL:数据查询)(草稿)
- Hadoop基础教程-第11章 Hive:SQL on Hadoop(11.8 HQL:排序)(草稿)
- Hadoop基础教程-第11章 Hive:SQL on Hadoop(11.2 Hive安装与配置)(草稿)
- Hadoop基础教程-第11章 Hive:SQL on Hadoop(11.4 数据类型和存储格式)(草稿)
- Hadoop基础教程-第11章 Hive:SQL on Hadoop(11.5 HQL:DDL数据定义)(草稿)
- Hadoop基础教程-第11章 Hive:SQL on Hadoop(11.6 HQL:DML数据操纵)(草稿)
- Hadoop基础教程-第11章 Hive:SQL on Hadoop(11.3 Hive 快速入门)
- Hadoop基础教程-第7章 MapReduce进阶(7.1 MapReduce过程)(草稿)
- Hadoop基础教程-第7章 MapReduce进阶(7.2 MapReduce工作机制)(草稿)
- Hadoop基础教程-第7章 MapReduce进阶(7.3 MapReduce API)(草稿)
- Hadoop基础教程-第10章 HBase:Hadoop数据库(10.1 NoSQL介绍)(草稿)
- Hadoop基础教程-第10章 HBase:Hadoop数据库(10.2 HBase基本概念、框架)(草稿)
- 使用freemark模板生成excel
- JAVA keystore
- PATbasic1002
- easyui主界面传统布局,tab页面等
- windows环境下Python2.X和Python3.X共存的问题
- Hadoop基础教程-第12章 Hive:进阶(12.5 Hive外表)(草稿)
- qt使用布局进行窗口切换
- Spire.XLS 教程:保护和取消保护工作表
- Eclipse 如何切换工作空间(workspace)
- 关于String面试题
- 自定义实现侧滑菜单功能
- SurfaceView的使用—思路的顿悟
- 04:填空:类型转换1
- jdk和jre的区别