Hive通过动态分区装载数据
来源:互联网 发布:qt usb通信 linux api 编辑:程序博客网 时间:2024/05/22 17:15
Hive通过动态分区装载数据
0、启动hive命令行窗口
1、创建分区表t12并查看分区hive表信息
hive> CREATE TABLE t12(id INT , NAME string) partitioned BY(YEAR INT , MONTH INT) ROW format delimited FIELDS TERMINATED BY '\t';OKTime taken: 0.21 secondshive> desc t12;OKid intname stringyear intmonth int# Partition Information# col_name data_type commentyear intmonth intTime taken: 0.085 seconds, Fetched: 10 row(s)2、创建测试数据文件,并装载数据到分区表t12
localhost:result_data a6$ pwd/Users/a6/Applications/apache-hive-2.3.0-bin/result_datalocalhost:result_data a6$ more t12_data.txt1 小华2 成龙3 zhangsan4 李四5 张龙6 赵虎localhost:result_data a6$装载数据并查看装载后的数据,命令如下:
hive> load data local inpath '/Users/a6/Applications/apache-hive-2.3.0-bin/result_data/t12_data.txt'Display all 574 possibilities? (y or n)hive> load data local inpath '/Users/a6/Applications/apache-hive-2.3.0-bin/result_data/t12_data.txt' into table t12 partition ( year=2017,month=8);Loading data to table yyz_workdb.t12 partition (year=2017, month=8)OKTime taken: 1.042 secondshive> load data local inpath '/Users/a6/Applications/apache-hive-2.3.0-bin/result_data/t12_data.txt' into table t12 partition ( year=2017,month=9);Loading data to table yyz_workdb.t12 partition (year=2017, month=9)OKTime taken: 0.575 secondshive> load data local inpath '/Users/a6/Applications/apache-hive-2.3.0-bin/result_data/t12_data.txt' into table t12 partition ( year=2017,month=10);Loading data to table yyz_workdb.t12 partition (year=2017, month=10)OKTime taken: 0.532 secondshive> load data local inpath '/Users/a6/Applications/apache-hive-2.3.0-bin/result_data/t12_data.txt' into table t12 partition ( year=2017,month=11);Loading data to table yyz_workdb.t12 partition (year=2017, month=11)OKTime taken: 0.502 secondshive> select * from t12;OK1小华2017102成龙2017103zhangsan2017104李四2017105张龙2017106赵虎2017101小华2017112成龙2017113zhangsan2017114李四2017115张龙2017116赵虎2017111小华201782成龙201783zhangsan201784李四201785张龙201786赵虎201781小华201792成龙201793zhangsan201794李四201795张龙201796赵虎20179Time taken: 0.193 seconds, Fetched: 24 row(s)3、创建分区表t13
hive> CREATE TABLE t13(id INT , NAME string) partitioned BY(YEAR INT , MONTH INT) ROW format delimited FIELDS TERMINATED BY '\t';OKTime taken: 0.075 seconds4、动态加载数据到分区表
hive> insert into table t13 partition(year=2015,month) select id,name,month from t12 where year=2017;WARNING: Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.Query ID = a6_20171104164639_f6406180-46ea-496b-8cf6-70f28ca62659Total jobs = 3Launching Job 1 out of 3Number of reduce tasks is set to 0 since there's no reduce operatorStarting Job = job_1509763925736_0003, Tracking URL = http://localhost:8088/proxy/application_1509763925736_0003/Kill Command = /Users/a6/Applications/hadoop-2.6.5/bin/hadoop job -kill job_1509763925736_0003Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 02017-11-04 16:46:46,292 Stage-1 map = 0%, reduce = 0%2017-11-04 16:46:52,648 Stage-1 map = 100%, reduce = 0%Ended Job = job_1509763925736_0003Stage-4 is selected by condition resolver.Stage-3 is filtered out by condition resolver.Stage-5 is filtered out by condition resolver.Moving data to directory hdfs://localhost:9002/user/hive/warehouse/yyz_workdb.db/t13/year=2015/.hive-staging_hive_2017-11-04_16-46-39_096_8917617436179357709-1/-ext-10000Loading data to table yyz_workdb.t13 partition (year=2015, month=null)Loaded : 4/4 partitions. Time taken to load dynamic partitions: 0.476 seconds Time taken for adding to write entity : 0.001 secondsMapReduce Jobs Launched:Stage-Stage-1: Map: 1 HDFS Read: 5808 HDFS Write: 467 SUCCESSTotal MapReduce CPU Time Spent: 0 msecOKTime taken: 15.51 seconds
注意:执行此语句会把t12表中的year=2017的所有数据插入到新的分区表t13中,(ps. 并指定新的分区字段为year=2015)。注意id,name,month的写法,t13中有id、name、year、month字段,其中year、month为分区字段,插入的时候,因为已经指定year=2017,所以从t12中查询的时候,只需要指定三列id,name,month就行。
4.1、下面查看插入后的t13表中的数据:
hive> select * from t13;OK1小华2015102成龙2015103zhangsan2015104李四2015105张龙2015106赵虎2015101小华2015112成龙2015113zhangsan2015114李四2015115张龙2015116赵虎2015111小华201582成龙201583zhangsan201584李四201585张龙201586赵虎201581小华201592成龙201593zhangsan201594李四201595张龙201596赵虎20159Time taken: 0.098 seconds, Fetched: 24 row(s)5、使用全部分区才可以变为动态的
set hive.exec.dynamic.partition.mode=nonstrict; //必须设置,才可以使用全部分区才可以变为动态的insert into table t13 partition(year,month) select * from t12;
参考网址:http://blog.csdn.net/liubiaoxin/article/details/48931247
阅读全文
0 0
- Hive通过动态分区装载数据
- Hive通过动态分区装载数据
- Hive 视图 索引 动态分区装载数据
- 动态分区装载数据
- hive向动态分区插入数据
- Hive装载数据命令
- hive导入CSV数据,使用动态分区重新分区
- Hive 动态分区 & 静态分区
- hive动态分区
- Hive动态分区
- Hive动态分区
- HIVE动态分区实战
- Hive动态分区
- Hive动态分区
- hive 动态分区
- 05-Hive动态分区
- hive动态分区
- Hive动态分区
- Windows C语言 Socket编程 client端(客户端)--初级(简单版)
- Pat 1010. 一元多项式求导 (25)
- 调用CodeSmith类库实现代码生成(含源码)
- echarts绘图
- responsive layout in css
- Hive通过动态分区装载数据
- 初入开发心得
- 前端数组函数巧妙运用
- SVM听课笔记
- 【常用函数使用总结】js
- 祖玛
- 完全图解VS2017安装过程并演示VS2017创建Linux项目和调试
- js禁止粘贴非数字内容
- 【MySQL+ Redis】传统MySQL+ Memcached架构遇到的问题