Hive静态分区表

来源:互联网 发布:淘宝客服月总结 编辑:程序博客网 时间:2024/06/10 00:13


Hive的分区表分为动态分区和静态分区,分区表的使用能够为巨量表查询性能的提高提供帮助。
静态分区在数据载入前需要事先将分区建好,使用起来稍显复杂,而动态表可以根据数据自动建立分区,但同时花费了巨大的性能代价。如果分区是可以确定的话,一定不要用动态分区,动态分区的值是在reduce运行阶段确定的;也就是会把所有的记录distribute by。 可想而知表记录非常大的话,只有一个reduce去处理,那简直是疯狂的。如果这个值唯一或者事先已经知道,比如按天分区(i_date=20151105)那就用静态分区吧。静态分区在编译阶段已经确定,不需要reduce处理。所以,在分区表的使用上,一般建议使用静态分区。
partition必须在表定义时创建。
1、单分区建表语句

CREATE TABLE `base_order_partition`(  `order_id` bigint,   `order_date` int,   `order_mo` int,   `status` tinyint,   `status_desc` varchar(12),   `food_num` int,   `food_amount` float,   `order_amount` float)PARTITIONED BY (   `order_month` int)ROW FORMAT DELIMITED   FIELDS TERMINATED BY '$' STORED AS INPUTFORMAT   'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'LOCATION  'hdfs://Master:9000/home/spark/opt/data_dir/hive/warehouse/59store.db/base_order_partition'TBLPROPERTIES (  'transient_lastDdlTime'='1446600982');

 
2、双分区建表语句

CREATE TABLE `base_order_partition2`(  `order_id` bigint,   `order_date` int,   `order_mo` int,   `status` tinyint,   `status_desc` varchar(12),   `food_num` int,   `food_amount` float,   `order_amount` float)PARTITIONED BY (   `order_month` int,   `order_date1` int)ROW FORMAT DELIMITED   FIELDS TERMINATED BY '$' STORED AS INPUTFORMAT   'org.apache.hadoop.mapred.TextInputFormat' OUTPUTFORMAT   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'LOCATION  'hdfs://Master:9000/home/spark/opt/data_dir/hive/warehouse/59store.db/base_order_partition2'TBLPROPERTIES (  'transient_lastDdlTime'='1446620191');先以order_month为文件夹,再以order_date1子文件夹区分。

3、添加分区表语法(表已创建,在此基础上添加分区)

单分区表:alter table base_order_partition add partition (order_month=201501);alter table base_order_partition add partition (order_month=201502);alter table base_order_partition add partition (order_month=201503);alter table base_order_partition add partition (order_month=201504);alter table base_order_partition add partition (order_month=201505);alter table base_order_partition add partition (order_month=201506);alter table base_order_partition add partition (order_month=201507);alter table base_order_partition add partition (order_month=201508);alter table base_order_partition add partition (order_month=201509);alter table base_order_partition add partition (order_month=201510);alter table base_order_partition add partition (order_month=201511);alter table base_order_partition add partition (order_month=201512);双分区表:alter table base_order_partition2 add partition (order_month=201510,order_date1=20151001);alter table base_order_partition2 add partition (order_month=201510,order_date1=20151002);alter table base_order_partition2 add partition (order_month=201510,order_date1=20151003);alter table base_order_partition2 add partition (order_month=201510,order_date1=20151004);alter table base_order_partition2 add partition (order_month=201510,order_date1=20151005);alter table base_order_partition2 add partition (order_month=201510,order_date1=20151006);alter table base_order_partition2 add partition (order_month=201510,order_date1=20151007);alter table base_order_partition2 add partition (order_month=201510,order_date1=20151008);alter table base_order_partition2 add partition (order_month=201510,order_date1=20151009);alter table base_order_partition2 add partition (order_month=201510,order_date1=20151010);alter table base_order_partition2 add partition (order_month=201510,order_date1=20151011);alter table base_order_partition2 add partition (order_month=201510,order_date1=20151012);alter table base_order_partition2 add partition (order_month=201510,order_date1=20151013);alter table base_order_partition2 add partition (order_month=201510,order_date1=20151014);alter table base_order_partition2 add partition (order_month=201510,order_date1=20151015);alter table base_order_partition2 add partition (order_month=201510,order_date1=20151016);alter table base_order_partition2 add partition (order_month=201510,order_date1=20151017);alter table base_order_partition2 add partition (order_month=201510,order_date1=20151018);alter table base_order_partition2 add partition (order_month=201510,order_date1=20151019);alter table base_order_partition2 add partition (order_month=201510,order_date1=20151020);alter table base_order_partition2 add partition (order_month=201510,order_date1=20151021);alter table base_order_partition2 add partition (order_month=201510,order_date1=20151022);alter table base_order_partition2 add partition (order_month=201510,order_date1=20151023);alter table base_order_partition2 add partition (order_month=201510,order_date1=20151024);alter table base_order_partition2 add partition (order_month=201510,order_date1=20151025);alter table base_order_partition2 add partition (order_month=201510,order_date1=20151026);alter table base_order_partition2 add partition (order_month=201510,order_date1=20151027);alter table base_order_partition2 add partition (order_month=201510,order_date1=20151028);alter table base_order_partition2 add partition (order_month=201510,order_date1=20151029);alter table base_order_partition2 add partition (order_month=201510,order_date1=20151030);alter table base_order_partition2 add partition (order_month=201510,order_date1=20151031);hive> alter table base_order_partition2 add partition (order_month=201510,order_date1=20151032);

4、删除分区语法
ALTER TABLE table_name DROP
 partition_spec, partition_spec,...

用户可以用 ALTER TABLE DROP PARTITION 来删除分区。分区的元数据和数据将被一并删除。

hive> alter table base_order_partition2 drop partition (order_month=201510,order_date1=20151032);

5、数据加载进分区表中语法
LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)]
例:
LOAD DATA INPATH '/home/spark/opt/data_dir/pv.txt' INTO TABLE day_hour_table PARTITION(dt='2008-08- 08', hour='08'); LOAD DATA local INPATH '/home/spark/opt/data_dir/hua/*' INTO TABLE day_hour partition(dt='2010-07- 07');
当数据被加载至表中时,不会对数据进行任何转换。Load操作只是将数据复制至Hive表对应的位置。数据加载时在表下自动创建一个目录。

表对表方式:

from base_order_partitioninsert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151001)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151001insert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151002)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151002insert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151003)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151003insert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151004)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151004insert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151005)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151005insert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151006)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151006insert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151007)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151007insert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151008)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151008insert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151009)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151009insert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151010)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151010insert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151011)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151011insert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151012)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151012insert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151013)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151013insert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151014)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151014insert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151015)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151015insert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151016)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151016insert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151017)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151017insert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151018)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151018insert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151019)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151019insert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151020)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151020insert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151021)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151021insert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151022)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151022insert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151023)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151023insert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151024)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151024insert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151025)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151025insert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151026)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151026insert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151027)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151027insert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151028)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151028insert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151029)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151029insert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151030)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151030insert overwrite table base_order_partition2 partition (order_month=201510,order_date1=20151031)select `order_id`,`order_date`,`order_mo`,`status`,`status_desc`,`food_num`,`food_amount`,`order_amount`, where order_month=201510 and order_date=20151031;


6、基于分区的查询的语句

SELECT * FROM base_order_partition WHERE order_month=201511;

7、查看分区语句

hive> show partitions base_order_partition;OKorder_month=201501order_month=201502order_month=201503order_month=201504order_month=201505order_month=201506order_month=201507order_month=201508order_month=201509order_month=201510order_month=201511order_month=201512Time taken: 0.076 seconds, Fetched: 12 row(s)

0 0