hive添加分区字段
来源:互联网 发布:计算机二级所用软件 编辑:程序博客网 时间:2024/06/06 01:16
静态分区表:
一级分区表:
CREATE TABLE order_created_partition ( orderNumber STRING , event_time STRING)PARTITIONED BY (event_month string)ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
加载数据方式一:从本地/HDFS目录加载
load data local inpath '/home/spark/software/data/order_created.txt' overwrite into table order_created_partition PARTITION(event_month='2014-05');
select * from order_created_partition where event_month='2014-05';
+-----------------+-----------------------------+--------------+| ordernumber | event_time | event_month |+-----------------+-----------------------------+--------------+| 10703007267488 | 2014-05-01 06:01:12.334+01 | 2014-05 || 10101043505096 | 2014-05-01 07:28:12.342+01 | 2014-05 || 10103043509747 | 2014-05-01 07:50:12.33+01 | 2014-05 || 10103043501575 | 2014-05-01 09:27:12.33+01 | 2014-05 || 10104043514061 | 2014-05-01 09:03:12.324+01 | 2014-05 |+-----------------+-----------------------------+--------------+
加载数据方式二:手工上传文件到hdfs上,然后将数据添加到分区表指定的分区:
1) 创建hdfs目录:在hdfs目录:/user/hive/warehouse/order_created_partition目录下创建event_month=2014-06
hadoop fs -mkdir /user/hive/warehouse/order_created_partition/event_month=2014-06
2)拷贝数据到新创建的目录下:
hadoop fs -put /home/spark/software/data/order_created.txt /user/hive/warehouse/order_created_partition/event_month=2014-06
select * from order_created_partition where event_month='2014-06'; #发现查询结果是空的
3)添加新分区数据到元数据信息中:
msck repair table order_created_partition;
输出日志信息:
Partitions not in metastore: order_created_partition:event_month=2014-06Repair: Added partition to metastore order_created_partition:event_month=2014-06
或者: alter table order_created_partition add partition(dt='2014-06');
select * from order_created_partition where event_month='2014-06';
+-----------------+-----------------------------+--------------+| ordernumber | event_time | event_month |+-----------------+-----------------------------+--------------+| 10703007267488 | 2014-05-01 06:01:12.334+01 | 2014-06 || 10101043505096 | 2014-05-01 07:28:12.342+01 | 2014-06 || 10103043509747 | 2014-05-01 07:50:12.33+01 | 2014-06 || 10103043501575 | 2014-05-01 09:27:12.33+01 | 2014-06 || 10104043514061 | 2014-05-01 09:03:12.324+01 | 2014-06 |+-----------------+-----------------------------+--------------+
加载数据方式三:select查询方式insert/overwrite
CREATE TABLE order_created_4_partition ( orderNumber STRING , event_time STRING)ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
load data local inpath '/home/spark/software/data/order_created.txt' overwrite into table order_created_4_partition;insert into table order_created_partition partition(event_month='2014-07') select * from order_created_4_partition;insert overwrite table order_created_partition partition(event_month='2014-07') select * from order_created_4_partition;
对比:
insert overwrite table order_created_partition partition(event_month='2014-07') select ordernumber,event_time from order_created_4_partition;insert overwrite table order_created_partition partition(event_month='2014-07') select event_time,ordernumber from order_created_4_partition;
发现字段值错位,在使用时一定要注意:字段值顺序要与表中字段顺序一致,名称可以不一致;
查看分区表已有的所有分区:
show partitions order_created_partition;
查看分区表已有的指定分区:
SHOW PARTITIONS order_created_partition PARTITION(event_month='2014-06');
查看表字段信息:
desc order_created_partition;desc extended order_created_partition;desc formatted order_created_partition;desc formatted order_created_partition partition(event_month='2014-05');
二级分区表:
CREATE TABLE order_created_partition2 ( orderNumber STRING , event_time STRING)PARTITIONED BY (event_month string, step string)ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
show partitions order_created_partition2;
显示结果空
load data local inpath '/home/spark/software/data/order_created.txt' into table order_created_partition2 partition(event_month='2014-09',step='1'); show partitions order_created_partition2;
+-----------------------------+| result |+-----------------------------+| event_month=2014-09/step=1 |+-----------------------------+
insert overwrite table order_created_partition2 partition(event_month='2014-09',step='2') select * from order_created_4_partition;show partitions order_created_partition2;
+-----------------------------+| result |+-----------------------------+| event_month=2014-09/step=1 || event_month=2014-09/step=2 |+-----------------------------+
动态分区表
CREATE TABLE order_created_dynamic_partition ( orderNumber STRING , event_time STRING)PARTITIONED BY (event_month string);
insert into table order_created_dynamic_partition PARTITION (event_month)select orderNumber, event_time, substr(event_time, 1, 7) as event_month from order_created;
报错:
FAILED: SemanticException [Error 10096]: Dynamic partition strict mode requires at least one static partition column.
To turn this off set hive.exec.dynamic.partition.mode=nonstrict
解决方案:
set hive.exec.dynamic.partition.mode=nonstrict;
重新执行:
insert into table order_created_dynamic_partition PARTITION (event_month)select orderNumber, event_time, substr(event_time, 1, 7) as event_month from order_created;
select * from order_created_dynamic_partition;
+-----------------+-----------------------------+--------------+| ordernumber | event_time | event_month |+-----------------+-----------------------------+--------------+| 10703007267488 | 2014-05-01 06:01:12.334+01 | 2014-05 || 10101043505096 | 2014-05-01 07:28:12.342+01 | 2014-05 || 10103043509747 | 2014-05-01 07:50:12.33+01 | 2014-05 || 10103043501575 | 2014-05-01 09:27:12.33+01 | 2014-05 || 10104043514061 | 2014-05-01 09:03:12.324+01 | 2014-05 |+-----------------+-----------------------------+--------------+
阅读全文
0 0
- hive添加分区字段
- hive 建立表结构 ,添加分区
- Hive分区表添加分区时的注意事项
- hive 添加字段导致数据出错
- HIVE分区表新增字段后新增字段值为空,需要带分区加字段
- hive 分区
- hive分区
- hive分区
- hive 分区
- hive分区
- Hive 分区
- hive--分区
- hive分区
- HIVE分区
- Hive分区
- hive 分区
- Hive 修改表添加分区(add partition)和 加载数据时添加分区 的区别
- hive partition ||| hive分区
- c# 定时任务
- 获得最长公共子串问题
- chrome 开发者工具
- ubuntu开机进入initramfs的解决办法
- 集体智慧编程错误总结第三章
- hive添加分区字段
- How to set up minimal Redhat 7.3
- spring按类型进行依赖注入引起的问题
- 修改keil5工程名称
- 自定义控件 带删除按钮的EditText
- 欢迎使用CSDN-markdown编辑器
- Linux进行yum 程序安装时遇到的公钥没有安装的问题解决方法
- 解决Git报错:error: You have not concluded your merge (MERGE_HEAD exists).
- 10、Android开发基础之点击事件的处理