hive表复制和HDFS回收站清除

来源：互联网发布：淘宝代运营公司w863 编辑：程序博客网时间：2024/05/26 02:54

hive表复制

1、非分区表的复制

create table t_copy as select * from t_temp;

跟一般sql语句一样。

2、分区表的复制

先复制源表的表结构

create table t_copy like t_part;

再插入分区数据，如：

insert overwrite  table  t_copy partition(year,month)select id,name,orderdate,substring(orderdate,1,4),substring(orderdate,6,2) from t_part;

注意事项：
1.动态分区的字段，需要写在select语句中所有字段的最后
2.hive需要设置set hive.exec.dynamic.partition=true;(默认值是false，表示是否开启动态分区)
3.[可选]hive需要设置set hive.exec.dynamic.partition.mode=nonstrict;(默认是strict模式，表示至少需要指定一个静态分区；nonstrict模式表示不需要指定静态分区)。需要动态设置二级类目分区，就需要开启此项。

执行后可以得到和之前一样的分区表,但是这种方法并不是最快的方法。

3、msck修复分区

还是先复制源表的表结构

hive> create table t_copy like t_part;

再运行

 hdfs@master  hadoop fs -cp /data/use/hive/warehouse/fdm.db/t_part/* /data/use/hive/warehouse/fdm.db/t_copy/

最后修复新表的分区元数据：

hive> msck repair table t2;

使用这样的方法同样可以去快速copy分区表,而且这样操作的速度比使用动态分区要快,因为我们移动数据是使用hdfs的文件复制,而不是启动mapreduce作业.

清空回收站

清空回收站命令：hdfs dfs -expunge
或者删除.Trash目录（清理垃圾） hadoop fs -rmr .Trash

注：在HDFS上的回收站数据在

/user/$USER/.Trash/Current/user/$USER/

目录下
检查回收站数据

hdfs@master hadoop-0.20.2]$ bin/hadoop dfs -ls /user/grid/.Trash

恢复回收站数据

hdfs@master hadoop-0.20.2]$ bin/hadoop dfs -mv /user/grid/.Trash/Current/user/grid/in /user/grid/in

阅读全文

1 0