hive 同一个表不同分区的数据求和

来源:互联网 发布:网络信息抓取软件 合法 编辑:程序博客网 时间:2024/05/29 04:36

业务表如下:

hive> desc sc_kekaoxing;OKid                      string                  可靠小表id              car_id                  string                  car表id              month                   string                  月份                  name                    string                  可靠名称父级              count                   string                  可靠数量                dt                      string                                      dp                      string                                      # Partition Information          # col_name              data_type               comment             dt                      string                                      dp                      string                                      Time taken: 0.128 seconds, Fetched: 13 row(s)
hive> select * from sc_kekaoxing limit 5;OK1       10      201612  车身外观        5       201612  qiche1       11      201612  车身外观        6       201612  qiche1       12      201612  车身外观        7       201612  qiche1       13      201612  车身外观        2       201612  qiche1       14      201612  车身外观        17      201612  qicheTime taken: 0.386 seconds, Fetched: 5 row(s)

业务需求:不同dt分区下的字段count的值需要累加:

INSERT OVERWRITE local directory '/tmp/t01'row format delimited fields terminated by '\t'select if(aa.id is null,bb.id,aa.id),if(aa.car_id is null,bb.car_id,aa.car_id),'201702',if(aa.name is null,bb.name,aa.name),case when aa.count is not null and bb.count is not null   then aa.count+bb.count when aa.count is null and bb.count is not null   then bb.count when aa.count is not null and bb.count is null   then aa.countelse 0 endfrom   (select * from sc_kekaoxing where dt='201612') aa full outer join   (select * from sc_kekaoxing where dt='201702') bb on aa.id=bb.id and aa.car_id=bb.car_id and aa.name=bb.name;
0 0