表关联优化方法分享

来源:互联网 发布:c语言关键字是什么 编辑:程序博客网 时间:2024/06/05 14:42

在数据库中,表与表之间的关联,通过JOIN连接。可以理解为“横向关联”,如果是多个大表,“横向关联”,效率比较慢;

 “纵向关联”:UNION每个表,再GROUPBY去重,得到“关联”的效果。“纵向关联”效率比“横向关联”强很多。

举例:T1,T2,T3,T4,T5,每个表有5000万条数据。

“横向关联”: JOIN关联,实际是5000万*5000万*5000万*5000万*5000万,实际是在笛卡尔集基础上,再过滤条件。【Hadoop:超过几个小时】

 

“纵向关联”:5000万+5000万+5000万+5000万+5000万。【Hadoop:30分内】

“纵向关联”示例代码:

INSERT OVERWRITE TABLET_DM_SCORE

SELECT   

 '20160606',

        CUST_NO,

  ID_NUMBER,

 NAME,

        MAX(SCORE1) AS SCORE1,

        MAX(SCORE2) AS SCORE2,

        MAX(SCORE3) AS SCORE3,

        MAX(SCORE4) AS SCORE4,

        MAX(SCORE5) AS SCORE5,

        0 AS SCORE6,

        0 AS SCORE7,

        (MAX(SCORE1) + MAX(SCORE2) + MAX(SCORE3) + MAX(SCORE4) + MAX(SCORE5)) ASSCORE_ALL,

        MAX(ASSET_MAVG_SUM) AS ASSET_MAVG_SUM

FROM

(SELECTCUST_NO, CERT_ID AS ID_NUMBER, CUST_NAME AS NAME,

        -90000 AS SCORE1, -90000 AS SCORE2, -90000 AS SCORE3, -90000 AS SCORE4, -90000AS SCORE5, NVL(ASSET_MAVG_SUM,0) AS ASSET_MAVG_SUM

        FROM T_GET_WAIVER_ALL_CUST_BPH_INFO T0

UNION ALL

SELECTCUST_NO, ID_NUMBER, NAME, SCORE AS SCORE1, -90000 AS SCORE2, -90000 AS SCORE3,-90000 AS SCORE4, -90000 AS SCORE5, -90000 AS ASSET_MAVG_SUM

FROM T_DM_ABILITY_SCORET1

UNION ALL

SELECTCUST_NO, ID_NUMBER, NAME, -90000 AS SCORE1, SCORE AS SCORE2, -90000 AS SCORE3,-90000 AS SCORE4, -90000 AS SCORE5, -90000 AS ASSET_MAVG_SUM

FROM T_DM_ACTIVETY_SCORET2

UNION ALL

SELECTCUST_NO, ID_NUMBER, NAME, -90000 AS SCORE1, -90000 AS SCORE2, SCORE AS SCORE3,-90000 AS SCORE4, -90000 AS SCORE5, -90000 AS ASSET_MAVG_SUM

FROM T_DM_BEHAVIOR_SCORET3

UNION ALL

SELECTCUST_NO, ID_NUMBER, NAME, -90000 AS SCORE1, -90000 AS SCORE2, -90000 AS SCORE3,SCORE AS SCORE4, -90000 AS SCORE5, -90000 AS ASSET_MAVG_SUM

FROM T_DM_CREDIT_SCORET4

UNION ALL

SELECTCUST_NO, ID_NUMBER, NAME, -90000 AS SCORE1, -90000 AS SCORE2, -90000 AS SCORE3,-90000 AS SCORE4, SCORE AS SCORE5, -90000 AS ASSET_MAVG_SUM

FROM T_DM_IDENTITY_SCORET5

)TA

GROUPBY  CUST_NO, ID_NUMBER, NAME;

 

“纵向关联”理解:第2步骤为UNION合并表,第3步GROUP BY去重,达到“关联”的目的





注:不适用一对多的关系表关联


0 0
原创粉丝点击