压缩信息立方体和集合技术内幕

来源:互联网 发布:手机搜不到4g网络 编辑:程序博客网 时间:2024/05/01 09:53
什么是信息立方体压缩 :
For this we need to go into specifics of how data gets updated into the cube.
A cube has two fact tables..
The F Fact table or the uncompressed fact table - this table is partitioned based on request ID.
In other words .. the F Fact table has a partition for each data load. This is the request ID. The partitioning index for the cube is the 900 index or the P Index.

Then there is the E Fact Table or the compressed fact table. The E Fact Table may or may not be partitioned based on the
cube design. If the cube is partitioned then the E Fact table is partitioned.

信息立方体压缩时对应处理 

The selected requests in the F Table are summarized and then inserted into the E Fact Table and the summarized into one Request or in other words - the requets are merged together by summarizing the same.

信息立方体压缩的优点:
Query access is faster
Cube size comes down - this helps in tasks like rebuild of aggregates etc.
Indexes are better maintained
Data Load is more efficient because the F Table has fewer requests ane lesser data.

信息立方体压缩的缺点 :
Request based deletion is not possible on compression. Only selective deletion is possible.


Let us look ate the basic compression job - this job was got from the SM37 monitor...
Here references to acxtual cubes have been removed and also Key Figures and characters are highlighted.

Compression starts here - parameters for compression :
零值排除


什么是零值排除? :
If there is some record which has got nullified in the cube it can be currently listed as :
Customer|Product|Sales
ABC | sprocket | 100
ABC | sprocket | -100

The DSO will be reading zero for the same record..

If you have zero suppression in the query then in all likelihood this record is not reported  in such a scenario this record can be removed from the cube to reduce the number of records.
Zero elimination basically is for records in the fact table that do not have any facts - in other words all the key figures for the record in the fact table are zero.



任务顺序

Job started
Step 001 started (program RSCOMP1, variant &0000000001187, user ID <USERID>)
Performing check and potential update for status control table             

A preliminary check is done to find out how many requests are compressed and if any request is locked / cube is locked for some reason.

FB RSM1_CHECK_DM_GOT_REQUEST called from PRG RSSM_PROCESS_COMPRESS; row 000200
Request '465,550'; DTA '<CUBE>'; action 'C'; with dialog 'X'                     
Leave RSM1_CHECK_DM_GOT_REQUEST in row 70; Req_State ''                             
FB RSM1_CHECK_DM_GOT_REQUEST called from PRG RSSM_PROCESS_COMPRESS; row 000200      
Request '466,617'; DTA '<CUBE>'; action 'C'; with dialog 'X'                     
Leave RSM1_CHECK_DM_GOT_REQUEST in row 70; Req_State ''                             
FB RSM1_CHECK_DM_GOT_REQUEST called from PRG RSSM_PROCESS_COMPRESS; row 000200      
Request '467,673'; DTA '<CUBE>'; action 'C'; with dialog 'X'                     
Leave RSM1_CHECK_DM_GOT_REQUEST in row 70; Req_State ''                             
FB RSM1_CHECK_DM_GOT_REQUEST called from PRG RSSM_PROCESS_COMPRESS; row 000200      
Request '468,424'; DTA '<CUBE>'; action 'C'; with dialog 'X'                     
Leave RSM1_CHECK_DM_GOT_REQUEST in row 70; Req_State ''                             
FB RSM1_CHECK_DM_GOT_REQUEST called from PRG RSSM_PROCESS_COMPRESS; row 000200      
Request '469,143'; DTA '<CUBE>'; action 'C'; with dialog 'X'


This also gives the clock symbol for the requests in the manage tab where the request is shown as being compressed / summarized.

Leave RSM1_CHECK_DM_GOT_REQUEST in row 70; Req_State ''                             
RSS2_DTP_RNR_SUBSEQ_PROC_SET GET_INSTANCE_FOR_RNR    469143 LINE 43                 
RSS2_DTP_RNR_SUBSEQ_PROC_SET GET_TSTATE_FOR_RNR 7 LINE 243                          
RSS2_DTP_RNR_SUBSEQ_PROC_SET GET_INSTANCE_FOR_RNR    468424 LINE 43                 
RSS2_DTP_RNR_SUBSEQ_PROC_SET GET_TSTATE_FOR_RNR 7 LINE 243                          
RSS2_DTP_RNR_SUBSEQ_PROC_SET GET_INSTANCE_FOR_RNR    467673 LINE 43                 
RSS2_DTP_RNR_SUBSEQ_PROC_SET GET_TSTATE_FOR_RNR 7 LINE 243                          
RSS2_DTP_RNR_SUBSEQ_PROC_SET GET_INSTANCE_FOR_RNR    466617 LINE 43                 
RSS2_DTP_RNR_SUBSEQ_PROC_SET GET_TSTATE_FOR_RNR 7 LINE 243                          
RSS2_DTP_RNR_SUBSEQ_PROC_SET GET_INSTANCE_FOR_RNR    465550 LINE 43                 
RSS2_DTP_RNR_SUBSEQ_PROC_SET GET_TSTATE_FOR_RNR 7 LINE 243                          

Now the summarization query is fired ... kinda long but has a lot of information...!!

This query is for the F Table - the records for the requests selected are summarized...
( You can notice that the Key Figures - KF are all summed up... )

SQL:   <USERID>
ALTER TABLE "/BIC/F<CUBE>" MONITORING SQL-END:   00:00:00
MERGE/*+ USE_NL ( FACT E ) INDEX( E,"/BIC/E<CUBE>~P" ) */ INTO "/BIC/E<CUBE>"
E USING ( SELECT /*+ PARALLEL ( FACT , 3 ) */0 "PDIMID" , "KEY_<CUBE>T" ,
"KEY_<CUBE>U" , "KEY_<CUBE>1" ,KEY_<CUBE>2 , "KEY_<CUBE>3" , "KEY_<CUBE>4" ,
"KEY_<CUBE>5" , "SID_0CALMONTH" ,SUM(KF1" ) AS "KF1" ,SUM( "KF2" ) AS "KF2" ,
SUM( "KF3" ) AS "KF3" ,SUM( "KF4" ) AS "KF4" ,SUM( "KF5" )
AS "/BIC/YKF5" ,SUM( "KF6" ) AS "KF6" ,SUM( "KF7" ) AS "KF7" ,
SUM( "KF8" ) AS "KF8" ,SUM( "/BIC/YKF9" ) AS "KF9" ,SUM( "KF10" )
AS "KF10" ,SUM( "KF11" )AS "KF11" ,SUM( "KF12" ) AS "KF12" ,
SUM( "KF13" ) AS "KF13" ,SUM( "KF14" ) AS "KF14,SUM( KF15" ) AS
"KF15" ,SUM("KF16" ) AS "KF16" ,SUM( "/BIC/YKF17" ) AS "KF17" ,
SUM( "KF18" ) AS "KF18" ,SUM( "KF19" ) AS"KF19" ,SUM( "KF20" )
AS "/BIC/YKF20" ,SUM( "KF21" ) AS "KF21" ,SUM( "KF22" ) AS "KF22"
,SUM( "KF23" ) AS "KF23" ,SUM( "KF24" ) AS "KF24" ,SUM( "/BIC/YKF25" )
AS "KF25" ,SUM( "KF26" ) AS "KF26" ,SUM( "KF27" ) AS "/BIC/YKF27" ,
SUM( "KF28" ) AS "KF28" ,SUM( "KF29" ) AS "KF29"FROM "/BIC/F<CUBE>"
FACT WHERE "KEY_<CUBE>P=74 AND KEY_<CUBE>T" IN (4751 ,4755 ,4756 ,4757 ,4759 ,4777 ,
<< Request numbers Selected >>4778 ,4779 ,4784 ,4790 ,47938 ,4804 ,4806 ,4807 ,4817 ,4818 ,4828 ,4831 ,4835 ,4836 ,
4837 ,4839 ,4845 ,4850 ,4851 ,48853 ,4854 ,4855 ,4856 ,4857 ,4858 ,4859 ,48601 ,4862 ,
4863 ,4864 ,4865 ,4866 ,4867 ,4868 ,4874 ,4876 ,4881 ,4882 ,4886 ,4887 ,4888 ,48890 ,
4891 ,4892 ,4893 ,4894 ,4895 ,4896 ,48978 ,4899 ,4900 ,4901 ,4902 ,4903 ,4904 ,4905 ,
,4907 ,4908 ,4937 ,4938 ,4939 ,4940 ,4941 ,49943 ,4944 ,4945 ,4946 ,4947) GROUP BY KEY_<CUBE>T" , "KEY_<CUBE>U" , "KEY_<CUBE>1" , "KEY_<CUBE>2" , "KEY_<CUBE>3" ,  "KEY_<CUBE>4", "KEY_<CUBE>5" , "SID_0CALMONTH" HAVING (SUM("KF1") <> 0 ) OR (SUM ("KF2" ) <> 0 ) OR (SUM ("KF3") <> 0 ) OR (SUM("KF4") <> 0 ) OR (SUM ("KF5")> 0 ) OR (SUM ("KF6") <> 0 ) OR (SUM (=KF7) <> 0 ) OR (SUM ("KF8")> 0 ) OR (SUM ("KF9") <> 0 ) OR (SUM ("KF10") <> 0 ) OR (SUM ("KF11")> 0 ) OR (SUM ("KF12") <> 0 ) OR (SUM (KF13) <> 0 ) OR (SUM ("KF14")
<> 0 ) OR (SUM ("KF15") <> 0 ) OR (SUM ("KF16") <> 0 ) OR (SUM ("KF17") <
> 0 ) OR (SUM ("KF18") <> 0 ) OR (SUM ("/BIC/YKF19") <> 0 ) OR (SUM ("KF20") <
> 0 ) OR (SUM ("KF21") <> 0 ) OR (SUM ("KF22") <> 0 ) OR (SUM ("KF23") <
> 0 ) OR (SUM ("KF24") <> 0 ) OR (SUM ("KF25") <> 0 ) OR (SUM ("KF26") <>
0 ) OR (SUM ("KF27") <> 0 ) OR (SUM ("/BIC/YKF28") <> 0 ) OR (SUM ("KF29") <>
0 ) ) F ON ( E."KEY_<CUBE>P" = "PDIMID" AND EKEY_<CUBE>T" = F."KEY_<CUBE>T" AND   E."KEY_<CUBE>U" = F."KEY_<CUBE>U" AND   E."KEY_<CUBE>1" = F."KEY_<CUBE>1" AND   E."KEY_<CUBE>2" = F."KEY_<CUBE>2" AND   E."KEY_<CUBE>3"= F."KEY_<CUBE>3" AND   E."KEY_<CUBE>4" = F.
KEY_<CUBE>4 AND   E."KEY_<CUBE>5" = F."KEY_<CUBE>5" AND   E."SID_0CALMONTH" = F."SID_0CALMONTH" ) WHEN NOT MATCHED THEN INSERT ( E."KEBL_C11P" , E."KEY_<CUBE>T" , E."KEY_<CUBE>U" E."KEY_<CUBE>1" , E."KEY_<CUBE>2" , E."KEY_<CUBE>3" , E."KEY_<CUBE>4" , E."KEY_<CUBE>5" , E."SID_0CALMONTH" , E."KF1" , E."KF2" , E."KF3" , E."KF4" , E."KF5" , ."KF6" ,E."KF7" , E."KF8" , E."KF9" , E."KF10" , E."KF11" ,."KF12" , E."KF13" , E."/BIC/YKF14" , E."KF15" , E."KF16"
, E."KF17" , E."KF18" , E."/BIC/YKF19" , E."KF20" , E."KF21, E.KF22" , E."KF23" , E."/BIC/YKF24" , E."KF25" , E."KF26 E.KF27" , E."KF28" , E."/BIC/YKF29" ) VALUES ( "PDIMID" , F."KEY_<CUB" , F."KEY_<CUBE>U" , F."KEY_<CUBE>1" , F."KEY_<CUBE>2" , F."KEY_<CUBE>3" , F."KEY_<CUBE>4" , F."KEY_<CUBE>5" , F."SID_0CALMONTH" F."KF1" , F."KF2" , F."/BIC /YKF3" , F."KF4" , F."KF5", F."KF6" , F."KF7" , F."/BI
C/YKF8" , F."KF9" , F."KF10" , F."KF11" , F."KF12" , F."/B
IC/YKF13" , F."KF14" , F."KF15" , F."KF16" , F."KF17" , F."/B
IC/YKF18" , F."KF19" , F."KF20" , F."KF21" , F."KF22" , F."
KF23" , F."KF24" , F."KF25" , F."KF26" , F."KF27" , F."/B
IC/YKF28" , F."KF29" ) WHEN MATCHED
THEN UPDATE /*+ INDEX("/BIC/E<CUBE>" "/BIC/EYBL_C11~P") */ SET E."KF1" = E."CL
STOCKU" + F."KF1", E."KF2" =E."KF2" + F."KF2", E."/BIC/YKF3" = E."KF3" +
F."KF3",E."KF4" = E."KF4" + F."KF4", E."KF5" = E."KF5" + F.
KF5, E."KF6" = E."KF6" + F."KF6", E."KF7" = E
."KF7" + F."KF7", E."KF8" = E."KF8" + F."KF8", E."/BIC/YKF9" =
E."KF9" + F."KF9", E."KF10" = E."KF10" + F."/BIC/YKF10",
E."KF11" = E."KF11" + F."KF11", E."KF12" = E."KF12" + F."KF12",
E."KF13" = E."KF13" + F."KF13", E."KF14" = E."KF14" + F."KF14",
E."KF15" = E."KF15" + F."/BIC/YKF15", E."KF16" = E."KF16" +
F."KF16", E."KF17" = E."/BIC/YKF17" + F."KF17", E."KF18"= E."KF18"
+ F."KF18", E."/BIC/YKF19" = E."KF19" + F."KF19",E."KF20" = E."KF20"
+ F."/BIC/YKF20", E."KF21" = E."KF21" + F."KF21", E."KF22" = E."/BIC/YKF22"
 + F."KF22", E."KF23= E.KF23" + F."KF23", E."/BIC/YKF24" = E."KF24" + F."KF24", E."KF25" = E."KF25" + F."KF25", E."KF26" = E."KF26"+ F."KF26", E."KF27" = E."KF27" + F."KF27", E."KF28" = E."/BIC/YKF28" + F."KF28", E."KF29" = E."KF29" + F."KF29" DELETE WHERE ( E."KF1" = 0 AND E."KF2" = 0 AND E."KF3" = 0 AND E."KF4" = 0 AND E."KF5" = 0 AND E."/BIC/YKF6" = 0 AND E."KF7" = 0 AND E."/BIC/YKF8" = 0 AND E."KF9" = 0 AND E."F10" = 0 AND E."KF11" = 0 AND E."KF12" = 0 AND E."KF13" = 0 A
ND E."KF14" = 0 AND E."KF15" = 0 AND E."KF16" = 0 AND E."KF17" =
0 AND E."KF18" = 0 AND E."KF19" = 0 AND E."KF20" = 0 AND E."KF21" = 0 AND E."KF22" = 0 AND E."KF23" = 0 AND E."KF24" = 0 AND E."/BIC/KF25" = 0 AND E."KF26" = 0 AND E."/BIC /YKF27" = 0 AND E."KF28" = 0 AND E."/BI
C/YKF29" = 0 )

Now a similar query is fired on the E Table to determine how the records get inserted.

Post this the F Table request partitions get dropped since they are no longer required.
The F Table partitions are based on requests and once compressed the partitions become empty.

ALTER TABLE "/BIC/F<CUBE>" DROP PARTITION /BIC/F<CUBE>0000000079
SQL-END:  12:46:38 00:00:01
SQL:  12:46:38 <USERID>
ALTER TABLE "/BIC/F<CUBE>" DROP PARTITION /BIC/F<CUBE>0000000078
SQL-END:  12:46:39 00:00:01
SQL:  12:46:39 <USERID>
ALTER TABLE "/BIC/F<CUBE>" DROP PARTITION /BIC/F<CUBE>0000000077
SQL-END:  12:46:40 00:00:01
SQL:  12:46:40 <USERID>
ALTER TABLE "/BIC/F<CUBE>" DROP PARTITION /BIC/F<CUBE>0000000076
SQL-END:  12:46:44 00:00:04
SQL:  12:46:44 <USERID>
ALTER TABLE "/BIC/F<CUBE>" DROP PARTITION /BIC/F<CUBE>0000000075
SQL-END:  12:46:45 00:00:01

Now the compressed status for the requests are updated in the respective tables.
 
RSS2_DTP_RNR_SUBSEQ_PROC_SET GET_INSTANCE_FOR_RNR    469143 LINE 43                 
RSS2_DTP_RNR_SUBSEQ_PROC_SET GET_TSTATE_FOR_RNR 7 LINE 273                          
Status transition 7 / 7 to 8 / 8 completed successfully                             
RSS2_DTP_RNR_SUBSEQ_PROC_SET SET_TSTATE_FURTHER_OK LINE 317                         
RSS2_DTP_RNR_SUBSEQ_PROC_SET GET_INSTANCE_FOR_RNR    468424 LINE 43                 
RSS2_DTP_RNR_SUBSEQ_PROC_SET GET_TSTATE_FOR_RNR 7 LINE 273                          
Status transition 7 / 7 to 8 / 8 completed successfully                             
RSS2_DTP_RNR_SUBSEQ_PROC_SET SET_TSTATE_FURTHER_OK LINE 317                         
RSS2_DTP_RNR_SUBSEQ_PROC_SET GET_INSTANCE_FOR_RNR    467673 LINE 43                 
RSS2_DTP_RNR_SUBSEQ_PROC_SET GET_TSTATE_FOR_RNR 7 LINE 273                          
Status transition 7 / 7 to 8 / 8 completed successfully                             
RSS2_DTP_RNR_SUBSEQ_PROC_SET SET_TSTATE_FURTHER_OK LINE 317                         
RSS2_DTP_RNR_SUBSEQ_PROC_SET GET_INSTANCE_FOR_RNR    466617 LINE 43                 
RSS2_DTP_RNR_SUBSEQ_PROC_SET GET_TSTATE_FOR_RNR 7 LINE 273                          
Status transition 7 / 7 to 8 / 8 completed successfully                             
RSS2_DTP_RNR_SUBSEQ_PROC_SET SET_TSTATE_FURTHER_OK LINE 317                         
RSS2_DTP_RNR_SUBSEQ_PROC_SET GET_INSTANCE_FOR_RNR    465550 LINE 43                 
RSS2_DTP_RNR_SUBSEQ_PROC_SET GET_TSTATE_FOR_RNR 7 LINE 273                          
Status transition 7 / 7 to 8 / 8 completed successfully                             
RSS2_DTP_RNR_SUBSEQ_PROC_SET SET_TSTATE_FURTHER_OK LINE 317                         
InfoCube <CUBE> Successfully Compressed Up To Request 469,143                    
Aggregation of InfoCube <CUBE> to request 469143                                 
Zero elimination switched on                
                                       

Now the insert of records into the E Table is done and the number of records is indicated.

Mass upsert of transaction data executed (982130 data records)                      
Mass insert of transaction data executed (105234 data records)                      
P-DIMID 74 deleted from table /BIC/D<CUBE>P (InfoCube <CUBE>)                 
Request 465550 was summarized correctly                                             
Mass upsert of transaction data executed (299651 data records)                      
Mass insert of transaction data executed (279998 data records)                      
P-DIMID 75 deleted from table /BIC/D<CUBE>P (InfoCube <CUBE>)                 
Request 466617 was summarized correctly                                             
Mass upsert of transaction data executed (153193 data records)                      
P-DIMID 76 deleted from table /BIC/D<CUBE>P (InfoCube <CUBE>)                 
Request 467673 was summarized correctly                                             
Mass upsert of transaction data executed (11378 data records)                       
P-DIMID 77 deleted from table /BIC/D<CUBE>P (InfoCube <CUBE>)                 
Request 468424 was summarized correctly                                             
Mass upsert of transaction data executed (389506 data records)                      
P-DIMID 78 deleted from table /BIC/D<CUBE>P (InfoCube <CUBE>)                 
Request 469143 was summarized correctly                                             
DB Statistics: 00000000 selects, 02221090 inserts, 00000000 updates, deletes 00000000

Net updates and inserts in the E Table are given in the DB Statistics ....

I am not too sure why there are multiple repetitions of the insert into the E Table ...

Request with P-Dimid 469143 from InfoCube <CUBE> deleted (389506 data records)   
Request with P-Dimid 468424 from InfoCube <CUBE> deleted (11378 data records)    
Request with P-Dimid 467673 from InfoCube <CUBE> deleted (153193 data records)   
Request with P-Dimid 466617 from InfoCube <CUBE> deleted (579649 data records)   
Request with P-Dimid 465550 from InfoCube <CUBE> deleted (1087364 data records)  
Aggregation run ended successfully                                                  
InfoCube <CUBE> Successfully Compressed Up To Request 469,143                    
Aggregation of InfoCube <CUBE> to request 469143                                 
Zero elimination switched on                                                        
Mass upsert of transaction data executed (982130 data records)                      
Mass insert of transaction data executed (105234 data records)                      
P-DIMID 74 deleted from table /BIC/D<CUBE>P (InfoCube <CUBE>)                 
Request 465550 was summarized correctly                                             
Mass upsert of transaction data executed (299651 data records)                      
Mass insert of transaction data executed (279998 data records)                      
P-DIMID 75 deleted from table /BIC/D<CUBE>P (InfoCube <CUBE>)                 
Request 466617 was summarized correctly                                             
Mass upsert of transaction data executed (153193 data records)                      
P-DIMID 76 deleted from table /BIC/D<CUBE>P (InfoCube <CUBE>)                 
Request 467673 was summarized correctly                                             
Mass upsert of transaction data executed (11378 data records)                       
P-DIMID 77 deleted from table /BIC/D<CUBE>P (InfoCube <CUBE>)                 
Request 468424 was summarized correctly                                             
Mass upsert of transaction data executed (389506 data records)                      
P-DIMID 78 deleted from table /BIC/D<CUBE>P (InfoCube <CUBE>)                 
Request 469143 was summarized correctly                                             
DB Statistics: 00000000 selects, 02221090 inserts, 00000000 updates, deletes 00000000
Request with P-Dimid 469143 from InfoCube <CUBE> deleted (389506 data records)   
Request with P-Dimid 468424 from InfoCube <CUBE> deleted (11378 data records)    
Request with P-Dimid 467673 from InfoCube <CUBE> deleted (153193 data records)   
Request with P-Dimid 466617 from InfoCube <CUBE> deleted (579649 data records)   
Request with P-Dimid 465550 from InfoCube <CUBE> deleted (1087364 data records)  
Aggregation run ended successfully                                                  

Aggregatios in referred to here since the records are aggregated and then inserted into the E Fact Table.

InfoCube <CUBE> Successfully Compressed Up To Request 469,143                    
Job finished                                                                        

 

I have made an attempt to explain what happens when compression happens. I find this one of the basic tasks in maintaining a EDW in BI and something that should be done regularly.

 Also  compression is an activity which is heavy on archive logs ... whenever trigerring compression it is advisable to monitor archive log growth...

Of course Archive log settings also determine the log generation.

source link: https://www.sdn.sap.com/irj/sdn/weblogs?blog=/pub/wlg/10672

原创粉丝点击