ENDPOINT_NUMBER column in histograms

来源:互联网 发布:淘宝人群画像 编辑:程序博客网 时间:2024/05/18 01:16

-- Thanks for the question regarding "How to interpret strange values in ENDPOINT_NUMBER column in histograms", version 9.2.0

Submitted on 6-Aug-2008 0:29 Central time zone
Tom's latest followup | Bookmark | Bottom
Last updated 8-Aug-2008 13:10

You Asked

Tom, consider this:

SQL> column table_name format a11SQL> column column_name format a11SQL> column column_name format a11SQL> column endpoint_number format 999,999SQL> column endpoint_value format 9.999999EEEESQL> column endpoint_actual_value format a22SQL> column owner format A8SQL> SQL> create table test as select owner from dba_objects where owner < 'C'  2  /Table created.SQL> select owner, count(*) from test group by owner  2  /OWNER      COUNT(*)-------- ----------ABM             464AHL             518AHM             167AK              188ALR             128AMF             106AMS            1557AMV             189AMW             244AP              795APPLSYS        2766APPLSYSPUB       17APPS         140786AR             2211ASF              10ASG             154ASL              53ASN               4ASO             285ASP               4AST              38AX              141AZ               74BEN            2783BIC              70BIL             226BIM             838BIS            1497BIV              39BIX             193BNE             150BOM             787BSC             30033 rows selected.SQL> exec dbms_stats.gather_table_stats(ownname => user, tabname=> 'TEST', method_opt=>'for columns owner size 33');PL/SQL procedure successfully completed.SQL> select * from user_histograms where table_name = 'TEST' and column_name = 'OWNER'  2  /TABLE_NAME  COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALUE----------- ----------- --------------- -------------- ----------------------TEST        OWNER                     0   3.388440E+35TEST        OWNER                     1   3.391283E+35TEST        OWNER                    30   3.391283E+35TEST        OWNER                    31   3.391625E+35TEST        OWNER                    32   3.440973E+35TEST        OWNER                    33   3.443803E+356 rows selected.SQL> exec dbms_stats.gather_table_stats(ownname => user, tabname=> 'TEST', method_opt=>'for columns owner size 40');PL/SQL procedure successfully completed.SQL> select * from user_histograms where table_name = 'TEST' and column_name = 'OWNER'  2  /TABLE_NAME  COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALUE----------- ----------- --------------- -------------- ----------------------TEST        OWNER                     0   3.388440E+35TEST        OWNER                     1   3.391219E+35TEST        OWNER                    37   3.391283E+35TEST        OWNER                    38   3.391625E+35TEST        OWNER                    39   3.441782E+35TEST        OWNER                    40   3.443803E+356 rows selected.SQL> exec dbms_stats.gather_table_stats(ownname => user, tabname=> 'TEST', method_opt=>'for columns owner size 50');PL/SQL procedure successfully completed.SQL> select * from user_histograms where table_name = 'TEST' and column_name = 'OWNER'  2  /TABLE_NAME  COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALUE----------- ----------- --------------- -------------- ----------------------TEST        OWNER                   464   3.388440E+35 ABMTEST        OWNER                   982   3.389657E+35 AHLTEST        OWNER                 1,149   3.389657E+35 AHMTEST        OWNER                 1,337   3.390205E+35 AKTEST        OWNER                 1,465   3.390473E+35 ALRTEST        OWNER                 1,571   3.390666E+35 AMFTEST        OWNER                 3,128   3.390676E+35 AMSTEST        OWNER                 3,317   3.390679E+35 AMVTEST        OWNER                 3,561   3.390679E+35 AMWTEST        OWNER                 4,356   3.391219E+35 APTEST        OWNER                 7,122   3.391283E+35 APPLSYSTEST        OWNER                 7,139   3.391283E+35 APPLSYSPUBTEST        OWNER               147,925   3.391283E+35 APPSTEST        OWNER               150,136   3.391625E+35 ARTEST        OWNER               150,146   3.391883E+35 ASFTEST        OWNER               150,300   3.391884E+35 ASGTEST        OWNER               150,353   3.391888E+35 ASLTEST        OWNER               150,357   3.391889E+35 ASNTEST        OWNER               150,642   3.391890E+35 ASOTEST        OWNER               150,646   3.391891E+35 ASPTEST        OWNER               150,684   3.391894E+35 ASTTEST        OWNER               150,825   3.392841E+35 AXTEST        OWNER               150,899   3.393247E+35 AZTEST        OWNER               153,682   3.440973E+35 BENTEST        OWNER               153,752   3.441775E+35 BICTEST        OWNER               153,978   3.441782E+35 BILTEST        OWNER               154,816   3.441783E+35 BIMTEST        OWNER               156,313   3.441788E+35 BISTEST        OWNER               156,352   3.441790E+35 BIVTEST        OWNER               156,545   3.441792E+35 BIXTEST        OWNER               156,695   3.442791E+35 BNETEST        OWNER               157,482   3.443000E+35 BOMTEST        OWNER               157,782   3.443803E+35 BSC33 rows selected.SQL> drop table test  2  /Table dropped.SQL> select * from v$version  2  /BANNER--------------------------------------------------------------------------------Oracle9i Enterprise Edition Release 9.2.0.7.0 - 64bit ProductionPL/SQL Release 9.2.0.7.0 - ProductionCORE    9.2.0.7.0    ProductionTNS for IBM/AIX RISC System/6000: Version 9.2.0.7.0 - ProductionNLSRTL Version 9.2.0.7.0 - Production5 rows selected.



There are 33 distinct values in OWNER column. After gathering statistics for this column with number of buckets equal and even more than number of distinct values (N between 33 and 44), I see in ENDPOINT_NUMBER column ordinal number of bucket and nothing in ENDPOINT_ACTUAL_VALUE. But when N in 'for columns owner size N' is sufficient greater than the number of distinct values (it seems the rule is "N >= 1.33333*number_of_distinct_values+1") then in ENDPOINT_NUMBER column appears the cumulative numbers of rows (according metalink Note:72539.1) and ENDPOINT_ACTUAL_VALUE is not empty. Here are my questions:

1. Why is it so? In above example I am expecting to see same results after

exec dbms_stats.gather_table_stats(ownname => user, tabname=> 'TEST', method_opt=>'for columns owner size 33');


  and after

exec dbms_stats.gather_table_stats(ownname => user, tabname=> 'TEST', method_opt=>'for columns owner size 50');


2. Why there are only 6 rows (buckets) in USER_HISTOGRAMS when I calling dbms_stats with 'for columns owner size 33' (for example)? I am expecting to see 33 rows (buckets) after this call...
3. Is it better to have in ENDPOINT_NUMBER column cumulative numbers of rows than just bucket numbers? Can optimizer make its decisions more accurately in this case?

and we said...

This happens when you have some values that utterly dominate the other values - as you do - that one really high value can be used to infer the other buckets.

consider the differences between t1 and t2 below:

<code>
ops$tkyte%ORA9IR2> create table t1
2 as
3 select 'user' || mod(rownum,33) owner, object_name
4  from all_objects;

Table created.

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> create table t2
2 as
3 select owner, object_name
4  from all_objects;

Table created.

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> select 't1', owner, count(*) from t1 group by owner union all
2 select 't2', owner, count(*) from t2 group by owner
3 order by 1, 2;

'T OWNER                            COUNT(*)
-- -------------------------------------------- ----------
t1 user0                              929
t1 user1                              930
t1 user10                              930
t1 user11                              930
t1 user12                              930
t1 user13                              930
t1 user14                              930
t1 user15                              930
t1 user16                              930
t1 user17                              930
t1 user18                              929
t1 user19                              929
t1 user2                              930
t1 user20                              929
t1 user21                              929
t1 user22                              929
t1 user23                              929
t1 user24                              929
t1 user25                              929
t1 user26                              929
t1 user27                              929
t1 user28                              929
t1 user29                              929
t1 user3                              930
t1 user30                              929
t1 user31                              929
t1 user32                              929
t1 user4                              930
t1 user5                              930
t1 user6                              930
t1 user7                              930
t1 user8                              930
t1 user9                              930
t2 A                                  1
t2 BIG_TABLE                              2
t2 CTXSYS                              261
t2 DEMO                                1
t2 HR                                  34
t2 MDSYS                              234
t2 ODM                                439
t2 ODM_MTR                              12
t2 OE                                  86
t2 OLAPSYS                              662
t2 OPS$TKYTE                            16
t2 ORDPLUGINS                            29
t2 ORDSYS                              969
t2 OUTLN                                7
t2 PERFSTAT                              86
t2 PM                                  9
t2 PUBLIC                            12285
t2 QS                                  41
t2 QS_ADM                                7
t2 QS_CBADM                              24
t2 QS_CS                                23
t2 QS_ES                                39
t2 QS_OS                                39
t2 QS_WS                                39
t2 SCOTT                                6
t2 SH                                173
t2 SYS                              14073
t2 SYSTEM                              382
t2 USER1                                9
t2 USER2                                9
t2 WKSYS                              279
t2 WMSYS                              129
t2 XDB                                270

66 rows selected.

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T1', method_opt => 'for columns owner size 33' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA9IR2> select * from user_tab_col_statistics where table_name = 'T1';

TABLE_NAME    COLUMN_NAME  NUM_DISTINCT LOW_VALUE HIGH_VALUE  DENSITY NUM_NULLS NUM_BUCKETS LAST_ANAL SAMPLE_SIZE GLO USE AVG_COL_LEN
--------------- --------------- ------------ ---------- ---------- ---------- ---------- ----------- --------- ----------- --- --- -----------
T1        OWNER              33 7573657230 7573657239 .030302545      0      32 06-AUG-08    30674 YES NO        7

ops$tkyte%ORA9IR2> select * from user_histograms where table_name = 'T1' and column_name = 'OWNER';

TABLE_NAME    COLUMN_NAME  ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL
--------------- --------------- --------------- -------------- ---------------
T1        OWNER                0  6.0984E+35
T1        OWNER                1  6.0984E+35
T1        OWNER                2  6.0984E+35
T1        OWNER                3  6.0984E+35
T1        OWNER                4  6.0984E+35
T1        OWNER                5  6.0984E+35
T1        OWNER                6  6.0984E+35
T1        OWNER                7  6.0984E+35
T1        OWNER                8  6.0984E+35
T1        OWNER                9  6.0984E+35
T1        OWNER                10  6.0984E+35
T1        OWNER                11  6.0984E+35
T1        OWNER                12  6.0984E+35
T1        OWNER                13  6.0984E+35
T1        OWNER                14  6.0984E+35
T1        OWNER                15  6.0984E+35
T1        OWNER                16  6.0984E+35
T1        OWNER                17  6.0984E+35
T1        OWNER                18  6.0984E+35
T1        OWNER                19  6.0984E+35
T1        OWNER                20  6.0984E+35
T1        OWNER                21  6.0984E+35
T1        OWNER                22  6.0984E+35
T1        OWNER                23  6.0984E+35
T1        OWNER                24  6.0984E+35
T1        OWNER                25  6.0984E+35
T1        OWNER                26  6.0984E+35
T1        OWNER                27  6.0984E+35
T1        OWNER                28  6.0984E+35
T1        OWNER                29  6.0984E+35
T1        OWNER                30  6.0984E+35
T1        OWNER                31  6.0984E+35
T1        OWNER                33  6.0984E+35

33 rows selected.

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T2', method_opt => 'for columns owner size 33' );

PL/SQL procedure successfully completed.

ops$tkyte%ORA9IR2> select * from user_tab_col_statistics where table_name = 'T2';

TABLE_NAME    COLUMN_NAME 

Reviews   
5 stars Where is dominant value?   August 7, 2008 - 2pm Central time zone
Bookmark | Bottom |
Reviewer: jian huang zheng from China
TomThat is very interesting and enlightening,butIf you use dbms_xplan to view the results, you can see that we just used a low value for the "low values" and the dominating values appear: how does oracle caculate 477 and *KNOW* the dominant value since dominant values are not in the histograms.Thanks!


Followup   August 7, 2008 - 2pm Central time zone:

it does now the value, you just cannot see it, but it can. That big ugly number there - it has everything we need to know.
5 stars could it be the bug in 9i and 10.1?   August 7, 2008 - 3pm Central time zone
Bookmark | Bottom |
Reviewer: Bo Jin 
With 33 distinct values and 33 buckets, you should get a frequency histogram, where endpoint_number is the cumulative number of rows. But instead you got a height balanced histogram, where endpoint_number is the bucket number. Could it be the bug that Jonathan mentioned in page 164 of "Cost base optimizer Oracle fundamentals"? He also said that the bug was fixed in 10.2.


Followup   August 7, 2008 - 3pm Central time zone:

same results in 10gr2 and 11gr1

and everything I see in the problem database says "we meant to do this", meaning - not a bug.

it is the predominant value, dbms_stats is a "program", it gathered them and said "ok - this is what I want to store"
3 stars Continuing the first example   August 8, 2008 - 1am Central time zone
Bookmark | Bottom |
Reviewer: A reader 
Not a bug? Are you sure, Tom?  How about this:<code>SQL> create table test as select owner from dba_objects where owner < 'C' and owner <> 'APPS'  2  /Table created.SQL> select owner, count(*) from test group by owner  2  /OWNER          COUNT(*)------------ ----------ABM                 464AHL                 518AHM                 167AK                  188ALR                 128AMF                 106AMS                1557AMV                 189AMW                 244AP                  795APPLSYS            2791APPLSYSPUB           17AR                 2211ASF                  10ASG                 154ASL                  53ASN                   4ASO                 285ASP                   4AST                  38AX                  141AZ                   74BEN                2783BIC                  70BIL                 226BIM                 838BIS                1497BIV                  39BIX                 193BNE                 150BOM                 787BSC                 30032 rows selected.SQL> exec dbms_stats.gather_table_stats(ownname => user, tabname=> 'TEST', method_opt=>'for columns owner size 33');PL/SQL procedure successfully completed.SQL> SQL> select * from user_histograms where table_name = 'TEST' and column_name = 'OWNER'  2  /TABLE_NAME  COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALUE----------- ----------- --------------- -------------- ----------------------TEST        OWNER                     0   3.388440E+35TEST        OWNER                     1   3.389657E+35TEST        OWNER                     2   3.389657E+35TEST        OWNER                     3   3.390666E+35TEST        OWNER                     6   3.390676E+35TEST        OWNER                     8   3.391219E+35TEST        OWNER                    13   3.391283E+35TEST        OWNER                    18   3.391625E+35TEST        OWNER                    19   3.391890E+35TEST        OWNER                    25   3.440973E+35TEST        OWNER                    27   3.441783E+35TEST        OWNER                    30   3.441788E+35TEST        OWNER                    32   3.443000E+35TEST        OWNER                    33   3.443803E+3514 rows selected.SQL> analyze table test compute statistics for columns owner size 33  2  /Table analyzed.SQL> select * from user_histograms where table_name = 'TEST' and column_name = 'OWNER'  2  /TABLE_NAME  COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALUE----------- ----------- --------------- -------------- ----------------------TEST        OWNER                   464   3.388440E+35 ABMTEST        OWNER                   982   3.389657E+35 AHLTEST        OWNER                 1,149   3.389657E+35 AHMTEST        OWNER                 1,337   3.390205E+35 AKTEST        OWNER                 1,465   3.390473E+35 ALRTEST        OWNER                 1,571   3.390666E+35 AMFTEST        OWNER                 3,128   3.390676E+35 AMSTEST        OWNER                 3,317   3.390679E+35 AMVTEST        OWNER                 3,561   3.390679E+35 AMWTEST        OWNER                 4,356   3.391219E+35 APTEST        OWNER                 7,147   3.391283E+35 APPLSYSTEST        OWNER                 7,164   3.391283E+35 APPLSYSPUBTEST        OWNER                 9,375   3.391625E+35 ARTEST        OWNER                 9,385   3.391883E+35 ASFTEST        OWNER                 9,539   3.391884E+35 ASGTEST        OWNER                 9,592   3.391888E+35 ASLTEST        OWNER                 9,596   3.391889E+35 ASNTEST        OWNER                 9,881   3.391890E+35 ASOTEST        OWNER                 9,885   3.391891E+35 ASPTEST        OWNER                 9,923   3.391894E+35 ASTTEST        OWNER                10,064   3.392841E+35 AXTEST        OWNER                10,138   3.393247E+35 AZTEST        OWNER                12,921   3.440973E+35 BENTEST        OWNER                12,991   3.441775E+35 BICTEST        OWNER                13,217   3.441782E+35 BILTEST        OWNER                14,055   3.441783E+35 BIMTEST        OWNER                15,552   3.441788E+35 BISTEST        OWNER                15,591   3.441790E+35 BIVTEST        OWNER                15,784   3.441792E+35 BIXTEST        OWNER                15,934   3.442791E+35 BNETEST        OWNER                16,721   3.443000E+35 BOMTEST        OWNER                17,021   3.443803E+35 BSC32 rows selected.SQL> drop table test  2  /Table dropped.


1. The good old ANALYZE command (obsolete as stated in Oracle docs) produces more expected, correct result, but new dbms_stat does not. So, where is bug? Or both work correct, but every in its own fashion?

2. You write "we meant to do this". Please, explain that exactly do you mean. Does Oracle try to compress histogram? But why? There is no need in above case to do this! There are 32 distinct values and I try to create the same number of buckets - "please, just do it!". You write about one dominating value - I excluded that one - no changes. Furthermore, histograms are "born" to deal with uneven distributed data! Aren't they?

3. Where are values in ENDPOINT_ACTUAL_VALUE column? Why they appears and disappears so mystically?</code>

Followup   August 8, 2008 - 1pm Central time zone:

dbms_stats is the thing that the optimizer expects to have generated the statistics.

dbms_stats is what is tested with the optimizer, analyze is not.

dbms_stats and analyze have always returned different results.

If you believe this to be a bug, you'll want to file it in metalink, but I do believe they'll close it as "not a bug, we programmed it that way".
原创粉丝点击