ENDPOINT_NUMBER column in histograms
来源:互联网 发布:淘宝人群画像 编辑:程序博客网 时间:2024/05/18 01:16
-- Thanks for the question regarding "How to interpret strange values in ENDPOINT_NUMBER column in histograms", version 9.2.0
You Asked
Tom, consider this:
SQL> column table_name format a11SQL> column column_name format a11SQL> column column_name format a11SQL> column endpoint_number format 999,999SQL> column endpoint_value format 9.999999EEEESQL> column endpoint_actual_value format a22SQL> column owner format A8SQL> SQL> create table test as select owner from dba_objects where owner < 'C' 2 /Table created.SQL> select owner, count(*) from test group by owner 2 /OWNER COUNT(*)-------- ----------ABM 464AHL 518AHM 167AK 188ALR 128AMF 106AMS 1557AMV 189AMW 244AP 795APPLSYS 2766APPLSYSPUB 17APPS 140786AR 2211ASF 10ASG 154ASL 53ASN 4ASO 285ASP 4AST 38AX 141AZ 74BEN 2783BIC 70BIL 226BIM 838BIS 1497BIV 39BIX 193BNE 150BOM 787BSC 30033 rows selected.SQL> exec dbms_stats.gather_table_stats(ownname => user, tabname=> 'TEST', method_opt=>'for columns owner size 33');PL/SQL procedure successfully completed.SQL> select * from user_histograms where table_name = 'TEST' and column_name = 'OWNER' 2 /TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALUE----------- ----------- --------------- -------------- ----------------------TEST OWNER 0 3.388440E+35TEST OWNER 1 3.391283E+35TEST OWNER 30 3.391283E+35TEST OWNER 31 3.391625E+35TEST OWNER 32 3.440973E+35TEST OWNER 33 3.443803E+356 rows selected.SQL> exec dbms_stats.gather_table_stats(ownname => user, tabname=> 'TEST', method_opt=>'for columns owner size 40');PL/SQL procedure successfully completed.SQL> select * from user_histograms where table_name = 'TEST' and column_name = 'OWNER' 2 /TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALUE----------- ----------- --------------- -------------- ----------------------TEST OWNER 0 3.388440E+35TEST OWNER 1 3.391219E+35TEST OWNER 37 3.391283E+35TEST OWNER 38 3.391625E+35TEST OWNER 39 3.441782E+35TEST OWNER 40 3.443803E+356 rows selected.SQL> exec dbms_stats.gather_table_stats(ownname => user, tabname=> 'TEST', method_opt=>'for columns owner size 50');PL/SQL procedure successfully completed.SQL> select * from user_histograms where table_name = 'TEST' and column_name = 'OWNER' 2 /TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALUE----------- ----------- --------------- -------------- ----------------------TEST OWNER 464 3.388440E+35 ABMTEST OWNER 982 3.389657E+35 AHLTEST OWNER 1,149 3.389657E+35 AHMTEST OWNER 1,337 3.390205E+35 AKTEST OWNER 1,465 3.390473E+35 ALRTEST OWNER 1,571 3.390666E+35 AMFTEST OWNER 3,128 3.390676E+35 AMSTEST OWNER 3,317 3.390679E+35 AMVTEST OWNER 3,561 3.390679E+35 AMWTEST OWNER 4,356 3.391219E+35 APTEST OWNER 7,122 3.391283E+35 APPLSYSTEST OWNER 7,139 3.391283E+35 APPLSYSPUBTEST OWNER 147,925 3.391283E+35 APPSTEST OWNER 150,136 3.391625E+35 ARTEST OWNER 150,146 3.391883E+35 ASFTEST OWNER 150,300 3.391884E+35 ASGTEST OWNER 150,353 3.391888E+35 ASLTEST OWNER 150,357 3.391889E+35 ASNTEST OWNER 150,642 3.391890E+35 ASOTEST OWNER 150,646 3.391891E+35 ASPTEST OWNER 150,684 3.391894E+35 ASTTEST OWNER 150,825 3.392841E+35 AXTEST OWNER 150,899 3.393247E+35 AZTEST OWNER 153,682 3.440973E+35 BENTEST OWNER 153,752 3.441775E+35 BICTEST OWNER 153,978 3.441782E+35 BILTEST OWNER 154,816 3.441783E+35 BIMTEST OWNER 156,313 3.441788E+35 BISTEST OWNER 156,352 3.441790E+35 BIVTEST OWNER 156,545 3.441792E+35 BIXTEST OWNER 156,695 3.442791E+35 BNETEST OWNER 157,482 3.443000E+35 BOMTEST OWNER 157,782 3.443803E+35 BSC33 rows selected.SQL> drop table test 2 /Table dropped.SQL> select * from v$version 2 /BANNER--------------------------------------------------------------------------------Oracle9i Enterprise Edition Release 9.2.0.7.0 - 64bit ProductionPL/SQL Release 9.2.0.7.0 - ProductionCORE 9.2.0.7.0 ProductionTNS for IBM/AIX RISC System/6000: Version 9.2.0.7.0 - ProductionNLSRTL Version 9.2.0.7.0 - Production5 rows selected.
There are 33 distinct values in OWNER column. After gathering statistics for this column with number of buckets equal and even more than number of distinct values (N between 33 and 44), I see in ENDPOINT_NUMBER column ordinal number of bucket and nothing in ENDPOINT_ACTUAL_VALUE. But when N in 'for columns owner size N' is sufficient greater than the number of distinct values (it seems the rule is "N >= 1.33333*number_of_distinct_values+1") then in ENDPOINT_NUMBER column appears the cumulative numbers of rows (according metalink Note:72539.1) and ENDPOINT_ACTUAL_VALUE is not empty. Here are my questions:
1. Why is it so? In above example I am expecting to see same results after
exec dbms_stats.gather_table_stats(ownname => user, tabname=> 'TEST', method_opt=>'for columns owner size 33');
and after
exec dbms_stats.gather_table_stats(ownname => user, tabname=> 'TEST', method_opt=>'for columns owner size 50');
2. Why there are only 6 rows (buckets) in USER_HISTOGRAMS when I calling dbms_stats with 'for columns owner size 33' (for example)? I am expecting to see 33 rows (buckets) after this call...
3. Is it better to have in ENDPOINT_NUMBER column cumulative numbers of rows than just bucket numbers? Can optimizer make its decisions more accurately in this case?
and we said...
This happens when you have some values that utterly dominate the other values - as you do - that one really high value can be used to infer the other buckets.
consider the differences between t1 and t2 below:
<code>
ops$tkyte%ORA9IR2> create table t1
2 as
3 select 'user' || mod(rownum,33) owner, object_name
4 from all_objects;
Table created.
ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> create table t2
2 as
3 select owner, object_name
4 from all_objects;
Table created.
ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> select 't1', owner, count(*) from t1 group by owner union all
2 select 't2', owner, count(*) from t2 group by owner
3 order by 1, 2;
'T OWNER COUNT(*)
-- -------------------------------------------- ----------
t1 user0 929
t1 user1 930
t1 user10 930
t1 user11 930
t1 user12 930
t1 user13 930
t1 user14 930
t1 user15 930
t1 user16 930
t1 user17 930
t1 user18 929
t1 user19 929
t1 user2 930
t1 user20 929
t1 user21 929
t1 user22 929
t1 user23 929
t1 user24 929
t1 user25 929
t1 user26 929
t1 user27 929
t1 user28 929
t1 user29 929
t1 user3 930
t1 user30 929
t1 user31 929
t1 user32 929
t1 user4 930
t1 user5 930
t1 user6 930
t1 user7 930
t1 user8 930
t1 user9 930
t2 A 1
t2 BIG_TABLE 2
t2 CTXSYS 261
t2 DEMO 1
t2 HR 34
t2 MDSYS 234
t2 ODM 439
t2 ODM_MTR 12
t2 OE 86
t2 OLAPSYS 662
t2 OPS$TKYTE 16
t2 ORDPLUGINS 29
t2 ORDSYS 969
t2 OUTLN 7
t2 PERFSTAT 86
t2 PM 9
t2 PUBLIC 12285
t2 QS 41
t2 QS_ADM 7
t2 QS_CBADM 24
t2 QS_CS 23
t2 QS_ES 39
t2 QS_OS 39
t2 QS_WS 39
t2 SCOTT 6
t2 SH 173
t2 SYS 14073
t2 SYSTEM 382
t2 USER1 9
t2 USER2 9
t2 WKSYS 279
t2 WMSYS 129
t2 XDB 270
66 rows selected.
ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T1', method_opt => 'for columns owner size 33' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA9IR2> select * from user_tab_col_statistics where table_name = 'T1';
TABLE_NAME COLUMN_NAME NUM_DISTINCT LOW_VALUE HIGH_VALUE DENSITY NUM_NULLS NUM_BUCKETS LAST_ANAL SAMPLE_SIZE GLO USE AVG_COL_LEN
--------------- --------------- ------------ ---------- ---------- ---------- ---------- ----------- --------- ----------- --- --- -----------
T1 OWNER 33 7573657230 7573657239 .030302545 0 32 06-AUG-08 30674 YES NO 7
ops$tkyte%ORA9IR2> select * from user_histograms where table_name = 'T1' and column_name = 'OWNER';
TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL
--------------- --------------- --------------- -------------- ---------------
T1 OWNER 0 6.0984E+35
T1 OWNER 1 6.0984E+35
T1 OWNER 2 6.0984E+35
T1 OWNER 3 6.0984E+35
T1 OWNER 4 6.0984E+35
T1 OWNER 5 6.0984E+35
T1 OWNER 6 6.0984E+35
T1 OWNER 7 6.0984E+35
T1 OWNER 8 6.0984E+35
T1 OWNER 9 6.0984E+35
T1 OWNER 10 6.0984E+35
T1 OWNER 11 6.0984E+35
T1 OWNER 12 6.0984E+35
T1 OWNER 13 6.0984E+35
T1 OWNER 14 6.0984E+35
T1 OWNER 15 6.0984E+35
T1 OWNER 16 6.0984E+35
T1 OWNER 17 6.0984E+35
T1 OWNER 18 6.0984E+35
T1 OWNER 19 6.0984E+35
T1 OWNER 20 6.0984E+35
T1 OWNER 21 6.0984E+35
T1 OWNER 22 6.0984E+35
T1 OWNER 23 6.0984E+35
T1 OWNER 24 6.0984E+35
T1 OWNER 25 6.0984E+35
T1 OWNER 26 6.0984E+35
T1 OWNER 27 6.0984E+35
T1 OWNER 28 6.0984E+35
T1 OWNER 29 6.0984E+35
T1 OWNER 30 6.0984E+35
T1 OWNER 31 6.0984E+35
T1 OWNER 33 6.0984E+35
33 rows selected.
ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> exec dbms_stats.gather_table_stats( user, 'T2', method_opt => 'for columns owner size 33' );
PL/SQL procedure successfully completed.
ops$tkyte%ORA9IR2> select * from user_tab_col_statistics where table_name = 'T2';
TABLE_NAME COLUMN_NAME
TomThat is very interesting and enlightening,butIf you use dbms_xplan to view the results, you can see that we just used a low value for the "low values" and the dominating values appear: how does oracle caculate 477 and *KNOW* the dominant value since dominant values are not in the histograms.Thanks!
Followup August 7, 2008 - 2pm Central time zone:
it does now the value, you just cannot see it, but it can. That big ugly number there - it has everything we need to know.With 33 distinct values and 33 buckets, you should get a frequency histogram, where endpoint_number is the cumulative number of rows. But instead you got a height balanced histogram, where endpoint_number is the bucket number. Could it be the bug that Jonathan mentioned in page 164 of "Cost base optimizer Oracle fundamentals"? He also said that the bug was fixed in 10.2.
Followup August 7, 2008 - 3pm Central time zone:
same results in 10gr2 and 11gr1and everything I see in the problem database says "we meant to do this", meaning - not a bug.
it is the predominant value, dbms_stats is a "program", it gathered them and said "ok - this is what I want to store"
Not a bug? Are you sure, Tom? How about this:<code>SQL> create table test as select owner from dba_objects where owner < 'C' and owner <> 'APPS' 2 /Table created.SQL> select owner, count(*) from test group by owner 2 /OWNER COUNT(*)------------ ----------ABM 464AHL 518AHM 167AK 188ALR 128AMF 106AMS 1557AMV 189AMW 244AP 795APPLSYS 2791APPLSYSPUB 17AR 2211ASF 10ASG 154ASL 53ASN 4ASO 285ASP 4AST 38AX 141AZ 74BEN 2783BIC 70BIL 226BIM 838BIS 1497BIV 39BIX 193BNE 150BOM 787BSC 30032 rows selected.SQL> exec dbms_stats.gather_table_stats(ownname => user, tabname=> 'TEST', method_opt=>'for columns owner size 33');PL/SQL procedure successfully completed.SQL> SQL> select * from user_histograms where table_name = 'TEST' and column_name = 'OWNER' 2 /TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALUE----------- ----------- --------------- -------------- ----------------------TEST OWNER 0 3.388440E+35TEST OWNER 1 3.389657E+35TEST OWNER 2 3.389657E+35TEST OWNER 3 3.390666E+35TEST OWNER 6 3.390676E+35TEST OWNER 8 3.391219E+35TEST OWNER 13 3.391283E+35TEST OWNER 18 3.391625E+35TEST OWNER 19 3.391890E+35TEST OWNER 25 3.440973E+35TEST OWNER 27 3.441783E+35TEST OWNER 30 3.441788E+35TEST OWNER 32 3.443000E+35TEST OWNER 33 3.443803E+3514 rows selected.SQL> analyze table test compute statistics for columns owner size 33 2 /Table analyzed.SQL> select * from user_histograms where table_name = 'TEST' and column_name = 'OWNER' 2 /TABLE_NAME COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALUE----------- ----------- --------------- -------------- ----------------------TEST OWNER 464 3.388440E+35 ABMTEST OWNER 982 3.389657E+35 AHLTEST OWNER 1,149 3.389657E+35 AHMTEST OWNER 1,337 3.390205E+35 AKTEST OWNER 1,465 3.390473E+35 ALRTEST OWNER 1,571 3.390666E+35 AMFTEST OWNER 3,128 3.390676E+35 AMSTEST OWNER 3,317 3.390679E+35 AMVTEST OWNER 3,561 3.390679E+35 AMWTEST OWNER 4,356 3.391219E+35 APTEST OWNER 7,147 3.391283E+35 APPLSYSTEST OWNER 7,164 3.391283E+35 APPLSYSPUBTEST OWNER 9,375 3.391625E+35 ARTEST OWNER 9,385 3.391883E+35 ASFTEST OWNER 9,539 3.391884E+35 ASGTEST OWNER 9,592 3.391888E+35 ASLTEST OWNER 9,596 3.391889E+35 ASNTEST OWNER 9,881 3.391890E+35 ASOTEST OWNER 9,885 3.391891E+35 ASPTEST OWNER 9,923 3.391894E+35 ASTTEST OWNER 10,064 3.392841E+35 AXTEST OWNER 10,138 3.393247E+35 AZTEST OWNER 12,921 3.440973E+35 BENTEST OWNER 12,991 3.441775E+35 BICTEST OWNER 13,217 3.441782E+35 BILTEST OWNER 14,055 3.441783E+35 BIMTEST OWNER 15,552 3.441788E+35 BISTEST OWNER 15,591 3.441790E+35 BIVTEST OWNER 15,784 3.441792E+35 BIXTEST OWNER 15,934 3.442791E+35 BNETEST OWNER 16,721 3.443000E+35 BOMTEST OWNER 17,021 3.443803E+35 BSC32 rows selected.SQL> drop table test 2 /Table dropped.
1. The good old ANALYZE command (obsolete as stated in Oracle docs) produces more expected, correct result, but new dbms_stat does not. So, where is bug? Or both work correct, but every in its own fashion?
2. You write "we meant to do this". Please, explain that exactly do you mean. Does Oracle try to compress histogram? But why? There is no need in above case to do this! There are 32 distinct values and I try to create the same number of buckets - "please, just do it!". You write about one dominating value - I excluded that one - no changes. Furthermore, histograms are "born" to deal with uneven distributed data! Aren't they?
3. Where are values in ENDPOINT_ACTUAL_VALUE column? Why they appears and disappears so mystically?</code>
Followup August 8, 2008 - 1pm Central time zone:
dbms_stats is the thing that the optimizer expects to have generated the statistics.dbms_stats is what is tested with the optimizer, analyze is not.
dbms_stats and analyze have always returned different results.
If you believe this to be a bug, you'll want to file it in metalink, but I do believe they'll close it as "not a bug, we programmed it that way".
- ENDPOINT_NUMBER column in histograms
- Oracle SQL tuning with column histograms
- How To Quickly Add/Remove Column Statistics (Histograms) For A Column (文档 ID 390249.1
- 5 Ways to Do 2D Histograms in R
- Column Statistics in Hive
- column mode in VIM
- Change Column in MySQL
- In-Memory Column optional
- Disable Column Resizing in ListView
- Column not found: 1054 Unknown column 'tl10709' in 'where clause'
- Repeated column in mapping for entity
- How To Split column In SQL
- WPF DataGrid edit column content in MultiLine.
- Order a column conditional in SQL
- Unknown column 'Mary' in 'field list'
- Merge Same Data Column in Gridview
- Unknown column 'rownum' in 'where clause'解决方案
- 05 trigger the checkbox column in datagridview
- EJB初学者常有的十一个疑惑
- 数据库快照
- WPF编程入门
- 绩优股男人
- 练精化气
- ENDPOINT_NUMBER column in histograms
- 代理模式
- 咖啡
- 获取指定dll中的某个函数的地址
- union的用法
- editplus 3 注册码
- 一个实现了和数据库打交道的源文件(组装sql语句版)
- FTPEx.au3
- 实用单表千万级分页存储过程二(不敢独享,特此分享)