继续上一篇文章的实验,越发的让人搞不明白了

来源:互联网 发布:网络推广怎么拿提成 编辑:程序博客网 时间:2024/04/19 19:49

我分别在10.2.0.1.0和10.2.0.4.0的版本上进行测试,发现结果是一样的,所以这里只列举10.2.0.4版本上的实验结果了:
create table zsj_test_dtime(num number,dtime date not null) pctfree 70;

insert into zsj_test_dtime(num,dtime)
                    select rownum,to_date('2011-01-01','yyyy-mm-dd')- 90 -90*6*15 + numtodsinterval(rownum,'hour')
                      from dual
                   connect by to_date('2011-01-01','yyyy-mm-dd')- 90 -90*6*15 + numtodsinterval(rownum,'hour')<to_date('2011-01-01','yyyy-mm-dd') -90 ;
插入了194399 行数据
insert into zsj_test_dtime(num,dtime)
                    select rownum,to_date('2011-01-01','yyyy-mm-dd')- 90 + numtodsinterval(10*rownum,'minute')
                      from dual
                   connect by to_date('2011-01-01','yyyy-mm-dd')- 90 + numtodsinterval(10*rownum,'minute')<to_date('2011-01-01','yyyy-mm-dd');
插入了12959行数据
COMMIT;
插入2011-01-01之前90天的数据,每10分钟插入一行数据;插入2011-01-01 - 90前90*6*15天的数据,每1小时插入一行数据,数据行数是前者的15倍,时间跨度是8100天,这样dtime>to_date('2011-01-01','yyyy-mm-dd') -90的选择率应该是1/16=0.0625.

create index ind1_zsj_test_dtime on zsj_test_dtime(dtime desc);

exec dbms_stats.gather_table_stats(user,'ZSJ_TEST_DTIME',CASCADE=>TRUE,ESTIMATE_PERCENT=>100,METHOD_OPT=>'FOR COLUMNS SIZE 254 DTIME,SYS_NC00003$,NUM SIZE 1');
SELECT COLUMN_NAME,num_buckets,histogram FROM USER_TAB_COLS WHERE TABLE_NAME='ZSJ_TEST_DTIME';
COLUMN_NAME                    NUM_BUCKETS HISTOGRAM
------------------------------ ----------- ---------------
NUM                                      1 NONE
DTIME                                  254 HEIGHT BALANCED
SYS_NC00003$                           254 HEIGHT BALANCED

alter session set tracefile_identifier='zhao_10204_254_254';
alter session set events '10053 trace name context forever';
select /*+ zhao_10204_254_254*/count(num) from zsj_test_dtime where dtime>to_date('2011-01-01','yyyy-mm-dd') -90;

10053文件信息:
Table Stats::
  Table: ZSJ_TEST_DTIME  Alias: ZSJ_TEST_DTIME
    #Rows: 207358  #Blks:  1636  AvgRowLen:  22.00
Index Stats::
  Index: IND1_ZSJ_TEST_DTIME  Col#: 3
    LVLS: 1  #LB: 582  #DK: 207358  LB/K: 1.00  DB/K: 1.00  CLUF: 1539.00

  Column (#2): DTIME(DATE)
    AvgLen: 8.00 NDV: 207358 Nulls: 0 Density: 4.8226e-06 Min: 2447373 Max: 2455563
    Histogram: HtBal  #Bkts: 254  UncompBkts: 254  EndPtVals: 255
  Table: ZSJ_TEST_DTIME  Alias: ZSJ_TEST_DTIME    
    Card: Original: 207358  Rounded: 12695  Computed: 12694.65  Non Adjusted: 12694.65
这里card的估算是准确的,但这里好像是使用dtime上的统计信息得到的结果.
  Column (#3): SYS_NC00003$(RAW)
    AvgLen: 10.00 NDV: 207358 Nulls: 0 Density: 4.8226e-06
    Histogram: HtBal  #Bkts: 254  UncompBkts: 254  EndPtVals: 255
  Access Path: index (RangeScan)
    Index: IND1_ZSJ_TEST_DTIME
    resc_io: 42.00  resc_cpu: 2878770
    ix_sel: 0.059123  ix_sel_with_filters: 0.0036196
    Cost: 42.23  Resp: 42.23  Degree: 1
  Best:: AccessPath: IndexRange  Index: IND1_ZSJ_TEST_DTIME
         Cost: 42.23  Degree: 1  Resp: 42.23  Card: 12694.65  Bytes: 0
这里ix_sel,ix_sel_with_filters两者还是不相等,前者可以认为是准确的.
-----------------------------------------------------------+-----------------------------------+
| Id  | Operation                     | Name               | Rows  | Bytes | Cost  | Time      |
-----------------------------------------------------------+-----------------------------------+
| 0   | SELECT STATEMENT              |                    |       |       |    42 |           |
| 1   |  SORT AGGREGATE               |                    |     1 |    13 |       |           |
| 2   |   TABLE ACCESS BY INDEX ROWID | ZSJ_TEST_DTIME     |   12K |  161K |    42 |  00:00:01 |
| 3   |    INDEX RANGE SCAN           | IND1_ZSJ_TEST_DTIME|   751 |       |    36 |  00:00:01 |
-----------------------------------------------------------+-----------------------------------+
Predicate Information:
----------------------
3 - access("ZSJ_TEST_DTIME"."SYS_NC00003$"<HEXTORAW('8791F5FCFEF8FEFAFF') )
3 - filter(SYS_OP_UNDESCEND("ZSJ_TEST_DTIME"."SYS_NC00003$")>TO_DATE(' 2010-10-03 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

select round(207358*0.0036196,2) from dual;
750.55   --这个就是id=3步骤给出的rows,而且这里尤为让人疑惑的是这一点:索引访问之后返回了751行,而回访表之后居然返回了12k,具体就是12695行(虽然说这个数字是准确的吧),居然比索引访问返回的行数还要多.
我们计算一下这里的cost是如何得来的:
  Using NOWORKLOAD Stats
  CPUSPEED: 1263 millions instruction/sec
  IOTFRSPEED: 4096 bytes per millisecond (default is 4096)
  IOSEEKTIM: 10 milliseconds (default is 10)
cost= 1 + ceil(582*0.059123) + ceil(1539*0.0036196) + 2878770*100/1263000000/(10+8192/4096) =42.02   和42.23相差不大.
应该来说确实是ix_sel_with_filters决定了回访表的代价.可这里的估算偏差太大了.

再来看看另一个收集选项下的结果:
exec dbms_stats.gather_table_stats(user,'ZSJ_TEST_DTIME',CASCADE=>TRUE,ESTIMATE_PERCENT=>100,METHOD_OPT=>'FOR COLUMNS SIZE 1 NUM,SYS_NC00003$,DTIME SIZE 254');
SELECT COLUMN_NAME,num_buckets,histogram FROM USER_TAB_COLS WHERE TABLE_NAME='ZSJ_TEST_DTIME';

COLUMN_NAME                    NUM_BUCKETS HISTOGRAM
------------------------------ ----------- ---------------
NUM                                      1 NONE
DTIME                                  254 HEIGHT BALANCED
SYS_NC00003$                             1 NONE

alter session set tracefile_identifier='zhao_10204_254_1';
alter session set events '10053 trace name context forever';
select /*+ zhao_10204_254_1*/count(num) from zsj_test_dtime where dtime>to_date('2011-01-01','yyyy-mm-dd') -90;

10053文件内容:(和上面相同的内容就不显示了)
  Column (#3): SYS_NC00003$(RAW)
    AvgLen: 10.00 NDV: 207358 Nulls: 0 Density: 4.8226e-06
  Access Path: index (RangeScan)
    Index: IND1_ZSJ_TEST_DTIME
    resc_io: 3.00  resc_cpu: 23534
    ix_sel: 4.6293e-05  ix_sel_with_filters: 2.8341e-06
    Cost: 3.00  Resp: 3.00  Degree: 1
  Best:: AccessPath: IndexRange  Index: IND1_ZSJ_TEST_DTIME
         Cost: 3.00  Degree: 1  Resp: 3.00  Card: 12694.65  Bytes: 0
-----------------------------------------------------------+-----------------------------------+
| Id  | Operation                     | Name               | Rows  | Bytes | Cost  | Time      |
-----------------------------------------------------------+-----------------------------------+
| 0   | SELECT STATEMENT              |                    |       |       |     3 |           |
| 1   |  SORT AGGREGATE               |                    |     1 |    13 |       |           |
| 2   |   TABLE ACCESS BY INDEX ROWID | ZSJ_TEST_DTIME     |   12K |  161K |     3 |  00:00:01 |
| 3   |    INDEX RANGE SCAN           | IND1_ZSJ_TEST_DTIME|     1 |       |     2 |  00:00:01 |
-----------------------------------------------------------+-----------------------------------+
Predicate Information:
----------------------
3 - access("ZSJ_TEST_DTIME"."SYS_NC00003$"<HEXTORAW('8791F5FCFEF8FEFAFF') )
3 - filter(SYS_OP_UNDESCEND("ZSJ_TEST_DTIME"."SYS_NC00003$")>TO_DATE(' 2010-10-03 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
这里估算更不靠谱,也存在上面提到的问题:估算回访表后返回的行数比索引返回的行数还要多得多.

使用这个索引时的执行统计信息:
-----------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                | Starts | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------------------
|   1 |  SORT AGGREGATE              |                     |      1 |      1 |00:00:00.04 | 132 |
|   2 |   TABLE ACCESS BY INDEX ROWID| ZSJ_TEST_DTIME      |      1 |  12959 |00:00:00.03 | 132 |
|*  3 |    INDEX RANGE SCAN          | IND1_ZSJ_TEST_DTIME |      1 |  12959 |00:00:00.01 |  39 |
----------------------------------------------------------------------------------------------------
索引返回的行数确实应该是12K,13k的,而不应该是1,也不应该是751的.

就记录到这里吧,实在是搞不懂了,优化器这东西实在是太深奥了.

原创粉丝点击