SQL优化--分片思想的运用

来源:互联网 发布:windows安装器怎么用 编辑:程序博客网 时间:2024/05/22 17:48

     A表和B表都有个字段是code,code是表示部门编码的意思,如03表示一个大部门,0301,0302是03的小部门,030101,030102是0301的子部门,现在有个需求,就是查A表中部门的子部门在B表中的记录,可以是两种方式:

     一种substr(A.code,1,length(B.code))=B.code,

     另一种B.code like A.code||'%'。

    下面我们来做个试验:

SQL> create table  test_object as select * from all_objects;
SQL> create table test_obj1 as select * from test_object where length(object_name)<8;
SQL> insert into test_obj1 select * from test_object
       where length(object_name) < 16
         and length(object_name) >= 8;
SQL> insert into test_obj1 select * from test_object where
    length(object_name)<24 and length(object_name)>=16;
SQL> commit;
SQL> select count(*) from test_object;
  COUNT(*)
----------
     50250

SQL> select count(*) from test_obj1;
  COUNT(*)
----------
     22564
SQL> exec dbms_stats.gather_table_stats(user,'test_object');
SQL> exec dbms_stats.gather_table_stats(user,'test_obj1');

SQL> set timing on
SQL> set autotrace traceonly
SQL> select a.*, b.object_name
      from TEST_OBJECT a, TEST_OBJ1 b
     where substr(a.object_name, 1, length(b.object_name)) = b.object_name;
已选择67744行。
已用时间:  00: 07: 36.09
执行计划
----------------------------------------------------------
Plan hash value: 661671976
----------------------------------------------------------------------------------
| Id  | Operation          | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |             | 77053 |  8427K|  3169K  (1)| 11:42:32 |
|   1 |  NESTED LOOPS      |             | 77053 |  8427K|  3169K  (1)| 11:42:32 |
|   2 |   TABLE ACCESS FULL| TEST_OBJECT | 50250 |  4563K|   147   (1)| 00:00:02 |
|*  3 |   TABLE ACCESS FULL| TEST_OBJ1   |     2 |    38 |    63   (0)| 00:00:01 |
----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("B"."OBJECT_NAME"=SUBSTR("A"."OBJECT_NAME",1,LENGTH("B"."OBJ
              ECT_NAME")))
统计信息
----------------------------------------------------------
          0  recursive calls
          0  db block gets
   15586427  consistent gets
          0  physical reads
          0  redo size
    2210117  bytes sent via SQL*Net to client
      50061  bytes received via SQL*Net from client
       4518  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      67744  rows processed

    执行计划是nested loops,特别是需要这种大表,效率低下可想而知。没有试验like,因为用它结果会更糟。下面我们采用分片处理的方法,object_name长度大于3的用一种方式处理,小于等于3的用一种方式处理。

SQL> select a.*, b.object_name
      from TEST_OBJECT a, TEST_OBJ1 b
     where substr(a.object_name, 1, length(b.object_name)) = b.object_name
       and substr(a.object_name, 1, 4) = substr(b.object_name, 1, 4)
       and length(b.object_name) > 3
    union all
    select a.*, b.object_name
      from TEST_OBJECT a, TEST_OBJ1 b
     where substr(a.object_name, 1, length(b.object_name)) = b.object_name
      and length(b.object_name) < 4;
已选择67744行。
已用时间:  00: 00: 08.57
执行计划
----------------------------------------------------------
Plan hash value: 4080738151
-----------------------------------------------------------------------------------
| Id  | Operation           | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |             |  3892 |   425K|   164K(100)| 00:36:27 |
|   1 |  UNION-ALL          |             |       |       |            |          |
|*  2 |   HASH JOIN         |             |    39 |  4368 |   214   (2)| 00:00:03 |
|*  3 |    TABLE ACCESS FULL| TEST_OBJ1   |  1128 | 21432 |    65   (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| TEST_OBJECT | 50250 |  4563K|   147   (1)| 00:00:02 |
|   5 |   NESTED LOOPS      |             |  3853 |   421K|   164K  (1)| 00:36:24 |
|*  6 |    TABLE ACCESS FULL| TEST_OBJ1   |  1128 | 21432 |    65   (0)| 00:00:01 |
|*  7 |    TABLE ACCESS FULL| TEST_OBJECT |     3 |   279 |   145   (0)| 00:00:02 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access(SUBSTR("A"."OBJECT_NAME",1,4)=SUBSTR("B"."OBJECT_NAME",1,4))
       filter("B"."OBJECT_NAME"=SUBSTR("A"."OBJECT_NAME",1,LENGTH("B"."OBJE
              CT_NAME")))
   3 - filter(LENGTH("B"."OBJECT_NAME")>3)
   6 - filter(LENGTH("B"."OBJECT_NAME")<4)
   7 - filter("B"."OBJECT_NAME"=SUBSTR("A"."OBJECT_NAME",1,LENGTH("B"."OBJE
              CT_NAME")))
统计信息
----------------------------------------------------------
          1  recursive calls
          0  db block gets
      44813  consistent gets
          0  physical reads
          0  redo size
    2310942  bytes sent via SQL*Net to client
      50061  bytes received via SQL*Net from client
       4518  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)

      67744  rows processed

    可以看到执行计划是复杂了很多,但执行时间缩短了60倍,逻辑读也大大降低。分片的思想是大部分数据处理采用hash join(由于引起了等值条件substr(a.object_name, 1, 4) = substr(b.object_name, 1, 4),所以执行计划改变),少部分数据处理采用nested loops。这种思想对我们的日常调优非常有用。