not in , not exists 语句的N种写法

来源:互联网 发布:linux创建txt文件命令 编辑:程序博客网 时间:2024/05/16 10:18

今天,发现production上有一个SQL 耗了特多CPU time,于是抓了下来,看看是否可以tuning .

原始 SQL 如下(table名字改了一下):

代码:

DELETE FROM test_table_1 a
      WHERE NOT EXISTS
(SELECT *
                          
FROM test_table_2 b
                         WHERE a
.parent_id = b.id)

执行时间 60S
Buffer get
: 160W
执行计划

Operation    Object Name    Rows    Bytes    Cost    Object Node    In
/Out    PStart    PStop

DELETE STATEMENT Optimizer Mode
=CHOOSE        1           2069                                      
  DELETE    TEST_TABLE_1                                                    
    HASH JOIN ANTI        1      26      2069                                      
      TABLE ACCESS FULL    TEST_TABLE_1    5 M    106 M    1210                                      
      TABLE ACCESS FULL    TEST_TABLE_2    46 K    181 K    48                                      
                                    

其中
test_table_1 row count 为 4百万
test_table_2 为 为 4万


TEST_TABLE_1上的 parent_id上有index
.
  
...

 

第一种变换(not in)  

现在来试一下变换

用not in 代替 not exists

代码:

DELETE FROM test_table_1 a
      WHERE parent_id  NOT IN
(SELECT id
                          FROM test_table_2 b
)

..


在9i 里面,Oracle把这个语句和前面的语句看为同一个语句,当然执行计划和执行时间也没有差别。

 

第二种变换(minus)

第二种变换(minus)

用minus,

把 not exists 部分先用minus去掉,再用in 来选择data。

代码:

DELETE FROM test_table_1 a
      WHERE parent_id IN
(SELECT parent_id
                            FROM test_table_1
                          MINUS
                          SELECT ID
                            FROM test_table_2 b
)

执行时间 : 8S
Buffer get
:20000

执行计划

Operation    Object Name    Rows    Bytes    Cost    Object Node    In
/Out    PStart    PStop

DELETE STATEMENT Optimizer Mode
=CHOOSE        2 G         23052                                      
  DELETE    TEST_TABLE_1                                                    
    HASH JOIN        2 G    89G    23052                                      
      VIEW    SYS
.VW_NSO_1    5 M    63 M    5488                                      
        MINUS                                                        
          SORT UNIQUE        5 M    19 M                                         
            INDEX FAST FULL SCAN    TEST_INDX_1    5 M    19 M    457                                      
          SORT UNIQUE        46 K    181 K                                         
            TABLE ACCESS FULL    TEST_TABLE_2    46 K    181 K    48                                      
      TABLE ACCESS FULL    TEST_TABLE_1    5 M    106 M    1210                                      

速度提高了不少。
...

 


(minus + distinct )

在 minus之前加个 distinct看看

代码:

DELETE FROM test_table_1 a
      WHERE parent_id IN
(SELECT distinct parent_id
                            FROM test_table_1
                          MINUS
                          SELECT ID
                            FROM test_table_2 b
)

执行时间 : 8S
Buffer get
:20000

执行计划

Operation    Object Name    Rows    Bytes    Cost    Object Node    In
/Out    PStart    PStop

DELETE STATEMENT Optimizer Mode
=CHOOSE        2 G         23052                                       
  DELETE    TEST_TABLE_1                                                     
    HASH JOIN        2 G    89G    23052                                       
      VIEW    SYS
.VW_NSO_1    5 M    63 M    5488                                       
        MINUS                                                         
          SORT UNIQUE        5 M    19 M                                          
            INDEX FAST FULL SCAN    TEST_INDX_1    5 M    19 M    457                                       
          SORT UNIQUE        46 K    181 K                                          
            TABLE ACCESS FULL    TEST_TABLE_2    46 K    181 K    48                                       
      TABLE ACCESS FULL    TEST_TABLE_1    5 M    106 M    1210           

和上一个一模一样。
由于用了in 的操作符,Oracle内部就给加了个  
(SORT UNIQUE        )

 


可以用 in ,当然就可以用 exists 或者join 了(delete不好用join,只好改为SELECT)

代码:

Join

SELECT
* FROM (SELECT DISTINCT parent_id
                                   FROM TEST_TABLE_1
                                 MINUS
                                 SELECT id
                                   FROM TEST_TABLE_2 b
) tmp,TEST_TABLE_1 a
      WHERE  tmp
.parent_id   = a.id
执行时间 8S
Buffer get
: 20000

Operation    OBJECT Name    ROWS    Bytes    Cost    OBJECT Node    IN
/OUT    PStart    PStop

SELECT STATEMENT Optimizer MODE
=CHOOSE        5 M         4493                                      
  HASH JOIN        5 M    237 M    4493                                      
    VIEW        9 K    125 K    3161                                      
      MINUS                                                        
        SORT UNIQUE        9 K    38 K                                         
          INDEX FAST FULL SCAN    TEST_INDX_1    5 M    19 M    457                                      
        SORT UNIQUE        46 K    181 K                                         
          TABLE ACCESS FULL    TEST_TABLE_2    46 K    181 K    48                                      
    TABLE ACCESS FULL    TEST_TABLE_1    5 M    164 M    1210                

用exists的

DELETE FROM TEST_TABLE_1 a  WHERE EXISTS
(SELECT 1 FROM (SELECT DISTINCT parent_id
                                   FROM TEST_TABLE_1
                                 MINUS
                                 SELECT id
                                   FROM TEST_TABLE_2 b
)     
   
WHERE parent_id  = a.parent_id
执行效率惨不忍睹

...

 


除了上面几种,还有一种不是很常用的方法
就是(outer join + is null )(也是为了方便,先用SELECT来演示,)

代码:

select
* from test_table_1 a,test_table_2 b
where a
.parent_id = b.id(+)
and
b.id is null

在本例中执行效率也挺差的。(超过1min)
Operation    OBJECT Name    ROWS    Bytes    Cost    OBJECT Node    IN
/OUT    PStart    PStop

SELECT STATEMENT Optimizer MODE
=CHOOSE        24 M         3311                                      
  FILTER                                                        
    HASH JOIN OUTER                                                        
      TABLE ACCESS FULL    TEST_TABLE_1    5 M    164 M    1210                                      
      TABLE ACCESS FULL    TEST_TABLE_2    46 K    5 M    48       <



代码:

加个 distinct 看看

SELECT a
.*   FROM  TEST_TABLE_2 b ,(SELECT DISTINCT parentid FROM TEST_TABLE_1) a
                                  WHERE a
.parentid = b.id(+)
                                        AND
b.id IS NULL
                            
执行时间:1S
Buffer gets
: 1000    
                                        
Operation    OBJECT Name    ROWS    Bytes    Cost    OBJECT Node    IN
/OUT    PStart    PStop

SELECT STATEMENT Optimizer MODE
=CHOOSE        47 K         3117                                      
  FILTER                                                        
    HASH JOIN OUTER                                                        
      VIEW        9 K    125 K    3067                                      
        SORT UNIQUE        9 K    38 K    3067                                      
          INDEX FAST FULL SCAN    TEST_INDX_1    5 M    19 M    457                                      
      TABLE ACCESS FULL    TEST_TABLE_2    46 K    181 K    48           
      
...

 


看来最后一种是最有效哦
改成DELETE形式看看

代码:

DELETE FROM TEST_TABLE_1 WHERE parentid IN
(
SELEC parentid   FROM  TEST_TABLE_2 b ,(SELECT DISTINCT parentid FROM TEST_TABLE_1) a
                                  WHERE a
.parentid = b.id(+)
                                        AND
b.id IS NULL
)


成功了,执行时间 1S (实际上是 800-900ms)
Buffer_gets :1000
执行计划
Operation    OBJECT Name    ROWS    Bytes    Cost    OBJECT Node    IN
/OUT    PStart    PStop

DELETE STATEMENT Optimizer MODE
=CHOOSE        6 K         119                                      
  DELETE    TEST_TABLE_1                                                    
    HASH JOIN SEMI        6 K    171 K    119                                      
      TABLE ACCESS FULL    TEST_TABLE_1    6 K    91 K    26                                      
      VIEW    SYS
.VW_NSO_1    2 M    30 M    71                                      
        FILTER                                                        
          HASH JOIN OUTER                                                        
            INDEX FAST FULL SCAN    TEST_INDX_1    6 K    18 K    23                                      
            TABLE ACCESS FULL    TEST_TABLE_2    46 K    136 K    47      

...

 


 

quote:
最初由 rollingpig 发布
在 9i 里面,in 和 exists已经基本一样,Oracle会在内部自动作转换

倒是 我提到的其中几个转换值的深究

1。in / exists ==> join

2。Add distinct in IN Clause or subquery

3. Add condition where rownum < 2 in exists clause (这个忘了演示了)

4, not exists / not in ==> outrt join + is null condition



甚至,可以反过来转换,有时候也有意想不到的效果。

某些时候

1。join ==> in / exists

2. 去掉某些distinct

3. outrt join + is null condition ==> not exists / not in

正确的应该是
outer join + is null condition

代码:

select
* from a
where a
.col1 not in (select col2 from b)

==>
select a.* from a,b
where a
.col1 = b.col2(+)
and
col2 is null

..

 


原创粉丝点击