Oracle 子查询展开(subquery unnesting)

来源:互联网 发布:淘宝3o一45岁女防晒衣q 编辑:程序博客网 时间:2024/06/17 23:44
一、什么叫子查询
       查询是从表或视图中获取数据的一个操作,最上层的语句称之为查询,而嵌套在另外一个查询中的语句称之为子查询。

如:

SELECT * FROM   salesWHERE  cust_id IN ( SELECT cust_id                     FROM   customers );

在上面的语句中,

( SELECT cust_id                     FROM   customers );
我们称之为子查询。
About Queries and Subqueries A query is an operation that retrieves data from one or more tables or views. In this reference, a top-level SELECT statement is called a
query, and a query nested within another SQL statement is called a subquery.


二、什么叫子查询展开以及为什么要展开
        就是优化器将嵌套的子查询展开成一个等价的join,然后去优化这个join。上面的语句,不展开的情形是,从sales表中获取的每1条数据,都要代入子查询进行匹配。一般情况下效率都是比较低的。展开的结果可能是将sales表和customers表做一个hash join semi,从而提高效率。因为转换为join后可以从访问路径、连接方法、连接顺序等方面优化整个查询。
       In subquery unnesting, the optimizer transforms a nested query into an equivalent join statement, and then optimizes the join. This transformation enables the optimizer to consider the subquery tables during access path, join method, and join order selection. The optimizer can perform this transformation only if the resulting join statement is guaranteed to return the same rows as the original statement, and if subqueries do not contain aggregate functions such as AVG.


三、展开分类
  1)semi-join
    当子查询前的where条件:exists/in/=any
    CBO会这样转换:SU: Transform an ANY subquery to semi-join or distinct
  2)anti-join
    当子查询前的where条件:not exists/not in/<>all
    CBO会这样转换:SU: Transform ALL/NOTEXISTS subquery into a regular anti-join


四、相关参数(默认为true)

KSPPINM                                  KSPPSTVL   KSPPDESC---------------------------------------- ---------- --------------------------------------------------_unnest_subquery                         TRUE       enables unnesting of complex subqueries_optimizer_unnest_all_subqueries         TRUE       enables unnesting of every type of subquery
hint:unnest/no_unnest


五、限制
        必要条件:对于semi-join和anti-join都有一个共同的限制就是_unnest_subquery和_optimizer_unnest_all_subqueries需要同时为true,而且子查询不能与or在一个查询块内,如:select ... from xxx where exists(subquery) or xxx

 1)semi-join
    A.子查询内只存在一个表的
这种情况属于启发式查询转化,只要满足上述的必要条件,CBO就会做子查询展开,而不去考虑cost。但是实验证明在没有统计信息采用动态采样的情况下,子查询是否展开还是会基于cost的。
这里CBO选择了cost较低的不让子查询展开。

    B.子查询内存在多表关联的
这种情况CBO会将子查询转化成一个内嵌视图后,再与外部查询做半连接,但是必须满足子查询展开后的COST<未子查询展开的COST

 2)anti-join
    A.NOT EXISTS
这种情况需要满足上述的必要条件,并且必须满足子查询展开后的COST<未子查询展开的COST

    B.NOT IN/<>ALL
这种情况除了要需要满足上述的必要条件和子查询展开后的COST<未子查询展开的COST之外,由于对null敏感还需要满足下列条件:
(1)11G前必须连接列上有not null约束或者在sql上写明连接列is not null,CBO会对子查询展开做反连接(ANTI JOIN)
(2)11G以后只需要参数_optimizer_null_aware_antijoin=true,CBO会对子查询展开做一个针对null的反连接(ANTI NA JOIN)

注:11g之后如果_optimizer_null_aware_antijoin=false,但是连接列上有not null约束或者在sql上写明连接列is not null,CBO仍然会对子查询展开做一个正常的反连接(ANTI JOIN)


六、例子

6.1、数据准备,拷贝scott用户下的两张表,规避主外键约束、索引的影响[oracle@sean ~]$ sqlplus scott/tigerSQL*Plus: Release 11.2.0.4.0 Production on Mon Jul 10 14:24:28 2017SCOTT@sean> create table t_emp as select * from emp;Table created.SCOTT@sean> create table t_dept as select * from dept;Table created.SCOTT@sean> exec dbms_stats.gather_schema_stats(user);PL/SQL procedure successfully completed.SCOTT@sean> set linesize 200SCOTT@sean> set pagesize 200SCOTT@sean> select * from t_emp;     EMPNO ENAME      JOB              MGR HIREDATE         SAL       COMM     DEPTNO---------- ---------- --------- ---------- --------- ---------- ---------- ----------      7369 SMITH      CLERK           7902 17-DEC-80        800                    20      7499 ALLEN      SALESMAN        7698 20-FEB-81       1600        300         30      7521 WARD       SALESMAN        7698 22-FEB-81       1250        500         30      7566 JONES      MANAGER         7839 02-APR-81       2975                    20      7654 MARTIN     SALESMAN        7698 28-SEP-81       1250       1400         30      7698 BLAKE      MANAGER         7839 01-MAY-81       2850                    30      7782 CLARK      MANAGER         7839 09-JUN-81       2450                    10      7788 SCOTT      ANALYST         7566 19-APR-87       3000                    20      7839 KING       PRESIDENT            17-NOV-81       5000                    10      7844 TURNER     SALESMAN        7698 08-SEP-81       1500          0         30      7876 ADAMS      CLERK           7788 23-MAY-87       1100                    20      7900 JAMES      CLERK           7698 03-DEC-81        950                    30      7902 FORD       ANALYST         7566 03-DEC-81       3000                    20      7934 MILLER     CLERK           7782 23-JAN-82       1300                    1014 rows selected.SCOTT@sean> select * from t_dept;    DEPTNO DNAME          LOC---------- -------------- -------------        10 ACCOUNTING     NEW YORK        20 RESEARCH       DALLAS        30 SALES          CHICAGO        40 OPERATIONS     BOSTON        SCOTT@sean> set autotrace traceonly exp6.2、我们要观察的是以下这条语句,oracle将子查询展开,然后进行HASH JOIN SEMI连接。SCOTT@sean> select * from t_emp where deptno in (select deptno from t_dept where dname='SALES');Execution Plan----------------------------------------------------------Plan hash value: 741372481-----------------------------------------------------------------------------| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |-----------------------------------------------------------------------------|   0 | SELECT STATEMENT   |        |     5 |   255 |     6   (0)| 00:00:01 ||*  1 |  HASH JOIN SEMI    |        |     5 |   255 |     6   (0)| 00:00:01 ||   2 |   TABLE ACCESS FULL| T_EMP  |    14 |   532 |     3   (0)| 00:00:01 ||*  3 |   TABLE ACCESS FULL| T_DEPT |     1 |    13 |     3   (0)| 00:00:01 |-----------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------   1 - access("DEPTNO"="DEPTNO")   3 - filter("DNAME"='SALES')6.3、使用no_unnest这个hint禁用子查询展开,可见使用了FILTER操作,执行计划总成本为12,高于上面展开的执行计划。SCOTT@sean> select * from t_emp where deptno in (select /*+ no_unnest */ deptno from t_dept where dname='SALES');Execution Plan----------------------------------------------------------Plan hash value: 3716950600-----------------------------------------------------------------------------| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |-----------------------------------------------------------------------------|   0 | SELECT STATEMENT   |        |     5 |   190 |    12   (0)| 00:00:01 ||*  1 |  FILTER            |        |       |       |            |          ||   2 |   TABLE ACCESS FULL| T_EMP  |    14 |   532 |     3   (0)| 00:00:01 ||*  3 |   TABLE ACCESS FULL| T_DEPT |     1 |    13 |     3   (0)| 00:00:01 |-----------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------   1 - filter( EXISTS (SELECT /*+ NO_UNNEST */ 0 FROM "T_DEPT" "T_DEPT"              WHERE "DEPTNO"=:B1 AND "DNAME"='SALES'))   3 - filter("DEPTNO"=:B1 AND "DNAME"='SALES') 6.4、子查询和or连用,可见子查询无法展开SCOTT@sean> select * from t_emp where deptno in (select deptno from t_dept where dname='SALES') or deptno=10;Execution Plan----------------------------------------------------------Plan hash value: 3716950600-----------------------------------------------------------------------------| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |-----------------------------------------------------------------------------|   0 | SELECT STATEMENT   |        |     5 |   190 |     3   (0)| 00:00:01 ||*  1 |  FILTER            |        |       |       |            |          ||   2 |   TABLE ACCESS FULL| T_EMP  |    14 |   532 |     3   (0)| 00:00:01 ||*  3 |   TABLE ACCESS FULL| T_DEPT |     1 |    13 |     3   (0)| 00:00:01 |-----------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------   1 - filter("DEPTNO"=10 OR  EXISTS (SELECT 0 FROM "T_DEPT" "T_DEPT"              WHERE "DEPTNO"=:B1 AND "DNAME"='SALES'))   3 - filter("DEPTNO"=:B1 AND "DNAME"='SALES')SCOTT@sean> select * from t_emp where deptno in (select /*+ unnest */ deptno from t_dept where dname='SALES') or deptno=10;Execution Plan----------------------------------------------------------Plan hash value: 3716950600-----------------------------------------------------------------------------| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |-----------------------------------------------------------------------------|   0 | SELECT STATEMENT   |        |     5 |   190 |     3   (0)| 00:00:01 ||*  1 |  FILTER            |        |       |       |            |          ||   2 |   TABLE ACCESS FULL| T_EMP  |    14 |   532 |     3   (0)| 00:00:01 ||*  3 |   TABLE ACCESS FULL| T_DEPT |     1 |    13 |     3   (0)| 00:00:01 |-----------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------   1 - filter("DEPTNO"=10 OR  EXISTS (SELECT /*+ UNNEST */ 0 FROM              "T_DEPT" "T_DEPT" WHERE "DEPTNO"=:B1 AND "DNAME"='SALES'))   3 - filter("DEPTNO"=:B1 AND "DNAME"='SALES')6.5、NOT IN/<>ALL对null敏感,注意规避和不同优化器版本的选择SCOTT@sean> select * from t_emp where deptno not in (select deptno from t_dept where dname='SALES');Execution Plan----------------------------------------------------------Plan hash value: 2687964391-----------------------------------------------------------------------------| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |-----------------------------------------------------------------------------|   0 | SELECT STATEMENT   |        |     9 |   459 |     6   (0)| 00:00:01 ||*  1 |  HASH JOIN ANTI NA |        |     9 |   459 |     6   (0)| 00:00:01 ||   2 |   TABLE ACCESS FULL| T_EMP  |    14 |   532 |     3   (0)| 00:00:01 ||*  3 |   TABLE ACCESS FULL| T_DEPT |     1 |    13 |     3   (0)| 00:00:01 |-----------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------   1 - access("DEPTNO"="DEPTNO")   3 - filter("DNAME"='SALES')SCOTT@sean> select value from v$parameter_valid_values where name = 'optimizer_features_enable';VALUE--------------------------------------------------------------------------------......9.2.0.810.1.010.1.0.310.1.0.410.1.0.510.2.0.110.2.0.210.2.0.310.2.0.410.2.0.511.1.0.611.1.0.711.2.0.111.2.0.211.2.0.311.2.0.411.2.0.4.132 rows selected.优化器在10g模式下的情形,列上没有not null约束,也没有显示写明连接列is not nullSCOTT@sean> select /*+ optimizer_features_enable('10.1.0') */ * from t_emp where deptno not in (select deptno from t_dept where dname='SALES');Execution Plan----------------------------------------------------------Plan hash value: 3716950600-----------------------------------------------------------------------------| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |-----------------------------------------------------------------------------|   0 | SELECT STATEMENT   |        |     9 |   342 |    12   (0)| 00:00:01 ||*  1 |  FILTER            |        |       |       |            |          ||   2 |   TABLE ACCESS FULL| T_EMP  |    14 |   532 |     3   (0)| 00:00:01 ||*  3 |   TABLE ACCESS FULL| T_DEPT |     1 |    13 |     3   (0)| 00:00:01 |-----------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------   1 - filter( NOT EXISTS (SELECT 0 FROM "T_DEPT" "T_DEPT" WHERE              "DNAME"='SALES' AND LNNVL("DEPTNO"<>:B1)))   3 - filter("DNAME"='SALES' AND LNNVL("DEPTNO"<>:B1))优化器在10g模式下的情形,写明连接列is not null--外层限制 not null,无效果SCOTT@sean> select /*+ optimizer_features_enable('10.1.0') */ * from t_emp where deptno not in (select deptno from t_dept where dname='SALES') and deptno is not null;Execution Plan----------------------------------------------------------Plan hash value: 3716950600-----------------------------------------------------------------------------| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |-----------------------------------------------------------------------------|   0 | SELECT STATEMENT   |        |     9 |   342 |    12   (0)| 00:00:01 ||*  1 |  FILTER            |        |       |       |            |          ||*  2 |   TABLE ACCESS FULL| T_EMP  |    14 |   532 |     3   (0)| 00:00:01 ||*  3 |   TABLE ACCESS FULL| T_DEPT |     1 |    13 |     3   (0)| 00:00:01 |-----------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------   1 - filter( NOT EXISTS (SELECT 0 FROM "T_DEPT" "T_DEPT" WHERE              "DNAME"='SALES' AND LNNVL("DEPTNO"<>:B1)))   2 - filter("DEPTNO" IS NOT NULL)   3 - filter("DNAME"='SALES' AND LNNVL("DEPTNO"<>:B1))--内层限制 not null,无效果SCOTT@sean> select /*+ optimizer_features_enable('10.1.0') */ * from t_emp where deptno not in (select deptno from t_dept where dname='SALES' and deptno is not null);Execution Plan----------------------------------------------------------Plan hash value: 3716950600-----------------------------------------------------------------------------| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |-----------------------------------------------------------------------------|   0 | SELECT STATEMENT   |        |     9 |   342 |    12   (0)| 00:00:01 ||*  1 |  FILTER            |        |       |       |            |          ||   2 |   TABLE ACCESS FULL| T_EMP  |    14 |   532 |     3   (0)| 00:00:01 ||*  3 |   TABLE ACCESS FULL| T_DEPT |     1 |    13 |     3   (0)| 00:00:01 |-----------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------   1 - filter( NOT EXISTS (SELECT 0 FROM "T_DEPT" "T_DEPT" WHERE              "DNAME"='SALES' AND "DEPTNO" IS NOT NULL AND LNNVL("DEPTNO"<>:B1)))   3 - filter("DNAME"='SALES' AND "DEPTNO" IS NOT NULL AND              LNNVL("DEPTNO"<>:B1))--内外层限制 not nullSCOTT@sean> select /*+ optimizer_features_enable('10.1.0') */ * from t_emp where deptno not in (select deptno from t_dept where dname='SALES' and deptno is not null);Execution Plan----------------------------------------------------------Plan hash value: 3716950600-----------------------------------------------------------------------------| Id  | Operation          | Name   | Rows  | Bytes | Cost (%CPU)| Time     |-----------------------------------------------------------------------------|   0 | SELECT STATEMENT   |        |     9 |   342 |    12   (0)| 00:00:01 ||*  1 |  FILTER            |        |       |       |            |          ||   2 |   TABLE ACCESS FULL| T_EMP  |    14 |   532 |     3   (0)| 00:00:01 ||*  3 |   TABLE ACCESS FULL| T_DEPT |     1 |    13 |     3   (0)| 00:00:01 |-----------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------   1 - filter( NOT EXISTS (SELECT 0 FROM "T_DEPT" "T_DEPT" WHERE              "DNAME"='SALES' AND "DEPTNO" IS NOT NULL AND LNNVL("DEPTNO"<>:B1)))   3 - filter("DNAME"='SALES' AND "DEPTNO" IS NOT NULL AND              LNNVL("DEPTNO"<>:B1))


七、注意事项

CBO会选择cost最小的执行计划,所以他会在合适的时候做适当的展开,我们需要注意的就是不要一些限制因素使得子查询无法展开。
比如:
子查询不能与or在一个查询块内,如:select ... from xxx where exists(subquery) or xxx
NOT IN/<>ALL由于对null敏感
......      

三、四、五参考博客 http://blog.itpub.net/31347199/viewspace-2120832/


原创粉丝点击