对Group By 语句的一次优化过程(hellodba)
来源:互联网 发布:淘宝宝贝上架时间技巧 编辑:程序博客网 时间:2024/06/06 14:14
生产环境中发现一条语句很慢,拿回来一看,其实是一个简单的Group By语句:
表CCMMT的数据量比较大,5M多条记录。
1、SQL> select CDE, CID 2 from CCMMT 3 GROUP BY CDE, CID 4 having max(ADT) < sysdate - 180; 707924 rows selected. Elapsed: 00:06:17.49 Execution Plan---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=414 Card=238583 Bytes=4771660) 1 0 FILTER 2 1 SORT (GROUP BY NOSORT) (Cost=414 Card=238583 Bytes=4771660) 3 2 TABLE ACCESS (BY INDEX ROWID) OF 'CCMMT' (Cost=414 Card=57969096 Bytes=1159381920) 4 3 INDEX (FULL SCAN) OF 'CCMMT_TEMP_IDX' (NON-UNIQUE) (Cost=26 Card=57969096) Statistics---------------------------------------------------------- 0 recursive calls 0 db block gets 2769177 consistent gets 1089991 physical reads 0 redo size 23926954 bytes sent via SQL*Net to client 519785 bytes received via SQL*Net from client 47196 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 707924 rows processed
要6min多返回。尝试调整语句写法,用minus代替Group By:
2、
SQL> select DISTINCT CDE, CID 2 from CCMMT 3 where ADT < sysdate - 180 4 minus 5 select DISTINCT CDE, CID 6 from CCMMT 7 where ADT >= sysdate - 180; 707924 rows selected. Elapsed: 00:00:21.53 Execution Plan---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=190624 Card=2794940 Bytes=111797600) 1 0 MINUS 2 1 SORT* (UNIQUE) (Cost=95312 Card=2794940 Bytes=55898800) :Q13049001 3 2 INDEX* (FAST FULL SCAN) OF 'CCMMT_UQ1' (UNIQUE) (Cost=77305 Card=2898455 Bytes=57969100) :Q13049000 4 1 SORT* (UNIQUE) (Cost=95312 Card=2794940 Bytes=55898800) :Q13050001 5 4 INDEX* (FAST FULL SCAN) OF 'CCMMT_UQ1' (UNIQUE) (Cost=77305 Card=2898455 Bytes=57969100) :Q13050000 2 PARALLEL_TO_SERIAL SELECT DISTINCT C0 C0,C1 C1 FROM :Q13049000 ORDER BY C0,C1 3 PARALLEL_TO_PARALLEL SELECT /*+ INDEX_RRS(A1 "CCMMT_UQ1")*/ A1."CDE" C0,A1."CA 4 PARALLEL_TO_SERIAL SELECT DISTINCT C0 C0,C1 C1 FROM :Q13050000 ORDER BY C0,C1 5 PARALLEL_TO_PARALLEL SELECT /*+ INDEX_RRS(A1 "CCMMT_UQ1")*/ A1."CDE" C0,A1."CA Statistics---------------------------------------------------------- 0 recursive calls 33 db block gets 126566 consistent gets 129243 physical reads 0 redo size 18461368 bytes sent via SQL*Net to client 519785 bytes received via SQL*Net from client 47196 SQL*Net roundtrips to/from client 4 sorts (memory) 2 sorts (disk) 707924 rows processed
效果不错,Consistent gets 和 Physical Reads都下降了,同时只需要21s就返回了。但从查询计划看,用到了并行查询,因此会消耗更多的CPU。
在(ADT, CDE, CID )上创建索引,再次执行:
3、
SQL> select DISTINCT CDE, CID 2 from CCMMT 3 where ADT < sysdate - 180 4 minus 5 select DISTINCT CDE, CID 6 from CCMMT 7 where ADT >= sysdate - 180; 707924 rows selected. Elapsed: 00:00:26.94 Execution Plan---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=36018 Card=2794940 Bytes=111797600) 1 0 MINUS 2 1 SORT (UNIQUE) (Cost=18009 Card=2794940 Bytes=55898800) 3 2 INDEX (RANGE SCAN) OF 'CCMMT_IDX3' (NON-UNIQUE) (Cost=2 Card=2898455 Bytes=57969100) 4 1 SORT (UNIQUE) (Cost=18009 Card=2794940 Bytes=55898800) 5 4 INDEX (RANGE SCAN) OF 'CCMMT_IDX3' (NON-UNIQUE) (Cost=2 Card=2898455 Bytes=57969100) Statistics---------------------------------------------------------- 0 recursive calls 118 db block gets 22565 consistent gets 31604 physical reads 0 redo size 18461368 bytes sent via SQL*Net to client 519785 bytes received via SQL*Net from client 47196 SQL*Net roundtrips to/from client 1 sorts (memory) 1 sorts (disk) 707924 rows processed
效果也比较理想,consistent gets和physical reads再次大大下降,返回时间和上面差不多,在一个数量级上,但是不再使用并行查询了。
用NOT Exists代替minus:
4、
SQL> select DISTINCT CDE, CID 2 from CCMMT a 3 where ADT < sysdate - 180 4 AND NOT EXISTS 5 (SELECT CDE, CID FROM 6 (select DISTINCT CDE, CID 7 from CCMMT 8 where ADT >= sysdate - 180) b 9 WHERE a.CDE = b.CDE 10 AND a.CID = b.CID); 707924 rows selected. Elapsed: 00:10:35.70 Execution Plan---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=600 Card=144923 Bytes=2898460) 1 0 SORT (UNIQUE) (Cost=600 Card=144923 Bytes=2898460) 2 1 INDEX (RANGE SCAN) OF 'CCMMT_IDX3' (NON-UNIQUE)(Cost=2 Card=144923 Bytes=2898460) 3 2 TABLE ACCESS (BY INDEX ROWID) OF 'CCMMT' (Cost=2 Card=1 Bytes=20) 4 3 INDEX (RANGE SCAN) OF 'CCMMT_TEMP_IDX' (NON-UNIQUE) (Cost=1 Card=9) Statistics---------------------------------------------------------- 5 recursive calls 118 db block gets 40535587 consistent gets 3157604 physical reads 0 redo size 18461368 bytes sent via SQL*Net to client 519785 bytes received via SQL*Net from client 47196 SQL*Net roundtrips to/from client 2 sorts (memory) 1 sorts (disk) 707924 rows processed FT! consistent gets和physical reads爆涨,10min才返回结果!用Not In换掉Not Exists:
5、
SQL> select DISTINCT CDE, CID 2 from CCMMT a 3 where ADT < sysdate - 180 4 AND (CDE, CID) NOT IN 5 (select DISTINCT CDE, CID 6 from CCMMT 7 where ADT >= sysdate - 180); 707924 rows selected. Elapsed: 00:01:00.70 Execution Plan---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=36425 Card=1 Bytes=40) 1 0 SORT (UNIQUE NOSORT) (Cost=36425 Card=1 Bytes=40) 2 1 MERGE JOIN (ANTI) (Cost=36423 Card=1 Bytes=40) 3 2 SORT (JOIN) (Cost=18212 Card=2898455 Bytes=57969100) 4 3 INDEX (RANGE SCAN) OF 'CCMMT_IDX3' (NON-UNIQUE) (Cost=2 Card=2898455 Bytes=57969100) 5 2 SORT (UNIQUE) (Cost=18212 Card=2898455 Bytes=57969100) 6 5 INDEX (RANGE SCAN) OF 'CCMMT_IDX3' (NON-UNIQUE) (Cost=2 Card=2898455 Bytes=57969100) Statistics---------------------------------------------------------- 0 recursive calls 419 db block gets 22565 consistent gets 98692 physical reads 0 redo size 18461368 bytes sent via SQL*Net to client 519785 bytes received via SQL*Net from client 47196 SQL*Net roundtrips to/from client 1 sorts (memory) 1 sorts (disk) 707924 rows processed
恩,consistent gets和建了索引时的minus方式一样,但是physical reads太大,返回时间太长---1min。同时用到了刚才建的索引。(呵呵,所以说,NOT EXISTS并不是什么情况下都比NOT IN更优啊)
在尝试用left join + is null代替not in:
6、
SQL> SELECT a.CDE, a.CID 2 FROM 3 (select DISTINCT CDE, CID 4 from CCMMT 5 where ADT < sysdate - 180) a, 6 (select DISTINCT CDE, CID 7 from CCMMT 8 where ADT >= sysdate - 180) b 9 WHERE a.CDE = b.CDE(+) 10 AND a.CID = b.CID(+) 11 AND b.CDE IS NULL; 707924 rows selected. Elapsed: 00:00:25.46 Execution Plan---------------------------------------------------------- 0 SELECT STATEMENT Optimizer=CHOOSE (Cost=54675 Card=2794940 Bytes=117387480) 1 0 FILTER 2 1 MERGE JOIN (OUTER) 3 2 VIEW (Cost=18009 Card=2794940 Bytes=58693740) 4 3 SORT (UNIQUE) (Cost=18009 Card=2794940 Bytes=55898800) 5 4 INDEX (RANGE SCAN) OF 'CCMMT_IDX3' (NON-UNIQUE) (Cost=2 Card=2898455 Bytes=57969100) 6 2 SORT (JOIN) (Cost=36667 Card=2794940 Bytes=58693740) 7 6 VIEW (Cost=18009 Card=2794940 Bytes=58693740) 8 7 SORT (UNIQUE) (Cost=18009 Card=2794940 Bytes=55898800) 9 8 INDEX (RANGE SCAN) OF 'CCMMT_IDX3' (NON-UNIQUE) (Cost=2 Card=2898455 Bytes=57969100) Statistics---------------------------------------------------------- 10 recursive calls 118 db block gets 22569 consistent gets 31300 physical reads 0 redo size 18461368 bytes sent via SQL*Net to client 519785 bytes received via SQL*Net from client 47196 SQL*Net roundtrips to/from client 6 sorts (memory) 1 sorts (disk) 707924 rows processed
效果不错,和有索引时使用minus在同一数量级上。
总结,以上几种方式中,效果最好的应该是第3种和第6种,buffer gets、磁盘IO和CPU消耗都比较少,返回时间大大减少,但是需要新建一个索引,消耗更多磁盘空间,并存在影响其它语句的正常查询计划的风险。而第2种方式应该是次好的。在返回时间上,和上面两种差不多,不需要新的索引,但是会消耗更多的内存、磁盘和CPU资源。
出于综合考虑,采用了第2种方式对生产库进行了优化。
- 对Group By 语句的一次优化过程(hellodba)
- 对Group By 语句的一次优化过程
- 对Group By 语句的一次优化过程
- 对Group By 语句的一次优化过程
- 对Group By 语句的一次优化过程
- 对Group By 语句的一次优化过程
- 对Group By 语句的一次优化过程
- 优化Group By语句
- Mysql group by语句的优化
- Mysql group by语句的优化
- 含join,order by,group by的复杂语句优化
- MySQL Group By的优化
- Vertica的这些事<十>—— vertica中group by 和join 语句的优化
- 关于having,group by 的Sql 语句
- SQL group by语句的使用
- 数据库 GROUP BY 语句的操作方式
- oracle的group by语句注意
- 关于having,group by 的Sql 语句
- 【好文】傅盛:如何快慢“炼”金山?
- 从360出走的傅盛,正在Copy一个傅氏的360帝国
- androidmanifest.xml权限中文说明
- Android 学习笔记四(开发环境搭建)
- 消除通过重写Object的equals方法重复的对象
- 对Group By 语句的一次优化过程(hellodba)
- c# arrayList LinkedList Dictionary
- 递归算法,及其在树形结构中的应用
- freemarker入门 list指令
- speex 基本编码/解码流程说明
- [知识库分享系列] 一、开篇
- Android的AlertDialog详解
- SQLite介绍及使用
- 也许能启发你的素材(不断更新)