数据库优化

来源:互联网 发布:淘宝361官方旗舰店 编辑:程序博客网 时间:2024/06/05 06:15

前几天做了数据库优化的培训,现将培训的内容做整理,知识得自己加工一遍才会记忆深刻,有点疑惑的是,SQL优化属于DBA的工作,却是开发来培训的。
1、首先来看看如何查看sql执行效率

--环境构造set autotrace offDROP TABLE t1 CASCADE CONSTRAINTS PURGE; DROP TABLE t2 CASCADE CONSTRAINTS PURGE; CREATE TABLE t1 (id NUMBER NOT NULL,n NUMBER,contents VARCHAR2(4000)); CREATE TABLE t2 (id NUMBER NOT NULL,t1_id NUMBER NOT NULL,n NUMBER,contents VARCHAR2(4000)); execute dbms_random.seed(0); INSERT INTO t1     SELECT  rownum,  rownum, dbms_random.string('a', 50)       FROM dual     CONNECT BY level <= 1000      ORDER BY dbms_random.random; INSERT INTO t2 SELECT rownum, rownum, rownum, dbms_random.string('b', 50) FROM dual CONNECT BY level <= 100000    ORDER BY dbms_random.random; COMMIT; 
CREATE INDEX t1_n ON t1 (n);CREATE INDEX t2_t1_id ON t2(t1_id);

t1有1000条数据,在n上建立索引,t2有100000条数据,在t1_id上建立索引。
/*
下面我们将会用多种方法来查看如下语句的执行计划
SELECT *
FROM t1, t2
WHERE t1.id = t2.t1_id
AND t1.n in(18,19);

*/
—-方法1(explain plan for 的方式。类似PLSQL DEVELOPE里的F5)

/*  步骤1:explain plan for "你的SQL"  步骤2select * from table(dbms_xplan.display()); */set autotrace offset linesize 1000set pagesize 2000explain plan forSELECT  *FROM t1, t2WHERE t1.id = t2.t1_idAND t1.n in(18,19);select * from table(dbms_xplan.display());

执行结果:

SQL> select * from table(dbms_xplan.display());PLAN_TABLE_OUTPUT--------------------------------------------------------------------------------Plan hash value: 3373872153--------------------------------------------------------------------------------| Id  | Operation                              | Name     | Rows  | Bytes | Cost--------------------------------------------------------------------------------|   0 | SELECT STATEMENT                       |          |     2 |  8138 ||   1 |  NESTED LOOPS                          |          |     2 |  8138 ||   2 |   NESTED LOOPS                         |          |     2 |  8138 ||   3 |    INLIST ITERATOR                     |          |       |       ||   4 |     TABLE ACCESS BY INDEX ROWID BATCHED| T1       |     2 |  4056 ||*  5 |      INDEX RANGE SCAN                  | T1_N     |     4 |       ||*  6 |    INDEX RANGE SCAN                    | T2_T1_ID |     1 |       ||   7 |   TABLE ACCESS BY INDEX ROWID          | T2       |     1 |  2041 |--------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------   5 - access("T1"."N"=18 OR "T1"."N"=19)   6 - access("T1"."ID"="T2"."T1_ID")Note-----   - dynamic statistics used: dynamic sampling (level=2)24 rows selected

/*
优点: 无需真正执行,快捷方便

缺陷: 1.没有输出运行时的相关统计信息(产生多少逻辑读,多少次递归调用,多少次物理读的情况);
2.可能不是真实的执行计划
*/
—-方法2(set autotrace on 方式)
/*
步骤1:set autotrace on
步骤2:在此处执行你的SQL,后续会有结果输出

另,有如下几种方式:
set autotrace on (得到执行计划,输出运行结果)
set autotrace traceonly (得到执行计划,不输出运行结果)
set autotrace traceonly explain (得到执行计划,不输出运行结果和统计信息部分,仅展现执行计划部分)
set autotrace traceonl statistics(不输出运行结果和执行计划部分,仅展现统计信息部分)
*/

set autotrace on SELECT  *FROM t1, t2WHERE t1.id = t2.t1_idAND t1.n in(18,19);

/*
–优点:可以输出运行时的相关统计信息(产生多少逻辑读,多少次递归调用,多少次物理读的情况);

–缺陷:必须要等到语句真正执行完毕后,才可以出结果,可以用traceonly开关来控制返回结果不打屏输出;

*/
—-方法3(知道sql_id后,直接带入的方式,简单,就步骤1)

/*
步骤1:得到sql_id (session,awr)

步骤2: select * from table(dbms_xplan.display_cursor(‘&sq_id’)); (该方法是从共享池里得到)

*/

set autotrace offSELECT sq.sql_id  FROM v$sql                    sq,       v$active_session_history s WHERE s.sql_id = sq.sql_id       AND sq.sql_fulltext LIKE '%AND t1.n in(18,19)%' ORDER BY s.sample_time DESC;  --显示执行效率 select * from table(dbms_xplan.display_cursor('ds48yh2v179kx'));

/*
–优点:1.知道sql_id立即可得到执行计划;
2.可以得到真实的执行计划。

–缺陷 没有输出运行时的相关统计信息(产生多少逻辑读,多少次递归调用,多少次物理读的情况);
*/

2、count的说明
drop table t purge;
create table t as select * from dba_objects;
update t set object_id =rownum ;
set timing on
set linesize 1000
set autotrace on

–Oracle有数据缓冲池和共享池,在比较性能的时候多执行几次,让执行过程稳定了再作比较。
select count(*) from t;
/
select count(object_id) from t;
/

–看来count(列)比count(*)更快是谣传,明明是一样快,真相是这样吗?
–建个索引看看
create index idx_object_id on t(object_id);
select count(*) from t;
/
select count(object_id) from t;
/

–看来真的是用COUNT(列)比COUNT()要快啊,因为COUNT()不能用到索引,而COUNT(列)可以?

alter table T modify object_id not null;
select count(*) from t;
/
select count(object_id) from t;
/
–看来count(列)和count()其实一样快,如果索引列是非空的,count()可用到索引,此时一样快!

–其实两者根本没有可比性,性能比较首先要考虑写法等价,这两个语句根本就不等价!
alter table T modify object_id null;
update t set object_id =null where rownum<=2;
set autotrace off

select count(*) from t;
/
select count(object_id) from t;
/
3、in和exist之争
drop table emp_test purge;
drop table dept_test purge;
create table emp_test as select * from emp;
create table dept_test as select * from dept;
set timing on
set linesize 1000
set autotrace traceonly explain

–NOT IN
select * from dept_test where deptno NOT IN ( select deptno from emp_test );
select * from dept_test where not exists ( select deptno from emp_test where emp_test.deptno=dept_test.deptno);
select * from dept_test where deptno NOT IN ( select deptno from emp_test where deptno is not null) and deptno is not null;
/一般来说,anti的反连接算法比filter更高效,但是在10g时,Oracle的这个算法不完善,必须要制定非空,才可以让not in 用anti算法。在11g的时候,这个情况已经改变了,无论not in 还是not exists,无论是否列为空,都可以走到Oracle比较先进高效的anti反连接算法/

–IN
select * from dept_test where deptno IN ( select deptno from emp_test );
select * from dept_test where exists ( select deptno from emp_test where emp_test.deptno=dept_test.deptno);
4、表的链接顺序
drop table tab_big;
drop table tab_small;
create table tab_big as select * from dba_objects where rownum<=30000;
create table tab_small as select * from dba_objects where rownum<=10;
set autotrace traceonly
set linesize 1000
set timing on
select count(*) from tab_big,tab_small;
select count(*) from tab_small,tab_big;

—其实任何谣言,都是有一定的影子的。看看下面的语句,比较一下性能。
select /+rule/ count(*) from tab_big,tab_small ;
select /+rule/ count(*) from tab_small,tab_big ;

select /+cost/ count(*) from tab_big,tab_small ;
select /+cost/ count(*) from tab_small,tab_big ;
/*结论:原来表连接顺序的说法早就过时了,那是基于规则的时代,现在我们是基于代价的。
RBO: Rule-Based Optimization 基于规则的优化器
CBO: Cost-Based Optimization 基于代价的优化器
在RBO中,SQL的写法往往会影响执行计划,CBO则不会。*/
5、索引影响效率
建立索引查询的效率提高了,但是插入的效率降低了
–构造表
drop table test1 purge;
drop table test2 purge;
drop table test3 purge;
drop table t purge;
create table t as select * from dba_objects;
create table test1 as select * from t;
create table test2 as select * from t;
create table test3 as select * from t;
–构造索引
create index test2_n1 on test2(status);
create index test2_n2 on test2(object_id);
create index test3_n1 on test3(owner);
create index test3_n2 on test3(object_name);
create index test3_n3 on test3(data_object_id);
create index test3_n4 on test3(created);
create index test3_n5 on test3(last_ddl_time);
create index test3_n6 on test3(status);
–开启时间
set timing on
set autotrace off;

select count(1) from dba_objects;

–语句1(test1表无索引)
insert into test1 select * from t;
commit;
–语句2(test2表有2个索引)
insert into test2 select * from t;
commit;
–语句3(test3表有6个索引)
insert into test3 select * from t;
commit;
6、避免SQL中调用函数
drop table people purge;
drop table sex purge;
create table people (first_name varchar2(200),last_name varchar2(200),sex_id number);
create table sex (name varchar2(20), sex_id number);
insert into people (first_name,last_name,sex_id) select object_name,object_type,1 from dba_objects;
insert into sex (name,sex_id) values (‘男’,1);
insert into sex (name,sex_id) values (‘女’,2);
insert into sex (name,sex_id) values (‘不详’,3);
commit;

create or replace function get_sex_name(p_id sex.sex_id%type) return sex.name%type is
v_name sex.name%type;
begin
select name
into v_name
from sex
where sex_id=p_id;
return v_name;
end;
/

–以下两种写法是等价的,都是为了查询people 表信息,同时通过sex 表,获取人员的性别信息。
–但是通过autotrace 比较观察发现两种写法性能上存在巨大差异
set autotrace traceonly statistics
select sex_id,
first_name||’ ‘||last_name full_name,
get_sex_name(sex_id) gender
from people;

select p.sex_id,
p.first_name||’ ‘||p.last_name full_name,
s.name
from people p, sex s
where s.sex_id=p.sex_id;

select p.sex_id,
p.first_name||’ ‘||p.last_name full_name,
(select s.name from sex s where s.sex_id = p.sex_id) as gender
from people p;

–标量子查询缓存
select sex_id,
first_name||’ ‘||last_name full_name,
(select get_sex_name(sex_id) from dual) gender
from people where sex_id<50;
6、警惕字段类型转化
drop table t_col_type purge;
create table t_col_type(id varchar2(30),col2 varchar2(30),col3 varchar2(30));
insert into t_col_type select rownum,’abc’,’efg’ from dual connect by level<=10000;
commit;
create index idx_id on t_col_type(id);
set linesize 1000
set autotrace traceonly

–错误的类型
select * from t_col_type where id = 6;
select * from t_col_type where to_number(id) = 6;
–实际上只有如下写法才可以用到索引,是什么类型的取值就设置什么样的字段。
select * from t_col_type where id = ‘6’;

–做查询时候不要对列做函数计算,如果有必要,对查询条件做转换
select COUNT(*) from pur_headers_all where to_char(creation_date, ‘yyyymmdd’) >= ‘20161219’;
select COUNT(*) from pur_headers_all where creation_date >= to_date(‘20161219’, ‘yyyymmdd’);


–关于nvl
update t_col_type t set t.id = null where rownum <= 3;
commit;
exec dbms_stats.gather_table_stats(‘MYDEV’,’T_COL_TYPE’,cascade=>true);
select * from t_col_type where nvl(id, ‘-1’) = ‘-1’;

–查询条件is null / is not null
select * from t_col_type where id is null;
select * from t_col_type where id is not null;

–is not null在选择率小的情况下,可能走索引。
update t_col_type t set t.id = null where rownum <=9900;
commit;
exec dbms_stats.gather_table_stats(‘MYDEV’,’T_COL_TYPE’,cascade=>true);
select * from t_col_type where id is not null;

–结论:在设计表字段的时候,尽量把字段设置为非空,可以有默认值的就设置默认值,以免在查询时用nvl或者is null做过滤,效率太低。

—以上很多实践在12c上是差别不大的,这块还有待深究,培训的内容不一定是对的。

原创粉丝点击