SQL语句级别的优化总结

来源：互联网发布：人工智能 04 哈工大编辑：程序博客网时间：2024/04/30 13:15

第一部分：sql级别的优化：
1、 SQL语句尽量都大写字母出现。
2、查询时，如果基表（from最后面的表）,数据库语句处理from后面的语句时，是从右侧往
左侧处理的，那么选择数据量最小的表作为基表，可加快查询速度，同时，如果三张表，中间的那个交叉表，作为基表。
例如：A，1万条， B，10万条， C，100万条数据。
关联条件： A.id=B.id and C.sid=B.sid此时，B就是交叉表。
查询时： select * from C,A,B where A.id=B.id and C.sid=B.sid (此时，B作为基表出现。)

3、where后面的条件顺序：连接条件放前面，而其他的条件放后面，由于sql从右侧往左侧执行，此时可以过滤掉大部分数据，
较少不必要的连接次数。
4、sql语句，尽量较少对数据库表的访问次数。
这里介绍两种：
（1）子查询时，没用到组函数的情况下，可以使用外连接做，这样可以较少对表访问次数。
例如：查询跟SMITH同一部门的员工信息：
select * from emp where deptno=(select deptno from emp where ename='SMITH') and ename!='SMITH'
（效率低下，访问多次表）

select d.* from emp e left join emp d on e.deptno=d.deptno where e.ename='SMITH' and d.ename!='SMITH'

select d.* from emp e , emp d where e.deptno=d.deptno and e.ename='SMITH' and d.ename!='SMITH'

(2) 一般用于统计时，用到了组函数，每次统计数据时都分了很多情况，此时可考虑用
case when then end来做。

例如：成绩表：score
编号学号成绩课程编号
coreNum stno scoreNum clno
create table score(corenum number(5), stno number(5),scorenum number(5),clno number(5))
查询及格率以及不及格率：


下面查询了一个不及格率：效率低下。
select a1/a2 from (select count(*) a1 from score where scoreNum<60 and clno=111) B1, (select count(distinct stno) a2 from score where clno=111) B2

下面效率高：（如果是每个课程的，那么直接加group by clno即可）
select round(sum(case when scorenum <60 then 1 else 0 end)/count(distinct stno),2), round(sum(case when scorenum >=60 then 1 else 0 end)/count(distinct stno),2 ) from score where clno=111

insert into score values(1,1001,40,111);
insert into score values(2,1001,60,112);
insert into score values(3,1001,80,113);

insert into score values(4,1002,100,111);
insert into score values(5,1002,40,112);
insert into score values(6,1002,80,113);

insert into score values(7,1003,50,111);
insert into score values(8,1003,40,112);
insert into score values(9,1003,66,113);

insert into score values(10,1004,51,111);
insert into score values(11,1004,43,112);
insert into score values(12,1004,69,113);

insert into score values(13,1005,51,111);
insert into score values(14,1005,43,112);
insert into score values(15,1005,69,113);

insert into score values(16,1006,54,111);
insert into score values(17,1006,90,112);
insert into score values(18,1006,69,113);

（3）如果是等值的多条件，选择，可以通过case when等效的函数，decode来做。
例如：
decode用法为： select decode(to_char(comm),null,'no　comm',0 ,'为0',10) from emp;

（4）使用where子句替换having
例如：这里可以用having，但是效率低，要改为where
select empno,sal from emp having sal> (select avg(sal) from emp ) group by sal,empno

例如：select empno,sal from emp having sal>300 group by sal,empno 查询员工工资大于300的。这样写跟where查询一样结果，但是效率低下。
select empno,sal from emp where sal>300

（5）适当的使用内部函数（PLSQL中的自定义函数），以及存储过程（数据库端的业务逻辑比较复杂的）。

（6）适当使用表的别名或者列别名。
（7）用exists替代 in或者not exists 替代 not in：
例如：查询跟SMITH同部门的员工信息。
select * from emp e where e.deptno in( select d.deptno from emp d where d.ename='SMITH') 最底下。

select * from emp e where exists(select 'x' from emp d where d.ename='SMITH' and e.deptno=d.deptno ) （最好）

select e.* from emp e left join emp d on e.deptno=d.deptno where d.ename='SMITH' （较好）

（8）使用索引：
经常被访问的列，并且这个列是非主键（或者外键），查询列数据范围跨度比较大，在行记录中出现次数比率低，这行记录中空值多。不是每个列都要加索引。

跟索引有关的列的操作：
（8.1）使用索引的列尽量不要使用not（条件），也尽量不要使用is null或者 is not null。
原因：使用not后，索引失效，数据库会将这列数据全扫描。

（8.2）使用索引的列尽量不要使用计算操作。例如：
select * from emp where sal*100>3000;（效率低）
select * from emp where sal>3000/100;(效率高)

（8.3）针对多个索引列出现在where子句的or条件下时：使用union比or效率高：（使用or时，这些列忽略了索引）
例如：假设emp表中sal和ename都使用了索引，并且语句中sal和ename都是or的条件。考虑用union不用or：
select * from emp where sal>3000 or ename like '%张%' （效率低下）
select * from emp where sal >3000 union select * from emp where ename like '%张%'（效率高）

（8.4）一个索引，加在一个表中多列上，此时如果where子句出现了这几列，此时起作用的索引列是靠前的列。
例如： create index b on(emp.ename,emp.sal,emp.comm)
select * from emp where e.sal >3000 and e.comm >200 and e.ename like '%S%' (此时，索引对ename起作用，对其他两列失效-这里我指的是全盘扫描数据)。
（8.5） order by后面使用索引列时，不要使用约束为null的列，这列索引失效。
（8.6）在对索引列使用to_char 或者to_number函数时，此时索引失效。
例如：select * from emp where empno='10001' 这种隐士转化，对索引无影响。
select * from emp where empno=to_number('10001') 对非索引列使用转换函数，对索引无影响。
select * from emp where to_char(empno)='10001' 此时，如果该empno是索引列，该索引失效。



（9）distinct 去重复效率低下：可以通过exists实现：
select distinct d.dname from dept d,emp e where d.deptno=e.deptno （效率低下）

select d.dname from dept d where exists(select 'x' from emp e where e.deptno=d.deptno) （效率高）

（10）从java角度出发：避免过多的使用string字段来 “+”号连接，拼接sql语句。
可以考虑通过StringBuffer 中append方法追加sql。

（11）使用>= 替代>
例如： select * from emp where deptno>=10(效率高，因为depto直接定位到10，效率高)
select * from emp where deptno>9(先定位到9 ，然后还要排除9 ，效率低)

（12）如果使用union或者union all了，此时如果不需要考虑去掉重复的数据，尽量不要使用union ，因为union默认是去重复的
在去重复过程中，也会影响效率。

（13）对group by子句的优化上面：
尽量在group by前面将数据过滤掉。
例如：
select job,avg(sal) from emp where job='CLERK' group by job （效率高）

select job,avg(sal) from emp group by job having job='CLERK' （效率低）

（14）使用视图：（数据量非常大的情况下）
经常被查询的列数据，并且这些数据不被经常的修改，删除。

第二部分：

Oracle面试题：
（1）行变列： case when then end；
empno ename hiredate
7499ALLEN1981/2/20
7521WARD1981/2/22
7566JONES1981/4/2
7654MARTIN1981/9/28
7698BLAKE1981/5/1
7782CLARK1981/6/9
7788SCOTT1987/4/19
7839KING1981/11/17
7844TURNER1981/9/8

查询结果要求：
empno ename 1981/1/1-1981/12/1 1982/1/1-1982/12/1 1987/1/1-1987/12/1
7499ALLEN   1981/2/20
7498SMITH   1982/4/20
7844TURNER   1981/9/8
7788SCOTT   1987/4/19


查询语句：
select empno,ename ,case when hiredate between '1-1月-1980' and '1-12月-1980' then hiredate else null end "1980/1/1-1980/12/1",
case when hiredate between '1-1月-1981' and '1-12月-1981' then hiredate else null end "1981/1/1-1981/12/1",
case when hiredate between '1-1月-1982' and '1-12月-1982' then hiredate else null end "1982/1/1-1982/12/1",
case when hiredate between '1-1月-1983' and '1-12月-1983' then hiredate else null end "1983/1/1-1983/12/1",
case when hiredate between '1-1月-1984' and '1-12月-1984' then hiredate else null end "1984/1/1-1984/12/1",
case when hiredate between '1-1月-1985' and '1-12月-1985' then hiredate else null end "1985/1/1-1985/12/1",
case when hiredate between '1-1月-1986' and '1-12月-1986' then hiredate else null end "1986/1/1-1986/12/1",
case when hiredate between '1-1月-1987' and '1-12月-1987' then hiredate else null end "1987/1/1-1987/12/1"

from emp

(2) 删除重复行：(效率最好的)
delete from emp e where e.rowid>(
select min(d.rowid) from emp d where e.empno=d.empno
)

问：delete 和trunc区别？
trunc是截断表，将表所有数据删除，释放表空间。
不可以回滚。
delete：删除表数据，不释放表空间，可以回滚。

（3）删除所有表中数据：（truncate-截断表比delete效率高）
truncate的特点是，释放表空间，事务不能回滚。

（4）数据库表语句的类型有哪些：
DML：数据库操纵语言：主要对数据库数据进行操作的。delete，update，select等。
DCL：数据库控制语言：主要有：commit，rollback，授权，用户等操作。
DDL：数据库定义语言：主要有：drop，alter （对表结构操作的）。

(5)数据库的三范式：
第一范式：数据库表中列，拆到不能拆为止。

例如：此时，电话列，存放固定电话号码也存移动电话，此时这一列使整体数据有冗余。
编号姓名电话
1001 zyg 15845689182
1001 zyg 010-67676767
修改方法：
编号姓名固定电话移动电话
1001 zyg 010-67676767 15845689182



第二范式：（满足第一范式前提下）较少非主键列之间的依赖关系，适当的使用外键（拆成多个表）

例如：学生信息表以及成绩，课程都设计在一个表中。（有冗余数据）
学号姓名课程成绩
1001 zyg 1 78
1001 zyg 2 90

修改为：成绩表：
学生表：课程表：序号成绩，课程编号，学号
学号姓名   课程课程名称 1 89 1 1001
1001 zyg   1 css 290 2 1001
   2 HTML

例如：下面满足一范式，二范式，不满足三范式：

学生表：课程表：序号成绩，课程编号，学号
学号姓名学院学院名称课程课程名称 1 89 1 1001
1001 zyg 1aaa 1 css 290 2 1001

学院信息表：
学院编号学院名称

第三范式：在满足一范式，二范式基础上，在表中不能含有另一张表的非主键列。
例如上面学生表中，出现的学院名称，在学员信息表中也存在，这是不满足第三范式的。

0 0