OVER(PARTITION BY)函数介绍

来源:互联网 发布:如何下载spss13软件 编辑:程序博客网 时间:2024/06/09 18:51
不错,学习了

原文地址:BY)函数介绍">OVER(PARTITION BY)函数介绍作者:
开窗函数          
    Oracle从8.1.6开始提供分析函数,分析函数用于计算基于组的某种聚合值,它和聚合函数的不同之处是:对于每个组返回多行,而聚合函数对于每个组只返回一行。

     开窗函数指定了分析函数工作的数据窗口大小,这个数据窗口大小可能会随着行的变化而变化,举例如下:
1:over后的写法:    
   over(order by salary)按照salary排序进行累计,order by是个默认的开窗函数
   over(partition bydeptno)按照部门分区

 

   over(partition by deptno orderby salary)

 

2:开窗的窗口范围
over(order by salary range between5 preceding and 5 following):窗口范围为当前行数据幅度减5加5后的范围内的。

举例:

 

--sum(s)over(orderby s range between 2 preceding and 2following) 表示加2或2的范围内的求和

 select name,class,s, sum(s)over(order by s rangebetween 2 preceding and 2 following) mm from t2
adf             45       45  --45加2减2即43到47,但是s在这个范围内只有45
asdf            55       55
cfe             74       74
3dd             78       158 --78在76到80范围内有78,80,求和得158
fda             80       158
gds             92       92
ffd             95       190
dss             95       190
ddd             99       198

gf              99       198

 

 

 

over(order by salary rows between5 preceding and 5 following):窗口范围为当前行前后各移动5行。

举例:

 

--sum(s)over(order by s rows between 2 preceding and 2following)表示在上下两行之间的范围内
select name,class,s, sum(s)over(order by s rows between 2 precedingand 2 following) mm from t2
adf             45       174  (45+55+74=174)
asdf            55       252   (45+55+74+78=252)
cfe             74       332   (74+55+45+78+80=332)
3dd             78       379   (78+74+55+80+92=379)
fda             80       419
gds             92       440
ffd             95       461
dss             95       480
ddd             99       388
gf              99       293

 

 

over(order by salary range betweenunbounded preceding and unbounded following)或者
over(order by salary rows betweenunbounded preceding and unbounded following):窗口不做限制

 

3、与over函数结合的几个函数介绍

row_number()over()、rank()over()和dense_rank()over()函数的使用

下面以班级成绩表t2来说明其应用

t2表信息如下:
cfe             74
dss             95
ffd             95
fda             80
gds             92
gf              99
ddd             99
adf             45
asdf            55
3dd             78

select *from                                                                      
                                                                             
    selectname,class,s,rank()over(partition by class order by s desc) mm fromt2
                                                                             
    wheremm=1;
得到的结果是:
dss             95       1
ffd             95       1
gds             92       1
gf              99       1
ddd             99       

注意:
   1.在求第一名成绩的时候,不能用row_number(),因为如果同班有两个并列第一,row_number()只返回一个结果;
select *from                                                                      
                                                                             
    selectname,class,s,row_number()over(partition by class order by s desc)mm from t2
                                                                             
    wheremm=1;
      95        --95有两名但是只显示一个
      92       1
      99       1 --99有两名但也只显示一个

   2.rank()和dense_rank()可以将所有的都查找出来:
如上可以看到采用rank可以将并列第一名的都查找出来;
    rank()和dense_rank()区别:
    --rank()是跳跃排序,有两个第二名时接下来就是第四名;
select name,class,s,rank()over(partition by class order by s desc)mm from t2
dss             95       1
ffd             95       1
fda             80       3 --直接就跳到了第三
gds             92       1
cfe             74       2
gf              99       1
ddd             99       1
3dd             78       3
asdf            55       4
adf             45       5
    --dense_rank()l是连续排序,有两个第二名时仍然跟着第三名
select name,class,s,dense_rank()over(partition by class order by sdesc) mm from t2
dss             95       1
ffd             95       1
fda             80       2 --连续排序(仍为2)
gds             92       1
cfe             74       2
gf              99       1
ddd             99       1
3dd             78       2
asdf            55       3
adf             45       4

--sum()over()的使用
select name,class,s, sum(s)over(partition by class order by s desc)mm from t2 --根据班级进行分数求和
dss             95       190  --由于两个95都是第一名,所以累加时是两个第一名的相加
ffd             95       190 
fda             80       270  --第一名加上第二名的
gds             92       92
cfe             74       166
gf              99       198
ddd             99       198
3dd             78       276
asdf            55       331
adf             45       376

first_value() over()和last_value() over()的使用 



--找出这三条电路每条电路的第一条记录类型和最后一条记录类型

SELECT opr_id,res_type,
      first_value(res_type) over(PARTITION BYopr_id ORDERBY res_type) low,
      last_value(res_type) over(PARTITION BYopr_id ORDERBY res_type rowsBETWEEN unbounded preceding AND unbounded following)high
  FROM rm_circuit_route
WHERE opr_id IN('000100190000000000021311','000100190000000000021355','000100190000000000021339')
 ORDER BY opr_id;

 

注:rowsBETWEEN unbounded preceding AND unboundedfollowing 的使用

--取last_value时不使用rowsBETWEEN unbounded preceding AND unboundedfollowing的结果

 

SELECT opr_id,res_type,
      first_value(res_type) over(PARTITION BY opr_id ORDER BY res_type)low,
      last_value(res_type) over(PARTITION BY opr_id ORDER BY res_type)high
  FROM rm_circuit_route
 WHERE opr_id IN('000100190000000000021311','000100190000000000021355','000100190000000000021339')
 ORDER BY opr_id;

如下图可以看到,如果不使用

rows BETWEEN unbounded preceding AND unboundedfollowing,取出的last_value由于与res_type进行进行排列,因此取出的电路的最后一行记录的类型就不是按照电路的范围提取了,而是以res_type为范围进行提取了。

 

 

 

 

 

在first_value和last_value中ignore nulls的使用
数据如下:

 

 

取出该电路的第一条记录,加上ignorenulls后,如果第一条是判断的那个字段是空的,则默认取下一条,结果如下所示:

 

 

--lag() over()函数用法(取出前n行数据)
lag(expresstion,<offset>,<default>)
with a as 
(select 1 id,'a' name from dual
 union
 select 2 id,'b' name from dual
 union
 select 3 id,'c' name from dual
 union
 select 4 id,'d' name from dual
 union
 select 5 id,'e' name from dual

select id,name,lag(id,1,'')over(orderby name) from a;

--lead() over()函数用法(取出后N行数据)

lead(expresstion,<offset>,<default>)
with a as 
(select 1 id,'a' name from dual
 union
 select 2 id,'b' name from dual
 union
 select 3 id,'c' name from dual
 union
 select 4 id,'d' name from dual
 union
 select 5 id,'e' name from dual

select id,name,lead(id,1,'')over(orderby name) from a;

--ratio_to_report(a)函数用法 Ratio_to_report() 括号中就是分子,over()括号中就是分母
with a as (select 1 a from dual
          union all
select 1 a from dual
          union  all
select 1 a from dual
          union all
select 2 a from dual
          union all 
select 3 a from dual
          union all
select 4 a from dual
          union all
select 4 a from dual
          union all
select 5 a from dual
          )
select a, ratio_to_report(a)over(partitionby a) b from a 
order by a; 

with a as (select 1 a from dual
          union all
select 1 a from dual
          union  all
select 1 a from dual
          union all
select 2 a from dual
          union all 
select 3 a from dual
          union all
select 4 a from dual
          union all
select 4 a from dual
          union all
select 5 a from dual
          )
select a, ratio_to_report(a)over() bfrom a --分母缺省就是整个占比
order by a; 

with a as (select 1 a from dual
          union all
select 1 a from dual
          union  all
select 1 a from dual
          union all
select 2 a from dual
          union all 
select 3 a from dual
          union all
select 4 a from dual
          union all
select 4 a from dual
          union all
select 5 a from dual
          )
select a, ratio_to_report(a)over()b from a
group by a order by a;--分组后的占比