六、聚集函数

来源：互联网发布：多尺度图像分割的算法编辑：程序博客网时间：2024/05/22 08:29

定义：

Aggregate functionsare functions that take acollection (a set or multiset) of values as input and return a single value.

聚集函数就是将一系列的属性作为输入，然后输出单个值的函数。

– Average: avg

– Minimum: min

– Maximum: max

– Total : sum

– Count: count

默认情况下是可以保留重复的，可以加上distinct来进行去重：例如count（*）不去重，而count（distinct teaches.semester)是去重的。

E.g. “Find the totalnumber of instructors who teach a course inthe Spring 2010 semester.”

select count(distinct ID)

from teaches

where semester = ‟Spring‟ and year = 2010;

We use the aggregate function count frequently to count the number of tuples in a relation.一般我们可以用聚集函数计算分好组后每个分组中，元素的总数（SUM），平均数（AVG），元组个数（COUNT）等。要注意的是AVG，SUM 等函数只能用于一些可计算的值，比如int， double等，而不能计算字串等数据类型，其实这也比较好理解。

计算元组个数可以用count（*），但是count（*）不能与distinct一起使用，不过可以这样用max（distinct *), min(distict *)。

select count(*)

from course;

SQL does not allow the use of distinct with count (*). It is legal to use distinct with max and min, even thoughthe result does not change.

分组：

E.g. “Find the average salary in each department.”

select dept_name,avg (salary) as avg_salary

from instructor

group by dept_name;

如果group by之后传给avg函数的一组元组集合是空集，那么这组集合将不会被统计。（分子分母都不改变）

group by 里的小规矩：

出现在select子句中但没有被聚集的属性必须出现在group by子句中。

利用having子句可以对分组后的结果集进行过滤：

Try: Find the names and average salaries of all departments whose average salary is greater than 42000

select dept_name,avg (salary)

from instructor

group by dept_name

having avg (salary) > 42000;

注意：除了count(*)外所有的聚集函数都忽略输入集合中的空值。