Hive查询

来源：互联网发布：格兰杰詹姆斯数据编辑：程序博客网时间：2024/05/01 15:44

1、Hive的查询与SQL类似，基本语法如下：

    SELECT [ALL | DISTINCT] select_expr, select_expr, ...     FROM table_reference     [WHERE where_condition]     [GROUP BY col_list]     [HAVING having_condition]     [CLUSTER BY col_list | [DISTRIBUTE BY col_list] [SORT BY col_list]]     [LIMIT number];

查询emp表中，salary>45000的员工信息：

    SELECT * FROM employee WHERE salary>45000;

执行结果如下：
这里写图片描述
2、通过python访问hive，进行数据查询，代码如下：

    # coding:utf-8    from pyhive import hive    from TCLIService.ttypes import TOperationState    # 打开hive连接    hiveConn = hive.connect(host='192.168.83.135',port=11111,username='hadoop',database='userdbbypy')    cursor = hiveConn.cursor()    sql = ''' SELECT * FROM emp WHERE salary>45000 '''    cursor.execute(sql, async=True)    # 得到执行语句的状态    status = cursor.poll().operationState    print "status:",status    for eid,ename,salary,destination,dept, in cursor.fetchall():        print eid,ename,salary,destination,dept    # 关闭hive连接    cursor.close()    hiveConn.close()

执行代码，结果如下：
这里写图片描述
3、Order By查询

    SELECT [ALL | DISTINCT] select_expr, select_expr, ...     FROM table_reference     [WHERE where_condition]     [GROUP BY col_list]     [HAVING having_condition]     [ORDER BY col_list]]     [LIMIT number];

查询emp表，并将salary按照由高到低排序：

    SELECT * FROM emp ORDER BY salary;

这里写图片描述
中间出了很多执行过程，不明所以。。
4、Group By查询
统计emp中工资一样的数量：

    SELECT count(*),salary from emp GROUP BY salary;

这里写图片描述

阅读全文

1 0