Hive查询

来源:互联网 发布:格兰杰詹姆斯数据 编辑:程序博客网 时间:2024/05/01 15:44

1、Hive的查询与SQL类似,基本语法如下:

    SELECT [ALL | DISTINCT] select_expr, select_expr, ...     FROM table_reference     [WHERE where_condition]     [GROUP BY col_list]     [HAVING having_condition]     [CLUSTER BY col_list | [DISTRIBUTE BY col_list] [SORT BY col_list]]     [LIMIT number];

查询emp表中,salary>45000的员工信息:

    SELECT * FROM employee WHERE salary>45000;

执行结果如下:
这里写图片描述
2、通过python访问hive,进行数据查询,代码如下:

    # coding:utf-8    from pyhive import hive    from TCLIService.ttypes import TOperationState    # 打开hive连接    hiveConn = hive.connect(host='192.168.83.135',port=11111,username='hadoop',database='userdbbypy')    cursor = hiveConn.cursor()    sql = ''' SELECT * FROM emp WHERE salary>45000 '''    cursor.execute(sql, async=True)    # 得到执行语句的状态    status = cursor.poll().operationState    print "status:",status    for eid,ename,salary,destination,dept, in cursor.fetchall():        print eid,ename,salary,destination,dept    # 关闭hive连接    cursor.close()    hiveConn.close()

执行代码,结果如下:
这里写图片描述
3、Order By查询

    SELECT [ALL | DISTINCT] select_expr, select_expr, ...     FROM table_reference     [WHERE where_condition]     [GROUP BY col_list]     [HAVING having_condition]     [ORDER BY col_list]]     [LIMIT number];

查询emp表,并将salary按照由高到低排序:

    SELECT * FROM emp ORDER BY salary;

这里写图片描述
中间出了很多执行过程,不明所以。。
4、Group By查询
统计emp中工资一样的数量:

    SELECT count(*),salary from emp GROUP BY salary;

这里写图片描述

原创粉丝点击