文章标题

来源:互联网 发布:java char 编辑:程序博客网 时间:2024/05/29 23:47

Hive

基于Hadoop的一个数据仓库工具,构建于hadoop的hdfs和mapred之上,用于管理和查询结构化/非结构化数据的数据仓库。可以将结构化的数据文件映射为一张数据库表,并提供类SQL查询功能。

1.建表
create table city(
province_code INT,
province_name string,
city_code INT,
city_name string
)
row FORMAT delimited
fields terminated by ‘,’
lines terminated by ‘\n’;
2.查看表信息
show create table city;
3.查看表内容
select * from city limit 10;
4.最大10个
select * from city order by city_code desc limit 10;
5.省份里市最多的排序。
select province_name,count(city_name) as sum from city group by province_name order by sum limit 10;
6.去重
select count(distinct province_name) from city;

select count(*) from (select province_name from city group by province_name) a;
7.只有一个市的省份
select province_name,sum from (
select province_name,count(*) as sum
from city group by province_name) a where sum=1;

select province_name,count(*) as sum
from city group by province_name having sum=1;
8.内表与外表
外表
create external table city_ex
(province_code int,
province_name string,
city_code int,
city_name string
)
row format delimited fields terminated by ‘,’
lines terminated by ‘\n’
location’/user/hdfs/lyy/city’;
内表:
CREATE TABLE user(
uid INT,
city_code INT,
model string,
access string
)
row FORMAT delimited
fields terminated by ‘,’
lines terminated by ‘\n’;
将文件导入:
load data local inpath ‘/home/bigdata/hive/user.txt’ into table user;

求比例(除以总数):
select sum(if(access=’2G’,1,0))/count(1) from user;

按条件查询:
select
case
when uid % 10 in (0, 1, 2, 3) then ‘0-3’
when uid % 10 in (4, 5, 6, 7) then ‘4-7’
else ‘8-9’
end as interval,
count(*) as cnt
from user
group by
case
when uid % 10 in (0, 1, 2, 3) then ‘0-3’
when uid % 10 in (4, 5, 6, 7) then ‘4-7’
else ‘8-9’
end;

集合:
去重:select collect_set(access) from user;
不去重:select collect_list(access) from user;
表的关联方式:
select user.uid, user.city_code, city.city_name
from
(select * from user where uid <= 100) user
full join
(select * from city where province_code <= 30) city
on (user.city_code = city.city_code)
limit 20;

分组topN;
select access,city_code,uid
from
(
select uid,access,city_code,
row_number()over (partition by access order by city_code desc)as
row_num
from user
)a
where row_num=1;

select p_date,
sum(cnt) over(order by p_date asc rows between unbounded preceding and current row)
from(
select p_date,count(*)as cnt
from user_daily
where p_date between ‘2017-09-01’ and ‘2017-09-30’
group by p_date
)a;
unbounded preceding

原创粉丝点击
热门问题 老师的惩罚 人脸识别 我在镇武司摸鱼那些年 重生之率土为王 我在大康的咸鱼生活 盘龙之生命进化 天生仙种 凡人之先天五行 春回大明朝 姑娘不必设防,我是瞎子 头发油就掉头发怎么办 2岁儿童头发稀少怎么办 1岁宝宝头发稀少怎么办 头发又细又油怎么办 斑秃长了又掉怎么办 2岁宝贝头发稀少怎么办 25岁小伙掉头发怎么办 头发又细又软怎么办 甲减引起掉头发怎么办 掉头发严重怎么办40岁 4岁宝宝头发稀少怎么办 25岁头发顶脱发怎么办 头发细又少和软怎么办 头顶上头发少怎么办呢 3岁宝贝头发稀少怎么办 2岁幼儿头发稀少怎么办 小孩头发太少了怎么办 头发少盘丸子头怎么办 油头发掉的厉害怎么办 头顶头发长得慢怎么办 每次洗头发都掉很多头发怎么办 拔了头发不长怎么办 6岁儿童头发稀少怎么办 2岁宝宝胆子小怎么办 坐到小孩的头怎么办 托班幼儿不刷牙怎么办 两岁的宝宝蛀牙怎么办 小孩在学校被打怎么办 油画棒画在墙上怎么办 宝宝把蜡笔吃了怎么办 吃鸡更新了怎么办开始 数字画涂料干了怎么办 广告画颜料干了怎么办 宝宝断奶后瘦了怎么办 腿一个粗一个细怎么办 两条小腿不一样粗怎么办 两岁宝宝坐不了怎么办 q糖孩子吃多了怎么办 q糖孩子吃的太多怎么办 ps图层不能覆盖怎么办 孩子的字写的不好怎么办