hive分组随机抽一定量数

来源:互联网 发布:数据库关系图 连线 编辑:程序博客网 时间:2024/05/16 06:12


要求根据员工的职级分类,然后每类职级随机抽取2条数据,


建表:

create table temp.a(id    string, name  string, age   string, rank  string)ROW format delimited FIELDS TERMINATED BY ',' ;load data local inpath 'a.txt' into table temp.a;select  * from temp.a;



idnameagerank1a10p12b34p13c23p24d33p25e23p26f67p37g34p38h12p49i54p5

SQL:

select id,   name,   age,   rankfrom ( select id,name,age,rank,row_number()over(partition by rank order by rand()) as rnfrom a   ) twhere t.rn <=2

结果1:


结果2:




注:如果order by rand(1),则每次排序相同,即出来的结果相同。





1 0