spark concat_ws,collect_set

来源:互联网 发布:外汇投资收益率知乎 编辑:程序博客网 时间:2024/06/05 20:53

concat_ws

hive > select product_id, concat_ws('_',collect_set(promotion_id)) as promotion_ids from product_promotion group by product_id;OK5112 960024_960025_960026_960027_9600285113 960043_960044_960045_960046Time taken: 3.116 seconds
concat_ws实现将多行记录合并成一行

collect_set

from pyspark.sql import functions as F
F.collect_set("di_ware_no")
这里的collect_set的作用是对di_ware_no去重,值得注意的是,必须保证di_ware_no的类型是string类型
原创粉丝点击