Shell脚本执行hive语句 | hive以日期建立分区表 | linux schedule程序 | sed替换文件字符串 | shell判断hdfs文件目录是否存在
来源:互联网 发布:鼎信诺审计软件多少钱 编辑:程序博客网 时间:2024/05/29 16:33
#!/bin/bashsource /etc/profile;################################################### Author: ouyangyewei ## ## Content: Combineorder Algorithm #################################################### change workspace to herecd /cd /home/deploy/recsys/algorithm/schedule/project/combineorder# generate product_sell datayesterday=$(date -d '-1 day' '+%Y-%m-%d')lastweek=$(date -d '-1 week' '+%Y-%m-%d')/usr/local/cloud/hive/bin/hive<<EOF CREATE EXTERNAL TABLE IF NOT EXISTS product_sell(category_id bigint,province_id bigint,product_id bigint,price double,sell_num bigint)PARTITIONED BY (ds string)ROW FORMAT DELIMITEDFIELDS TERMINATED BY '\t'LINES TERMINATED BY '\n'STORED AS TEXTFILE;INSERT OVERWRITE TABLE product_sell PARTITION (ds='$yesterday') select a.category_id, b.good_receiver_province_id as province_id, a.id as product_id, (b.sell_amount/b.sell_num) as price, b.sell_num from product a join (select si.product_id, s.good_receiver_province_id, sum(si.order_item_amount) sell_amount, sum(si.order_item_num) sell_num from so_item si join so s on (si.order_id=s.id) where si.is_gift=0 and si.is_hidden=0 and si.ds between '$lastweek' and '$yesterday' group by s.good_receiver_province_id, si.product_id) b on (a.id=b.product_id);EOF# generate yhd_gmv_month datayesterday=$(date -d '-1 day' '+%Y-%m-%d')lastmonth=$(date -d '-1 month' '+%Y-%m-%d')/usr/local/cloud/hive/bin/hive<<EOF CREATE EXTERNAL TABLE IF NOT EXISTS yhd_gmv_month(province_id bigint,price_area int,product_id bigint,sell_num bigint)PARTITIONED BY (ds string)ROW FORMAT DELIMITEDFIELDS TERMINATED BY '\t'LINES TERMINATED BY '\n'STORED AS TEXTFILE;INSERT OVERWRITE TABLE yhd_gmv_month PARTITION (ds='$yesterday') select ssi.province_id, (case when price>0.0 and price<=10.0 then 0 when price>10.0 and price<=20.0 then 1 when price>20.0 and price<=30.0 then 2 when price>30.0 then 3 else -1 end) as price_area, ssi.product_id, ssi.sell_num from (select s.good_receiver_province_id as province_id, si.product_id, sum(si.order_item_num) as sell_num, sum(si.order_item_amount)/sum(si.order_item_num) as price from so_item si join so s on (si.order_id=s.id) where si.is_hidden=0 and si.is_gift=0 and si.ds between '$lastmonth' and '$yesterday' group by s.good_receiver_province_id, si.product_id) ssi;EOF# execute the combineorder algorithm jobcd /cd /home/deploy/recsys/algorithm/schedule/project/combineorder/schedule/pms_category_rec_prodhadoop jar /home/deploy/recsys/algorithm/schedule/project/combineorder/schedule/recommender-dm-1.0-SNAPSHOT.jar com.yhd.recommender.combineorder.schedule.CombineorderRecommendScheduler# export "pms_category_rec_prod" data to mysqlcd /cd /home/deploy/recsys/algorithm/schedule/project/combineorder/schedule/pms_category_rec_prodhadoop jar /home/deploy/recsys/algorithm/schedule/project/combineorder/schedule/recommender-dm-1.0-SNAPSHOT.jar com.yhd.recommender.exporter.db.HdfsToDBProcessor# check "yhd_gmv_month" is existyesterday=$(date -d '-1 day' '+%Y-%m-%d')hadoop fs -test -e /user/hive/warehouse/yhd_gmv_month/ds=2014-08-27if [ $? -ne 0 ] ;thenecho 'Error! Directory is not exist'else# auto modify date timeoldestVersionDay=$(date -d '-3 day' '+%Y-%m-%d')olderVersionDay=$(date -d '-2 day' '+%Y-%m-%d')newVersionDay=$(date -d '-1 day' '+%Y-%m-%d')sed -r -i '{s/oldestVersion=\/user\/hive\/warehouse\/yhd_gmv_month\/ds=.*/oldestVersion=\/user\/hive\/warehouse\/yhd_gmv_month\/ds='"${oldestVersionDay}"'/}' /home/deploy/recsys/algorithm/schedule/verifaction/combineorder/yhd_gmv_month/input/verification.propertiessed -r -i '{s/olderVersion=\/user\/hive\/warehouse\/yhd_gmv_month\/ds=.*/olderVersion=\/user\/hive\/warehouse\/yhd_gmv_month\/ds='"${olderVersionDay}"'/}' /home/deploy/recsys/algorithm/schedule/verifaction/combineorder/yhd_gmv_month/input/verification.propertiessed -r -i '{s/newVersion=\/user\/hive\/warehouse\/yhd_gmv_month\/ds=.*/newVersion=\/user\/hive\/warehouse\/yhd_gmv_month\/ds='"${newVersionDay}"'/}' /home/deploy/recsys/algorithm/schedule/verifaction/combineorder/yhd_gmv_month/input/verification.properties# export "yhd_gmv_month" data to mysqlcd /cd /home/deploy/recsys/algorithm/schedule/project/combineorder/schedule/yhd_gmv_monthhadoop jar /home/deploy/recsys/algorithm/schedule/project/combineorder/schedule/recommender-dm-1.0-SNAPSHOT.jar com.yhd.recommender.exporter.db.HdfsToDBProcessorfi
5 0
- Shell脚本执行hive语句 | hive以日期建立分区表 | linux schedule程序 | sed替换文件字符串 | shell判断hdfs文件目录是否存在
- shell判断hdfs文件目录是否存在
- shell判断hdfs文件目录是否存在
- shell脚本判断文件和目录是否存在
- Shell脚本执行Hive语句
- shell判断文件,目录是否存在
- shell 判断文件、目录是否存在
- shell 判断文件、目录是否存在
- shell判断文件或目录是否存在
- shell 判断文件、目录是否存在
- shell 判断文件、目录是否存在
- shell 判断文件、目录是否存在
- shell判断目录或者文件是否存在
- shell 脚本判断文件是否存在
- shell 脚本判断某个文件是否存在
- shell脚本中判断文件是否存在
- 判断文件是否存在 shell 脚本
- shell 脚本判断文件、文件夹是否存在
- scala eclipse plugin 插件安装
- Ubuntu中编译Android 源码出现Switch.pm出错信息的解决方法
- [ZOJ 2961] Spinlock [搜索]
- 如何解决Windows7系统不兼容VC++6.0的问题
- 2012-12-3 服务器群组
- Shell脚本执行hive语句 | hive以日期建立分区表 | linux schedule程序 | sed替换文件字符串 | shell判断hdfs文件目录是否存在
- 系统中的广播
- Cocos2d-x 3.0 fatal error C1083: 无法打开包括文件:“extensions/ExtensionMacros.h”: No such file or directory”
- 听鬼哥说ZJDROID脱壳的简单使用
- Learning Golang Day 1
- diff命令
- IOS Info.plist中常用的key简介
- java中final关键的用法整理
- solr 服务端搭建