Parameters在Oozie的hive action中的使用

来源:互联网 发布:java标识符是什么 编辑:程序博客网 时间:2024/06/16 10:21

目标:从hive action中获取结果,传入下一个hive action

现状:很不幸运,目前hive action不支持capture output

实现:使用ssh action运行hive script获得output, 并将其传入hive action

Detail:

  • ssh脚本去echo对应参数(key=value)
echo "minDate=${minDate}"
  • Hive action可以给Hive Script 使用下面方法传参
<param>DATE_TODAY=${wf:actionData('name_of_the_shell_action_goes_here')['minDate']</param>
  • Hive Script使用下面方法去接受参数
SELECT * FROM foo where date = ${minDate};

Demo

  • testShell.sh
minDate=`hive -e "select current_date;"`echo "minDate='${minDate}'"
  • testParaShellToHive.hql
select current_date>${MYDATE};
  • workflow.xml
<?xml version="1.0" encoding="UTF-8"?><workflow-app xmlns="uri:oozie:workflow:0.5" name="testHive" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="uri:oozie:workflow:0.5">    <start to="ssh-demo"/>    <action name="ssh-demo">        <ssh xmlns="uri:oozie:ssh-action:0.1">            <host>${sshUser}@${hostName}</host>            <command>sh ${SCRIPTS}/testShell.sh</command>            <capture-output/>        </ssh>        <ok to="hive-demo"/>        <error to="kill"/>    </action>    <action name="hive-demo">        <hive xmlns="uri:oozie:hive-action:0.2">            <job-tracker>${jobTracker}</job-tracker>            <name-node>${nameNode}</name-node>            <script>${dbScripts}/testParaShellToHive.hql            </script>            <param>MYDATE=${wf:actionData('ssh-demo')['minDate']}</param>        </hive>        <ok to="end"/>        <error to="kill"/>    </action>    <kill name="kill">    <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]    </message>    </kill>    <end name="end"/></workflow-app>
  • hive.properties
oozie.wf.application.path=hdfs:..._ftr_dtls/oozieappPath=hdfs...ing_ftr_dtls/oozienameNode=hdfs://cvldhdpds1jobTracker=ip-1...:8050user.name=yyangmetaStoreURI=thrift://ip-10-1...ernal:9083sshUser=bisusrhostName=10.177.228.19dbScripts=hdfs://cvldhdpd...ftr_dtls/oozie/hqlSCRIPTS=hdfs://cvldhdpd...ftr_dtls/oozie/shell

Oozieの运行命令

oozie job -oozie http://10.177.2...0/oozie -config hive.properties -run

Oozieの查看log命令

oozie job -oozie http://10...000/oozie -log 0000074-160427121156888-oozie-oozi-W > outputvi output
ps: 0000074-160427121156888-oozie-oozi-W 是你的Job ID

注意事项:

  • testParaShellToHive.hql 和 workflow.xml必须要拷贝到hdfs
  • testShell.ksh 和 hive.properties应该保存在local上

Appendix

  • Hadoop 删除hdfs文件
    hadoop fs -rm hdfs://.../workflow.xml
  • Hadoop 从local拷贝文件到HDFS
    hadoop fs -put workflow.xml hdfs://.../workflow.xml
  • Hadoop 创建hdfs目录
    hadoop fs -mkdir hdfs://.../test

Reference

[1](http://blog.csdn.net/xiao_jun_0820/article/details/40370783 “oozie 知识整合”)
[2](https://www.mail-archive.com/user@oozie.apache.org/msg01136.html “Oozie - capture output and pass it to hive script as input”)

0 0