hive 使用python脚本

来源：互联网发布：小白网络技术论坛编辑：程序博客网时间：2024/05/19 20:59

首先，相关MapReduce的操作请看博客 http://blog.csdn.net/longshenlmj/article/details/24380041

pyhton 脚本其实很快就写完了。但是，如何测试，add file 却是把我难道了。

import sysimport  reimport  math#select score,down_count,collected_count,comment_count,view_count,product_code,product_name,year,month,day from zshtest#for line in  sys.stdin:    (score,down_count,collected_count,comment_count,view_count,product_code,product_name,year,month,day) = line.strip().split()    if (view_count > 0 ):        rank = score+(down_count+collected_count+comment_count)/view_count    else:        rank = 0    print "%s\t%s\t%s\t%s\t%s\t%s" % (product_code,product_name,year,month,day,rank)

我这个脚本难道要直接放到大数据的服务器上，add file 其实应用的路径是hive的默认路径？

路径没有错误只是调用方法错了，为什么所有的资料都是错误的方式，难道他们没报错

add file ./wanyi/t_product.py ;
from zshtest
select transform(score)
using ' python t_product.py'
as rank,score

我少了一个启用 python 的命令，大家都是 using ‘t_product.py’ 这样的模式，难道都不报错？

困扰了我两天啊！

阅读全文

0 0