hadoop python streaming 特殊文本解析
来源:互联网 发布:淘宝产品如何上架 编辑:程序博客网 时间:2024/05/05 18:47
#!/usr/bin/env python
import sys
skey=sys.argv[1].split(',')
for line in sys.stdin:
dic={}
cols=line.split('\t')
for kv in cols[1:]:
kv_tmp=kv.split('\x01')
dic[kv_tmp[0]]=kv_tmp[1]
tmp=''
for i in skey:
tmp=tmp+'\t'+str(dic.get(i,''))
print '%s' % (tmp[1:])
import sys
skey=sys.argv[1].split(',')
for line in sys.stdin:
dic={}
cols=line.split('\t')
for kv in cols[1:]:
kv_tmp=kv.split('\x01')
dic[kv_tmp[0]]=kv_tmp[1]
tmp=''
for i in skey:
tmp=tmp+'\t'+str(dic.get(i,''))
print '%s' % (tmp[1:])
hadoop jar /home/hadoop/opt/hadoop-0.20.2-cdh3u2/contrib/streaming/hadoop-streaming-0.20.2-cdh3u2.jar -input /2.txt -output /out1/ -mapper '1.py cookie_aa_ad_gid,trackid,prereferer' -reducer cat -file /home/hadoop/1.py
- hadoop python streaming 特殊文本解析
- Hadoop Streaming for Python
- python Hadoop Streaming程序测试
- Hadoop Streaming python c c++ perl 编程
- HADOOP STREAMING实例HIVE引用PYTHON
- Hadoop WordCount(Streaming,Python,Java三合一)
- Python+Hadoop Streaming实现MapReduce任务
- Hadoop Streaming
- Hadoop Streaming
- Hadoop Streaming
- Hadoop Streaming
- hadoop streaming
- Hadoop Streaming
- Hadoop Streaming
- Hadoop Streaming
- Hadoop Streaming
- Hadoop Streaming
- Hadoop Streaming
- 第5题 在一个字符串中找到第一个只出现一次的字符
- 火星人的研究 Prefix
- Windows7 64位下vs2008配置OpenCV2.3.1
- spring定时任务+线池程实现
- win7系统如何设置自己获取IP地址
- hadoop python streaming 特殊文本解析
- iphone检测耳机插入/拔出
- HTML相对路径和绝对路径
- Android 软件开发之如何使用Eclipse Debug调试程序详解(十二)
- 一段为学自动化而写的代码
- ASP.NET MVC:自定义 Route
- 用观查者模式传递线程状态
- C语言扩展Apache模块开发入门篇
- 锁定页面