hadoop之grep

来源:互联网 发布:pk10源码 编辑:程序博客网 时间:2024/06/06 17:03

hadoop streaming  -D stream.non.zero.exit.is.failure=false ...


#!/bin/shisdebug=falseSTREAMING=/home/work/software/hadoop/contrib/streaming/hadoop-streaming.jarHADOOP=/home/work/software/hadoop/bin/hadoopinput_path=$1output_path=/home/user1/tmpecho "$HADOOP fs -rmr $output_path"$HADOOP fs -rmr $output_pathecho "$HADOOP jar $STREAMING -D mapred.reduce.tasks=1 -D mapred.job.priority=VERY_HIGH -D mapred.job.name='sunlin-s:grep' -input $input_path  -output $output_path -mapper 'egrep \"$2\"' "$HADOOP jar $STREAMING -D mapred.reduce.tasks=1 -D mapred.job.priority=VERY_HIGH -D mapred.job.name='grep' -D stream.non.zero.exit.is.failure=false -D mapred.max.split.size=800000000 -input $input_path  -output $output_path -mapper "egrep '$2'"exit 0

usage: sh grep.sh to_grep_path grep_str



原创粉丝点击